Design sgRNA dual guides for MP targets, AP targets, positive and negative controls

First, find major and alternative promoters from integration pacbio ect. Second, find the regions to identify the regions for FlashFry Input

Positive Controls These need to be examples with one promoter only. So that both protospacers target the main promoter and therefore provides a full KD- TOPBP1.

Negative Control Negative controls. With regards to negative controls: the % varies from ~2% to ~5% of total sgRNA with a minimum number of 5 negative controls. See the Negative controls Request a detailed protocol For each library, the frequency of each DNA base at each position along the sgRNA protospacer sequence was calculated. Random sgRNA protospacer sequences weighted by these base frequencies were then generated to mirror the composition of the targeting sgRNAs. These were then filtered for sgRNAs with 0 alignments with a mismatch score less than 31 proximal to the TSS and 0 alignments under 21 in the genome as above.

full KD genes Dual guide that is targeting one protospacer that targets the main promoter and then the second protospacer that targets the AP.

1) Extract the promoter regions identified by Weissman et al. lift them over from hg19 to hg38

2) Overlap these with the candidate promoter regions and AP identied in alternative promoter identification git lab project. Which uses 2/3 proactiv, CAGE and pacbio for TSS to be identified

3) Then select the most upstream promoter found

promoter KD genes Dual guide that both protospacer targets the main 5' promoter.

1) Extract the promoter regions identified by Weissman et al. lift them over from hg19 to hg38

2) Overlap these with the candidate promoter regions identied in alternative promoter identification git lab project. Which uses 2/3 proactiv, CAGE and pacbio for TSS to be identified

3) Then select the most upstream promoter found

Type of control % of sgRNAs # of Genes # of sgRNAs
Positive 5 10
Negative Non-Targeting 25
Full KD Genes 45 118 118
Promoter KD Genes 45 118 118

Output table /merklist/polot_sgRNA_protospacersequences.txt</b> is going to be same format as Repogle et al. :

  • unique identifier, type, gene, distance, protospacer_1, protospacer_2, sgID_1, sgID_2, oligo_for_dualguide_cloning ,oligo_for_dualguide_cloning_withprimers

Other tables of interest:

  • ./dual_guide/flashfry_output.txt (raw output files from flashfry)(w/ foundinCRISPRiv2.1)
  • ./dual_guide/flashfry_positive_output.txt (raw output files from flashfry)(w/ foundinCRISPRiv2.1) Their respective "top" files are the flashfry output files for the chosen protospacer sequences