microrna Dr. Researcherr Prepared by LC Sciences, LLC Jan. 1, 2009
|
|
- Maryann McBride
- 5 years ago
- Views:
Transcription
1 Sequencing Dataa Report microrna Discovery Sequencing Service On sample_ For Dr. Researcherr Life Sciences University of USA Prepared by LC Sciences, LLC Jan. 1, 2009 Prepared by LC Sciences, LLC W. Bellfort, Suite 270, Houston, Texas Tel , Fax
2 microrna Discovery Sequencing Data Report sample_ I. PROJECT INFORMATION Project related information is listed in Table 1. Table 1. Sample, service, and project tracking information Project Information Customer Sample Name: Sample Type: Date Sample Received: Service Requested: Data Analysis Requested: LCS Project Number: LCS Sample ID: sample_ Human total RNA 12/15/2009 microrna Discovery Sequencing Service Standard Data Analysis sample1 II. DA ATA REPORT The received RNA sample was processed to generate a cdna library which was then used to deep sequencing. The dataa generated were analyzed and the full data files of 2-3 Gb were saved onto a DVD disc which is included in this report. Experimental procedures and analysis methods were described in Section III of this report. The statistics of the data analysis was given in file Data_summary_sample1.xls and a summary is presented in Table 2. The detailed dataa files which may be in tens of Mbs and the recommended software programs for reviewing the data are given in Table 3. Terminologies Used Sequ Seq: Raw sequencing reads generated in after image extraction and base-calling Unique Seq: Family of sequ seq with same sequence Copy Number: Number of sequ seqs in the same unique seq family Count: Number of sequ seqs in the same unique seq family Mapping: Aligning a sequence to a reference database Mir: pre-mirna registered in mirbase mir: mature mirnas registered in mirbase LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
3 microrna Discovery Sequencing Data Report sample_ Table 2. A summary of standard data analysis results Raw Mappable Mapped to mirbase (including nohit 1) Mapped to Cluster I Mapped to Cluster II Mapped to Cluster III Mapped to Cluster IV Mapped (total) Nohit (including nohit 1 and nohit 2) #SequSeq 6,870,, % 4,764,, % 4,200,, % 4,181,, % % 2, % 13, % 4,196,, % 567, % %SequSeq #UniqueSeq 1,445,905 40,890 8,971 7, ,128 9,201 31,689 %UniqueSeq % 2.83% 0.62% 0.55% 0.00% 0.01% 0.08% 0.64% 2.19% Flow chart of sequencing data analysis of a single sequencing reaction through various filters and the number of mirnas detected. 7.42% mrn NA, RFam,repbas se filter 1.35% ADT filter 0.05% Junk filter 0.07% Sequ uence pattern filt ter 6.07% length<15 or > % copy#< <3 91% mappable 88% pas ss optional filte r 9,031,007 reads (91%) are mappable 7,944,551 re eads (88%) passed opt ional filter 41% unma apped 38% (gro our 4) 292 mirnas detected 39 9% (group 1) 300 mirna detecte d 4,722,478 re eads (59%) are mapped to or are mirna ca andidates 19% (gro oup 3) 149 mirna As detected 4% (g group 2) 27 mi RNAs detected d LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
4 microrna Discovery Sequencing Data Report sample_ Folder Raw Data Filtered Data Table 3. Data files delivered and programs recommended for reviewing Data Files sample1_rawdata.txt sample1_fhg unique.txt sample1_fhg pass.txt sample1_fhg long.txt sample1_fhg short.txt sample1_fhg hc.txt sample1_fhg lc.txt Description Sequencing sequences (sequ seqs) ) as Wordpad obtained from sequencer Sequ seqs listed by family (unique seqs) Wordpad Unique seqs passed digital filters Unique seqs with length >= 15 Unique seqs with length < 15 Reviewing Program Wordpad Wordpad Wordpad Unique seqs with copy number >=3 Wordpad Unique seqs with copy number <3 Wordpad sample1_fhg db.fa Final mappable unique seqs Wordpad sample1_fhg gp1_align.txt sample1_fhg gp1_mirlist.txt sample1_fhg gp1_sum.txt Cluster I: see Table 4 Wordpad Excel Excel sample1_fhg gp2_align.txt sample1_fhg gp2_mirlist.txt sample1_fhg gp2_sum.txt Cluster II: see Table 4 Wordpad Excel Excel Mapped Data sample1_fhg gp3_align.txt sample1_fhg gp3_mirlist.txt sample1_fhg gp3_sum.txt Cluster III: see Table 4 Wordpad Excel Excel sample1_fhg gp4_align.txt sample1_fhg gp4_mirlist.txt sample1_fhg gp4_sum.txt Cluster IV: see Table 4 Wordpad Excel Excel sample1_fhg uni_mirs.txt The list of all unique seqs from Cluster I to IV Wordpad sample1_fhg nohit.txt Unique seqs having no hit with reference libraries or the genome Wordpad Summary sample1_fhg clusterposition.txt sample1_fhg mirdistribution.png Data_summary_sample1.xls Genomic chromosomal positions of the mapped unique seqs Plot of position of mapped unique seqs inside genome Statistics of data analysis at the various steps and the final results Excel Paint Excel LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
5 microrna Discovery Sequencing Data Report sample_ III. M ETHODS AND EXPERIMENTS A. Small RNA Library Constructionn A small RNA library was generated from the customer sample according to Illumina s sample preparation instruction 1. A summary of the procedures performed is briefly described below. 1. Small RNA Isolation by Denaturing PAGE Gel The received total RNA sample was size-fractionated on a 15% tris-borate-edta- quantified following gel elution, and ethanol precipitated. Urea polyacrylamide gel. The RNA fragments of length nts were isolated, 2. 5 and 3 Adapter Ligation The SRA 5 adapter (Illumina) was ligated to the aforementioned RNA fragments with T4 RNA ligase (Promega). The ligated RNAs were size-fractionated on a 15% trisborate-edta-urea polyacrylamide gel and the RNA fragments of size ~41-76 nts were isolated. The SRA 3 adapter (Illumina) ligation was then performed, followed by a second size-fractionation using the same gel condition as described above. The RNA fragments of size ~64-99 nts were isolated through gel elution and ethanol precipitation. 3. Reverse Transcription and PCR Amplification The ligated RNA fragments were reverse transcribed to single-stranded cdnas using M-MLV (Invitrogen) with RT-primers recommended by Illumina. The cdnas were amplified with pfx DNA polymerase (Invitrogen) in 20 cycles of PCR using Illumina s small RNA primers set. 4. Purification of Amplified cdna Library for Sequencing PCR products prepared were purified on a 12% TBE polyacrylamide gel and a slice of gel of ~ bps was excised. This fraction was eluted and the recovered cdnas were precipitated and quantified on Nanodrop (Thermo Scientific) and on TBS-380 mini-fluorometer (Turner Biosystems) using Picogreen dsdna quantization reagent (Invitrogen). The concentration of the sample was adjusted to ~10 nm and a total of 10 L was used in sequencing reaction. B. Deep Sequencing The purified cdna library was used for cluster generationn on Illumina s Cluster Station and then sequenced on Illumina GAIIx following vendor s instructionn for running the instrument. Raw sequencing reads were obtained using Illumina s Pipeline v1.5 software LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
6 microrna Discovery Sequencing Data Report sample_ following sequencing image analysis by Pipeline Firecrest Module and base-calling by Pipeline Bustard Module. The extracted sequencing reads weree stored in file sample1_rawdata.txt and were then used in the standard data analysis, which is described in the next Section. C. Standard Data Analysis A proprietary software package, ACGT101-miR v3.x (LC Sciences), was used for standard data analysis. The key functions performed by this software and the relevant analysis results are described here. 5. Obtaining Mappable Sequences from Raw Sequencing Data After the raw sequence reads, or sequenced sequences (sequ seqs) were extracted from image data, a series of digital filters (LC Sciences) were employed to remove various un-mappable sequencing reads. A Fasta file named sample1_fhg_db.fa was generated and used for mapping. a. Generating Unique Families of Sequ Seqs by Sorting Raw Sequencing Reads In this step, the same sequ seqs in the raw data file were being counted and a unique family of sequences (unique seqs) file, sample1_fhg_unique.txt, was generated. An example of a typical entry of this file is as shown below: 23 TTTGTCGG GTCTTTGGATATGCCGTGTGACAATGGTGG 1,8560 where 23 is the index of this sequence, followed by the sequ seq, and is the count (copy number) of the sequ seq. b. Generating Mappable Sequ Seqs In this step, the impurity sequences due to sample preparation, sequencing chemistry and processes, and the optical digital resolution of the sequencer detector were removed to give sequ seqs which were used to map with the reference database files. Those remaining sequ seqs were grouped by families (unique seqs) and stored in file sample1_fhg_ pass.txt. c. Filtering Unique Seqs by Length In this step, unique seqs weree separated into two groups based on their sequence lengths. Unique seqs with sequence length greater than a cut-off length (default = 15 nts for microrna discovery) were saved in the file named sample1_fhg_long.txt, while those of shorter length were saved in the file named sample1_fhg_short.txt. LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
7 microrna Discovery Sequencing Data Report sample_ d. Filtering Unique Seqs by Copy Number In this step, unique seqs weree further sorted based on their copy numbers. Those with copy numbers greater than a predefined cut-off number (default = 3) were stored in the file named sample1_fhg hc.txt while those with less copies were stored in the file named sample1_fhg lc.txt, whereas hc means high copy and lc means low copy. e. Removing Unique Seqs from Certain Known RNA Reference Databases Standard procedures were employed to remove those unique seqs which were mapped to mrna, RFam and Repbase. The unique seqs which passed the filter at this step were saved in the file called sample1_fhg_db.fa. 6. Mapping Mappable Unique Seqs to Mirs and Genome In this Section, various mappings were performed on unique seqs against pre-mirna (mir) and mature mirna (mir) sequences listed in the latest release of mirbase 2, 3, 4, or genomee based on the public releases of appropriate species. Mappings were also done on mirs of interest against genome sequence. Methods and criteria used for various mappings were documented in the ACGT-101 User s Manual 5. Brief descriptions of the analyses are presented below and the characteristics of various groups of unique seqs are summarized in Table 4. a. Mapping Unique Seqs to Mirs in mirbase The cleaned unique seqs in sample1_fhg db.fa were blasted against mirs in mirbase. The mapped unique seqs weree grouped as unique seqs mapped to mirs in mirbase, while the remaining ones were grouped as unique seqs un-mapped to mirs in mirbase. b. Mapping Mirs Mapped by Unique Seqs to Genome The mirs to which the unique seqs in unique seqs mapped to mirs in mirbase group weree mapped were further blasted against genome. The mirs mapped to genome weree sorted out and the unique seqs associated with these mirs were grouped as unique seqs mapped to mirs that further mapped to genome. This group of unique seqs were categorized as Cluster I and saved in file sample1_fhg_gp1_mirlist.txt. Their alignments were presented in file sample1_fhg_gp1_align.txt. A summary file was also generated and saved as sample1_fhg_gp1_sum.txt. The unique seqs mapped to the mirs that were not mapped to genome were grouped as unique seqs mapped to mirs that un-mapped to genome. LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
8 microrna Discovery Sequencing Data Report sample_ c. Mapping Unique Seqs with Mirs Un-mapped to Genome to Genome The unique seqs in the group unique seqs mapped to mirs that un-mapped to genome were blasted against genome. The unique seqs that mapped to genome heree were grouped as unique seqs mapped to mirs and genome but mirs un-mapped to genome and were categorized as Cluster II and saved in file sample1_fhg_gp2_mirlist.txt. Their alignments were presented in file sample1_fhg_gp2_align.txt. A summary file was also generated and saved as sample1_fhg_gp2_sum.txt. The remaining unique seqs were grouped as unique seqs mapped to mirs but neither unique seqs nor their mirs mapped to genome. d. Mapping Unique Seqs to mirs The unique seqs in the group unique seqs mapped to mirs but neither unique seqs nor their mirs mapped to genome were further categorized based on whether unique seqs were mapped to any mature mirnas (mirs) in the mirs to which the unique seqs were mapped. All unique seqs in this group that were mapped to mirs were grouped as unique seqs mapped to mirs and mirs but neither unique seqs nor their mirs mapped to genome and were categorized as Cluster III and saved in file sample1_fhg_gp3_mirlist.txt. Their alignments were presented in file sample1_fhg_gp3_align.txt. A summary file was also generated and saved as sample1_fhg_gp3_sum.txt. The rest of unique seqs in the group that were un-mapped to mirs were further grouped as unique seqs mapped to mirs but not to mir and neither unique seqs nor their mirs mapped to genome and termed as unique seqs nohit 1. e. Mapping Unique Seqs Un-mapped to Mirs to Genome Unique seqs in the group of unique seqs un-mapped to mirs in mirbase were blasted against genome directly and those mapped to genome weree identified. The extended sequences of the mapped genome sequences were tested for possible formation of stable hairpins. When stable hairpins were predicted, their associated unique seqs were then grouped as unique seqs un-mapped to mirs but mapped to genome with possible hairpin formation. These unique seqs were categorized as Cluster IV and saved in file sample1_fhg_gp4_mirlist.txt. Their alignments were presented in file sample1_fhg_gp4_align.txt. A summary file was also generated and saved as sample1_fhg_gp4_sum.txt All unique seqs in Cluster I to IV were listed in sample1_fhg_uni_mirs.txt as mapped mirs or predicted mirs. LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
9 microrna Discovery Sequencing Data Report sample_ The unique seqs that were mapped neither to mirs in mirbase nor to genome were grouped as unique seqs un-mapped to mirs and genome and weree termed as unique seqs nohit 2. The unique seqs in both groups of unique seqs nohit 1 and unique seqs nohit 2 weree combined as unique seqs nohit and saved in file sample1_fhg_nohit.txt. f. Plot of the Chromosome Genomic Positions of the Mapped Unique Seqs The genomic positions of the Cluster I to IV sequences were mapped to chromosomes and the results were saved in the file named sample1_fhg_clustposition.txt and displayed in the plot file named sample1_fhg_mirdistribution.png. Table 4. Summary of mapping of unique seqs to mirs, mirs, and genome* Clusters Group Description mir Unique seqs Mapped* Genome mir Comments Cluster I Unique seqs mapped to mirs that further mapped to genome Cluster II Cluster IIII Unique seqs mapped to mirs and genome but mirs un-mapped to genome Unique seqs mapped to mirs and mirs but neither unique seqs nor their mirs mapped to genome mirs un-mapped to genome mirs un-mapped to genome Cluster IV Unique seqs un-mapped to mirs but mapped to genome with possible hairpin formation Unique seqs nohit Unique seqs mapped to mirs but not to mir and neither unique seqs nor their mirs mapped to genome (unique seqs nohit 1) Unique seqs un-mapped to mirs and genome (unique seqs nohit 2) mirs un-mapped to genome * Note: indicates that a unique seq was mapped to the mir, mir, or genome. indicates that a unique seq was not mapped to the mir, mir, or genome. LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
10 microrna Discovery Sequencing Data Report sample_ Length Distribution of Mappable Dataa 4,000,000 3,500,000 3,533,567 3,000,000 2,500,000 2,,203,539 2,000,000 1,500,000 1,000, , , , ,314 81,131 77, , , ,,713 42,930 20, Length (nt) Total # of Reads 105,314 81,131 77, , , ,824 2,203,539 3,533, , ,713 42,930 20,549 7,944,551 % of Total Reads # of Unique Seqs , 471 Reads # / Unique seqs LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
11 microrna Discovery Sequencing Data Report sample_ Chromosomal location of pre-mirnas. The relative locations of individual pre-mirnas (mir) are shown across the 19 chromosomes. MID (Maximum Inter-Distance) is the maximum distance between any two pre-mirnas on a same chromosome considered to be in the same cluster. Fiftynine clusters (black dots) are obtained under the MID is limited to 50 kb. LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
12 microrna Discovery Sequencing Data Report sample_ IV. RE FERENCE Preparing Samples for Analysis of Small RNA, Illumina Inc., Part # Rev. A, 2008; Griffiths-Jones, S., Saini, H.K., van Dongen, S., Enright, A.J., mirbase: tools for microrna genomics, 2008, Nucleic Acids Research, 36, D154-D158; Griffiths-Jones, S., Grocock, R.J., van Dongen, S., Bateman, A., Enright, A.J., mirbase: microrna sequences, targets and gene nomenclature, 2006, Nucleic Acids Research, 34, D140-D144; Griffiths-Jones, S., The microrna Registry, 2004, Nucleic Acids Research 32, D109-D111; LC Sciences ACGT 101 manual; LC Sciences, LLC support@lcsciences.com 2575 W. Bellfort, Suite 270, Houston, Texas Tel , Fax
13 Example printouts of the included sequence data files are attached. The data files included with each sample report are listed in Table 3. These printouts represent truncated sample data files.
14 sample1_rawdata >ILLUMINA-57021F:7:1:3:1127#0/1 ACATTGGGTTCTCATTCAAATATACTTTTGAAGTATGTGC >ILLUMINA-57021F:7:1:3:352#0/1 ACGAAGAGGGAGCGCAATNNNTCAGTATATATTGAAGGAC >ILLUMINA-57021F:7:1:3:804#0/1 TCTGAGCAGTGACTAGNACCCGTAATANGAGGTGAGCAGC >ILLUMINA-57021F:7:1:3:2003#0/1 CAACAAGGGTAAGTTAATGCAATCGCCCCTCCNNAAAGGG >ILLUMINA-57021F:7:1:3:399#0/1 TTAAGGTAATTAGCGTGGGCGGTAGCGCTCTGTATAAGCT >ILLUMINA-57021F:7:1:3:921#0/1 TAAGCAGGCCATGCCGCTACGNNGGGGATAAATCTGGCTG >ILLUMINA-57021F:7:1:3:1598#0/1 TATTGGGAGGGAATAGATCGCTGNCCAGCCTGATNTAAGG >ILLUMINA-57021F:7:1:3:1981#0/1 GCCGCCTGCCTAGGTCTTCTTATCTTGAGAATGAGTCAAG >ILLUMINA-57021F:7:1:3:118#0/1 TGAGCACACATTACATAATGCGGCTACTGTTTGACAAAGT >ILLUMINA-57021F:7:1:3:1347#0/1 CTCGATGTAAGAGATCACTATTTCGCCACATGGTATTCCG >ILLUMINA-57021F:7:1:3:185#0/1 TAAGGGACCTATTGTCAGCGGATCACAATGTCTTGAAGGA >ILLUMINA-57021F:7:1:3:824#0/1 TTTGCAAACCGATGTCAGTGGCACTGCAAATGTCCACTGT >ILLUMINA-57021F:7:1:3:1603#0/1 CAAAACAAACGTAATACGGCGGTATCCCACTAGAGTTGTC >ILLUMINA-57021F:7:1:3:613#0/1 TCCCATGAATCAGGCCNGACTAGGGAAANTTCAATCAGAC >ILLUMINA-57021F:7:1:3:1459#0/1 GCCCGCCGTTATGGACAATAAGTAAATTGCTACAATTGAC >ILLUMINA-57021F:7:1:3:494#0/1 GCTGTTCTAGAAAATGGTTTATCTATTCCTGCGTCAATCT >ILLUMINA-57021F:7:1:3:1747#0/1 TGGAAAACTCTATTAGAGTCTAACTTATCCAATGCGCACG >ILLUMINA-57021F:7:1:3:1574#0/1 AGTAAATNATANAAATAAAATTAAAAAAAAAANAAAAAAA >ILLUMINA-57021F:7:1:3:1448#0/1 CTATGTACAGCCACTCTCTTGATGGCGGGAAATATTTATT >ILLUMINA-57021F:7:1:3:149#0/1 GGTGCTGGATTCCCGTTTTGCGTATTTTGGGAGAGGTCCA >ILLUMINA-57021F:7:1:3:460#0/1 TAGGCTGTTTGCTACATTTTGAGACAAACTGTATAGAGTG >ILLUMINA-57021F:7:1:3:1436#0/1 AATGGCGGAGCGATTTATAGGGAGAGGGGCGATTGGCTCG >ILLUMINA-57021F:7:1:3:1847#0/1 GACTATCTGCCTGTAGCGGATAAGGCAGCATCCAACCTAA >ILLUMINA-57021F:7:1:3:909#0/1 sample1_rawdata
15 sample1_fhg_unique # of input sequences: # of families: ; / =21.0% Index seq count 1 ATTATGGTACTTGTATTTAACAGGCTCACT CCTCTTATGTAGACCGTTGTCCAGTGGTGA ATTTTTATAGTACCAAGAGGCTACGCAGT GTAAAAAGTCTATCGCCGCACTGTCGTCA CTTAACGGTTCTAACTATTCACCGGTAAAG CAAAGAAGCCATAGGCGCCCGGGAACACC AACCGTGAGGCGCTGGAAAGACGCTAGAAG AGTAGTACTGGCGTACACATTCTCCACGG CATCCTATTCTAGCAATCAGGAGAACATTC AGGCCCATATCAAGAAGTAGAACTATCGA ACATGAATGGCGAATGCTTCCCGTGATA GTTAGCTAGTGCCCGGTTTTATCAAGCCC CGCGAATGTCTTCGTATGCTCAGGTAGC CATGAGAGGTCGAGGGACTTGATTCCTAC CTTCTGGCACCGTGGGCCAGCGGAAGGACA TGCCCCAACGACGCGGAAAATCAAGCGAGC CTTCTGGAAGCCAACGCTCTGGCGGGATCC AAACAAATTAACTCGACGACCTCTCCTCT AAACTATGCATTATTTCCCCCTAAGATCT CATATCTGTCTCCTACCAGATTATCACCCC TGTTTGCTGTGGCATTTTTCCCATGGATTA ATTATTAACGTGGTGTGGTAAATAGAGGGT GGTTCATGCCTAAATTGCATCTATAATA GAGTCGTAACGCTACCCTATACGAAGCG CCTTGAATGTGACCCTGAGGCTTCTATTAG AAGAAGCGTCAACCCCCACGTCAAGACGT TCCGTTGCTAGCCGGAGGACCTCCTGGT CCGCAGAGCCAGGCACTATGTCAGGGGCTA CATATGGCAAAAGACCGGACTGGACGCGA ACATCAAGAATCCTCAATCCTACGTGGACG AAGAGCCGGAGAACACATGATGGAGGCGAC TGCGATAAAAACGGTGATGACCAAAGAACA TGATCACGAAAAGTTGCTTGACAAGGTT CTGGTTAAGCACCCCCTGGTGGTGCTGCCT TGGGGCTACCGGGGCCTACACGCACCCAT GTTTATACTATTAATATGCAATGGTGACT GCCCCAAGCGTAGGTTGGGGGTCCGTTCG GCCCGGAGTCAGATGACTCGCTTACGTG CACGCTAGAAGGTGCTAGGGCTAGCTCTTT GTCATGGGAGCATTCATGCCGCGACGCAC CGCCTTTACTCCTGGAAGATATGACATGA GTAAACCCCGGGCGGTGCAACCACAGGCGG TGGTCTGGGCATTGTGCTTGAGCACACTTA 67972
16 sample1_fhg_pass # of input seq: # of input family: # of seq after filter: ; / =95.0% # of family after filter: ; /987972=83.6% # of seq with repeat fragment: ; / =5.0% # of family with repeat fragment: ; /987972=16.4% # of seq of >=7A,>=8C,>=6G,>=7T: ; / =5.0% # of family of >=7A,>=8C,>=6G,>=7T: ; /987972=16.4% # of seq of >10 dimer: 5; 5/ =0.0% # of family of >10 dimer: 5; 5/987972=0.0% # of seq of >6 trimer: 18; 18/ =0.0% # of family of >6 trimer: 18; 18/987972=0.0% # of seq of >5 tetramer: 8; 8/ =0.0% # of family of >5 tetramer: 8; 8/987972=0.0% Index seq count 1 TAGCACTCAAGTGTTTTGCACTGG AGAACGGTTTGCTATTTCTG TAACGGGGGTCACCTTCGGCAG CTCGAATTTTTCCAATCAC GTATGCATAATTGCAAGCACAT TTGCCTCACTCGTACAAAAGGCC GGATCAGGACCCGACTCCACATTAG AGAGGCCACCAAGATCTTAGGCC AAACAGCGTCTCAGTGTAATTG TCTGTCTATCTTCTTAAT CTGTCCCCCTTGTCTGATACA ACAAATGGCGGACGCGAATC GGCAAGTCTTCCTGCGA TGATGGAGGAACGTAAACGT GTCCACATCATCCCGGGGTCG TCGCGTATTGCTATTTAGGA ATCAACAACCAATCTCGATAT TATCAGTAACGTACATGCCCCCGAT ACTCCTGGAGGTCTCGCTCGTCTA AATCGAATCTTCGATACGTCGT CGTGGAGGGAGGCCGTCAGTTT GTCCATCTAGCCAATAGGC AGTAACCAACCGTGAGAGTGTTGGC GCTGGTTTTTGGAGCATG AGTAGGTCTGTAAGGGGT CAGTAAACGAAAGGACCGAGACT TTACACAAACATATCAGCGAT CTCGTTATTAGTAATACTC TAAGTGGTAAATTAACCGTTACACC TCGCGAGCTGACCAGTATCACG 23074
17 sample1_fhg_gp1_align input file: sample1/miralign/sample1_fhg_align.txt for mirs input file: sample1/db/sample1_fhg_db.fa for sequ seq input file: sample1/output/sample1_fhg_gp1.txt for cluster & genomeseq Conventions: 1. the. in alignments means that the base is same as that in reference. 2. the * in alignments means that the base is same as that in reference, but the * is the mature part of precursor. The capital bases in * region also belong to mature. 3. the * in #error means that this sequseq has deletion compared with reference. 4. the + in #error means that this sequseq is without 3ADT cut, and the previous part of the sequseq is mapped to the reference and the other part is removed. 5. full length precursor and sequseq (except for the sequseq without 3ADT cut, which is indicated by + in #error) are listed below. clusterno=1 chr=1 gi=nt_ strand=1 #mirs=7 #copy(all)=3832 #family(all)=37 #copy(0error)=1167 #family(0error)=12 #copy(1error)=2665 #family(1error)=25 genome GTATGCCTTAACAGCAAGCGCAGTAGCGTAGCGACTGGGCATGAACGCGACGTTGATGAACTCGTAAGTTCTTCCACAAGTCTGACCGTCGTATAAG #error hsa-mir-xxe 1...**********************...********************** hsa-mir-xxe,hsa-mir-xxe* ptr-mir-xxe 1...********************** ptr-mir-xxe mml-mir-xxe 1...a...********************** mml-mir-xxe mmu-mir-xxe 1...**********************...c...********************** mmu-mir-xxe,mmu-mir-xxe* bta-mir-xxe 1...************************...c...cga bta-mir-xxe-5p oan-mir-xxe 1.t...***********************..g... c.ga...**********************...a.. 94 oan-mir-xxe,oan-mir-xxe* cfa-mir-xxe 1...a...g.********************** 64 cfa-mir-xxe 275_count= _count= _count= _count= _count= _count= _count= _count= c _count= c _count= c _count= t _count= c _count= a _count= c _count=8 1...c _count=7 1...c _count=5 1...a _count=4 1...t _count=3 1...t _count=3 1...c _count=3 1...t _count=3 1...c _count= _count= _count= _count= _count= _count= t _count= g _count= g _count= g _count=9 1...a 23 1
18 sample1_fhg_gp1_align 19877_count=6 1...g _count=3 1...t 21 1 clusterno=2 chr=1 gi=nt_ strand=1 #mirs=5 #copy(all)=156 #family(all)=7 #copy(0error)=137 #family(0error)=4 #copy(1error)=19 #family(1error)=3 genome CTCTTGCGAAAAATAAATAAACGCTCAATTAGATGGCGGCGGATTGGGTCCCCCCTAGAAGCGACAGGGTTGCTGCTGAACTCGGTGGTTCTGTGAG #error bta-mir- XXc 4 a..g...c...***********************...g...g. 104 bta-mir-xxc hsa-mir-xxc ***********************...********************** hsa-mir-xxc ptr-mir-xxc *********************** ptr-mir-xxc mmu-mir-xxc t...***********************...********************** mmu-mir-xxc cfa-mir-xxc-1 1 ************************ cfa-mir-xxc 1630_count= _count= _count= _count= a _count=6 1...a _count=3 1...t _count= clusterno=3 chr=1 gi=nt_ strand=1 #mirs=5 #copy(all)=22 #family(all)=2 #copy(0error)=19 #family(0error)=1 #copy(1error)=3 #family(1error)=1 genome TCATAAAAATGTCGAGGAATGGCGGCTCGCGTAGACCCGCACCCCACCCCTTCGAAGCTCATTGCGTCAGTTCCACGATTC #error mmu-mir-xxx 1...********************** mmu-mir-xxx bta-mir-xxx 1...**********************.g...a.. 80 bta-mir-xxx hsa-mir-xxx 1...********************** hsa-mir-xxx mne-mir-xxx 1...*******************G** mne-mir-xxx cfa-mir-xxx 1...********************** 61 cfa-mir-xxx 6971_count= _count=3 1...a 24 1 clusterno=4 chr=1 gi=nt_ strand=1 #mirs=2 #copy(all)=33402 #family(all)=49 #copy(0error)=481 #family(0error)=12 #copy(1error)=32921 #family(1error)=37 genome GCTCCCCTATAAGAAGCGCGGAAGCCGGCTTATATGTTTCCCCATTATCATCGAACTTTCGATTGGGCCCCGTAACTCT #error ptr-mir-xxx ********************** ptr-mir-xxx hsa-mir-xxx ********************** hsa-mir-xxx 1326_count= _count= _count= _count= _count= _count= _count= _count= _count= g _count= g _count= g _count= g _count= t _count= t _count= g _count= a _count= t _count= g _count= g _count= t 21 1
19 sample1_fhg_gp1_mirlist #sequ_seq_id seq length clusterno=1: mir 40e 3p _count=837 AAATTGTCGTCCGAACGACCCA 24 CTAAATTGTCGTCCGAACGACCCA _count=62 AAATTGTCGTCCGAACGACCCA 23 CTAAATTGTCGTCCGAACGACCCA _count=32 AAATTGTCGTCCGAACGACCCA 23 CTAAATTGTCGTCCGAACGACCCA _count=12 CTAAATTGTCGTCCGAACGACC 22 CTAAATTGTCGTCCGAACGACCCA _count=7 CTAAATTGTCGTCCGAACGACCCA 25 CTAAATTGTCGTCCGAACGACCCA _count=7 CTAAATTGTCGTCCGAACGACC 22 CTAAATTGTCGTCCGAACGACCCA _count=5 TAAATTGTCGTCCGAACGACCCA 24 CTAAATTGTCGTCCGAACGACCCA _count=4 AAATTGTCGTCCGAACGACCCA 24 CTAAATTGTCGTCCGAACGACCCA _count=3 TAAATTGTCGTCCGAACGACCCA 24 CTAAATTGTCGTCCGAACGACCCA _count=3 AAATTGTCGTCCGAACGAC 19 CTAAATTGTCGTCCGAACGACCCA _count=3 AAATTGTCGTCCGAACGACCCA 24 CTAAATTGTCGTCCGAACGACCCA _count=3 CTAAATTGTCGTCCGAACGA 20 CTAAATTGTCGTCCGAACGACCCA 1 3 clusterno=1: mir 40e 5p _count=133 AAGAGTGCGTTGATTGTGGGTA 22 TCAAGAGTGCGTTGATTGTGGGTA _count=61 CAAGAGTGCGTTGATTGTGGG 21 TCAAGAGTGCGTTGATTGTGGGTA _count=4 AAGAGTGCGTTGATTGTGG 19 TCAAGAGTGCGTTGATTGTGGGTA _count=4 AAGAGTGCGTTGATTGTGGGT 21 TCAAGAGTGCGTTGATTGTGGGTA _count=4 AAGAGTGCGTTGATTGTGGG 20 TCAAGAGTGCGTTGATTGTGGGTA _count=168 CAAGAGTGCGTTGATTGTGGGT 22 TCAAGAGTGCGTTGATTGTGGGTA _count=80 AAGAGTGCGTTGATTGTGGGTA 22 TCAAGAGTGCGTTGATTGTGGGTA _count=19 AAGAGTGCGTTGATTGTGGGT 21 TCAAGAGTGCGTTGATTGTGGGTA _count=12 TCAAGAGTGCGTTGATTGTGG 21 TCAAGAGTGCGTTGATTGTGGGTA _count=9 TCAAGAGTGCGTTGATTGTGGGT 23 TCAAGAGTGCGTTGATTGTGGGTA _count=6 AAGAGTGCGTTGATTGTGGG 20 TCAAGAGTGCGTTGATTGTGGGTA _count=3 CAAGAGTGCGTTGATTGTGGG 21 TCAAGAGTGCGTTGATTGTGGGTA _count=3 AAGAGTGCGTTGATTGTGGGTA 22 TCAAGAGTGCGTTGATTGTGGGTA _count=3 AAGAGTGCGTTGATTGTGGGTA 23 TCAAGAGTGCGTTGATTGTGGGTA _count=3 AAGAGTGCGTTGATTGTGGGTA 23 TCAAGAGTGCGTTGATTGTGGGTA 3 3
20 sample1_fhg_gp1_mirlist clusterno=2: mir 40c 5p _count=106 AGTGGAGAGTGCCGCGTGTCTCG 24 GAGTGGAGAGTGCCGCGTGTCTCG _count=19 GAGTGGAGAGTGCCGCGTGTCTC 23 GAGTGGAGAGTGCCGCGTGTCTCG _count=7 AGTGGAGAGTGCCGCGTGTCTC 22 GAGTGGAGAGTGCCGCGTGTCTCG _count=10 AGTGGAGAGTGCCGCGTGTCTCG 24 GAGTGGAGAGTGCCGCGTGTCTCG _count=6 AGTGGAGAGTGCCGCGTGTCTCG 25 GAGTGGAGAGTGCCGCGTGTCTCG _count=3 GAGTGGAGAGTGCCGCGTGTCTCG 25 GAGTGGAGAGTGCCGCGTGTCTCG 1 2 clusterno=2: mir 40c 3p _count=5 ATCGCAGAATGCGCCTTGAT 22 CATCGCAGAATGCGCCTTGAT 2 1 clusterno=3: mir p _count=9 AACTAGCGGTCTCTTTCGCGT 21 AACTAGCGGTCTCTTTCGCGTGGA _count=13 ACTAGCGGTCTCTTTCGCGTGG 22 AACTAGCGGTCTCTTTCGCGTGGA 2
21 sample1_fhg_gp1_sum input file: sample1/miralign/sample1_fhg_align.txt for mirs input file: sample1/db/sample1_fhg_db.fa for sequ seq input file: sample1/output/sample1_fhg_gp1.txt for cluster & genomeseq Title: Position clusters of mirs mapping to genome input file: sample1/lists/sample1_fhg_align_chry.txt for sequence start position input file: sample1/miralign/sample1_fhg_matchedmirs.fa for mapped mir IDs input file: sample1/lists/sample1_fhg_align_chr1.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr2.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr3.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr4.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr5.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr6.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr7.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr8.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr9.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr10.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr11.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr12.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr13.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr14.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr15.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr16.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr17.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr18.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr19.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr20.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr21.txt for alignment data input file: sample1/lists/sample1_fhg_align_chr22.txt for alignment data input file: sample1/lists/sample1_fhg_align_chrx.txt for alignment data input file: sample1/lists/sample1_fhg_align_chry.txt for alignment data unique mammalian mirs mapped by sequ seq: 6456 # of unique mammalian mirs mapped by sequ seq & genome: 5256; 5256/6456=85.6% # of position clusters of the mapped mirs: 574 length of the genome: position cluster: the distance of two near positions in a clustaer is < 50 For the mirs mapped by sequ seq and genome after re alignment: # of position clusters: 457 # of mammalian mirs mapped to sequ seq and genome: 4839 # of unique sequ seq: # of unique sequ family: 5330 Note: represent mir at 5p or 3p is the sequenced sequence of lowest error# in each cluster and highest copy# the mir_name is composed of 4 parts, mir, extension number, 5p or 3p, index of sequ seq. The extension number is the extension number in mirids of highest occurency # of unique mirs detected: 563; # of unique mirs is counted based on mir_name.
22 sample1_fhg_gp1_sum Index copy#(all isoforms in 5p family#(all isoforms in 5p or 3p) chr# chr_seqid strand mir_start mir_end mir_start mir_end #mirs mirids clustern o mir_name mir_seq mir_len copy# of the isoform or 3p) 1 1 mir 30e 5p 275 TGTAAACATCCTTGACTGGAAGCT NT_ hsa mir XXX 2 20 mir 28 3p 2238 CACTAGATTGTGAGCTCCTGGA NT_ hsa mir XXX 3 57 mir 27b 3p 127 TTCACAGTGGCTAAGTTCTGC NT_ hsa mir XXX 4 87 mir 625 3p GACTATAGAACTTTCCCCCTCA NT_ hsa mir XXX 5 98 mir p AGGGTAGATAGAACAGGTCTTG NT_ hsa mir XXX mir 21 3p 413 CAACACCAGTCGATGGGCTGTC NT_ hsa mir XXX mir 7e 3p CTATACGGCCTCCTAGCTTTCC NT_ hsa mir XXX mir 101 3p 121 GTACAGTACTGTGATAACTGAA NT_ hsa mir XXX mir 135b 3p ATGTAGGGCTAAAAGCCATGGG NT_ hsa mir XXX mir 425 3p 5257 ATCGGGAATGTCGTGTCCGCC NT_ hsa mir XXX mir 548p 3p CCAAAACTGCAGTTACTTTTGC NT_ hsa mir XXX mir 31 5p 412 AGGCAAGATGCTGGCATAGCTG NT_ hsa mir XXX mir 548l 5p AAAAGTATTTGCGGGTTTTGTC NT_ hsa mir XXX mir 125b 5p 90 TCCCTGAGACCCTAACTTGTGA NT_ hsa mir XXX mir p 5124 TCTGGGTGGTCTGGAGATTTGTG NT_ hsa mir XXX mir 16 3p 9924 CCAGTATTAACTGTGCTGCTGA NT_ hsa mir XXX mir p GAGGCAGAAGCAGGATGACAA NT_ hsa mir XXX mir 33b 5p 3282 GTGCATTGCTGTTGCATTGCA NT_ hsa mir XXX mir 454 3p 8695 TAGTGCAATATTGCTTATAGGGTTT NT_ hsa mir XXX mir 320 3p 1624 AAAAGCTGGGTTGAGAGGG NT_ hsa mir XXX mir 548j 5p AAAAGTAATTGCGGTCTTTGGT NT_ hsa mir XXX mir 659 5p AGGACCTTCCCTGAACCAAGGA NT_ hsa mir XXX mir p ACGCCCTTCCCCCCCTTCTTCA NT_ hsa mir XXX mir 221 5p 816 ACCTGGCATACAATGTAGATTTCT X NT_ hsa mir XXX mir 221 3p 5 AGCTACATTGTCTGCTGGGTTTC X NT_ hsa mir XXX mir 222 5p 4253 CTCAGTAGCCAGTGTAGATCC X NT_ hsa mir XXX mir 222 3p 23 AGCTACATCTGGCTACTGGGTCTC X NT_ hsa mir XXX mir 548i 5p AAAAGTACTTGCGGATTTTGC X NT_ hsa mir XXX mir 361 5p 1854 TTATCAGAATCTCCAGGGGTAC X NT_ hsa mir XXX mir 361 3p TCCCCCAGGTGTGATTCTGATTT X NT_ hsa mir XXX mir 421 3p 3255 ATCAACAGACATTAATTGGGCGC X NT_ hsa mir XXX
23 sample1_fhg_clusterposition input from sample1/0finalreport/3_sample1_fhg_gp1_sum.txt input from sample1/0finalreport/4_sample1_fhg_gp2_sum.txt input from sample1/0finalreport/6_sample1_fhg_gp4_sum.txt The position refers to the start position of mir in human genome. ESTs are added one by one after chromosome X. The PositionInSeq refers to the position of mir in its own contig sequence or EST Refer to above three input files to get definitions of mir_name & mir_seq The clusterdistance is the difference of the positions of the current and previous clusters. Minimum c 1 Maximum c #Copy (all #family (all cluster isoforms in 5p isoforms in 5p StartPosition EndPosition Index Position Distance or 3p) or 3p) Chr# Strand InSeq InSeq Type unique_mirs mir_name mir_seq predict (gp4) PC 5p XXXXX GCGGCACTGAGGCTTATAGCGGAA predict (gp4) PC 3p XXXXX TGTACGGCCATCCAGCTCTAGGCC predict (gp4) PC 5p XXXXX GGAATAGCACATCAAGTAGGT predict (gp4) PC 5p XXXXX CACGGCCATTAGACGACGCCGGG predict (gp4) PC 3p XXXXX TGTACGTAAAGTGACTCCACTAA predict (gp4) PC 5p XXXXX TGATAGGCCCTACTGTCCATGTT known (gp1) hsa mir XXX mir XXX 5p XXX TATGCCAGGCAGTTATACCAT known (gp1) hsa mir XXX mir XXX 5p XXX CAAGCGTTTGTCAACAAAGTGTTGA predict (gp4) PC 5p XXXXX TTGTCCGATTATGTGCTCG known (gp1) mmu mir XXX;mml mir XXX;b mir XXX 5p XXX CCGCGACGTTTTCGGGACCGA known (gp1) mmu mir XXX;mml mir XXX;b mir XXX 3p XXX CCAAACTCGCGAACTAG known (gp1) mmu mir XXX;mml mir XXX;b mir XXX 5p XXX GAACTCTACGAATCATCCTAGTATG known (gp1) mmu mir XXX;mml mir XXX;b mir XXX 3p XXX GCTGCCTCCGTACGATGCTA predict (gp4) PC 5p XXXXX ATCTAATGTGGGTGACACTGGT predict (gp4) PC 5p XXXXX GGGGTTTAGGGTACCCGCTTCTG predict (gp4) PC 5p XXXXX TGTTGAGCGATTGCATGCAACTTA predict (gp4) PC 5p XXXXX TGGGTGCGTGTGGTCACGTC predict (gp4) PC 5p XXXXX TGCGCGTCTTTATTATC predict (gp4) PC 5p XXXXX CCGTGATTGGACCGTCGCGTTCGT predict (gp4) PC 5p XXXXX CACTGCCGAACGATCTGTGATTCC predict (gp4) PC 3p XXXXX TTCACGCTGGGTTATATCTCTCGC predict (gp4) PC 5p XXXXX CCTCTCCTGGTTAGTCCA predict (gp4) PC 3p XXXXX CCAAGCAGTCTGGCATCTTATGC known (gp1) mmu mir XXX;mml mir XXX;b mir XXX 3p XXX GTATCGCTATCGCCCAGAGCGTCG predict (gp4) PC 5p XXXXX CACCCAAGGACCCCGCC known (gp1) ssc mir XXX mir XXX 3p XXX GCGCCTTCCGCCGATTTTGT predict (gp4) PC 3p XXXXX ACGATACTGTACTCGGG predict (gp4) PC 5p XXXXX CCAAGAGGTGTGTTGAGCA predict (gp4) PC 5p XXXXX AGTTTTCGCACGGCGTGTCAT predict (gp4) PC 5p XXXXX TGCTTATGCAGCTTTGTAGCCT known (gp1) mmu mir XXX;mml mir XXX;b mir XXX 5p XXX AAGGCGGGTCTACTAAGGGGAGC predict (gp4) PC 5p XXXXX GAAACCAGCTAAGCAATGC known (gp1) mmu mir XXX;mml mir XXX;b mir XXX 5p XXX GGGCATAACTGTGGGCTGAC predict (gp4) PC 5p XXXXX TTGAGGTCCGTTCCTCAGTCGACCT predict (gp4) PC 3p XXXXX TAGAGGTAGCCACAAGGATAGCG
24 Table 1 - Data summary Raw Data #SequSeq #UniqueSeq # Raw Sequ seq 9,922,513 1,592,666 Data Processing #SequSeq %SequSeq #UniqueSeq %UniqueSeq 1. impurity sequences filtered 948, % 623, % 2. Copy#<3 filtered 913, % 845, % 3. Length < 15 filtered 204, % 56, % 4. mrna,rfam,repbase filtered 362, % 11, % 5. Final Mappable 7,493, % 55, % Total 9,922, % 1,592, % Table 2 - Length distribution of mappable data Length #SequSeq %FinalMappable SequSeq #UniqueSeq %FinalMappable UniqueSeq #SequSeq/ #UniqueSeq 15 52, % % , % 1, % , % % , % 1, % , % % , % 1, % , % 3, % ,438, % 16, % ,810, % 8, % , % 2, % , % % , % % , % % , % % , % % , % % , % 15, % 7.4 Final Mappable 7,493, % 55, % 135.1
25 Table 3 - Unique seq mapped to unique mammalian mirs #SequSeq %SequSeq %FinalMappable #UniqueSeq %UniqueSeq %FinalMappable # Known mammalian unique mir in mirbase v14.0 3,924 # Known mammalian unique mir in mirbase v14.0 2,656 # Unique mir in mirbase mapped 1,599 # Unique mir in mirbase mapped 2,273 Mapped to mirbase 6,013, % 80.25% 12, % 22.46% # Known hsa mir in mirbase v # Known hsa mir in mirbase v # Unique hsa mir in mirmirbase mapped 286 # Unique hsa mir in mirbase mapped 399 Mapped to hsa of mirbase 5,937, % 79.24% 8, % 15.13% Table 4 - Cluster I: Sequ seq mapped to mammalian mirs that further mapped to genome Mapping to mammalian: #SequSeq %SequSeq %FinalMappable #UniqueSeq %UniqueSeq %FinalMappable Cluster I 5,734, % 76.53% 8, % 15.03% # Alignment-cluster in Cluster I 389 # Unique mir in Cluster I 1,456 # Unique mir in Cluster I 309 FileName FileName FileName Sequ Seq Sequ Seq Mapped_Data/sample1_FHG_gp1_Align.txt Mapped_Data/sample1_FHG_gp1_Sum.txt Mapped_Data/sample1_FHG_gp1_miRlist.txt Unique Seq Unique Seq Mapping to species hsa: Sequ Seq Unique Seq #SequSeq %SequSeq %FinalMappable #UniqueSeq %UniqueSeq %FinalMappable Cluster I 5,734, % 76.53% 8, % 15.01% #Unique hsa mir in Cluster I 295 #Unqiue hsa mir in Cluster I 399
26 Table 5 - Cluster II: Sequ seq mapped to both mammalian mirs and genome, but the mirs unmapped to genome: Mapping to mammalian: Sequ Seq Unique Seq #SequSeq %SequSeq %FinalMappable #UniqueSeq %UniqueSeq %FinalMappable Cluster II % 0.00% % 0.02% # Alignment-cluster in Cluster II 7 # Unique mir in Cluster II 3 # Unique mir in Cluster II 4 FileName FileName FileName Mapped_Data/sample1_FHG_gp2_Align.txt Mapped_Data/sample1_FHG_gp2_Sum.txt Mapped_Data/sample1_FHG_gp2_miRlist.txt Table 6 - Cluster III: Sequ seq mapped to mammalian mirs, but the mirs unmapped to genome (sequence cluster): Mapping to mammalian: Sequ Seq Unique Seq #SequSeq %SequSeq %FinalMappable #UniqueSeq %UniqueSeq %FinalMappable Cluster III 2, % 0.04% % 0.26% # Alignment-cluster in Cluster III 23 # Unique mir in Cluster III 37 # Unique mir in Cluster III 18 FileName FileName FileName Mapped_Data/sample1_FHG_gp3_Align.txt Mapped_Data/sample1_FHG_gp3_Sum.txt Mapped_Data/sample1_FHG_gp3_miRlist.txt Mapping to species hsa: Sequ Seq Unique Seq #SequSeq %SequSeq %FinalMappable #UniqueSeq %UniqueSeq %FinalMappable Cluster III % 0.00% % 0.00% #Unique hsa mir in Cluster III 0 #Unqiue hsa mir in Cluster III 0
27 Table 7 - Cluster IV: Sequ seq mapped to genome, but unmapped to mammalian mirs (predict new hairpin by mfold): #SequSeq %SequSeq %FinalMappable #UniqueSeq %UniqueSeq %FinalMappable Cluster IV 18, % 0.25% 1, % 2.53% # Alignment-cluster in Cluster IV 1,095 # Unique mir in Cluster IV 703 FileName FileName FileName Table 8 - Unmapped Sequ Seq Mapped_Data/sample1_FHG_gp4_Align.txt Mapped_Data/sample1_FHG_gp4_Sum.txt Mapped_Data/sample1_FHG_gp4_miRlist.txt #SequSeq %SequSeq %FinalMappable #UniqueSeq %UniqueSeq %FinalMappable Nohit 847, % 11.31% 40, % 72.15% FileName Table 9 - Mapping summary Mapped_Data/sample1_FHG_nohit.txt #SequSeq %SequSeq #UniqueSeq %UniqueSeq Raw 9,922, % 1,592, % Mappable 7,493, % 55, % Mapped to mirbase (including nohit 1) 6,013, % 12, % Mapped to Cluster I 5,734, % 8, % Mapped to Cluster II % % Mapped to Cluster III 2, % % Mapped to Cluster IV 18, % 1, % Mapped (total) 5,756, % 9, % Nohit (including nohit 1 and nohit 2) 847, % 40, % Note: Mapped (total) + Nohit should equal to mappable Table 10 - Detected mir summary # of unique mirs detected in Cluster I 309 # of unique mirs detected in Cluster II 4 # of unique mirs detected in Cluster III 18 # of unique mirs detected in Cluster IV 703 Total 1,034 Sequ Seq Sequ Seq Unique Seq Unique Seq FileName Mapped_Data/sample1_FHG_uni_miRs.txt
28 Note: Definition of mir_name: in cluster group1: the mir_name is composed of 4 parts, mir, extension number, 5p or 3p, index of sequ seq. The extension number is the extension number in mirids of highest occurency. in cluster group2: the mir_name is composed of 4 parts, PC, extension number, 5p or 3p, index of sequ seq. PC means 'Predicted Candidate'. The extension number is the extension number in mirids of highest occurency. in cluster group3: the mir_name is composed of 4 parts, PN, extension number, 5p or 3p, index of sequ seq. PN means 'Predicted Novel mir'. The extension number is the extension number in mirids of highest occurency. in cluster group4: the mir_name is composed of 3 parts, PC, 5p or 3p, index of sequ seq. PC means 'Predicted Candidate'. The mir sequence is from the isoform of lowest error# and highest copy# in cluster groups 1, 2 & 4. The mir sequence is from the isoform of highest copy# regardless its error# in cluster groups 3. # of unique mirs detected in these 4 groups is counted from different mir_seq.
E.Z.N.A. MicroElute Clean-up Kits Table of Contents
E.Z.N.A. MicroElute Clean-up Kits Table of Contents Introduction... 2 Kit Contents... 3 Preparing Reagents/Storage and Stability... 4 Guideline for Vacuum Manifold... 5 MicroElute Cycle-Pure - Spin Protocol...
More informationGEP Annotation Report
GEP Annotation Report Note: For each gene described in this annotation report, you should also prepare the corresponding GFF, transcript and peptide sequence files as part of your submission. Student name:
More informationC101-E112. BioSpec-nano. Shimadzu Spectrophotometer for Life Science
C101-E112 BioSpec-nano Shimadzu Spectrophotometer for Life Science Power of small. BioSpec-nano BioSpec-nano Shimadzu Spectrophotometer for Life Science Quick and Simple Nucleic Acid Quantitation Drop-and-Click
More informationAnalytical Study of Hexapod mirnas using Phylogenetic Methods
Analytical Study of Hexapod mirnas using Phylogenetic Methods A.K. Mishra and H.Chandrasekharan Unit of Simulation & Informatics, Indian Agricultural Research Institute, New Delhi, India akmishra@iari.res.in,
More informationMathangi Thiagarajan Rice Genome Annotation Workshop May 23rd, 2007
-2 Transcript Alignment Assembly and Automated Gene Structure Improvements Using PASA-2 Mathangi Thiagarajan mathangi@jcvi.org Rice Genome Annotation Workshop May 23rd, 2007 About PASA PASA is an open
More informationHigh-throughput Quantification of DNA for NGS Library Prep with the Zephyr G3 Workstation and the VICTOR Nivo Plate Reader
TECHNICAL APPLICATION NOTE High-throughput Quantification of DNA for NGS Library Prep with the Zephyr G3 Workstation and the VICTOR Nivo Plate Reader NGS Automation Image or Color Block Area Next generation
More informationAutomated Illumina TruSeq Stranded mrna library construction with the epmotion 5075t/TMX
SHORT PROTOCOL No. 02 I November 2014 Automated Illumina TruSeq Stranded mrna library construction with the epmotion 5075t/TMX Introduction For the MiSeq and HiSeq next generation sequencing (NGS) systems,
More informationmicrorna Studies Chen-Hanson Ting SVFIG June 23, 2018
microrna Studies Chen-Hanson Ting SVFIG June 23, 2018 Summary MicroRNA (mirna) Species and organisms studied mirna in mitocondria Huge genome files mirna in human Chromosome 1 mirna in bacteria Tools used
More informationFormation and Determination of the Oxidation Products of 5- Methylcytosine in RNA
Electronic Supplementary Material (ESI) for Chemical Science. This journal is The Royal Society of Chemistry 2016 Supporting Information For Formation and Determination of the Oxidation Products of 5-
More informationHigh-throughput sequence alignment. November 9, 2017
High-throughput sequence alignment November 9, 2017 a little history human genome project #1 (many U.S. government agencies and large institute) started October 1, 1990. Goal: 10x coverage of human genome,
More informationIon Torrent. The chip is the machine
Ion Torrent Introduction The Ion Personal Genome Machine [PGM] is simple, more costeffective, and more scalable than any other sequencing technology. Founded in 2007 by Jonathan Rothberg. Part of Life
More informationSpectrophotometer for Life Science. BioSpec-nano C101-E112D
Spectrophotometer for Life Science BioSpec-nano C11-E112D BioSpec-nano Spectrophotometer for Life Science Drop-and-Start Analysis Automatic Optical Pathlength Setting & Automatic Wiping Low Carryover Achieved
More informationGenotyping By Sequencing (GBS) Method Overview
enotyping By Sequencing (BS) Method Overview RJ Elshire, JC laubitz, Q Sun, JV Harriman ES Buckler, and SE Mitchell http://wwwmaizegeneticsnet/ Topics Presented Background/oals BS lab protocol Illumina
More informationCREATING CUSTOMIZED DATE RANGE COLLECTIONS IN PRESENTATION STUDIO
CREATING CUSTOMIZED DATE RANGE COLLECTIONS IN PRESENTATION STUDIO Date range collections are pre-defined reporting periods for performance data. You have two options: Dynamic date ranges automatically
More informationAnnotation of Plant Genomes using RNA-seq. Matteo Pellegrini (UCLA) In collaboration with Sabeeha Merchant (UCLA)
Annotation of Plant Genomes using RNA-seq Matteo Pellegrini (UCLA) In collaboration with Sabeeha Merchant (UCLA) inuscu1-35bp 5 _ 0 _ 5 _ What is Annotation inuscu2-75bp luscu1-75bp 0 _ 5 _ Reconstruction
More informationTUTORIAL EXERCISES WITH ANSWERS
TUTORIAL EXERCISES WITH ANSWERS Tutorial 1 Settings 1. What is the exact monoisotopic mass difference for peptides carrying a 13 C (and NO additional 15 N) labelled C-terminal lysine residue? a. 6.020129
More informationGenotyping By Sequencing (GBS) Method Overview
enotyping By Sequencing (BS) Method Overview Sharon E Mitchell Institute for enomic Diversity Cornell University http://wwwmaizegeneticsnet/ Topics Presented Background/oals BS lab protocol Illumina sequencing
More informationA Browser for Pig Genome Data
A Browser for Pig Genome Data Thomas Mailund January 2, 2004 This report briefly describe the blast and alignment data available at http://www.daimi.au.dk/ mailund/pig-genome/ hits.html. The report describes
More informationESPRIT Feature. Innovation with Integrity. Particle detection and chemical classification EDS
ESPRIT Feature Particle detection and chemical classification Innovation with Integrity EDS Fast and Comprehensive Feature Analysis Based on the speed and accuracy of the QUANTAX EDS system with its powerful
More informationexpress: Streaming read deconvolution and abundance estimation applied to RNA-Seq
express: Streaming read deconvolution and abundance estimation applied to RNA-Seq Adam Roberts 1 and Lior Pachter 1,2 1 Department of Computer Science, 2 Departments of Mathematics and Molecular & Cell
More informationRNA Transport. R preps R preps
RNA Transport R0527-00 5 preps R0527-01 50 preps July 2014 RNA Transport Table of Contents Introduction...2 Kit Contents/Storage and Stability...3 Protocol...4 Storage Procedure...4 Recovery Procedure...5
More informationncounter PlexSet Data Analysis Guidelines
ncounter PlexSet Data Analysis Guidelines NanoString Technologies, Inc. 530 airview Ave North Seattle, Washington 98109 USA Telephone: 206.378.6266 888.358.6266 E-mail: info@nanostring.com Molecules That
More informationAraport, a community portal for Arabidopsis. Data integration, sharing and reuse. sergio contrino University of Cambridge
Araport, a community portal for Arabidopsis. Data integration, sharing and reuse sergio contrino University of Cambridge Acknowledgements J Craig Venter Institute Chris Town Agnes Chan Vivek Krishnakumar
More information08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega
BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments
More informationMRC-Holland MLPA. Description version 14; 21 January 2015
SALSA MLPA probemix P229-B2 OPA1 Lot B2-0412. As compared to version B1-0809, two reference probes and the 88 and 96 nt control fragments have been replaced (QDX2). The OPA1 gene product is a nuclear-encoded
More informationOECD QSAR Toolbox v.4.0. Tutorial on how to predict Skin sensitization potential taking into account alert performance
OECD QSAR Toolbox v.4.0 Tutorial on how to predict Skin sensitization potential taking into account alert performance Outlook Background Objectives Specific Aims Read across and analogue approach The exercise
More informationBayesian Clustering of Multi-Omics
Bayesian Clustering of Multi-Omics for Cardiovascular Diseases Nils Strelow 22./23.01.2019 Final Presentation Trends in Bioinformatics WS18/19 Recap Intermediate presentation Precision Medicine Multi-Omics
More informationOverview - MS Proteomics in One Slide. MS masses of peptides. MS/MS fragments of a peptide. Results! Match to sequence database
Overview - MS Proteomics in One Slide Obtain protein Digest into peptides Acquire spectra in mass spectrometer MS masses of peptides MS/MS fragments of a peptide Results! Match to sequence database 2 But
More informationMassHunter TOF/QTOF Users Meeting
MassHunter TOF/QTOF Users Meeting 1 Qualitative Analysis Workflows Workflows in Qualitative Analysis allow the user to only see and work with the areas and dialog boxes they need for their specific tasks
More informationHairpin Database: Why and How?
Hairpin Database: Why and How? Clark Jeffries Research Professor Renaissance Computing Institute and School of Pharmacy University of North Carolina at Chapel Hill, United States Why should a database
More informationBLAST. Varieties of BLAST
BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database
More informationAutomation of ChIP-Seq Library Preparation for Next Generation Sequencing on the epmotion 5075t
APPLICATION NOTE No. 275 Automation of ChIP-Seq Library Preparation for Next Generation Sequencing on the epmotion 5075t Cheng Liu Ph.D. 1, Maryke Appel Ph.D. 2 1 Eppendorf North America, Hauppauge, NY,
More informationBLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010
BLAST Database Searching BME 110: CompBio Tools Todd Lowe April 8, 2010 Admin Reading: Read chapter 7, and the NCBI Blast Guide and tutorial http://www.ncbi.nlm.nih.gov/blast/why.shtml Read Chapter 8 for
More informationAutomated Illumina TruSeq Stranded Total RNA library construction with the epmotion 5075t/TMX
SHORT PROTOCOL No. 01 I November 2014 Automated Illumina TruSeq Stranded Total RNA library construction with the epmotion 5075t/TMX Introduction This protocol describes the configuration and preprogrammed
More informationThe Developmental Transcriptome of the Mosquito Aedes aegypti, an invasive species and major arbovirus vector.
The Developmental Transcriptome of the Mosquito Aedes aegypti, an invasive species and major arbovirus vector. Omar S. Akbari*, Igor Antoshechkin*, Henry Amrhein, Brian Williams, Race Diloreto, Jeremy
More informationIsoform discovery and quantification from RNA-Seq data
Isoform discovery and quantification from RNA-Seq data C. Toffano-Nioche, T. Dayris, Y. Boursin, M. Deloger November 2016 C. Toffano-Nioche, T. Dayris, Y. Boursin, M. Isoform Deloger discovery and quantification
More informationQubit RNA IQ Assay Kits
USER GUIDE Qubit RNA IQ s Catalog No. Q33221, Q33222 Pub. No. MAN0017405 Rev. B.0 Product information The Qubit RNA IQ provides a fast, simple method to check whether an RNA sample has degraded using the
More informationIon Sphere Assay on the Qubit 3.0 Fluorometer
USER GUIDE Ion Sphere Assay on the Qubit 3.0 Fluorometer for use with: Ion Sphere Quality Control Kit (Cat. No. 4468656) Publication Number MAN0016388 Revision A.0 Ion Sphere Assay overview... 2 Materials
More informationGlobin-Zero Gold Kit
Cat. No. GZG1206 (Contains 1 box of Cat. No. GZRR1306 and 1 box of Cat. No. MRZ116C) Cat. No. GZG1224 (Contains 1 box of Cat. No. GZRR1324 and 1 box of Cat. No. MRZ11124C) Connect with Epicentre on our
More informationWeatherHawk Weather Station Protocol
WeatherHawk Weather Station Protocol Purpose To log atmosphere data using a WeatherHawk TM weather station Overview A weather station is setup to measure and record atmospheric measurements at 15 minute
More informationSynteny Portal Documentation
Synteny Portal Documentation Synteny Portal is a web application portal for visualizing, browsing, searching and building synteny blocks. Synteny Portal provides four main web applications: SynCircos,
More informationNGS Made Easy. Optimize your NGS library preparation with the epmotion Automated liquid handling system
NGS Made Easy Optimize your NGS library preparation with the epmotion Automated liquid handling system NGS Library Preparation Made Easy and Reliable Next-generation sequencing sample preparation is a
More informationComplete all warm up questions Focus on operon functioning we will be creating operon models on Monday
Complete all warm up questions Focus on operon functioning we will be creating operon models on Monday 1. What is the Central Dogma? 2. How does prokaryotic DNA compare to eukaryotic DNA? 3. How is DNA
More informationCompounding insights Thermo Scientific Compound Discoverer Software
Compounding insights Thermo Scientific Compound Discoverer Software Integrated, complete, toolset solves small-molecule analysis challenges Thermo Scientific Orbitrap mass spectrometers produce information-rich
More informationSupplementary Information
Supplementary Information A versatile genome-scale PCR-based pipeline for high-definition DNA FISH Magda Bienko,, Nicola Crosetto,, Leonid Teytelman, Sandy Klemm, Shalev Itzkovitz & Alexander van Oudenaarden,,
More informationAutomated purification of high quality genomic DNA
APPLICATION NOTE No. AA267 I October 2012 Automated purification of high quality genomic DNA from various tissues using the Eppendorf MagSep Tissue gdna Kit on the Eppendorf epmotion M5073 Ulrich Wilkening,
More informationEppendorf twin.tec PCR Plates 96 LoBind Increase Yield of Transcript Species and Number of Reads of NGS Libraries
APPLICATION NOTE No. 375 I December 2016 Eppendorf twin.tec PCR Plates 96 LoBind Increase Yield of Transcript Species and Number of Reads of NGS Libraries Hanae A. Henke¹, Björn Rotter² ¹Eppendorf AG,
More informationDr. OligoTM DNA / RNA / OLIGO SYNTHESIZERS
Dr. OligoTM DNA / RNA / OLIGO SYNTHESIZERS High Throughput Oligo Synthesis Synthesize Cleave Deprotect Desalt Elute Dr. Oligo TM The Dr. Oligo TM High Throughput Oligo Synthesizer is available in four
More informationMRC-Holland MLPA. Description version 09; 25 April 2017
SALSA MLPA probemix P143-C2 MFN2-MPZ Lot C2-0317. As compared to version C1-0813, one reference probe has been removed and two replaced, in addition several probe lengths have been adjusted. This P143
More informationOperation Manual. SPECTRO-NANO4 Nucleic Acid Analyzer PLEASE READ THIS MANUAL CAREFULLY BEFORE OPERATION
Operation Manual SPECTRO-NANO4 Nucleic Acid Analyzer PLEASE READ THIS MANUAL CAREFULLY BEFORE OPERATION 3, Hagavish st. Israel 58817 Tel: 972 3 5595252, Fax: 972 3 5594529 mrc@mrclab.com MRC. 4.18 Foreword
More informationMassHunter Software Overview
MassHunter Software Overview 1 Qualitative Analysis Workflows Workflows in Qualitative Analysis allow the user to only see and work with the areas and dialog boxes they need for their specific tasks A
More informationON SITE SYSTEMS Chemical Safety Assistant
ON SITE SYSTEMS Chemical Safety Assistant CS ASSISTANT WEB USERS MANUAL On Site Systems 23 N. Gore Ave. Suite 200 St. Louis, MO 63119 Phone 314-963-9934 Fax 314-963-9281 Table of Contents INTRODUCTION
More informationNanoDrop One Viewer software NanoDrop One Website. NanoDrop One Website NanoDrop One Viewer software NanoDrop One Website Software System Update Update Update Software, Update Note OK Language Measure
More informationLigand Scout Tutorials
Ligand Scout Tutorials Step : Creating a pharmacophore from a protein-ligand complex. Type ke6 in the upper right area of the screen and press the button Download *+. The protein will be downloaded and
More informationIntroduction to Bioinformatics Online Course: IBT
Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec1 Building a Multiple Sequence Alignment Learning Outcomes 1- Understanding Why multiple
More informationPDF-4+ Tools and Searches
PDF-4+ Tools and Searches PDF-4+ 2019 The PDF-4+ 2019 database is powered by our integrated search display software. PDF-4+ 2019 boasts 74 search selections coupled with 126 display fields resulting in
More informationIntroduction to Molecular and Cell Biology
Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the molecular basis of disease? What
More informationRibo-Zero Magnetic Gold Kit*
Ribo-Zero Magnetic Gold Kit* (Epidemiology) Cat. No. MRZE706 (Contains 1 box of Cat. No. RZE1206 and 1 box of Cat. No. MRZ116C) Cat. No. MRZE724 (Contains 1 box of Cat. No. RZE1224 and 1 box of Cat. No.
More informationProgrammed ph-driven Reversible Association and Dissociation of Inter-Connected. Circular DNA Dimer Nanostructures
Supporting information Programmed ph-driven Reversible Association and Dissociation of Inter-Connected Circular DNA Dimer Nanostructures Yuwei Hu, Jiangtao Ren, Chun-Hua Lu, and Itamar Willner* Institute
More information2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology
2012 Univ. 1301 Aguilera Lecture Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the
More informationData Sheet. Azide Cy5 RNA T7 Transcription Kit
Cat. No. Size 1. Description PP-501-Cy5 10 reactions à 40 µl For in vitro use only Quality guaranteed for 12 months Store all components at -20 C. Avoid freeze and thaw cycles. DBCO-Sulfo-Cy5 must be stored
More informationTraining Path FNT IT Infrastruktur Management
Training Path FNT IT Infrastruktur Management // TRAINING PATH: FNT IT INFRASTRUCTURE MANAGEMENT Training Path: FNT IT Infrastructure Management 2 9 // FNT COMMAND BASIC COURSE FNT Command Basic Course
More informationGenomics and bioinformatics summary. Finding genes -- computer searches
Genomics and bioinformatics summary 1. Gene finding: computer searches, cdnas, ESTs, 2. Microarrays 3. Use BLAST to find homologous sequences 4. Multiple sequence alignments (MSAs) 5. Trees quantify sequence
More informationM E R C E R W I N WA L K T H R O U G H
H E A L T H W E A L T H C A R E E R WA L K T H R O U G H C L I E N T S O L U T I O N S T E A M T A B L E O F C O N T E N T 1. Login to the Tool 2 2. Published reports... 7 3. Select Results Criteria...
More informationFog Monitor 100 (FM 100) Extinction Module. Operator Manual
Particle Analysis and Display System (PADS): Fog Monitor 100 (FM 100) Extinction Module Operator Manual DOC-0217 Rev A-1 PADS 2.7.3, FM 100 Extinction Module 2.7.0 5710 Flatiron Parkway, Unit B Boulder,
More informationGene Switches Teacher Information
STO-143 Gene Switches Teacher Information Summary Kit contains How do bacteria turn on and turn off genes? Students model the action of the lac operon that regulates the expression of genes essential for
More informationAppendix B Microsoft Office Specialist exam objectives maps
B 1 Appendix B Microsoft Office Specialist exam objectives maps This appendix covers these additional topics: A Excel 2003 Specialist exam objectives with references to corresponding material in Course
More informationBioDrop DUO dsdna Application Note
BioDrop DUO dsdna Application Note Using a BioDrop DUO spectrophotometer to measure the concentration of low volume samples of dsdna Micro-volume measurement of DNA is a routine application in many life
More informationFOR RESEARCH USE ONLY.
MAN-10039-04 Vantage 3D DNA SNV Qualification Kit Vantage 3D DNA SNV Qualification Kit The ncounter Vantage 3D DNA SNV Qualification Kit is designed to assess whether the ncounter MAX, FLEX, or SPRINT
More informationAdvanced Forecast. For MAX TM. Users Manual
Advanced Forecast For MAX TM Users Manual www.maxtoolkit.com Revised: June 24, 2014 Contents Purpose:... 3 Installation... 3 Requirements:... 3 Installer:... 3 Setup: spreadsheet... 4 Setup: External Forecast
More informationCyFlow Ploidy Analyser & CyFlow Space High-resolution DNA analysis
CyFlow Ploidy Analyser & High-resolution DNA analysis For agroscience breeding aquaculture CyFlow Ploidy Analyser www.sysmex-flowcytometry.com Dedicated solutions for ploidy analysis and determining genome
More informationAnalysis of Y-STR Profiles in Mixed DNA using Next Generation Sequencing
Analysis of Y-STR Profiles in Mixed DNA using Next Generation Sequencing So Yeun Kwon, Hwan Young Lee, and Kyoung-Jin Shin Department of Forensic Medicine, Yonsei University College of Medicine, Seoul,
More informationTruSight Cancer Workflow on the MiniSeq System
TruSight Cancer Workflow on the MiniSeq System Prepare Library Sequence Analyze Data TruSight Cancer 1.5 days ~ 24 hours < 2 hours TruSight Cancer Library Prep MiniSeq System Local Run Manager Enrichment
More informationPDF-4+ Tools and Searches
PDF-4+ Tools and Searches PDF-4+ 2018 The PDF-4+ 2018 database is powered by our integrated search display software. PDF-4+ 2018 boasts 72 search selections coupled with 125 display fields resulting in
More informationGoing Beyond SNPs with Next Genera5on Sequencing Technology Personalized Medicine: Understanding Your Own Genome Fall 2014
Going Beyond SNPs with Next Genera5on Sequencing Technology 02-223 Personalized Medicine: Understanding Your Own Genome Fall 2014 Next Genera5on Sequencing Technology (NGS) NGS technology Discover more
More informationOECD QSAR Toolbox v.3.4. Example for predicting Repeated dose toxicity of 2,3-dimethylaniline
OECD QSAR Toolbox v.3.4 Example for predicting Repeated dose toxicity of 2,3-dimethylaniline Outlook Background Objectives The exercise Workflow Save prediction 2 Background This is a step-by-step presentation
More informationChIP seq peak calling. Statistical integration between ChIP seq and RNA seq
Institute for Computational Biomedicine ChIP seq peak calling Statistical integration between ChIP seq and RNA seq Olivier Elemento, PhD ChIP-seq to map where transcription factors bind DNA Transcription
More informationUse of Agilent Feature Extraction Software (v8.1) QC Report to Evaluate Microarray Performance
Use of Agilent Feature Extraction Software (v8.1) QC Report to Evaluate Microarray Performance Anthea Dokidis Glenda Delenstarr Abstract The performance of the Agilent microarray system can now be evaluated
More informationGenome sequence of Plasmopara viticola and insight into the pathogenic mechanism
Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism Ling Yin 1,3,, Yunhe An 1,2,, Junjie Qu 3,, Xinlong Li 1, Yali Zhang 1, Ian Dry 5, Huijun Wu 2*, Jiang Lu 1,4** 1 College
More information(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.
1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the
More informationU.S. Patent No. 9,051,563 and other pending patents. Ver
INSTRUCTION MANUAL Direct-zol 96 RNA Catalog Nos. R2054, R2055, R2056 & R2057 Highlights Quick, 96-well purification of high-quality (DNA-free) total RNA directly from TRIzol, TRI Reagent and all other
More informationmrna Isolation Kit for Blood/Bone Marrow For isolation mrna from blood or bone marrow lysates Cat. No
For isolation mrna from blood or bone marrow lysates Cat. No. 1 934 333 Principle Starting material Application Time required Results Key advantages The purification of mrna requires two steps: 1. Cells
More informationSupplemental Figure 1.
Supplemental Material: Annu. Rev. Genet. 2015. 49:213 42 doi: 10.1146/annurev-genet-120213-092023 A Uniform System for the Annotation of Vertebrate microrna Genes and the Evolution of the Human micrornaome
More informationRNA Labeling Kit. User Manual
RNA Labeling Kit User Manual RNA Labeling Kit The RNA Labeling Kit contains reagents to perform 10 transcription reactions (50 µl each) and 12 independent labeling reactions. Introduction and product description:
More informationOECD QSAR Toolbox v.4.1. Tutorial on how to predict Skin sensitization potential taking into account alert performance
OECD QSAR Toolbox v.4.1 Tutorial on how to predict Skin sensitization potential taking into account alert performance Outlook Background Objectives Specific Aims Read across and analogue approach The exercise
More informationEnsembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:
Comparative genomics and proteomics Species available Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Vertebrates: human, chimpanzee, mouse, rat,
More informationAccountability. User Guide
Accountability User Guide The information in this document is subject to change without notice and does not represent a commitment on the part of Horizon. The software described in this document is furnished
More informationThe Research Plan. Functional Genomics Research Stream. Transcription Factors. Tuning In Is A Good Idea
Functional Genomics Research Stream The Research Plan Tuning In Is A Good Idea Research Meeting: March 23, 2010 The Road to Publication Transcription Factors Protein that binds specific DNA sequences controlling
More informationExtrel is widely respected for the quality of mass spectrometer systems that are
Extrel is widely respected for the quality of mass spectrometer systems that are available to the world's top research scientists. In response to increasing requests for complete turn-key systems built
More information13.4 Gene Regulation and Expression
13.4 Gene Regulation and Expression Lesson Objectives Describe gene regulation in prokaryotes. Explain how most eukaryotic genes are regulated. Relate gene regulation to development in multicellular organisms.
More informationAlignment-free RNA-seq workflow. Charlotte Soneson University of Zurich Brixen 2017
Alignment-free RNA-seq workflow Charlotte Soneson University of Zurich Brixen 2017 The alignment-based workflow ALIGNMENT COUNTING ANALYSIS Gene A Gene B... Gene X 7... 13............... The alignment-based
More informationNINE CHOICE SERIAL REACTION TIME TASK
instrumentation and software for research NINE CHOICE SERIAL REACTION TIME TASK MED-STATE NOTATION PROCEDURE SOF-700RA-8 USER S MANUAL DOC-025 Rev. 1.3 Copyright 2013 All Rights Reserved MED Associates
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationProtein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.
Protein Synthesis Unit 6 Goal: Students will be able to describe the processes of transcription and translation. Protein Synthesis: Protein synthesis uses the information in genes to make proteins. 2 Steps
More informationPlease click the link below to view the YouTube video offering guidance to purchasers:
Guide Contents: Video Guide What is Quick Quote? Quick Quote Access Levels Your Quick Quote Control Panel How do I create a Quick Quote? How do I Distribute a Quick Quote? How do I Add Suppliers to a Quick
More informationFirefly Luciferase 1. ATP + Luciferin AMP + Oxyluciferin + Light (565 nm)
Berthold Detection Systems GmbH Bleichstrasse 56 68 D-75173 Pforzheim/Germany Phone: +49(0)7231/9206-0 Fax: +49(0)7231/9206-50 E-Mail: contact@berthold-ds.com Internet: www.berthold-ds.com Dual Luciferase
More informationRibo-Zero Magnetic Kit*
Ribo-Zero Magnetic Kit* (Bacteria) Cat. No. MRZMB126 (Contains 1 box of Cat. No. RZMB11086 and 1 box of Cat. No. MRZ116C) Cat. No. MRZB12424 24 Reactions (Contains 1 box of Cat. No. RZMB12324 and 1 box
More informationDEVELOP YOUR LAB, YOUR WAY
Technical brochure DEVELOP YOUR LAB, YOUR WAY The FLOW Solution: Expand your potential today Redesign your lab with the FLOW Solution The FLOW Solution is a highly flexible modular, semi-automated data
More informationRNA- seq read mapping
RNA- seq read mapping Pär Engström SciLifeLab RNA- seq workshop October 216 IniDal steps in RNA- seq data processing 1. Quality checks on reads 2. Trim 3' adapters (opdonal (for species with a reference
More information