New pipeline against build37 with GATK integrated

1. Existing pipeline look up
2. Any additional improvement
3. Submit job directly from lsrc cluster
4. GATK unsolved issue (read group)
5. Revisit proposal Jessica sent earlier Pipeline_upgrade_plan
6. It turned out that we are not going to spend 30 computing hours into our existing pipeline. (shall keep this in the back of our mind for the future GATK integration)

OK, updating pipeline detail at internal bioinformatics wiki

Getting ready for the new pipeline

Step 1 Copy the working scripts /nfs/chgv/seqpipe01/SOFTWARE/alignment_scripts/bwa_seqpipe01.pl over to lsrc under subversion control
Step 2 Copy the working “b37″ script to lsrc under subversion control: bwa59_b37_v1.pl
Step 3 Comparing to existing pipeline bwa55_b36_v1.pl and make changes accordingly.
Step 3. Working on reference setting on dscr

Major updates:

1. Reference genome b36 –> b37
2. Software update

a. bwa (0.5.9)
1) -I for sanger’s scoring system in “aln” command
2) -P in “sampe” command
b. samtools (0.1.12a and above)
c. Picard
d. Erds (1.0.3)

3. Variant calling procedure

a. pileup –> mpileup
b. samtools format –> vcf format
c. bco files/directory generation

4. Coverage computation

a. exon concordance (export from sva)
b. script to convert

5. Data transfer

a. rsync –> aspera

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>