1. Existing pipeline look up
2. Any additional improvement
3. Submit job directly from lsrc cluster
4. GATK unsolved issue (read group)
5. Revisit proposal Jessica sent earlier Pipeline_upgrade_plan
6. It turned out that we are not going to spend 30 computing hours into our existing pipeline. (shall keep this in the back of our mind for the future GATK integration)
OK, updating pipeline detail at internal bioinformatics wiki
Getting ready for the new pipeline
Step 1 Copy the working scripts /nfs/chgv/seqpipe01/SOFTWARE/alignment_scripts/bwa_seqpipe01.pl over to lsrc under subversion control
Step 2 Copy the working “b37″ script to lsrc under subversion control: bwa59_b37_v1.pl
Step 3 Comparing to existing pipeline bwa55_b36_v1.pl and make changes accordingly.
Step 3. Working on reference setting on dscr
Major updates:
1. Reference genome b36 –> b37
2. Software updatea. bwa (0.5.9)
1) -I for sanger’s scoring system in “aln” command
2) -P in “sampe” command
b. samtools (0.1.12a and above)
c. Picard
d. Erds (1.0.3)3. Variant calling procedure
a. pileup –> mpileup
b. samtools format –> vcf format
c. bco files/directory generation4. Coverage computation
a. exon concordance (export from sva)
b. script to convert5. Data transfer
a. rsync –> aspera