Review on samtools :-)

It is time to review samtools, since there have been many new releases since v0.1.7, samtools.

Jessica Maia has been comparing the features and analysis results run through samtools verion 1.7 vs. v.1.12, although v1.13 (changed the order of genotypes displayed in the variant files to conform to the Variant Call Format 4.1 spec) and 1.14 (gender aware variant calls on the X chromosome)have already come out.

mpileup has been a good tool to call SNP and indel report in vcf format, need to make sure it works in my hand.

A copy of samtools v.12a is located: /nfs/seqsata01/ALIGNMENT/Software/samtools-0.1.12a

To get bam file added RG is giving me much headache!

It is quite confusing, among samtools, awk, qsub…

This command seems to work from command line: samtools view -h /nfs/testrun/GATK-recal/samples/dukscz0106/precal/Run1/s.5.bam | cat /nfs/testrun/GATK-recal/samples/dukscz0106/precal/rg_temp.txt – | awk ‘{ if (substr($1,1,1)==”@”) print; else printf “%stRG:Z:gan”,$0; }’ | samtools view -uS – | samtools rmdup – – | samtools rmdup -s – /nfs/testrun/GATK-recal/samples/dukscz0106/precal/aln.bam

BUT, not seems to work it it gets wrapped into the qsub. Problem solved maybe some weird characters.

OK, let’s try the following steps:

1. Sort the individual .bam file: echo “$sam_dir/samtools sort $run_dir/s.$lane.bam $run_dir/sorted.Run1.$lane”

2. Merge and add @RG information: $samtoolsCMD merge -rh /nfs/testrun/GATK-recal/samples/dukscz0106/precal/rg.txt $mergedBAM $dataDir/s.1.bam $dataDir/s.2.bam $dataDir/s.3.bam $dataDir/s.4.bam $dataDir/s.5.bam $dataDir/s.6.ba
m $dataDir/s.7.bam $dataDir/s.8.bam

3. Remove PCR duplication:
#Picard
java -jar -Xmx14g $picard_dir/MarkDuplicates.jar TMP_DIR=$combined_dir VALIDATION_STRINGENCY=SILENT INPUT=$combined_bam OUTPUT=$combined_rmdup_bam METRICS_FILE=$duplicate_metrics REMOVE_DUPLICATES=true MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=5000000 VERBOSIT
=WARNING ASSUME_SORTED=true

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.