A good wiki site to start off with
The same question has been asked before. Answer? don’t know
Another post on seqanswers comparing different aligners.
SAM file tag format numeric tag explained.
And, picard provides online sam tag check
SAM format SAM_format helps to extract “perfect match”.
With special tags to BWA alignment here
I got help from our collaborator, Florian my own note
To get all flag counts:
samtools view ES583_miRbaseMature_bwa.bam | grep -v "@" | awk -F"\t" 'BEGIN{print "flag\toccurrences"} {a[$2]++} END{for(i in a)print i"\t"a[i]}' flag occurrences 4 2109951 20 1096 0 1190274 16 10244
Now, let’s focus on flag 0
Scenario I
NNNNNNNNNNN TGAGGTAGTAGGTTGTATAGTTNNNNNNNNNNN pppppppppppGTGAGGTAGTAGGTTGTATAGTT
With one base off : NM:i:1
With one mismatch in the alignment : XM:i:1
With one ambiguous base in the reference: XN:i:1
Scenario II, same as I, but reported differently by BWA
NNNNNNNNNNN TGAGGTAGTAGGTTGTATAGTTNNNNNNNNNNN pppppppppppGTGAGGTAGTAGGTTGTATAGTT (in scenario I) pppppppppppTTGAGGTAGTAGGTTGTATAGTT
With no base off : NM:i:0
With no mismatch in the alignment : XM:i:0
With one ambiguous base in the reference: XN:i:1
Scenario III, same as I, but shifted to the right
NNNNNNNNNNN TGAGGTAGTAGGTTGTATAGTTNNNNNNNNNNN pppppppppppGTGAGGTAGTAGGTTGTATAGTT (in scenario I) ppppppppppp TGAGGTAGTAGGTTGTATAGTTT
With one base off : NM:i:1
With one mismatch in the alignment : XM:i:1
With one ambiguous base in the reference: XN:i:1
Scenario IV, a perfect case
NNNNNNNNNNNTGAGGTAGTAGGTTGTATAGTTNNNNNNNNNNN pppppppppppTGAGGTAGTAGGTTGTATAGTT
With no base off : NM:i:0
With no mismatch in the alignment : XM:i:0
With no ambiguous base in the reference: XN:i:0