Learning and using mitomap

Today, I registered with mitomap and started to learn how to use it.

It is quite interesting that deletion often occurs at the direct repeat, i.e.

DelJunction    DelSize  RepeatLocation          RepeatType 
10058:14593	-4534	10059-10063/14593-14597	D, 5/5
10169:14435	-4265	10161-10165/14424-14428	D, 5/5

In the first case, the deletion starts at 10058, right before the first repeat (10059-10063); and ends at 14593, right before the second “direct repeat” starts. As a result, one repeat (10059-10063) disappears.

In the second case, the deletion starts at 10169, after the first repeat (10161-10165); and ends at 14435, which contains that the second “direct repeat”. As a result, one repeat (14424-14428) disappears.

So, to better understand the scenario helps to implement the analytical processes.

There is a Perl module developed by the team called MitoMaster. It turns out to be a good tool

Now, with our mitochondrial project, we can search for possible “repeats” that flank a deletion.

For a given deletion junction: 10058:14593

  • Case I: search up to 15 bp after break starts (10059 - 10074), against up to 15 bps right after the break ends: 14593
  • Case II: search up to 15 bp before break starts (10043- 10058), against up to 15 bps before after the break ends: 14578 - 14593
  • For each case, "number of repeat base" = min (15 bp, lengthOfDel)

I got help from tcoffee for local alignment Perl scripts.

While working on this project, I had experienced MUCH hassle with blat and blat output, especially to determine the “deletion” start and stop. It is also has something to do with the “zero” base coordinates or “one” base coordinates across different systems. Therefore, I created a separate post, hopefully could help me to get the bottom of this issues.

I found a very interesting alignment with G451E

In this directory: /ddn/gs1/home/li11/project2014/Copeland/IlluminaData/blatApp/gapHeatMap/
Issues this command: awk -F”\t” ‘{ if ($18 ==3 && $8 == 16054 ) print $1, “\t”, $8 ,”\t”, $9, “\t”, $10, “\t”, $18 , “\t”, $19, “\t”,$20 , “\t”, $21 }’ G451E_R1.psl

I found:

108 16054 – @HWI-M01825:53:000000000-A8MJM:1:1101:17663:13886 3 50,4,55, 191,241,245, 148,200,16256,
295 16054 – @HWI-M01825:53:000000000-A8MJM:1:1117:6130:11630 3 69,4,227, 0,69,73, 129,200,16256,
295 16054 + @HWI-M01825:53:000000000-A8MJM:1:2117:21755:9298 3 178,4,118, 0,178,182, 20,200,16256,

Based on my deletion detection rule:

diff1 = 200 - (148 + 4)  = 48
diff1 = 200 - (129 + 4)  = 67
diff1 = 200 - (20 + 4)   = 176

diff2 = 16256 - (200 + 50)   = 16006
diff2 = 16256 - (200 + 69)   = 15987 
diff2 = 16256 - (200 + 178)  = 15878 

 
totalDelLength = 16006 + 48   = 16054
totalDelLength = 15987 + 67   = 16054
totalDelLength = 15878 + 176  = 16054

#First one:

if (diff1 > 15)  ==> deletion is [153, 200]
DelStart = (148 + 4)  + 1 
DelEnd   = 200

if (diff2 > 15)  ==> deletion is [16055, 16256]
DelStart = 16006 + 48  + 1
DelEnd   = 16256

#Second one: 
if (diff1 > 15)  ==> deletion is [134, 200]
DelStart = (129 + 4)  + 1 
DelEnd   = 200

if (diff2 > 15)  ==> deletion is [16055, 16256]
DelStart = 15987 + 67  + 1
DelEnd   = 16256

#Third case:
if (diff1 > 15)  ==> deletion is [25, 200]
DelStart = (20 + 4)  + 1 
DelEnd   = 200

if (diff2 > 15)  ==> deletion is [16055, 16256]
DelStart = 15878 + 176  + 1
DelEnd   = 16256