When I tried to install GD module, I had error saying lsdgp 2.0.28 or higher is required.
Frank helped me with a command “yum install GD” 🙂
To get some modules work, sometimes one has to take care of the dependencies. i.e. Font::TTF
When I tried to install GD module, I had error saying lsdgp 2.0.28 or higher is required.
Frank helped me with a command “yum install GD” 🙂
To get some modules work, sometimes one has to take care of the dependencies. i.e. Font::TTF
At Steve’s lab meeting, I came across with this paper, published on Genome Research. It seems to be a good piece of software with full length documentation. Circos has been used in many scientific publications. However, to make it work, it is not trivial.
I am trying to document what I have and hopefully, I can make it working in my hand. Fingers crossed 🙂
Since I am the root on SeqBig, I am testing the installation on this machine. Really, the goal would be if I could create a virtualmachine so that I can give it to whoever wants to use it.
Step 1. Get all the CPAN modules installed. Not trivial though..
Honestly, it did not work out as well as I expected. Therefore, ONLY way I made it work is using my virtual machine.
Here is another related page that has more information.
In fact, there are many groups who are consistently searching for tools for draw a circular genome results like circos, i.e. a paper published on Bioinformatics call circleator does similar plots also.
Okay, I am trying to help a researcher for her project involving the MARCO genes. She was interested in the transcription binding site upstream of this gene.
I asked my college and got his help with ARE domain analysis. And, I also found this post done by Dave Tang ??
It is very helpful document.
To plot a sequence logo, researchers in the this community tend to use weblogo to plot
Here is the deal, I need to parse the “blat” output with two, three, and even four blocks to figure out where the deletion is exactly located:
Sample file format match T gap strand block blockSizes qStarts tStarts bases count 298 9962 + 3 69,205,24, 1,70,275, 4132,14162,14368, 299 63 - 3 5,107,188, 0,5,112, 2932,2999,3107, 298 1 + 2 136,163, 1,137, 2970,3107, 213 1 - 2 172,42, 82,257, 137,310, 4-blocks are yet to be implemented in the future
For 2-block positive strand case
298 1 + 2 136,163, 1,137, 2970,3107, 298 1 + 2 147,152, 1,148, 2299,2447, 299 96 + 2 293,6, 1,294, 14350,14739,
Solution is:
3107 - 2970 - 136 = 1 --> [3107, 3107] DelStart = 2970 + 136 + 1 [3107] DelEnd: = 2970 + 136 + 1 [3107] # Match stops at 3106 (2970 + 136), with 3107 missing, another match starts at 3108 2447 - 2299 - 147 = 1 --> [2447, 2447] DelStart = 2299 + 147 + 1 [2447] DelEnd: = 2299 + 147 + 1 [2447] # Match stops at 2446 (2299 + 147), with 2447 missing, another match starts at 2448 14739 - 14350 - 293 = 96 --> [14644, 14739] DelStart = 14350 + 293 + 1 [14644] DelEnd: = 14350 + 293 + 96 [14739] # Match stops at 14643 (14350 + 293 ), with [14644-14739] missing, another match starts at 14740
For 2-block negative strand case, same as POSITIVE strand !!!
298 402 - 2 7,293, 0,7, 263,672, #Query:@HWI-M01825:53:000000000-A8MJM:1:1101:16256:1461 672 - 7 - 263 = 402 --> [264, 270] [271, 672] # Match starts at 264 and stops at 270 with 7 bps aligned, breaks [271, 672] , match start at 673 # As a result ==> break starts at 271 (= 263 + 7 + 1), with 402 long deletion Therefore: DelStart: 263 + 7 + 1 [271] DelEnd: 263 + 7 + 402 [672] BLAT results details as following: ---------------------------------------------------------------- Side by Side Alignment 0001 ggggagggggtgatctaaaacactctttacgccggcttctattgacttgg 0050 <<<< |||||||||||||||||||||||||||||||||||||||||||||||||| <<<< 0965 ggggagggggtgatctaaaacactctttacgccggcttctattgacttgg 0916 0051 gttaatcgtgtgaccgcggtggctggcacgaaattgaccaaccctggggt 0100 <<<< |||||||||||||||||||||||||||||||||||||||||||||||||| <<<< 0915 gttaatcgtgtgaccgcggtggctggcacgaaattgaccaaccctggggt 0866 0101 tagtatagcttagttaaactttcgtttattgctaaaggttaatcactgct 0150 <<<< |||||||||||||||||||||||||||||||||||||||||||||||||| <<<< 0865 tagtatagcttagttaaactttcgtttattgctaaaggttaatcactgct 0816 0151 gtttcccgtgggggtgtggctaggctaagcgttttgagctgcattgctgc 0200 <<<< |||||||||||||||||||||||||||||||||||||||||||||||||| <<<< 0815 gtttcccgtgggggtgtggctaggctaagcgttttgagctgcattgctgc 0766 0201 gtgcttgatgcttgtcccttttgatcgtggtgatttagagggtgaactca 0250 <<<< ||||||||||||||| |||||||||||||||||||||||||||||||||| <<<< 0765 gtgcttgatgcttgttccttttgatcgtggtgatttagagggtgaactca 0716 0251 ctggaacgggggtgcttgcatgtgtaatcttactaagagctaa 0293 <<<< ||||||||||| ||||||||||||||||||||||||||||||| <<<< 0715 ctggaacggggatgcttgcatgtgtaatcttactaagagctaa 0673 ------------------------------------------------------------------- 0294 tggaaag 0300 <<<< ||||||| <<<< 0270 tggaaag 0264 -------------------------------------------------------------------
More on negative cases:
213 1 - 2 172,42, 82,257, 137,310, 298 1 - 2 37,262, 0,37, 3069,3107, 298 222 - 2 293,6, 1,294, 11425,11940, 298 4939 - 2 239,60, 1,240, 717,5895, 310 - 172 -137 = 1 --> [138, 309] ==> break starts at 310 (137 + 172 + 1 ) with 1 bp long deletion DelStart: 137 + 172 + 1 [310] DelEnd: 137 + 172 + 1 [310] 3107 - 37 -3069 = 1 --> [3070, 3106] ==> break starts at 3107 (3069 + 37 + 1) with 1 bp long deletion DelStart: 3069 + 37 + 1 [3107] DelEnd: 3069 + 37 + 1 [3107] 11940 - 293 - 11425 = 222 --> [11426, 11939] ==> braak starts at 11719 (11425 + 293 + 1) with 222 bps long deletion DelStart: 11425 + 293 + 1 [11719] DelEnd: 11425 + 293 + 222 [11940] 5895 - 239 - 717 = 4939 --> [718, 5894 ] ==> break starts at 957 (717 + 239 + 1) with 4939 bps long deletion DelStart: 717 + 239 + 1 [957] DelEnd: 717 + 239 + 4939 [5895]
Solution is: there was one-base off !!!
Solution is:
Sample read:
Hs_Mito_ref:
Case: 3-blocks positive strand
298 9962 + 3 69,205,24, 1,70,275, 4132,14162,14368, Score E Sequences producing significant alignments: (bits) Value gi|251831106|ref|NC_012920.1|_ChrM 437 e-122 gi|251831106|ref|NC_012920.1|_ChrM 135 7e-32 >gi|251831106|ref|NC_012920.1|_ChrM Length = 16569 Score = 437 bits (1127), Expect = e-122 Identities = 229/230 (100%) Strand = Plus / Plus Query: 71 caatctcaattacaatatatacaccaacaaacaatgttcaaccagtaactactactaatc 130 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 14163 caatctcaattacaatatatacaccaacaaacaatgttcaaccagtaactactactaatc 14222 Query: 131 aacgcccataatcatacaaagcccccgcaccaataggatcctcccgaatcaaccctgacc 190 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 14223 aacgcccataatcatacaaagcccccgcaccaataggatcctcccgaatcaaccctgacc 14282 Query: 191 cctctccttcataaattattcagcttcctacactattaaagtttaccacaaccaccaccc 250 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 14283 cctctccttcataaattattcagcttcctacactattaaagtttaccacaaccaccaccc 14342 Query: 251 catcatactctttcacccacagcac-aatcctacctccatcgctaacccc 299 ||||||||||||||||||||||||| |||||||||||||||||||||||| Sbjct: 14343 catcatactctttcacccacagcaccaatcctacctccatcgctaacccc 14392 Score = 135 bits (348), Expect = 7e-32 Identities = 69/69 (100%) Strand = Plus / Plus Query: 2 catacccccgattccgctacgaccaactcatacacctcctatgaaaaaacttcctaccac 61 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 4133 catacccccgattccgctacgaccaactcatacacctcctatgaaaaaacttcctaccac 4192 Query: 62 tcaccctag 70 ||||||||| Sbjct: 4193 tcaccctag 4201
Another 3 block examples with both breaks greater than 15
Score E Sequences producing significant alignments: (bits) Value gi|251831106|ref|NC_012920.1|_ChrM 472 e-133 gi|251831106|ref|NC_012920.1|_ChrM 92 7e-19 gi|251831106|ref|NC_012920.1|_ChrM 14 2e+05 >gi|251831106|ref|NC_012920.1|_ChrM Length = 16569 Score = 472 bits (1219), Expect = e-133 Identities = 244/246 (99%) Strand = Plus / Plus Query: 48 caattctccgatccgtccctaacaaactaggaggcgtccttgccctattactatccatcc 107 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 15582 caattctccgatccgtccctaacaaactaggaggcgtccttgccctattactatccatcc 15641 Query: 108 tcatcctagcaataatccccatcctccatatatccaaacaacaaagcataatacttcgcc 167 ||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||| Sbjct: 15642 tcatcctagcaataatccccatcctccatatatccaaacaacaaagcataatatttcgcc 15701 Query: 168 cactaagccaatcactttattgactcctagccgcagacctcctcattctaacctgaatcg 227 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 15702 cactaagccaatcactttattgactcctagccgcagacctcctcattctaacctgaatcg 15761 Query: 228 gaggacaaccagtaagctacccttttaccatcattggaccagtagcatccgtactatact 287 ||||||||||||||||||||||||||||||||||||||| |||||||||||||||||||| Sbjct: 15762 gaggacaaccagtaagctacccttttaccatcattggacaagtagcatccgtactatact 15821 Query: 288 tcacaa 293 |||||| Sbjct: 15822 tcacaa 15827 Score = 92 bits (237), Expect = 7e-19 Identities = 47/47 (100%) Strand = Plus / Plus Query: 1 catcccgatggtgcagccgctattaaaggttcgtttgttcaacgatt 47 ||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 3004 catcccgatggtgcagccgctattaaaggttcgtttgttcaacgatt 3050 Score = 14 bits (36), Expect = 2e+05 Identities = 7/7 (100%) Strand = Plus / Plus Query: 294 cctgtcc 300 ||||||| Sbjct: 15885 cctgtcc 15891
Case: 3-blocks negative strand
296 10788 - 3 113,182,5, 0,113,295, 5320,16167,16403, Sequences producing significant alignments: (bits) Value gi|251831106|ref|NC_012920.1|_ChrM 347 1e-95 gi|251831106|ref|NC_012920.1|_ChrM 216 2e-56 gi|251831106|ref|NC_012920.1|_ChrM 10 3e+06 >gi|251831106|ref|NC_012920.1|_ChrM Length = 16569 Score = 347 bits (895), Expect = 1e-95 Identities = 179/182 (98%) Strand = Minus / Plus Query: 187 ccaatccacatcaaaaccccctcctcatgcttacaagcaagtacagcaatcaaccctcaa 128 |||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||| Sbjct: 16168 ccaatccacatcaaaaccccctccccatgcttacaagcaagtacagcaatcaaccctcaa 16227 Query: 127 ctatcacacatcaactgcaactccaaagtcacccctcacccattaggataccaacaaacc 68 |||||||||||||||||||||||||||| ||||||||||||| ||||||||||||||||| Sbjct: 16228 ctatcacacatcaactgcaactccaaagccacccctcacccactaggataccaacaaacc 16287 Query: 67 tacccacccttaacagtacatagtacataaagccatttaccgtacatagcacattacagt 8 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 16288 tacccacccttaacagtacatagtacataaagccatttaccgtacatagcacattacagt 16347 Query: 7 ca 6 || Sbjct: 16348 ca 16349 Score = 216 bits (558), Expect = 2e-56 Identities = 112/113 (99%) Strand = Minus / Plus Query: 300 catcaccctccttaacctttacttctacctacgcctaatctactccacctcaatcacact 241 |||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||| Sbjct: 5321 catcaccctccttaacctctacttctacctacgcctaatctactccacctcaatcacact 5380 Query: 240 actccccatatctaacaacgtaaaaataaaatgacagtttgaacatacaaaac 188 ||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 5381 actccccatatctaacaacgtaaaaataaaatgacagtttgaacatacaaaac 5433 Score = 10 bits (25), Expect = 3e+06 Identities = 5/5 (100%) Strand = Minus / Plus Query: 5 catcc 1 ||||| Sbjct: 16404 catcc 16408
Case: 3-blocks negative strand — bad example
275 13167 - 3 82,132,62, 24,106,238, 3024,3107,16405, Output: BLASTN 2.2.11 [blat] Reference: Kent, WJ. (2002) BLAT - The BLAST-like alignment tool Query= @HWI-M01825:53:000000000-A8MJM:1:1101:23159:10038 (300 letters) Database: ../../../reference/hs_mit_seq.fasta 1 sequences; 16,569 total letters Searching.done Score E Sequences producing significant alignments: (bits) Value gi|251831106|ref|NC_012920.1|_ChrM 405 e-113 gi|251831106|ref|NC_012920.1|_ChrM 122 6e-28 >gi|251831106|ref|NC_012920.1|_ChrM Length = 16569 Score = 405 bits (1044), Expect = e-113 Identities = 213/215 (99%) Strand = Minus / Plus Query: 276 attaaaggttcgtttgttcaacgattaaagtcctacgtgatctgagttcagaccggagta 217 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 3025 attaaaggttcgtttgttcaacgattaaagtcctacgtgatctgagttcagaccggagta 3084 Query: 216 atccaggtcggtttctatctac-ttcaaattcctccctgtacgaaaggacaagagaaata 158 |||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||| Sbjct: 3085 atccaggtcggtttctatctacnttcaaattcctccctgtacgaaaggacaagagaaata 3144 Query: 157 aggcctacttcacaaagcgccttcccccgtaaatgatatcatctcaacttagcattatac 98 |||||||||||||||||||||||||||||||||||||||||||||||||||| ||||||| Sbjct: 3145 aggcctacttcacaaagcgccttcccccgtaaatgatatcatctcaacttagtattatac 3204 Query: 97 ccacacccacccaagaacagggtttgttaagatgg 63 ||||||||||||||||||||||||||||||||||| Sbjct: 3205 ccacacccacccaagaacagggtttgttaagatgg 3239 Score = 122 bits (315), Expect = 6e-28 Identities = 62/62 (100%) Strand = Minus / Plus Query: 62 tcctccgtgaaatcaatatcccgcacaagagtgctactctcctcgctccgggcccataac 3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 16406 tcctccgtgaaatcaatatcccgcacaagagtgctactctcctcgctccgggcccataac 16465 Query: 2 ac 1 || Sbjct: 16466 ac 16467
Here is one , but we are missing biolperl-ext package.
This one gives me error.
An hand coded Smith & Waterman algorithm. It is clear!