It seems like a promising R package on this.
Installation needs a little more attention. Here are steps in details:
#Dependencies install.packages("data.table") source("http://www.bioconductor.org/biocLite.R") biocLite("GenomicRanges") #Installation download.file("http://methylkit.googlecode.com/files/methylKit_0.5.7.tar.gz",destfile="methylKit_0.5.7.tar.gz") install.packages("methylKit_0.5.7.tar.gz",repos=NULL,type="source") unlink("methylKit_0.5.7.tar.gz")
Well, two hurdles were blocking me
1. I can't get the CpG island annotation for mouse from UCSC, when I followed the instruction :For CpG island annotation, select "Expression and Regulation" from the "group" drop-down menu. Following that, select "CpG islands" from the "track" drop-down menu. Select "BED- browser extensible data" for the "output format". Click "get output" and on the following page click "get BED" without changing any options. save the output as a text file.2. Our .sam file was not properly sorted. One extra step is need which is grep -v '^[[:space:]]*@' raw.sam | sort -k3,3 -k4,4n > sorted.sam 3. It turns out that this "sorting" is taking too much of the /tmp/ space to the extend that no server is designed to handle this problem, except wine. Thanks go to Frank who allows me to access wine2 temporarily. Both servers work with this just fine.
It turns out that we have three separate libraries and we have both raw and de-duplicated .sam files. So, firstly I need to combine those three files before anything can be done further. Solutions though
Basically, I need to
Generate deduplicated, merged-library, Picard-sorted & reordered SAM files for each animal
Let’s try samtools
- Convert .sam to .bam: samtools view -S in.sam -bo out.bam
- Sort .bam file: samtools sort out.bam out.bam.sorted
- Merge multiple .bam files: samtools merge merged.bam 1.out.bam.sorted 2.out.bam.sorted 3.out.bam.sorted
- Sort merged .bam file: samtools sort merged.bam merged.sorted.bam
Now, let’s take a look at picard tools
- Convert .sam to .bam:
- Merge multiple .bam files
- sort them
What about unix command, a quite simple unix solution
- Remove headers
tail -n +42 B6_M_1.L2x4.mm9.raw_bismark_deduplicated.sam temp_B6_M_1.L2x4.mm9.raw_bismark_deduplicated.sam
tail -n +51 B6_M_1.L3x13.mm9.raw_bismark_deduplicated.sam temp_B6_M_1.L3x13.mm9.raw_bismark_deduplicated.sam
- Concatenate them
cat B6_M_1.L1x2.mm9.raw_bismark_deduplicated.sam temp_B6_M_1.L2x4.mm9.raw_bismark_deduplicated.sam temp_B6_M_1.L3x13.mm9.raw_bismark_deduplicated.sam > B6_M_1.merged.sam- Convert and sort them with picard tools
picard-tools-1.42/SortSam.jar INPUT=hello.sam OUTPUT=hello.sorted.sam CREATE_INDEX=false SO=coordinate COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=7500000 TMP_DIR=.
picard-tools-1.42/SortSam.jar INPUT=hello.sorted.sam OUTPUT=hello.sorted.bam CREATE_INDEX=true SO=coordinate COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=7500000 TMP_DIR=.