I had found very confused FastQC report for Rick’s miRNAseq experiment. I did not understand and wanted to confirm what I have seen.
I downloaded the data in Kuchen, S et al paper on GenomeResearch, 2010. Using SRR042443.sra as an example to evaluate the data quality. Detail see command_01112013.txt
Using SRR042443.fastq as an example
Step 1: fastq-dump to unzip the data: Step 2: fastQC analysis Step 3: I saw similar results as our in-house, but I did not see the level of "over-represented sequence" matched 100% to RNA PCR primer sequences. Step 4: Discussed with Adam, David and Pierre and decided to perform "adapter" sequencing trimming.
Strategy:
Step 1. Remove "primer sequence" by JYL (fastx-toolkit) or by AB (cutadapt) Step 2. Align to miRBase (failed with Primer Sequence removed input file) Step 3. Test alignment with raw file, while trying to figure out the error.