Calling somatic variant

Posted on June 7, 2017 by Jin Tong

Came up with a paper on BMC genoics comparing five different somatic snp callers.

GATK UnifiedGenotyper in NaiveSubtract
MuTect1
SomaticSniper from the original paper
Installation help document for Strelka with updated user's guide and a quick start guide
And the original paper on somatic-sniper
VarScan2 from the original paper
Classical samtools method also works.

Calling variants with samtools/bcftools

Help from samtools protocol
Help from samtools/bcftools protocol

Rmarkdown

Posted on June 6, 2017 by Jin Tong

Working with Rmarkdown seems quite beneficial, as one can keep good record and Rmarkdown produces html or pdf report.

Group project with github

Posted on May 24, 2017 by Jin Tong

I need to work with Yicheng on a project on my github. From my existing repository, I need to make a branch and work with him

	I found out a help post, and I am following it step-by-step
	git checkout -b Rproj_w_yicheng
	git checkout Rproj_w_yicheng
        git remote add ocri3-w-yicheng https://github.com/ImageRecognitionMaster/myOCRI-iii
	git branch
	git branch -D Rproj_w_yicheng
	git push origin :Rproj_w_yicheng

Here is the end results.

Python help docs

Posted on May 9, 2017 by Jin Tong

Here are a few python doc with high regards.

Official python3 online doc

NGS consolidated software/tools collection

Posted on May 9, 2017 by Jin Tong

bcbio-nextgen has the high reputation but it is hard to set up.
Getting Genetics Done by Stephen Turner on WES coverage with bedtools

Learning ggplot

Posted on February 14, 2017 by Jin Tong

ggplot manual is available

Goal 1: ggplot without grid — someone has the post
Goal 2: Modify with no legend
Goal 3: Add x/y labels

Goal 4: Plot SNP panels with multiple samples

Do some researching and found this post is very helpful and can be used as the starting point

Goal 5: save a vector figure, .emf

p <- ggplot (someObject)
usage: ggsave ("someFigure.emf", p)

Working with EMMA

Posted on January 23, 2017 by Jin Tong

It has been a while, but I need to make sure that EMMA works for Dr. Kleeberger’s project

Understand the population genetics

A hands on tutorial here.

Understand the Mixed model

plink provides many good tools.

Implementation with R — I encountered some hurdles with R programming and would like to document them here before I forget.

If fctr.cols are the names of your factor columns, to convert
them to characters

X[, fctr.cols] <- sapply(X[, fctr.cols], as.character)

distances <- matrix(1:25, nrow=5, ncol=5)
Now, 
apply(distances, c(1, 2), function(x) 0)
distances[] <- 0L
distances*0

Viewing/reporting the results

I found a very useful and handy help doc on Manhattan Plot

Protected: Stat help doc

Posted on January 19, 2017 by Jin Tong

GWAS using EMMA

Posted on November 29, 2016 by Jin Tong

A good note on using EMMA

Or, I can get EMMAx from Ed Burk’s GAPIT bundle.
Here FastMap has moved to a new location
Where is snpster??

Fact about EMMA

Kang,H published EMMA in 2008 on Genetics

Kang,H extended EMMA to EMMAX

Zhou, present the EMMA in GEMMA. It seems to be very similar to EMMA without citing EMMA’s work.
GenABEL seems to quite popular in the community.

FastMap note

I need to take a note on this

Help Heather with PCA

Posted on October 21, 2016 by Jin Tong

Hi Jianying:

I hope you are doing well and having a nice week so far.  I hate to
 bother you with this question, but depending on whether I will have 
the data to do so, I may be running a principal component analysis 
on some past data from the lab.  I need to speak with Dr. Kleeberger
tomorrow morning so that I can clarify some things, but if possible,
would you possibly be available to meet sometime tomorrow afternoon or 
anytime on Friday so that I can discuss the principal component 
analysis?  I just want to be sure I do it correctly and I would feel 
better knowing I had your expertise.  If you do not have the time, 
though, please do not worry and know that I understand .  
Thank you for your consideration, regardless.

Sincerely,
Heather

Well, I found a very good lecture note from PennState on Principal Component Analysis