Python help docs

Posted on May 9, 2017 by Jin Tong

Here are a few python doc with high regards.

Official python3 online doc

NGS consolidated software/tools collection

Posted on May 9, 2017 by Jin Tong

bcbio-nextgen has the high reputation but it is hard to set up.
Getting Genetics Done by Stephen Turner on WES coverage with bedtools

Learning ggplot

Posted on February 14, 2017 by Jin Tong

ggplot manual is available

Goal 1: ggplot without grid — someone has the post
Goal 2: Modify with no legend
Goal 3: Add x/y labels

Goal 4: Plot SNP panels with multiple samples

Do some researching and found this post is very helpful and can be used as the starting point

Goal 5: save a vector figure, .emf

p <- ggplot (someObject)
usage: ggsave ("someFigure.emf", p)

Working with EMMA

Posted on January 23, 2017 by Jin Tong

It has been a while, but I need to make sure that EMMA works for Dr. Kleeberger’s project

Understand the population genetics

A hands on tutorial here.

Understand the Mixed model

plink provides many good tools.

Implementation with R — I encountered some hurdles with R programming and would like to document them here before I forget.

If fctr.cols are the names of your factor columns, to convert
them to characters

X[, fctr.cols] <- sapply(X[, fctr.cols], as.character)

distances <- matrix(1:25, nrow=5, ncol=5)
Now, 
apply(distances, c(1, 2), function(x) 0)
distances[] <- 0L
distances*0

Viewing/reporting the results

I found a very useful and handy help doc on Manhattan Plot

Protected: Stat help doc

Posted on January 19, 2017 by Jin Tong

GWAS using EMMA

Posted on November 29, 2016 by Jin Tong

A good note on using EMMA

Or, I can get EMMAx from Ed Burk’s GAPIT bundle.
Here FastMap has moved to a new location
Where is snpster??

Fact about EMMA

Kang,H published EMMA in 2008 on Genetics

Kang,H extended EMMA to EMMAX

Zhou, present the EMMA in GEMMA. It seems to be very similar to EMMA without citing EMMA’s work.
GenABEL seems to quite popular in the community.

FastMap note

I need to take a note on this

Help Heather with PCA

Posted on October 21, 2016 by Jin Tong

Hi Jianying:

I hope you are doing well and having a nice week so far.  I hate to
 bother you with this question, but depending on whether I will have 
the data to do so, I may be running a principal component analysis 
on some past data from the lab.  I need to speak with Dr. Kleeberger
tomorrow morning so that I can clarify some things, but if possible,
would you possibly be available to meet sometime tomorrow afternoon or 
anytime on Friday so that I can discuss the principal component 
analysis?  I just want to be sure I do it correctly and I would feel 
better knowing I had your expertise.  If you do not have the time, 
though, please do not worry and know that I understand .  
Thank you for your consideration, regardless.

Sincerely,
Heather

Well, I found a very good lecture note from PennState on Principal Component Analysis

Getting all the bolts and nuts ready

Posted on August 17, 2016 by Jin Tong

I am trying to create a workable environment for data analysis. Now on my list I have the following to be installed on my windows machine

Anaconda (maybe conda also)

I installed the Anoconda(v1.4.0) on the user level

Jupyter

I followed the link and installed jupyter
I can launch "jupyter notebook" and see the web on localhost:8888,
But, I do not see the coding environment available

Zipline

It is a little bit into it to get zipline installed. So, I tested using conda with “sub-environment”. zipline only works with python2.7!!

C:\Users\li11>conda create -n ForZipline python=2.7 biopython
(ForZipline) C:\Users\li11>conda intall -c Quantopian zipline

Using jupyter

Thank goes to our system admin and Anaconda. Both Python2 and Python3 have been installed on my windows machines. Now I can launch the “jupyter” directly. From now on, I will stick with the jupyter IDE (Interactive Development Environment).

Protected: Study note

Posted on August 12, 2016 by Jin Tong

R basic

Posted on May 12, 2016 by Jin Tong

In this post, I will document all basic for R and R programming.

Scenario 1, useful R introduction manuals and websites

	CRAN
	Thomas Girke -- USC

Scenario 2, differentiating matrix and dataframe

A matrix is a two-dimensional data structure. All the elements of a matrix must be of the same type (numeric, logical, character, complex).
A data frame combines features of matrices and lists. In fact we can think of a data frame as a rectangular list, that is, a list in which all items have the length length. The items of the list serve as the columns of the data frame, so every item within a particular column has to be of the samne type. However, different columns can be of different types. 
Matrix -- Dataframe

Scenario 3, how to combine two matrix by rownames

It seems easy and legitimate question, but not everyone knows it. I found one pretty decent solution


cbind.fill <- function(x, y){
  xrn <- rownames(x)
  yrn <- rownames(y)
  rn <- union(xrn, yrn)
  xcn <- colnames(x)
  ycn <- colnames(y)
  if(is.null(xrn) | is.null(yrn) | is.null(xcn) | is.null(ycn)) 
    stop("NULL rownames or colnames")
  z <- matrix(NA, nrow=length(rn), ncol=length(xcn)+length(ycn))
  rownames(z) <- rn
  colnames(z) <- c(xcn, ycn)
  idx <- match(rn, xrn)
  z[!is.na(idx), 1:length(xcn)] <- x[na.omit(idx),]
  idy <- match(rn, yrn)
  z[!is.na(idy), length(xcn)+(1:length(ycn))] <- y[na.omit(idy),]
  return(z)
}

Scenario 4, I want to have a thorough note on apply function in R

There was a simple question on R apply function. Although it seems quite straightforward, it causes lots of confusion for people. Therefore, I decide to write a thorough document for this.

Scenario 5, Invoking R from the command line

Here gives a good example to invoke R from the linux command line.

Scenario 6, Use R to perform clustering and then produce a heatmap

Good example-by-mannheimia
Example of savvi by Earl Glynn

onlinenote

Category Archives: Uncategorized