Here are a few python doc with high regards.
Official python3 online doc
Here are a few python doc with high regards.
Official python3 online doc
bcbio-nextgen has the high reputation but it is hard to set up.
Getting Genetics Done by Stephen Turner on WES coverage with bedtools
ggplot manual is available
Goal 1: ggplot without grid — someone has the post
Goal 2: Modify with no legend
Goal 3: Add x/y labels
Goal 4: Plot SNP panels with multiple samples
Do some researching and found this post is very helpful and can be used as the starting point
Goal 5: save a vector figure, .emf
p <- ggplot (someObject) usage: ggsave ("someFigure.emf", p)
It has been a while, but I need to make sure that EMMA works for Dr. Kleeberger’s project
Understand the population genetics
A hands on tutorial here.
Understand the Mixed model
plink provides many good tools.
Implementation with R — I encountered some hurdles with R programming and would like to document them here before I forget.
If fctr.cols are the names of your factor columns, to convert them to characters
X[, fctr.cols] <- sapply(X[, fctr.cols], as.character)
distances <- matrix(1:25, nrow=5, ncol=5) Now, apply(distances, c(1, 2), function(x) 0) distances[] <- 0L distances*0
Viewing/reporting the results
I found a very useful and handy help doc on Manhattan Plot
Or, I can get EMMAx from Ed Burk’s GAPIT bundle.
Here FastMap has moved to a new location
Where is snpster??
Fact about EMMA
FastMap note
I need to take a note on this
Hi Jianying: I hope you are doing well and having a nice week so far. I hate to bother you with this question, but depending on whether I will have the data to do so, I may be running a principal component analysis on some past data from the lab. I need to speak with Dr. Kleeberger tomorrow morning so that I can clarify some things, but if possible, would you possibly be available to meet sometime tomorrow afternoon or anytime on Friday so that I can discuss the principal component analysis? I just want to be sure I do it correctly and I would feel better knowing I had your expertise. If you do not have the time, though, please do not worry and know that I understand . Thank you for your consideration, regardless. Sincerely, Heather
Well, I found a very good lecture note from PennState on Principal Component Analysis
I am trying to create a workable environment for data analysis. Now on my list I have the following to be installed on my windows machine
Anaconda (maybe conda also)
I installed the Anoconda(v1.4.0) on the user level
Jupyter
I followed the link and installed jupyter I can launch "jupyter notebook" and see the web on localhost:8888, But, I do not see the coding environment available
Zipline
It is a little bit into it to get zipline installed. So, I tested using conda with “sub-environment”. zipline only works with python2.7!!
C:\Users\li11>conda create -n ForZipline python=2.7 biopython (ForZipline) C:\Users\li11>conda intall -c Quantopian zipline
Using jupyter
Thank goes to our system admin and Anaconda. Both Python2 and Python3 have been installed on my windows machines. Now I can launch the “jupyter” directly. From now on, I will stick with the jupyter IDE (Interactive Development Environment).
In this post, I will document all basic for R and R programming.
Scenario 1, useful R introduction manuals and websites
Scenario 2, differentiating matrix and dataframe
A matrix is a two-dimensional data structure. All the elements of a matrix must be of the same type (numeric, logical, character, complex). A data frame combines features of matrices and lists. In fact we can think of a data frame as a rectangular list, that is, a list in which all items have the length length. The items of the list serve as the columns of the data frame, so every item within a particular column has to be of the samne type. However, different columns can be of different types. Matrix -- Dataframe
Scenario 3, how to combine two matrix by rownames
It seems easy and legitimate question, but not everyone knows it. I found one pretty decent solution
cbind.fill <- function(x, y){ xrn <- rownames(x) yrn <- rownames(y) rn <- union(xrn, yrn) xcn <- colnames(x) ycn <- colnames(y) if(is.null(xrn) | is.null(yrn) | is.null(xcn) | is.null(ycn)) stop("NULL rownames or colnames") z <- matrix(NA, nrow=length(rn), ncol=length(xcn)+length(ycn)) rownames(z) <- rn colnames(z) <- c(xcn, ycn) idx <- match(rn, xrn) z[!is.na(idx), 1:length(xcn)] <- x[na.omit(idx),] idy <- match(rn, yrn) z[!is.na(idy), length(xcn)+(1:length(ycn))] <- y[na.omit(idy),] return(z) }
Scenario 4, I want to have a thorough note on apply function in R
There was a simple question on R apply function. Although it seems quite straightforward, it causes lots of confusion for people. Therefore, I decide to write a thorough document for this.
Scenario 5, Invoking R from the command line
Here gives a good example to invoke R from the linux command line.
Scenario 6, Use R to perform clustering and then produce a heatmap
Good example-by-mannheimia Example of savvi by Earl Glynn