Worknote-09-2014

Posted on September 29, 2014 by Jin Tong

Samseq

Posted on September 23, 2014 by Jin Tong

This is a competition to our EPIGseq, but it is very intuitive that researchers have shifted their focus from microarray to RNAseq. I am learning how it works and read the research paper by Li & Tibshirani

There is a rough but very interesting webportal in Japan, which lists all the R packages/functions at CRAN, R release, and Bioconductor

Comparing perl and R for implementing the FDR

Posted on September 13, 2014 by Jin Tong

Here, I am trying to implement Benjamini

Wikipedia provides the write up

The Original paper published in 1995

Perl has a module: Multtest on cpan, it implemented what implemented in an R module: p.adjust

Implementation in R is as following:

        i <- lp:1L
        o <- order(p, decreasing = TRUE)
        ro <- order(o)
        pmin(1, cummin(n/i * p[o]))[ro]

Given a sequence of example p-values in the original paper, R produces:

pval <- c(0.0001 ,0.0004, 0.0019, 0.0095, 0.0201, 0.0278, 0.0298, 0.0344, 0.0459, .03240, 0.4262, 0.5719, 0.6528, 0.7590, 1)
p.adjust (pval, "BH")
 [1] 0.00150000 0.00300000 0.00950000 0.03562500 0.05733333 0.05733333 0.05733333 0.05733333
 [9] 0.06885000 0.05733333 0.58118182 0.71487500 0.75323077 0.81321429 1.00000000

Therefore, FDR is control at 0.05. p(4) = 0.0356 < 0.05

Dispersion estimation

Posted on September 8, 2014 by Jin Tong

When dealing with count data, one consideration is the dispersion. Here is a good post on modeling count data regression in R

Here is a link on dispersion with a few kernels

A comment based on McCullagh and Nelder, with a online course from pennstate explains on the theoretical level. With a special dedication to an R session here

To estimate the dispersion, it is as simple as fit a GLM model with link function chosen accordingly

onlinenote

Monthly Archives: September 2014

Worknote-09-2014

Samseq

Comparing perl and R for implementing the FDR

Dispersion estimation