Python programming architecture

I need to give credit to Kenneth Reitz, who gives great instruction on this top.

Lesson 1: reproduce what Ken suggested

	
  • Fork Ken's github repository and make my own
  • When I am in ./sample/ folder, and try "make", it fails. Error was "OSError: [Errno 13] Permission denied: '/ddn/gs1/biotools/python/lib/python2.7/site-packages/nose'"
  • When I am in ./sample/ folder, and try "python setup.py install", it fails also. Error was " [Errno 13] Permission denied: '/ddn/gs1/biotools/python/lib/python2.7/site-packages/test-easy-install-147300.write-test'"
  • Topic 1: how can I solve this problem

    Using Structure Equation Model (SEM)

    What is SEM model:

  • SEM is a combination of factor analysis and multiple regression
  • It also goes by the aliases “causal modeling” and “analysis of covariance structure”
  • Special cases of SEM include confirmatory factor analysis and path analysis
  • The SEM can be divided into two parts. The measurement model is the part which relates measured variables to latent variables. The structural model is the part that relates latent variables to one another.

    The path analysis, which is SEM with no latent variables. In other words, path analysis is SEM with a structural model, but no measurement model. A structural equation modeling (SEM) has only single indicators are employed for each of the variables in the causal model.

    SEM model in genetics study, I am collecting some lecture notes on SEM.

    Karl Wuensch from ECU
    David Kenny
    Dr. William Revelle from North Western University 
    Dr. Brannick from USF has a good post on SEM vs. Path Analysis
    

    David Kenny presents a very thorough explanation of using SEM
    A very handy tutorial on SEM

    It turns out that Dr. William Revelle from North Western University has a class Psychology 454 syllabus with detail R and R package for SEM.

    In one of the lecture, he has a Lecture Note on “Latent Variable Modeling”.

    An R package lavaan was created for Latent Variable Analysis and it works perfectly fine for an SEM model as shown in the following diagram.

    model <- '
       # latent variables
         ind60 =~ x1 + x2 + x3
         dem60 =~ y1 + y2 + y3 + y4
         dem65 =~ y5 + y6 + y7 + y8
       # regressions
         dem60 ~ ind60
         dem65 ~ ind60 + dem60
       # residual covariances
         y1 ~~ y5
         y2 ~~ y4 + y6
         y3 ~~ y7
         y4 ~~ y8
         y6 ~~ y8
    '
    fit <- sem(model,
               data=PoliticalDemocracy)
    summary(fit)
    

    Sometimes, there could be data-related “NOT converged” error! It is apparently data-dependent case. But, my model

      mod1 <- '
       SOX17_lev =~ GATA2_lev + PGR_lev
       IHH  ~ SOX17_lev '
    

    happens to be like this!!

    There has been a lot of resource from USGS — Dr. James Grace, who provides basic to advanced features and applications of SEM in his research.

    Getting a word-cloud image with R

    Since I made a word-cloud image for a collaborator’s project, it caught people’s attention. It seems to be a good way to display science research in a vivid way. I would like to document this effort for future reference.

    I need to give the credit to Rshiny-Application, which gets me interested producing the word cloud in R. 
    There are a few libraries are quite popular for making a word-cloud: library(tm), library(SnowballC), library(wordcloud), library(memoise).
    
    
    

    I have found some websites that helped me to learn about word-cloud. I’d like to document them for others and give the authors credit for their works.

    The one from data science does not work for me. It generates error like "Error in simple_triplet_matrix(i, j, v, nrow = length(terms), ncol = length(corpus),  : 
      'i, j' invalid"
    Here is another one
    Another package is called wordcloud2 with a few examples
    

    Calling somatic variant

    Came up with a paper on BMC genoics comparing five different somatic snp callers.

    GATK UnifiedGenotyper in NaiveSubtract
    MuTect1
    SomaticSniper from the original paper
    Installation help document for Strelka with updated user's guide and a quick start guide
    And the original paper on somatic-sniper
    VarScan2 from the original paper
    Classical samtools method also works.
    

    Calling variants with samtools/bcftools

    Help from samtools protocol
    Help from samtools/bcftools protocol
    

    Group project with github

    I need to work with Yicheng on a project on my github. From my existing repository, I need to make a branch and work with him

    	
  • I found out a help post, and I am following it step-by-step
  • git checkout -b Rproj_w_yicheng
  • git checkout Rproj_w_yicheng
  • git remote add ocri3-w-yicheng https://github.com/ImageRecognitionMaster/myOCRI-iii
  • git branch
  • git branch -D Rproj_w_yicheng
  • git push origin :Rproj_w_yicheng
  • Here is the end results.