Linux note

Basic command

To remove write permission for the group: chmod -R g-w .

To compare two files: diff file1 file2

To count how many files in a directory: ls | wc –l

To extract a few lines from a file: sed –n 1, 10p filename > newfile

To run a shell script: chmod u+rwx yourfile.sh; ./yourfile.sh

To download file from http portal without password: wget –r –nd –np

To download file from http portal with password: wget –http-user = –http-password= –r –nd -np

To get unique entry: uniq

To create a symbolic link: ln –s [TARGET DIRECTORY OR FILE] ./[SHORTCUT]

To sort file: sort

To get to a person’s home directory on topsail, which is inaccessible, we can try through: /ifs1/scr/someone/

To modify environmental variable: go to home directory , open .bash_profile, then edit this file i.e. “PATH=$PATH:/ifs1/home/ferizs/lbgapps/bin

To copy the whole directory: cp –r /dir/ .

To kill a process: ps -ef | grep “jyli”, then kill -9 “process id”

To check an installed package: rpm –q “package name”

To put process into background:
1. ctrl-z
2. bg put it to the background
3. fg brings it up to the foreground

A user case to exclude redundancy in a file

To get SINGLE PLACEMENT reads
1. Work off .mapView files with unix command “uniq”
2. Chop the mapview file leave only three columns “cut -f 2,3,4 s_1_mapview.txt > s_1_mapview.txt.cut”
3. Extract unique rows only “uniq -u s_1_mapview.txt.cut s_1_mapview.txt.cut.uniq”
4. Extract just the first single copy of duplicate rows “uniq -d s_1_mapview.txt.cut s_1_mapview.txt.cut.dup”
5. Concatenate “.uniq” and “.dup” file and get a total count of the rows: “cat *.dup *.uniq > s_1_mapview.txt.cut.final” and “wc -l s_1_mapview.txt.cut.final”
6. Should check out “samtools” command

To exclude the first line of a file

tail -n +2 file > newfile

To exclude/remove blank lines from a text file, use the following command:

$ sed ‘/^$/d’ input.txt > output.txt
$ grep -v ‘^$’ input.txt > output.txt
$ strings input.txt > output.txt

The credit goes to I love Linux

To ssh across two linux systems without password:


From system one:

1. Go to home directory, “cd” will do it, and “pwd” confirms
2. cd .ssh
3. ssh-keygen -t dsa
4. Press “enter” 3 times
5. scp id_dsa.pub /dest/authorized_keys

From another system (two), do exactly same process. This way, you have achieved your goal.

Difference between “scp” and “rsync”:

“scp” copies and overwirtes the files on the destination. Also, if the network is interrupted, it will stop the process and loose track.

“rsyn” checks the time print on the destination, if the file at the destination has the same time stamp, it won’t copy. Or it will copy and/or update the destination.

Use “rsyn -av source destination”

Download files in Linux

Altough it seems simple, sometimes a small road block can through people out. Let’s take a look options to download:

  • “wget” seems to be the first choice
  • Sometimes, I have to use “–no-check-certificate” to download from an “unauthorized” questionable sites
  • We do have “rsync”, “aspera”, and who know what else is available out there
  • “aria” is another good options, in a sense that it can break download into small chunks
  • About aria:

    	
  • "# yum install aria2" gets me the tool
  • aria2c -x 4 "url"
  • check out aria wiki for details
  • Weird situation, since we don’t have the “sge” environment, we have to rely on linux detach way to run long job.

    Scenario 1, using “nohup”

    Example: Printing lines to both standard output & standard error

    while(true)
    do
    echo “standard output”
    echo “standard error” 1>&2
    sleep 1;
    done

    Execute the script without redirection

    $ nohup sh custom-script.sh &
    [1] 12034
    $ nohup: ignoring input and appending output to `nohup.out’

    $ tail -f nohup.out
    standard output
    standard error
    standard output
    standard error

    Execute the script with redirection

    $ nohup sh custom-script.sh > custom-out.log &
    [1] 11069
    $ nohup: ignoring input and redirecting stderr to stdout

    $ tail -f custom-out.log
    standard output
    standard error
    standard output
    standard error
    ..

    check processing status

    $ ps aux | grep li11

    Send an email in linux

    Send a via "Email_content" as the email content with Title
    
    
    mail -s "Email title" li11@niehs.nih.gov < Email_content
    To email a file as attachment
    echo "Email body" | mutt -a command_02132012.txt li11@niehs.nih.gov

    Echo “exclamation mark

    echo “something!” will generation the following error:
    It was because the ONLY single quote protect some special character. So,
    echo ‘something!’ will do the trick. Thanks to the post here

    It turns out that I have resize the /tmp/ on seqbig, which was set as default. I did plenty of research and found link. Here is what I did

    You can raise the size limit in /etc/fstab:

    tmpfs                  /dev/shm      tmpfs     size=20G,nr_inodes=10k  0      0
    

    Then remount it:

    # mount -o remount /dev/shm
    

    Be careful with the size, though. Since it exists in RAM, you don’t want a tmpfs partition to be bigger than your RAM, otherwise the big bad OOM killer will come along and start assassinating your processes.

    Download website recursively

    I found out a very good web page for teaching R, from Syracuse University bio793. I really like it and wanted to download all the pages.
    I found this video by an Indian guy. It is very helpful.
    The command to use is: “wget –random-wait -r -p -e robots=off -U Mozilla www.mozilla.com”

    In fact, there is a more straightforward note from pure linux documentation on download web with wget.

    As a test case, I followed this struction and downloaded Dr. Dicky’s ST512 note. It was very successful.

    wget 
    --recursive 
    --no-clobber 
    --page-requisites 
    --html-extension 
    --convert-links 
    --restrict-file-names=windows 
    --domains ncsu.edu 
    --no-parent http://www.stat.ncsu.edu/people/dickey/courses/st512/index.html
    

    Modify file header

    I encountered a very annoy situation. I have 277 files, some files have “DI” as header but some have “DNA_Index” as header. Now, how can I modify them under linux shell??

  • If it is adding line to the last, it can be easily appended with “echo” command.
  • But, I wanted to add to the begining!!
  • This is much better
  • So, the solution is two step

    	
  • Step 1, remove the header: tail -n +2 intial_file > temp
  • Add a new header: sed -i '1i\'"DNA_Index" temp
  • Copy it back: mv temp initial_file
  • for k in `ls *.csv`; do tail -n +2 $k > temp; sed -i '1i\'"DNA_Index" temp; mv temp $k; done;
  • I wanted to kill jobs running

    	
  • List job id: ps aux | grep li11 | cut -d" " -f6
  • for k in `ps aux | grep li11 | cut -d" " -f6`; do kill -9 $k; done;
  • R sessions: for k in `ps aux | grep li11 | grep "R" | cut -d" " -f6 `; do echo $k; kill -9 $k; done;