Monthly Archives: May 2011
Shell command to print the number of occurrences of chromosomes
The following is a text file containing human genetics data.
http://svn.software-carpentry.org/swc/data/1000gp.vcf
Here is the command to print the number of occurrences of chromosomes from the above data file: grep -v "^#" 1000gp.vcf| cut -f 1 | sort | uniq -c Here a pipeline symbol '|' is used to combine several commands into a single one. A pipe uses output of the command on its left side as the input to its right side. Now let's see the use of each keyword in the command. grep -v "^#" is used to print the lines other than header files,i.e lines which are not starting with "#" from the given file.cut -f 1 is used to select only the first field from its input.sort is used to sort the lines in the ascending order.uniq -c is used to print the number of occurrences by omitting the repeated value. Thus the final result will be the list of unique chromosomes ,each of with their number of occurrences.