<- back to BioInfoClub website

BioInfo Club - useful tips

Diverse tips

** to write tab or enter characters in the shell
press Ctrl+V first and then the special character. "Enter" is represented by "^M"

Useful commands


$head  ctd.txt

shows the first 10 lines

$head -n 2 *.pdb
shows the first 2 lines

$history | tail -n 15
shows the 15 most recent items in your command history

$tail -n +2 *.txt
shows from the second line to the end

$head -n -1 *.txt
shows from the second line to the 10th line

prints out the lines containing the characters
$grep ">" *.fasta

$grep "\-122" ctd.txt
searches for a negative number

chows only a count of the results

shows only the lines that do not match the pattern. Inverted search.

ignore case

Use regular expressions. Terms should be in quotes, use [] to indicate a character range, use [[:space:]] for \s, [[:digit:]] for \d.

Show line number of the matches

searches for a nearly exact match.

-d "\>"
uses > as a delimiter between records rather than end-of-line

-B -y
returns only the best match
$agrep -B -y -d "\>" CYG FPexcerpt.fta

returns results with up to this many mismatches between query and record. Maximum allowed is 8.

only lists filenames that contain a match

case-insensitive search


$cut -f 1,3 *.txt
returns columns 1 and 3 delimited by tabs

$cut -f 1-3 *.txt
returns columns 1 to 3 delimited by tabs

$cut -c 16-20,30 *.txt
returns characters 16 to 20  and 30 from each line

$grep ">" *.fta | cut -c 2-11
prints out the gene names

$head *.txt | cut -f 5,7 -d ","
returns columns 5 and 7. These are delimited by , in the original file and in the output.


$grep ">" *.fasta | sort

sorts by numerical value rather than alphabetically

makes all lines uppercase before sorting

sorts in reverse order

-k 3
sorts lines based on column 3 , with columns delimited by space or tab
$head *.txt | sort -k 3
-t ","
uses commas for delimiters

returns a unique representative of repeated items

removes identical lines that are in immediate succession and keeps a single line.

counts the number of occurrence of each unique line and write it before each unique line
$cut -c 12-21 ctd.txt | uniq -c

-f 4
ignores the first 4 fields (columns delimited by any number of spaces) in determining uniqueness

ignore case when determining uniqueness