Mac Unix & Perl
- On a Mac, plugged in drives appear as subdirectories in the special 'Volumes' directory.
- cd ~ and cd achieve the same thing: go to home directory (/Users/whatever)
- ls -l (L): list, the long version. ls -a: list all file(include hidden one).
- Space view next page, b view previous page,q to quit. j scroll down a line, k scroll up a line.(man man). h to bring up a help page.
- rm: remove file; rm -i *.txt: ask for confirmation before deleting; rmdir: remove a directory(empty), mkdir: generate a directory. mkdir -p Temp1/Temp2
- touch: to create a new, empty file. eg: touch heaven.txt.
- mv: move files. mv heaven.txt Temp/
- wildcard: ? caracterizing a single character, and * means everything that matches.
- cp: copy file. the default behavior of copy is to overwrite. cp -R Storage Storage1: copy directory.
- less: to view text files, not edit.
- ls -l: add a letter d to specify that it is a directory. ls -p use trailing slash.
- $ alias ls='ls -p' make ls equals ls -p. Aliases only exist in the current terminal session.
- nano profile: create a text editor file named profile. write into it and save(Ctrl + O):
# some useful command line short-cuts alias ls='ls -p' alias rm='rm -i'
$ source profile
to tell Unix to read the contents of a file and treat it as a series of Unix commands
- to show hidden files on mac:
- defaults write AppleShowAllFiles 1
- turn off hidden files on mac: ("killall finder" to see effect)
- defaults write AppleShowAllFiles 0
- Type: source /Volumes/USB/Unix_and_Perl_course/.profile each time when you open terminal.
- grep: search for matched lines, -v to invert. -i to ignore case. -c to count.
$ grep "ATGTGA" intro_IME_data.fasta | less
$ grep -i ACGTC * | head #show first 10 lines of matched item
$ head -n 1 chr1.fasta | sed 's/Chr1/Chromosome 1/' # head -n 1 means the first line. sed to substitute.concept of pipe. then press "/" to search some kind of pattern say "ATGTGA", "?" to search backward
- wc At_genes.gff; wc -l At_genes.gff: wc count lines, words, bytes, -l count only lines.
#!/usr/bin/perl # by Wade use warnings; $x = 3; print($x,"\n");
the first line tell us that we can type: $ instead of $ perl
- Any text between single quotes will print exactly as shown: print '$x $s\n' #=> $x $s\n
- if we turen on strict: use strict; it becomes mandatory to say whether the variable is a local or global variable.
my $pi = 3.14;
my: means this is a local variable.
- eq (equal to); ne (not equal to); gt (greater than); lt (less than); . (concatenation); cmp (comparison);
- print $x == $y ? "yes\n" : "no\n";
- method, length(); ord() : convert to number; chr() : convert to letters;
- =~ m// equals =~ // : matching; != // : not matching; =~ s/// : substitution(the first met) =~ s///g(global); =~ tr/// : transliteration;
die "non-DNA character in input\n" if ($input =~ /[efijlopqxz]/i);
die ... if syntax : to stop perl if necessary.
$sequence =~ tr/A-Z/a-z/;
push @animals, "fox";
my $length = @animals;
my @gene_names = qw(unc-10 cyc-1 act-1 let-7 dyf-2);
my $joined_names = join(", ", @gene_names);
my @digest = split("", $dna); # split at every possible position at $dna (string);- If you assign a list to a scalar variable, then the scalar variable becomes the length of the list.
- difference between :
- $length = @animals; # variable $length means the size of the array;
- ($length) = @animals; # list ($length) contains one element of array @animals;
- Array:
- pop(@array); shift(@array); push(@array, "element"); unshift(@array, "element"); splice();
- scalar(@array) : function that calculate the length of the array;
- index at 1.2, 1.7, .. rounded to 2. -1 means count from tail.
- @sorted_list = sort{$a <=> $b or uc($a) cmp uc($b)} @list;
- foreach $animal (@animals) {print "$animal\n"}
- for my $i (0..5) {print "$i\n"}
- 0 ""(null string) be considered false.
- next redo last => continue, redo, break;
- while(<>) equals while($_ = <>). chomp() function removes a \n character from the end of a line if present.
#!/usr/bin/perl # use strict; use warnings; open(IN, "<$ARGV[0]") or die "error reading $ARGV[0] for reading"; open(OUT, ">$ARGV[0].munge") or die "error creating $ARGV[0].munge"; while(<IN>) { chomp; my $rev = reverse $_; print OUT "$rev\n"; } close IN; close OUT:
- how to do I/Os: ; $! to store error messages. select handle. Perl now allows you to use a regular scalar as a filehandle.
- reverse() function both reverse arrays and strings.
#hash %genetic_code = ( ATG => 'Met'; AAA => 'Lys'; CCA => 'Pro'; ); foreach $key (keys %genetic_code) { print "$key $genetic_code{key}\n"; }
if (exists $genetic_code{AAA}) {print "AAA codon has a value\n"}
else {print "No values set for AAA codon\n"}
delete $genetic_code{AAA};The keys() function returns an array of keys, function values() returns an array of values.
- sort function:
- varible $&: The string matched by the last successful pattern match.
- uc(), lc() function: make string uppercase or lowercase:
my $str = "What is Perl Language for"; lc($str); print $str, "\n"; # displays: What is Perl Language for $str = lc($str); print $str, "\n"; # displays: what is perl language for
- \s matches whitespaces \S matches non-whitespaces. "=~ m/[^ATGC]/i" (negated character class)
if($text =~ m/A{1,3}/) {...} # matches between 1 and 3 As if($text =~ m/C{42}/) {...} # matches exactly 42 Cs if($text =~ m/T{6,}/) {...} # matches at least 6 Ts
# to match a "." you have to use backslash "\." eg. $sequence =~ m/A\. thaliana/ {...}
my @fields = split; # \s+ or $_ are assumed, and space-delimited - Regular expressions in list context return values from parenthesized patterns: my ($beg, $end) = $line =~ /(\d+)\.\.(\d+)/;
- file handler can only be processed one at a time. "If your inner loop is a filehandle iterator, then you will need to reset it."
