Comment
Variables
Values
Variable Typing
Example: Print a scalar
#!/usr/bin/perl -w # First we store the DNA in a variable called $DNA # Next, we print the DNA onto the screen # Finally, we'll specifically tell the program
to exit. |
Example: Concatenate a scalar
#!/usr/bin/perl -w # Store two DNA fragments into two variables called
$DNA1 and $DNA2 # Print the DNA onto the screen print $DNA1, "\n"; print $DNA2, "\n\n"; # Concatenate the DNA fragments into a third variable
and print them print "Here is the concatenation of the first two fragments (version 1):\n\n"; print "$DNA3\n\n"; # An alternative way using the "dot operator": print "Here is the concatenation of the first two fragments (version 2):\n\n"; print "$DNA3\n\n"; # Print the same thing without using the variable
$DNA3 print $DNA1, $DNA2, "\n"; exit; |
Operators
Arithmetic Operators
Numeric Comparisons
String Comparisons
Arrays are ordered collections of zero of more scalar values, indexed by position.
Array assignment
Accessing array elements
Array copy (using assignment operator)
#!/usr/bin/perl -w # Array copies # Initialize two arrays with same content print "--- Initial values of two arrays ---\n"; # Modify the first array print "--- New values of two arrays ---\n"; exit; |
Scalar vs List context
#!/usr/bin/perl -w # Demonstration of "scalar context" and "list context" @bases = ('A', 'C', 'G', 'T'); print "@bases\n"; $a = @bases; print $a, "\n"; ($a) = @bases; print $a, "\n"; exit; |
Array operators
#!/usr/bin/perl -w
@bases = ('A', 'C', 'G', 'T');
$base1 = shift @bases;
print "@bases";
output: ?
#!/usr/bin/perl -w
@bases = ('A', 'C', 'G', 'T');
unshift(@bases, 'U');
print "@bases";
output: ?
#!/usr/bin/perl -w
@bases = ('A', 'C', 'G', 'T');
$base1 = pop @bases;
print "@bases";output: ?
#!/usr/bin/perl -w
@bases = ('A', 'C', 'G', 'T');
push(@bases, 'U');
print "@bases";output: ?
#!/usr/bin/perl -w
@bases = ('A', 'C', 'G', 'T');
@reverse = reverse @bases;
print "@reverse";output: ?
#!/usr/bin/perl -w
@bases = ('A', 'C', 'G', 'T');
$len = scalar @bases;
print $len;output: ?
#!/usr/bin/perl -w
@bases = ('A', 'C', 'G', 'T');
splice (@bases, 2, 0, 'X');
print "@bases";output: ?
how about splice (@bases, 2, 1, 'X');
#!/usr/bin/perl -w
$bases = 'ACGT';
@bases=split('', $bases);
print "@bases";output: ?
#!/usr/bin/perl -w
@bases = ('A', 'C', 'G', 'T');
$bases=join('', @bases');
print "@bases";output: ?
#!/usr/bin/perl -w
@array = ('a', 'b', 'C',3, 1);
@sorted = sort (@array);
print "@sorted";output: ?
#!/usr/bin/perl -w # # Calculating the reverse complement of a strand of DNA using string # # The DNA $DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC'; print "Here is the starting DNA:\n\n$DNA\n\n"; # Calculate the reverse $revcom1 = reverse $DNA; # Calculate the complement $revcom1 =~ tr/ACGTacgt/TGCAtgca/; print "Here is the reverse complement DNA using STRING:\n\n$revcom1\n\n"; # # Calculating the reverse complement of a strand of DNA using array # # Split the DNA string into an array of characters @DNA = split('', $DNA); # Calculate the reverse @reverse = reverse @DNA; # Join the array of characters of the reverse $revcom2 = join('', @reverse); # Calculate the complement $revcom2 =~ tr/ACGTacgt/TGCAtgca/; print "Here is the reverse complement DNA using ARRAY:\n\n$revcom2\n"; |
A hash (also called an associative array) is a collection of zero or more pairs of scalar values, called keys and values
Hash assignment
Accessing Hash elements
Hash operators
#!/usr/bin/perl -w
%genes = ( 'gene1' => 'AACCCGGTTGGTT', 'gene2'=>'CCTTTDGGAAGGTC' );
@keys = keys %genes; @values = values %genes;
print "Keys are: @keys\n"; print "Values are: @values";
output: ?
#!/usr/bin/perl -w
%genes = ( 'gene1' => 'AACCCGGTTGGTT', 'gene2'=>'CCTTTDGGAAGGTC' );
%rev_genes = reverse %genes; @keys = keys %rev_genes; @values = values %rev_genes;
print "Keys are: @keys\n"; print "Values are: @values";
output: ?
what if there are duplicates in the values?
#!/usr/bin/perl -w
%genes = ( 'gene1' => 'AACCCGGTTGGTT', 'gene2'=>'CCTTTDGGAAGGTC' );
delete $genes{'gene1'}; @keys = keys %genes; @values = values %genes;
print "Keys are: @keys\n"; print "Values are: @values";
output: ?
Example: restriction enzyme hash
#!/usr/bin/perl -w # Restriction enzymes are proteins that cut DNA at short, specific sequences # e.g., EcoRI cuts where it finds GAATTC, between G and A # # Intialize restriction enzyme hash # keys are the names of restriction enzymes, values are the DNA sequence they cut # h %re_lookup = ( 'Eco47III'=> 'AGCGCT', 'EcoRI' => 'GAATTC', 'HindIII' => 'AAGCTT', ); print "Enter restriction enzyme name\n"; $re=<STDIN>; chomp $re; |
Example: Generic code
# # codon2aa # # A subroutine to translate a DNA 3-character codon to an amino acid # Version 3, using hash lookup sub codon2aa { my($codon) = @_; $codon = uc $codon; my(%genetic_code) = ( 'TCA' => 'S', # Serine 'TCC' => 'S', # Serine 'TCG' => 'S', # Serine 'TCT' => 'S', # Serine 'TTC' => 'F', # Phenylalanine 'TTT' => 'F', # Phenylalanine 'TTA' => 'L', # Leucine 'TTG' => 'L', # Leucine 'TAC' => 'Y', # Tyrosine 'TAT' => 'Y', # Tyrosine 'TAA' => '_', # Stop 'TAG' => '_', # Stop 'TGC' => 'C', # Cysteine 'TGT' => 'C', # Cysteine 'TGA' => '_', # Stop 'TGG' => 'W', # Tryptophan 'CTA' => 'L', # Leucine 'CTC' => 'L', # Leucine 'CTG' => 'L', # Leucine 'CTT' => 'L', # Leucine 'CCA' => 'P', # Proline 'CCC' => 'P', # Proline 'CCG' => 'P', # Proline 'CCT' => 'P', # Proline 'CAC' => 'H', # Histidine 'CAT' => 'H', # Histidine 'CAA' => 'Q', # Glutamine 'CAG' => 'Q', # Glutamine 'CGA' => 'R', # Arginine 'CGC' => 'R', # Arginine 'CGG' => 'R', # Arginine 'CGT' => 'R', # Arginine 'ATA' => 'I', # Isoleucine 'ATC' => 'I', # Isoleucine 'ATT' => 'I', # Isoleucine 'ATG' => 'M', # Methionine 'ACA' => 'T', # Threonine 'ACC' => 'T', # Threonine 'ACG' => 'T', # Threonine 'ACT' => 'T', # Threonine 'AAC' => 'N', # Asparagine 'AAT' => 'N', # Asparagine 'AAA' => 'K', # Lysine 'AAG' => 'K', # Lysine 'AGC' => 'S', # Serine 'AGT' => 'S', # Serine 'AGA' => 'R', # Arginine 'AGG' => 'R', # Arginine 'GTA' => 'V', # Valine 'GTC' => 'V', # Valine 'GTG' => 'V', # Valine 'GTT' => 'V', # Valine 'GCA' => 'A', # Alanine 'GCC' => 'A', # Alanine 'GCG' => 'A', # Alanine 'GCT' => 'A', # Alanine 'GAC' => 'D', # Aspartic Acid 'GAT' => 'D', # Aspartic Acid 'GAA' => 'E', # Glutamic Acid 'GAG' => 'E', # Glutamic Acid 'GGA' => 'G', # Glycine 'GGC' => 'G', # Glycine 'GGG' => 'G', # Glycine 'GGT' => 'G', # Glycine ); if(exists $genetic_code{$codon}) { return $genetic_code{$codon}; }else{ print STDERR "Bad codon \"$codon\"!!\n"; exit; } } # dna2peptide # # A subroutine to translate DNA sequence into a peptide sub dna2peptide { my($dna) = @_; use strict; use warnings; # Initialize variables my $protein = ''; # Translate each three-base codon to an amino acid, and append to a protein for(my $i=0; $i < (length($dna) - 2) ; $i += 3) { $protein .= codon2aa(substr($dna,$i,3) ); } return $protein; } print "Please enter your dna sequence:\n"; $dna = <STDIN>; $peptide = dna2peptide($dna); print "Here is the translated protein sequence: $peptide\n"; exit; |
How about modify about code to accomodate the 6 reading frames?
Some examples and perl scripts are adopted from the book Beginning Perl for Bioinformatics, James Tisdall, ISBN, 0-596-00080-4, 2001.