MIE453 - Bioinformatics Systems (Fall 06)

Tutorial 3 - Loops, Control & Subroutines

Contents

  1. Statements & Blocks
  2. Loops
  3. Logical Operators & Flow Control
  4. Subroutines
  5. Unix Basic (from Software Carpentry): 1, 2

1. Statements & Blocks

Statements

Blocks

2. Loops

Loops repeatedly execute the statements in a block until a conditional test changes value.

While Loop

continually execute a block of code as long as the condition continues to be true.

Syntax

while ( CONDITION ) {
  block of code to execute while condition is true
}

Control Flow

  1. Evaluate CONDITION,
  2. If CONDITION is true, execute the block, and return to step 1,
  3. otherwise do nothing and the loop is over

Notice: that the block may be executed 0 or more times.

Example

# This program print integers
# 1,2,3 in reverse order
my $x = 3;
while ($x > 0) {
	print "$x";
	$x = $x-1;
}
print ¡°\n¡±;

Special operators: to modify the flow of loop.

Example

# This program print integers in specified range
my $x = 10;
while (1) {
	print "$x\n";
	if ($x == 12) {
		last;
	} 
	$x++;
}

For Loop

Like while loop, it executes a block of code as long as the condition holds true. It differs from the while loop, in that it
incorporates the initialization and increment of the condition variable within the condition statement.

Syntax

for (INITIALIZATION; CONDITION; INCREMENT ) {
  block of code to execute while condition is true
}

Control Flow

  1. Evaluate INITIALIZATION,
  2. Evaluate CONDITION,
  3. If CONDITION is true, execute the block, and then evaluate INCREMENT and return to step 2,
  4. otherwise do nothing and the loop is over

Notice: that the block may be executed 0 or more times.

Example

# This program print integers
# 1,2,3 in reverse order
for ($x = 3; $x > 0; $x = $x-1) {
	print "$x";
}
print "\n";

This program is exact the same as:

my $x = 3;
while ($x > 0) {
	print "$x";
	$x = $x-1;
}
print ¡°\n¡±;

Foreach

a convinient way to iterate through the elements in an array.

Syntax

foreach VARIABLE (LIST) {
  block of code to execute for each element in LIST
}

Control Flow

  1. Set VARIABLE to next element of LIST,
  2. if above step succeeded, execute the block, return to step 1,
  3. otherwise do nothing and the loop is over

Notice: that the block may be executed 0 or more times.

Example

# This program print elements of an array
@array = ('one', 'two', 'three');
foreach $element (@array){
	print "$element\n";
}

Notice: if in the block you change the value of the loop variable $element, the array is changed and the change stays in effect after you've left the foreach loop.

Example

@array = ('one', 'two', 'three');
foreach $element (@array){
	$element = 'four';
}
foreach $element (@array){
	print "$element\n";
}

Exercise: rewrite above code without using foreach.

Do-until loop

execute the block before the conditional test.

Syntax

do {
  block of code to execute at least once
} until ( CONDITION)

Control Flow

  1. Execute the block
  2. Evaluate CONDITION,
  3. If CONDITION is true, return to step 1,
  4. otherwise do nothing and the loop is over

Notice: that the block is always executed at least once.

Example

# This program print integers
# 1,2,3 in reverse order
$i = 3
do {
	print $i, "\n";
    $i--;
}until($i);

3. Logical Operators & Flow Control

Conditional Test

In a conditional test, an expression evalutes to True or False.

Using numeric and string operators

Other expressions

All most all other expressions in Perl returns some value, so you can use them in Perl in conditional test

Example Result
"" False
'abc' True
() False
(1=>'A', 2=>'G', 3=>'C', 4=>'T') True
0 False
-13.5 True

Logical Operators

Logical operators are used to build complex expressions out of simple ones.

There are four logical operators in Perl

not operator (!)

unary operator, returns the opposite of its operant

Example Result
not 'abc' False
not $my_gene # undefined True

and operator (&&)

binary operator, returns the True if both operants are True, otherwise return False

Example Result
0 and 'abc' False
('ACG') and 'ACG' True

or operator (||)

binary operator, returns the Fale if both operants are False, otherwise return True

Example Result
0 or 'abc' True
('ACG') or 'ACG' True

xor operator

exclusive-or, binary operator, returns the True if exactly one of the operants is True, otherwise return False

Example Result
0 xor 'abc' True
('ACG') xor 'ACG' False
() xor 0 False

Example: Use a loop to read data from file

#!/usr/bin/perl -w
# Reading protein sequence data from a file using a loop

# The filename of the file containing the protein sequence data
$proteinfilename = 'fragment.pep';

open(PROTEINFILE, $proteinfilename);

# Read the protein sequence data from the file one line at a time and print it
while ($protein = <PROTEINFILE>) {
	print $protein;
}

close PROTEINFILE;
exit;

Example: Use a loop to read data from keyboard

#!/usr/bin/perl -w
# Reading protein sequence data using a loop

# Read the protein sequence data from the file one line at a time and print it
while ($protein = <STDIN>) {
	chomp($protein);
	if ($protein eq 'quit') {
		exit;
 	}
	print "$protein\n";
}

Example: Operator Chaining

#!/usr/bin/perl -w
# Show how to use opertor chaining in flow control 

# Read the protein sequence data from the file one line at a time and print it
while ($protein = <STDIN>) {
	chomp($protein);
	$protein or die "Done.\n";
	print "Sequence entered: $protein\n";
}

If Statement

continually execute a block of code as long as the condition continues to be true.

Syntax

if ( CONDITION1 ) {
  BLOCK1
} elsif ( CONDITION2 ) {
  BLOCK2
} else {
  BLOCK3
}

Control Flow

  1. Evaluate CONDITION1,
  2. If CONDITION1 is true, execute BLOCK1, and exit the if statement,
  3. otherwise evaluate CONDITION2,
  4. if CONDITION2 is true, execute BLOCK2, and exit the if statement,
  5. otherwise evaluate BLOCK3

Notice

Example: A conditional statement

#!/usr/bin/perl -w
# if-elsif-else

$word = 'MNIDDKL';

# if-elsif-else conditionals
if($word eq 'QSTVSGE') {
    print "QSTVSGE\n";
} elsif($word eq 'MRQQDMISHDEL') {
    print "MRQQDMISHDEL\n";
} elsif ( $word eq 'MNIDDKL' ) {
    print "MNIDDKL-the magic word!\n";
} else {
    print "Is \"$word\" a peptide? This program is not sure.\n";
}

Example: Valid DNA sequence

#!/usr/bin/perl -w

# Read the protein sequence data from STDIN and print it
while ($protein = <STDIN>) {
	chomp($protein);
	if (!$protein) {
		die "Done.\n";
	} elsif ($protein =~ /^[ACGTacgt]*$/) {
		$protein = uc $protein;
		print "Sequence entered: $protein\n";
	} else {
		print "Invalid sequence!\n";
	}
}

4. Subroutines

Subroutines

A subroutine encapsulates a piece of code, which can be "invoked" in other code.

Syntax:

sub SUBROUTINE_NAME {
  BLOCK
} 

Subroutine Invocation

a subroutine is called by using the name of the subroutine followed by a list of arguments enclosed in parentheses.

SUBROUTINE_NAME(ARG_LIST);

A subroutine receives all its arguments in the special variable @_

Example: Subroutine

#!/usr/bin/perl -w

# Subroutine
sub concatenate_dna {
	my($dna1, $dna2) = @_;
	
	my($new_dna);

	$new_dna = "$dna1$dna2";

	return $new_dna;
}

print concatenate_dna('AAA', 'CGC'),"\n";

exit;

Notice: keyword my means $dna1 and $dna2 have local scope (i.e., only visible inside the subroutine)

Array as argument

If any arrays are given as arguments, their elements are interpolated into the @_ list:

Example: Array as argument

#!/usr/bin/perl -w

# Array as argument
sub concatenate_dna {
	my(@genes) = @_;
	
	my($new_dna);

	$new_dna = $genes[0].$genes[1];

	return $new_dna;
}

my @array = ('AAA', 'CGC');
print concatenate_dna(@array),"\n";

exit;

Example: Array as argument

#!/usr/bin/perl -w

# Array as argument
sub concatenate_dna {
	my(@genes) = @_;
	
	my($new_dna);

	$new_dna = $genes[0].$genes[1].$genes[2];

	return $new_dna;
}

my @array = ('AA', 'CC');
print concatenate_dna(@array, 'GG'),"\n";

exit;

Pass by Reference

In above example, the elements of arrays (also true for hashes) get "flattened" out into the single @_ array.

Use Pass by Reference to avoid this situation.

Example: Pass by Reference

#!/usr/bin/perl -w

# Array as argument
sub concatenate_dna {
	# In Pass by Reference, arguments are collected
	# from the @_ array, and saved as scalar variable.
	# This is because a reference is a special kind of 
	# scalar data value. 
	my($genes1_ref, $genes2_ref) = @_;
	
	# We need to dereference the referenced arguments 
	# before using them, by prepending the symbol showing
	# the type of referenced arguments (e.g., @ for arrays).
	my(@genes1) = @$genes1_ref;
	my(@genes2) = @$genes2_ref;
 
	my($new_dna);

	$new_dna = $genes1[0].$genes1[1].$genes2[0].$genes2[1];

	return $new_dna;
}

my @genes1 = ('AA', 'CC');
my @genes2 = ('GG', 'TT');

# To call a subroutine pass by reference, preface the argument 
# names with a backslash.
print concatenate_dna(\@genes1, \@genes2),"\n";

exit;

Scoping and Strictness

By default, every variable in Perl is publically visible.

Local Scoping

Use my keyword to declare variables with local scope.

Use our keyword to declare the variable with global scope

Strictness

Perl is considered a "loose" programming language.

Some ¡®¡®golden rules¡¯¡¯ of the traditional programming language are not enforce by default:

Put use strict; directive near the top of the program to enforce these "golden rules".

Notice:

Line Input Operator: <>

This operator is usually in the conditional of a while loop as follows:

while (<STDIN>) {
    print;
}

Which is equivalent to:

while (defined($_ = <STDIN>)) {
    print $_ ;
}

$_ is a special variable used by Perl to automatically get assigned from the line input operator.

Sometimes, this operator is used alone as shown below:

while (<>) {
    print $_; 
}

Which is equvalent to:

# "-" returns the standard input when opened.
# So we assume STDIN iff empty
@ARGV = ('-') unless @ARGV; 
while (@ARGV) {
	# @ARGV is the special array which contains the command line arguments
    # remove first argument from list
	$ARGV = shift @ARGV;  
                          
    if (!open(ARGV, $ARGV)) {
	    warn "Can't open $ARGV: $!\n";
		next;
	}

	# process each file in turn
	while (<ARGV>) {     
		print $_;
	}
}

Suppose we put this code is in the perl script called print_all.pl, if we run it with files:

perl print_all.pl a.txt b.txt c.txt

It will print out the content of each file one by one.


Some examples and perl scripts are adopted from the book Beginning Perl for Bioinformatics, James Tisdall, ISBN, 0-596-00080-4, 2001.