MIE453 - Bioinformatics Systems (Fall 06)

Tutorial 6 - HTML & CGI

Contents

  1. HTML
  2. XHTML
  3. CGI
  4. Using CGI on ECF

1. HTML

Hyper Text Markup Language

Document structure

Basic Tags

Comments

Comments are defined with <!-- -->.

A comment will be ignored by the browser.

<!-- This is a comment -->

Headings

Headings are defined with the <h1> (largest heading) to <h6> (smallest heading) tags.

HTML automatically adds an extra blank line before and after a heading.

<h1>This is a Heading 1 line</h>

Example

Paragraphs

Paragraphs are defined with the <p> tag.

HTML automatically adds an extra blank line before and after a paragraph.

<p>This is a paragraph</p>

Example

Line Breaks

The <br> tag is used when you want to end a line, but don't want to start a new paragraph.

Line one <br>
Line two <br>

Example

Named Anchor

Name anchors are used to link within a HTML document.

Name anchors are diefined with <a name="ACHOR_NAME">

<a name="a1">This is an anchor</a>

Links

Links are defined with the anchor tag <a href="URL">.

The href attribute is used to address the document to link to.

Use #ACHOR_NAME in the URL to link to a named achor with a HTML document.

<a href="#a1">This is a link to above anchor</a>

Unordered Lists

An unordered list is a list of items, which are marked with bullets

An unordered list starts with the <ul> tag. Each list item starts with the <li> tag.

<ul>
  <li>Item 1</li>
  <li>Item 2</li>
</ul>

Inside a list item you can put paragraphs, line breaks, images, links, other lists, etc.

Ordered Lists

An ordered list is also a list of items, which are marked with numbers

An ordered list starts with the <ol> tag. Each list item starts with the <li> tag.

<ol>
  <li>Item 1</li>
  <li>Item 2</li>
</ol>

Definition Lists

A definition list is a list of terms and explanation of the terms.

A definition list starts with the <dl> tag.

Each definition-list term starts with the <dt> tag.

Each definition-list definition starts with the <dd> tag.

<dl>
  <dt>Item 1</dt>
  <dd>Definition for Item 1</dd>
  <dt>Item 2</dt>
  <dd>Definition for Item 2</dd>
</dl>

Example

Images

An image can be inserted into a document with <img src="URL"> tag.

<img src="http://www.mie.utoronto.ca/labs/cvdl/personnel/alireza/images/embgold2.gif"></img>

Example

Tables

Tables are defined with the <table> tag.

A table is divided into rows (with the <tr> tag), and each row is divided into data cells (with the <td> tag).

A data cell can contain text, images, lists, paragraphs, forms, horizontal rules, tables, etc.

<table border="1">
<tr>
	<td>cell 1.1</td>
	<td>cell 1.2</td>
</tr>
<tr>
<td>cell 2.1</td>
<td>cell 2.2</td>
</tr>
</table>

Example

Forms

A form contain a set of form elements, which allow the user to enter information.

A form is defined with the <form> tag.

Input

The input element allow single-line input and is defined with the <input> tag.

The type of input is specified with the type attribute. It's also a good idea to include the name attribute.

The most commonly used input types are text field, radio button and checkbox.

Text Fields

Text fields are used to enter letters, numbers, etc. in a form.

<form>
	Text Field 1 <input type="text" name="tr1"><br>
	Text Field: <input type="text" name="tr2">
</form>

Notice:

Radio Buttons

Radio Buttons are used when you want to select exactly one of a number of choices.

<form>
	Radio Button 1 <input type="radio" name="rb1"><br>
	Radio Button 2 <input type="radio" name="rb1">
</form>

Checkboxes

Checkboxes are used when you want to select one or more options of a number of choices.

<form>
	<input type="checkbox" name="cb1"> Checkbox 1 <br>
	<input type="checkbox" name="cb2"> Checkbox 2 <br>
</form>

Textareas

Textarea allow multi-line text input and is defined with the <textarea> tag.

<form>
	<textarea rows="10" cols="20"> Please enter your text here ..</textarea> 
</form>

Selects

The select element creates a drop-down list and is created with the <select> tag.

<form> 
   <select name="select1" multiple="true" size = "3"> 
<option value="op1">Option 1</option>
<option value="op2" selected="true">Option 2</option>
<option value="op3">Option 3</option>
</select> </form>

Example

Action Attribute and the Submit Button

When the user clicks on the "Submit" button, the content of the form is sent to another file.

The form's action attribute defines the name of the file to send the content to.

<form name="" action="something.pl" method="get">
	<input type="submit" value="Submit">
</form>

2. XHTML 1.0

EXtensible Hyper Text Markup Language

Rules

3. CGI

Common Gate Interface

How CGI works

This figure is taken from http://www.oreilly.com/openbook/cgi/

Sample client request:

GET /ta/mie453/tutorial/tut6/t6-1.cgi HTTP/1.0
Accept: text/html
Accept: image/gif
User-Agent: Lynx/2.4 libwww/2.14

Explanation:

Input to CGI Programs

On Unix systems, there are basically two ways that CGI programs get their input

GATEWAY_INTERFACE

The revision of the Common Gateway Interface that the server uses.

SERVER_NAME

The server's hostname or IP address.

SERVER_SOFTWARE

The name and version of the server software that is answering the client request.

SERVER_PROTOCOL

The name and revision of the information protocol the request came in with.

SERVER_PORT

The port number of the host on which the server is running.

REQUEST_METHOD

The method with which the information request was issued.

PATH_INFO

Extra path information passed to a CGI program.

PATH_TRANSLATED

The translated version of the path given by the variable PATH_INFO.

SCRIPT_NAME

The virtual path (e.g., /cgi-bin/program.pl) of the script being executed.

DOCUMENT_ROOT

The directory from which Web documents are served.

QUERY_STRING

The query information passed to the program. It is appended to the URL with a "?".

REMOTE_HOST

The remote hostname of the user making the request.

REMOTE_ADDR

The remote IP address of the user making the request.

AUTH_TYPE

The authentication method used to validate a user.

REMOTE_USER

The authenticated name of the user.

REMOTE_IDENT

The user making the request. This variable will only be set if NCSA IdentityCheck flag is enabled, and the client machine supports the RFC 931 identification scheme (ident daemon).

CONTENT_TYPE

The MIME type of the query data, such as "text/html".

CONTENT_LENGTH

The length of the data (in bytes or the number of characters) passed to the CGI program through standard input.

HTTP_FROM

The email address of the user making the request. Most browsers do not support this variable.

HTTP_ACCEPT

A list of the MIME types that the client can accept.

HTTP_USER_AGENT

The browser the client is using to issue the request.

HTTP_REFERER

The URL of the document that the client points to before accessing the CGI program.

This table is taken from http://www.oreilly.com/openbook/cgi/

Output from CGI Programs

There are also two ways that CGI programs response to the request

Sample CGI response:

A response consists of two parts, HTTP header and a body, separated by a blank line.

HTTP/1.0 200 OK
Date: Thursday, 02-October-06 08:28:00 GMT
Server: Apache/1.3.29 (Unix)
MIME-version: 1.0
Content-type: text/html
Content-length: 2000
<HTML>
<HEAD><TITLE>Server and User Information</TITLE></HEAD>
<BODY>
.
.
.
</BODY>

Explanation of the HTTP header:


Example: Server and User Information (with query string)

#!/local/bin/perl
print "Content-type: text/html", "\n\n";
print "<HTML>", "\n";
print "<HEAD><TITLE>Server and User Information</TITLE><HEAD>",    "\n";
print "<BODY>";
print "<H1>Information about this Server</H1>", "\n";
print "<Hr><PRE>"; 

# The server's hostname or IP address
print "Server Name: ", $ENV{'SERVER_NAME'}, '<BR>', "\n";

# The port number of the host on which the server is running
print "Server Running on Port: ", $ENV{'SERVER_PORT'}, '<BR>',    "\n";

# The name and version of the server software that is running
print "Server Software: ", $ENV{'SERVER_SOFTWARE'}, '<BR>',    "\n";

# The name and revision of the information protocol the server 
# is using to communicate with the client
print "Server Protocol: ", $ENV{'SERVER_PROTOCOL'}, '<BR>',    "\n";

# The revision of the CGI that the server uses
print "CGI Revision: ", $ENV{'GATEWAY_INTERFACE'}, '<BR>', "\n";

print "<Hr></Pre>", "\n";
print "<H1>Information about the remote user</H1>", "\n";
print "<Hr><PRE>";

# The remote hostname and IP address of the user who making this request
print "Remote Host Name: ", $ENV{' REMOTE_NAME'}, ',<BR>', "\n";
print "Remote Host IP address: ", $ENV{'REMOTE_ADDR'}, ',<BR>',    "\n";

# The authenticated name of the user
print "User Name: ", $ENV{'REMOTE_USER'}, '<BR>', "\n";

# The list of MINE types the client can accept
print "Accept Types: ", $ENV{'HTTP_ACCEPT'}, '<BR>', "\n";

# The browser the user is using to make this request
print "Browser: ", $ENV{'HTTP_USER_AGENT'}, '<BR>', "\n";

# The URL of the document the client points to before accessing 
# the CGI program
print "Referial: ", $ENV{'HTTP_REFERER'}, '<BR>', "\n";

# The query information the user send. 
# It is appended to the URL with a "?".
print "Query String: ", $ENV{'QUERY_STRING'}, '<BR>', "\n";
print "<Hr></Pre>", "\n"; print "</BODY></HTML>", "\n"; exit;

Accessing Form Input

As we have mentioned, depending on the which method the client uses to send data, there are ways CGI programs get their input: by environment variables and by standard input.

The GET method

To use the GET method to pass data to CGI programs:

<form name="form1" action="t6-1.cgi" method="get">
	......
</form>

When using the GET method

Advantage

Disadvantage

The POST method

To use the POST method to pass data to CGI programs:

<form name="form1" action="t6-1.cgi" method="post">
	......
</form>

When using the GET method

Advantage

Disadvantage

Example: Accessing Form Data (forms: GET, POST)

#!/local/bin/perl

# a perl script to handle both GET and POST methods

# get the type of method that was used by the 
# user to issue this request
$request_method = $ENV{'REQUEST_METHOD'};

# if the GET method was used
if ($request_method eq "GET") {
	# get the query request from the environment variable
	$query = $ENV{'QUERY_STRING'};

# if the POST method was used
} elsif ($request_method eq "POST") {
	# get the query request from the standard input
	# first we get the size of the query request (# of chars)
	$query_size = $ENV{'CONTENT_LENGTH'};
	# then we read from STDIN that many number of chars
	read (STDIN, $query, $query_size);

# otherwise, we got an invalid request
} else {
	print "Content-type: text/html", "\n";
	print "Status: 500 Server Error", "\n\n";

	print "<title>Server Error</title>", "\n";
	print "<h1>Server Error</h1>", "\n";
	print "<hr>Unsuported request method<hr>", "\n"; 
	
	exit;
}

# now we have the query string, let's constrcut a hash table from it
# where keys are the keys in the query string, values are the corresponding values
%query_data = ();

# first we separat each key value pairs
@key_values = split (/&/, $query);
# for each key value pair ...
foreach $key_value (@key_values) {
	($key, $value) = split (/=/, $key_value);
	# replace + with a space
	$value =~ tr/+/ /;

	# the regular expression matches any hexadecimal value
	# and store it in the variable $1
	# the pack and hex operator convert the value in $1 to an ASCII equivalent
	# e means evaluating the second argument as an expression
        $value =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;

	# insert the key value pair into the query hash
	# we need to consider multiple values (select element with "multiple" turned on) 
        # if the key value pair already exist in the hash
	if (defined($query_data{$key})) {
		# we append the new value to the old ones
        	$query_data{$key} = join (", ", $query_data{$key}, $value);
        } else {
		# otherwise, just add a new entry in the hash
        	$query_data{$key} = $value;
        }
}

# output the result
print "Content-type: text/html", "\n\n";

print "<HTML>", "\n";
print "<HEAD><TITLE>GET and POST Example</TITLE><HEAD>", "\n";
print "<BODY>";

# depends on the query string, the output is different
if ($query_data{'show_data'} eq 'yes') {
	print "<h3>Thanks for your submission!</h3>", "\n";
	print "<h3>Here is the form data you submitted</h3>", "\n";
	print "<hr>", "\n";
	foreach $key (keys(%query_data)) {
		print "<b>", $key, "</b>:	", $query_data{$key}, "<br>\n";
	}
	print "<hr>", "\n";
} else {
	print "<h3>Thanks for your submission!</h3>", "\n";
}

print "</BODY></HTML>", "\n";

Generating Output

A CGI program can produce

In fact CGI programs can return any type of document, as long as the client can handle it properly (e.g. PDF).

Example: Data format on demand (form)

#!/local/bin/perl

# a perl script to return an image or text or existing file on demand

#first get user input data 
%user_input = get_form_data();

# depends on the query string, the output is different
if ($query_data{'format'} eq 'text') {
	# return a piece of plain text
	print "Content-type: text/plain", "\n\n";
	
	print "Yellow Crane Tower is an imposing pagoda close to the Yangzi River. ", "\n"; 
	print "Situated at the top of Sheshan (Snake Hill), in Wuchang, the tower ", "\n";
	print "was originally built at a place called Yellow Crane Rock projecting ", "\n";
	print "over the water, hence the name. Over the centuries the tower was ", "\n";
	print "destroyed by fire many times, but its popularity with Wuhan residents ", "\n";
	print "ensured that it was always rebuilt. The current tower was completed ", "\n";
	print "in 1985 and its design was copied from a Qing dynasty (1644-1911) ", "\n";
	print "picture. The tower has 5 stories and rises to 51 meters (168ft). ", "\n";
	print "Covered with yellow glazed tiles and supported with 72 huge pillars, ", "\n";
	print "it has 60 upturned eaves layer upon layer. It is an authentic ", "\n";
	print "reproduction of both the exterior and interior design, with the ", "\n";
	print "exception of the addition of air-conditioning and an elevator.", "\n";
} elsif ($query_data{'format'} eq 'image') {
	# return an image
	$image = 'cranetower.jpg';
	# open the image file in read mode
	if (open (IMAGE, "<" . $image)) {
		# the size of the image in bytes
		$len = (stat ($image))[7];
		print "Content-type: image/jpeg", "\n";
	    print "Content-length: $len", "\n\n";
	}
	print <IMAGE>;
} elsif ($query_data{'format'} eq 'both') {
	# Server Redirection: redirect the server to an existing document
	# the server will return it as if it is the response from the CGI program
	print 'Location: cranetower.html', "\n\n"
} else {
	print "Content-type: text/plain", "\n\n";
	
	print "Sorry! This section is under construction", "\n";
}

sub get_form_data {
	# get the type of method that was used by the 
	# user to issue this request
	$request_method = $ENV{'REQUEST_METHOD'};
	
	# if the GET method was used
	if ($request_method eq "GET") {
		# get the query request from the environment variable
		$query = $ENV{'QUERY_STRING'};
	
	# if the POST method was used
	} elsif ($request_method eq "POST") {
		# get the query request from the standard input
		# first we get the size of the query request (# of chars)
		$query_size = $ENV{'CONTENT_LENGTH'};
		# then we read from STDIN that many number of chars
		read (STDIN, $query, $query_size); 
	
	# otherwise, we got an invalid request
	} else {
		print "Content-type: text/html", "\n";
		print "Status: 500 Server Error", "\n\n";
	
		print "<title>Server Error</title>", "\n";
		print "<h1>Server Error</h1>", "\n";
		print "<hr>Unsuported request method<hr>", "\n"; 
		
		exit;
	}
	
	# now we have the query string, let's constrcut a hash table from it
	# where keys are the keys in the query string, values are the corresponding values
	%query_data = ();
	
	# first we separat each key value pairs
	@key_values = split (/&/, $query);
	# for each key value pair ...
	foreach $key_value (@key_values) {
		($key, $value) = split (/=/, $key_value);
		# replace + with a space
		$value =~ tr/+/ /;
	
		# the regular expression matches any hexadecimal value
		# and store it in the variable $1
		# the pack and hex operator convert the value in $1 to an ASCII equivalent
		# e means evaluting the second argument as an expression
	        $value =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
	
		# insert the key value pair into the query hash
		# we need to consider multiple values (select element with "multiple" turned on) 
	        # if the key value pair already exist in the hash
		if (defined($query_data{$key})) {
			# we append the new value to the old ones
	        	$query_data{$key} = join (", ", $query_data{$key}, $value);
	        } else {
			# otherwise, just add a new entry in the hash
	        	$query_data{$key} = $value;
	        }
	}
	
	return qurey_data; 
}

4. Using CGI on ECF

please refer to http://www.ecf.toronto.edu/ecf/Webpage

Example: http://www.ecf.utoronto.ca/~jianglei/test.html (CGI: http://www.ecf.utoronto.ca/~jianglei/cgi_script)

Some CGI examples are adopted from the book CGI Programming on the World Wide Web, Shishir Gundavaram, Sebastopol, Calif. : O'Reilly & Associates, 1996, ISBN 1565921682 , http://www.oreilly.com/openbook/cgi/.