FAQ: Publishing on the WWW at CSLab

1. General

2. Using The CSLab Web Server


1. General

1.1) Who should I contact if I have problems with the web server?

If you have questions or comments regarding the contents a particular user's page, you should contact the owner of the page.

If you have questions or comments regarding about the contents of the DCS website, you should send mail to the DCS webmaster at www@cs

If the web server is not operating properly, or you have a technical question about its operation, please contact your Point of Contact.

1.2) How do I publish documents via anonymous FTP?

Please contact your PoC for your FTP directory to be set up.

This will create a directory on the anonymous FTP server where you can publish your documents. You can access that directory from any CSLab host as /cs/ftp/pub/username/.

Note: Your /cs/ftp/pub/username/ directory is provided only for the purpose of publishing information via FTP. Disk space on the CSLab FTP partition is shared amongst many users and is thus potentially a very scarce resource. Please do not use it for general storage of your files: use your home directory or a work disk instead.

1.3) How do I create a web page?

Your web page should already be created. If this is not the case, please send an email to your PoC to rectify the situation. The rest of this section will provide details about how to access your web page, as well as usage information.

The URL http://www.cs.toronto.edu/~username/ corresponds to the directory /cs/htuser/username/public_html/. For your convenience, a symbolic link called public_html to this location has been placed in your home directory.

Note: Your /cs/htuser/username/ directory is provided only for the purpose of publishing information via the World Wide Web. Disk space on the CSLab web partitions are shared amongst many users and is thus potentially a contested resource. Please do not use it for general storage of your files: use your home directory or a work disk instead.

1.4) How do I redirect my web page to another site?

If you have a web page at another department, you have the option to have requests for http://www.cs.toronto.edu/~username/ automatically redirected to your page.

For example, if user jsmith had a page on DGP's web server, he may have the webserver's configuration changed so that requests for http://www.cs.toronto.edu/~jsmith will be redirected to his DGP web page at http://www.dgp.toronto.edu/~jsmith/.

To setup redirection in this manner for your home page, or to have it removed, please contact your PoC.

1.5) I want my files to be available via either anonymous FTP or HTTP. What's the best way to do that?

All documents in your FTP area can also be retrieved via HTTP.

For example, if you put the file "example.ps.gz" into the directory /cs/ftp/pub/username/, it can be retrieved via either of the following URLs:

If you have a specific reason why you don't want the files in your FTP area to be retrievable via HTTP, please contact your Point of Contact.

1.6) Can I allow users to upload files to my /cs/ftp/pub/username/ directory via anonymous FTP?

No.

The problem with arbitrarily allowing FTP uploads is that as the delinquents on the Internet find sites that allow them, they use those sites to distribute pirated software and other copyrighted materials.

If you require a mechanism to enable a user without a CSLab account to upload files to our filesystems and no alternatives will work for you (courier of a burned DVD, offsite individual making the files available for download from their own ftp or http server, email attachments, etc.) then please get in touch with your Point of Contact to open discussion on the subject.

You should never set the permissions for any file in your FTP area (or under your /cs/htuser/username/public_html/ directory) to be world-writable. Making your FTP area world-writable will not allow anonymous FTP uploads to take place there.

For more information regarding security considerations when publishing documents via FTP or HTTP, please see the document titled The WWW at CSLab: Security Considerations.

1.7) Are the FTP and HTTP logs available to me?

The server logs are located on www.cs.toronto.edu, in the directory /var/log/apache2. The HTTP transfer logs are in the file access.log. The HTTP error logs are in error.log, also in that directory.

1.8) What am I allowed / not allowed to put on my web pages?

Anything you put onto your web pages must conform to CSLab policies and University of Toronto policies.

1.9) How do I publish a CSRG technical report?

DCS technical reports are available for anonymous FTP, so that anyone on the network can obtain the formatted reports electronically.

The reports are stored as compressed PostScript files in the directory /cs/ftp/pub/reports. You can tell people to get these reports "by anonymous FTP from ftp.cs.toronto.edu, in pub/reports".

If you'd like to make a report available for anonymous FTP, please send mail to joan@cs including the title and author(s). She will assign a report number, and we will create a directory for it, and add it to the index. You can then copy your report into this directory.

1.10) I want to co-author a set of web pages with some colleagues as part of a collaborative project. How do I set this up?

See the document titled Project Collaboration at CSLab for a detailed explanation on how to use Unix file permissions to collaborate on projects.

2. Using The CSLab Web Server

2.1) What web server software do we use at CSLab, and where can I find documentation for it?

We use the Apache web server. The documentation for the version of Apache we're currently using (2.0) is here.

2.2) Why can't I access my home directory from the CSLab web server?

The web server runs unaudited programs which may be written by any CSLab user. Such programs could easily contain security holes. For this reason, the security of the web server host is inherently poorer than the rest of the lab.

As such, the web server is generally trusted much less than the rest of the lab's workstations and servers. It does not have permission to access user home directories, project directories, mailboxes, or most other CSLab filesystems.

In general, the web server is considered to be outside CSLab for the purposes of security and trust.

Do not set up any mechanism that grants the web server any access to your account without first getting permission from the system administrators. Allowing any unauthenticated access to your CSLab account from the web server may result in the suspension of your account.

Note that this means that when you log into www.cs.toronto.edu, details such as your environment variables will probably not be set up the way they usually are when you log into a compute server, since the server can't read the .cshrc file in your CSLab home directory.

2.3) I set up my pages in my public_html directory, but web browsers report "403 Forbidden" when I try to access them. What's wrong?

Most probably, you've set the permissions on your files or directories too strictly, so that the web server isn't able to read your files.

In general, directories beneath your public_html area should have permissions rwx--x--x, and files should have permissions rw-r--r--. This allows the web server to read your files, but not write to them. For more information about setting up file permissions for your public_html area, see the document titled The WWW at CSLab: Security Considerations

Other reasons you might get this error include:

2.4) Why won't the web server follow my symbolic link?

By default, the server will only follow symbolic links if you own the file or directory the link points to. You should not be pointing symbolic links at other people's files.

The reason for this restriction is that a user might want some of his files to be controlled by an .htaccess file; for instance, he or she might want certain files to be published to only specific hosts or networks, or to be password protected. However, if you created symbolic links to those files within your own pages, you'd bypass the user's .htaccess file!

Symbolic links within the web server's space should be avoided, in general. If you want to refer to someone else's pages from within your own, it's probably more appropriate to use a Redirect directive to send the visitor's browser to the user's page.

2.5) Why are browsers reporting "500 Internal Server Error" for my web pages?

This is almost always due to an error in an .htaccess file in the directory containing the pages, or possibly a directory above it. The file /var/log/apache2/error.log on the web server may include error messages which will help you determine where the error is.

2.6) How do I create a CGI program?

At CSLab, CGI programs generally must have filenames which end in ".cgi", to identify to the server that you really intend the program to be run as a CGI. The program must also be world-executable (i.e., chmod o+x program.cgi).

Please be aware that CGI programs are typically very difficult to write securely. A security hole in a CGI program is not only a threat to your own published files, but of CSLab's research systems as a whole.

If your CGI programs use files to store information, you should probably make sure that those files are not anywhere under your /cs/htuser/username/public_html/ directory. Doing so would allow the web server to publish the file without going through your CGI. Instead, you should create other directories under your /cs/htuser/username/ directory, e.g., /cs/htuser/username/private_cgi.

(If you maintain web pages in a project directory under /cs/htdocs/, the CGI programs for those pages should keep their files in a corresponding directory under /cs/htdata/; if such a directory doesn't already exist, ask the system administrators to create one for you.)

2.7) My CGI program isn't working. How do I go about troubleshooting it?

The topic of troubleshooting CGI programs in general is beyond the scope of this document, but there are a few things you should keep in mind:

2.8) My CGI program doesn't seem to be able to send mail. What's wrong?

We don't generally permit the HTTP server to send mail messages. One reason for this is the lack of accountability: it can be extremely difficult to track down which particular CGI was responsible for sending a particular mail message.

If you feel you have a genuine reason why your CGI needs the ability to send mail, please contact your Point of Contact. Please do not circumvent the measures we've taken to prevent CGIs from sending mail without the permission of the systems administrators; doing so may result in the suspension of your account.

2.9) How do I use SSI (server-side includes)?

Files which end in .shtml will be parsed by the server.

For reasons of system security, the default setting is that server-parsed files are not permitted to use <!--#exec ...--> elements, nor may they <!--#include ...--> CGI scripts.

If you require these features, you may turn them on by adding the following line to your .htaccess file:

Options +Includes

However, please don't use Options +Includes unless you genuinely need the ability to execute programs within an server-side include.

See the Apache documentation for more information about server-side includes.

2.10) Does the CSLab web server support servlets?

At present, we don't provide any specific resources to support servlets. The CSLab web server is a general research resource, and researchers' requirements for servlets tend to be specific to the sort of research they're doing. It's tricky, to put it mildly, for us to configure and run a single web server that's shared by nearly a thousand CSLab users; for us to try to configure and maintain a single servlet engine that would accommodate everyone would not be practical.

However, servlet engines (such as Jakarta Tomcat) incorporate their own http server, and they can be built and deployed independently from the CSLab web server. If you would like to set up your own servlet engine for your research, please contact us and we'll work out the details together.

2.11) Can the web server automatically publish directory listings for my directories?

Yes, but this feature is now turned off by default. To turn it on, put the directive

Options +Indexes

into an .htaccess file in your public_html/ directory, or in the directory for which you want automatic listings to be generated.

2.12) How do I specify the MIME type of a document I'm publishing?

Put an appropriate AddType or ForceType directive into an .htaccess file in your public_html/ directory, or in the directory where the files are. Use AddType to associate a MIME type with a particular extension (e.g., to say that files ending in ".class" are of type application/octet-stream); use ForceType in an .htaccess file to force all files in a directory to be served as a particular MIME type regardless of their extensions.

Please do not ask the system administrators to add MIME types to the global server configuration.

2.13) How can I restrict access to my web pages to specific hosts or networks?

Use the order, allow, and deny directives in an .htaccess file.

Also note that the ErrorDocument directive can be used in the .htaccess file to explain why you're denying access to your pages (e.g., ErrorDocument 403 /~username/denied.html).

The following example restricts access to University of Toronto networks:

order deny,allow
deny from all
allow from 128.100
allow from 142.150
allow from 142.151

If you place the lines above into a file named .htaccess in your web area, then access to that directory and below will be restricted to the U of T network blocks.

(Note that because of the way hosts within CSLab are named locally--e.g., apps2.cs rather than apps2.cs.toronto.edu--a statement like allow from .cs.toronto.edu may not be applied to local systems the way you might expect based on the Apache documentation.)

2.14) How can I put a password on some of my web pages?

Although you can protect web pages with a password, be aware that neither the password nor the documents themselves are encrypted as they travel over the network, so the password protection is not an absolute guarantee of privacy by any means.

Furthermore, remember that the server is shared by all CSLab users, and runs with the same privileges for everybody; any local user can write their own CGI program that can read the files you're trying to protect with a password.

Here's a walkthrough of the steps required to set up password access on a directory:

If everything has been set up properly, then attempting to access any documents under the URL "http://www.cs.toronto.edu/~username/protected/" should now cause the web browser to prompt the user for a username and password. Supplying the username "example" and the password you typed into the htpasswd command above should give you access to the document.

2.15) Does our web server support SSL (https://)?

Our web server is not set up to provide SSL functionality. Reasons for this include: