Client-Server Paradigm
Client | Server |
- initiates contact with server (speaks first)
- typically requests service from server,
- for Web, client is implemented in browser; for e-mail, in mail reader
|
- typically waits for request from clients
- provides requested service to client
- e.g., Web server sends requested Web page, mail server delivers e-mail
|
Examples:
Protocols
- Protocols define
- message format
- order of messages sent and received among network entities
- actions taken on message transmission, receipt
- In general, they define
- Syntax of messages: use a grammar (ie BNF or FSM)
- Symantics of messages: document each message type
- Syntax of conversation: grammar (again BNF or FSM)
- Symantics of conversation: document the states of the
FSM, label the edges with actions.
- A protocol allows separate groups to write clients and
servers that can talk to eachother.
- It defines what a good conversation looks like, it does not
define how the clients or servers are implimented.
- Examples: ethernet, TCP, IP, HTTP (we will jump into
this one).
HTTP (HyperText Transfer Protocol)
http is a connectionless, stateless protocol, it defines how a web browser and web server communicate. A client issues
a request, the server responds with a responce. Both request and responce look like.
INITIAL LINE
HEADERS
(an empty line)
BODY
For a request, INITIAL LINE could be GET or POST or DELETE or HEAD. It includes the URL encoded resource
as well as the HTTP protocol version number.
The HEADER specifies things like Host: ... Cookie: ...
The BODY of a request contains any payload data. For example, in a POST request, used to transfer a file
from the client to the server, the file data appears in the http request body. Similarly,
a form with method=POST will have the URL encoded data in the body of the http request.
For a responce, the INITIAL LINE has the status of the request (ie HTTP/1.1 200 OK, HTTP/1.1 404 Not Found),
the HEADER contains things like Set-cookie: , Last-Modified:, Content-Type:, ...
The BODY contains the payload of the responce. For example, the HTML for the webpage, the image, ...
Example
A GET request along with the associated responce. For a GET request, the body is empty.
GET /~arnold/index.html HTTP/1.1
Host: localhost
HTTP/1.1 200 OK
Date: Thu, 23 Jan 2014 04:32:32 GMT
Server: Apache/2.4.6 (Ubuntu)
Last-Modified: Mon, 13 Jan 2014 14:32:04 GMT
ETag: "2c-4efdaf08b1a5a"
Accept-Ranges: bytes
Content-Length: 44
Content-Type: text/html
Got here!
Example
A GET request for a file the server cannot find along with the servers responce.
GET /~arnold/missing.html HTTP/1.1
Host: localhost
HTTP/1.1 404 Not Found
Date: Thu, 23 Jan 2014 04:34:55 GMT
Server: Apache/2.4.6 (Ubuntu)
Content-Length: 292
Content-Type: text/html; charset=iso-8859-1
404 Not Found
Not Found
The requested URL /~arnold/missing.html was not found on this server.
Apache/2.4.6 (Ubuntu) Server at localhost Port 80
References
HTTP (HyperText Transfer Protocol) Details
Note: Examples can be found at http.zip.
- Purpose: Deliver content over the WWW
- Pieces: Browser is HTTP client, web server is HTTP server.
- Resource: HTTP transmits these, could be files or other dynamically generated data.
- Protocol:
- Client connects to server, issues request, server responds, connection is closed.
- Message format: (all lines CRLF terminated). Everything except body must be ascii (text)
Generic |
Note |
Request |
Responce |
initialLine |
|
requestType location HTTPVersion
requestType can be GET, POST, HEAD
|
HTTPVersion statusCode reason
status codes look like
1xx=information
2xx=success
3xx=redirect
4xx=client error
5xx=server error
|
headers |
Attribute: value CRLF
describe message and body,
possibly many
|
HTTP/1.1 requires Host: |
|
CRLF | separates body | | |
body |
can be binary data |
|
|
Http Examples
http://www.cs.toronto.edu/~arnold/309/http/hello.html
telnet to www.cs.toronto.edu port 80 (make sure that each line ends with CRLF)
That is,
telnet www.cs.toronto.edu 80
GET /~arnold/309/http/hello.html HTTP/1.0
GET /~arnold/309/http/hello.html HTTP/1.1
Host:www.cs.toronto.edu
GET /~arnold/309/http/hello.html HTTP/1.1
Host:www.cs.toronto.edu
If-Modified-Since: Mon, 06 Jan 2003 11:11:11 GMT
GET /~arnold/309/http/hello.html HTTP/1.1
Host:www.cs.toronto.edu
If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT
GET /~arnold/309/http/anotherDocument.html HTTP/1.1
Host:www.cs.toronto.edu
DELETE /~arnold/309/http/anotherDocument.html HTTP/1.1
Host:www.cs.toronto.edu
HEAD /~arnold/309/http/saturn_family.jpg HTTP/1.0
GET /~arnold/309/http/saturn_family.jpg HTTP/1.1
Host:www.cs.toronto.edu
Summary of HTTP commands
- GET a specified resource
- DELETE a specified resource
- HEAD get the responce headers (without associated data)
- PUT accompanying data at the specified location
- POST accompanying data as a subordinate to the specified location.
Notes
- HTTP 1.1 requires that you specify the host name you are requesting the resource from.
This allows one IP to be associated with many 'web sites'. The web server
checks the Host field to determine which site the location refers to.
- Returned headers include: HTTP version, return code, responce date, server type,
allowed operations, how return data is encoded, a description of the returned
data type.
HTTP/1.1 405 Method Not Allowed
Date: Wed, 16 Jan 2002 04:48:24 GMT
Server: Apache/1.3.19 (Win32)
Allow: GET, HEAD, OPTIONS, TRACE
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
Authentication
Note: Examples can be found at phttp.zip
Authentication Examples
Authentication and the Apache Web Server
Note: These examples currently do not work on our server. I am asking the admins about it.
- Create password file, with user names and passwords. The contents
of my passwd file can be found below. It says that arnolds password
is prof and joes password is student.
arnold:$apr1$qe2.....$PhSXQ.GvNdBVzxS/GVkJR0
joe:$apr1$cg2.....$AbIos1EJ.ofCVkkGUpFpZ0
- Restrict access for any resource below a directory by placing an .htaccess file
in it. The contents of the .htaccess file for phttp can be found below.
AuthType Basic
AuthName "By Invitation Only"
AuthUserFile "/home/a/arnold/public_html/phttp/passwd"
Require user arnold joe
Cookies
- Note: Examples can be found at chttp.zip, or see Cookie examples
- Part of http headers NOT part of data!
- Server can include
Set-cookie: name=value
header in responce.
- Client includes a
cookie: name=value
header for each cookie it wishes to communicate back to the server.
- Client typically includes all relevent cookies in all requests to a server. Relevent cookies are those set by the server.
- Cookies are used to maintain session information. That is, add state to a stateless protocol.
- Server usually associates cookies with previous choices, login information etc.
- Example:
GET /cgi-bin/mycookiecgi.cgi HTTP/1.1
Host: www.cs.toronto.edu
- Details (from developer.netscape.com):
Set-Cookie:
name=value
[;EXPIRES=dateValue]
[;DOMAIN=domainName]
[;PATH=pathName]
[;SECURE]
name=value is a sequence of characters excluding semicolon, comma and white space.
To place restricted characters in the name or value,
use an encoding method such as URL-style %XX encoding.
EXPIRES=dateValue specifies a date string that defines the valid
life time of that cookie. Once the expiration date has been
reached, the cookie will no longer be stored or given out.
If you do not specify dateValue, the cookie expires when
the user's session ends.
The date string is formatted as:
Wdy, DD-Mon-YY HH:MM:SS GMT
where
Wdy is the day of the week (for example, Mon or Tues);
DD is a two-digit representation of the day of the month;
Mon is a three-letter abbreviation for the month
(for example, Jan or Feb);
YY is the last two digits of the year; H
H:MM:SS are hours, minutes, and seconds, respectively.
DOMAIN=domainName specifies the domain attributes for a valid cookie.
If you do not specify a value for domainName, Navigator uses the
host name of the server which generated the cookie response.
Example: DOMAIN=royalairways.com
matches hostnames anvil.royalairways.com
and ship.crate.royalairways.com
PATH=pathName specifies the path attributes for a valid cookie.
If you do not specify a value for pathName, Navigator uses the
path of the document that created the cookie property
(or the path of the document described by the HTTP header,
for CGI programming).
Example: PATH=/foo
matches /foobar and /foo/bar.html
The path "/" is the most general path.
Should a browser send back a cookie?
sendCookie(){
for(each cookie stored in cache){
if(DOMAIN matches){
if(PATH matches){
send cookie
}
}
}
}
SECURE specifies that the cookie is transmitted only if the
communications channel with the host is a secure.
Only HTTPS (HTTP over SSL) servers are currently secure.
If SECURE is not specified, the cookie is considered sent over
any channel.
- secrets.php
URL, URI and Encoding
Defines what a URL looks like as well as how parameters are to be passed to a server.
For example, to get a quote on apple from yahoo, you might go to the following url.
http://ca.finance.yahoo.com/q?s=aapl&ql=1
This executes some script externally known as 'q' with parameters 's=aapl' and 'ql=1'.
If you look back at the http protocol, you realize that part of this line ends up in the
INITIAL LINE of the http request. This presents a bit of a problem. Spaces in the INITIAL LINE
have meaning for the HTTP protocol. This means that they have to be encoded in some way.
The URL encoding scheme replaces spaces with '+'. There are other transformations that take place
to encode data to appear in a URL so that it does not interfere with the HTTP protocol.
References
CGI
Say you want to write a program, to provide dynamic content to the web. One choice is to
write a program which understands the http protocol and responds dynamically. A more flexible
approach is to somehow extend the capabilities of an existing web server. CGI describes
how a program, written in any language, can plug into a web server and provide
dynamic content to the web.
To plug into a web server, your program will need to know where to expect inputs and where to
place outputs. The CGI protocol specifies
this. Your program need only look in appropriately prepared environment variables
and in stdin for its inputs. Your program need only write an appropriate responce for output.
Consider the http request that arises from the following URL.
http://www.cs.toronto.edu/~arnold/cgi-bin/environment.cgi?answer=Who%27s+there
The web server determines that it needs to execute environment.cgi. It prepares environment variables as follows and then executes
environment.cgi. environment.cgi responds simply by printing to stdout. Each HTTP request executes the corresponding program (called a CGI script) once.
So the CGI script handles one request and then terminates.
HTTP_HOST=www.cs.toronto.edu
DOCUMENT_ROOT=/cs/htdocs
SERVER_ADDR=128.100.3.30
SCRIPT_URL=/~arnold/cgi-bin/environment.cgi
GATEWAY_INTERFACE=CGI/1.1
REQUEST_URI=/~arnold/cgi-bin/environment.cgi?answer=Who%27s+there
PATH=/usr/local/bin:/usr/bin:/bin
HTTP_X_FORWARDED_FOR=172.28.179.33
SERVER_PROTOCOL=HTTP/1.1
REMOTE_ADDR=137.132.3.8
HTTP_COOKIE=
HTTP_ACCEPT_LANGUAGE=en-US,en;q=0.5
SERVER_SIGNATURE=
Apache/2.2.22 (Ubuntu) Server at www.cs.toronto.edu Port 80
SERVER_PORT=80
HTTP_USER_AGENT=Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0
REMOTE_PORT=32393
QUERY_STRING=answer=Who%27s+there
SERVER_SOFTWARE=Apache/2.2.22 (Ubuntu)
SCRIPT_FILENAME=/cs/htuser/arnold/public_html/cgi-bin/environment.cgi
SCRIPT_URI=http://www.cs.toronto.edu/~arnold/cgi-bin/environment.cgi
HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
REQUEST_METHOD=GET
HTTP_CONNECTION=keep-alive
HTTP_ACCEPT_ENCODING=gzip, deflate
SERVER_ADMIN=www@cs.toronto.edu
HTTP_REFERER=http://www.cs.toronto.edu/~arnold/cp3101b/lectures/htmlIntro/forms.html
SERVER_NAME=www.cs.toronto.edu
SCRIPT_NAME=/~arnold/cgi-bin/environment.cgi
STDIN
References