Using TCP sockets, you will write a simplified version of a HTTP client and server. The client program will use the HTTP protocol to fetch a web page (stored in a file) from the server using the HTTP GET method, cache it, and then subsequently use conditional GET operations to fetch the file only if it has been modified.
The HTTP client will perform the following functions:
1. Take in a single command line argument that specifies a web url containing the hostname and port where the server is running, as well as the name of the file to be fetched, in the appropriate format.
Example: localhost:12000/filename.html
2. If the file is not yet cached, use a HTTP GET operation to fetch the file named in the URL
3. If the file is cached, use a Conditional GET operation for the file named in the URL
The HTTP server will perform the following functions:
1. Read a command-line argument specifying IP address and port server is to listen on e.g. 127.0.0.1 12000
2. Open a TCP socket and listen for incoming HTTP Get and Conditional GET requests from one or more HTTP Clients at above address and port
3. In the case of a HTTP Get request:
4. In the case of a HTTP Conditional Get Request:
5. In the case that the named file does not exist, return the appropriate "Not Found" error (return code 404)
6. The server must ignore all header fields in HTTP Requests it does not understand
Simplifying Assumptions:
Cache Implementation:
Test Cases:
Enable wireshark during all the following test cases. (One wireshark .pcap is fine for the test cases in this section, but the .pcap must show the test cases in this order.) Run the client four times, once for each test case.
1. Run client when web object not cached (or no cache exists): Using your HTTP client, fetch the contents of a text-based html file named filename.html from your HTTP server using the appropriate URL. Example: localhost:12000/filename.html. The client must:
2. Run client when web object cached, but not modified on server: Using your HTTP Client, send a conditional GET request to your HTTP server. The client must:
3. Run client when web object cached, but modified on server: Using your HTTP Client, send a conditional GET request to your HTTP server. The client must:
4. Web object does not exist: Using your HTTP Client, send a GET request for a filename that does not exist. The client must:
Using a web browser, such as Firefox or Chrome (note: Safari web browser may not implement the conditional GET as expected), perform the same test cases as above on your server. Enable wireshark during all the following test cases. One wireshark .pcap is fine for the web browser test cases.
1. Enter the URL in the web browser search bar and press < return>. The web browser should print the contents of the file downloaded from your server.
2. Re-enter/Re-fresh the URL in the web browser search bar and press < return>. The web browser should show the same web page contents as in step 1 (assuming the file has not been modified.), and the wireshark trace should show a Conditional Get and a "Not Modified" response.
3. Modify the file. Re-enter the URL in the web browser search bar and press
4. Enter a non-existent URL, and browser should indicate "Not found"
HTTP messages are encoded as strings in a specific format defined according to the HTTP specification.
As part of this assignment, your HTTP Client and HTTP Server programs are only expected to handle the following header fields:
HTTP Client GET Request Message:
Your GET Request must include the following:
Example:
GET /filename.html HTTP/1.1\r\n
Host: localhost:12000\r\n
\r\n
HTTP Server Response to Client GET Request (assuming file exists):
The response from the HTTP Server must include the following:
Example:
HTTP/1.1 200 OK\r\n
Date: Sun, 04 Mar 2018 21:24:58 GMT\r\n
Last-Modified: Fri, 02 Mar 2018 21:06:02 GMT\r\n
Content-Length: 75\r\n
Content-Type: text/html; charset=UTF-8\r\n
\r\n
< html>< p>First Line< br />Second Line< br />Third Line< br />COMPLETE< p>< html>
HTTP Client Conditional GET Request Message:
Your GET Request must include the following:
Example:
GET /filename.html HTTP/1.1\r\n
Host: localhost:12000\r\n
If-Modified-Since: Fri, 02 Mar 2018 21:06:02 GMT\r\n
\r\n
HTTP Server Conditional Response Message (Not Modified):
HTTP Server Conditional Response Message (Not Modified):
HTTP/1.1 304 Not Modified\r\n
Date: Sun, 04 Mar 2018 21:24:58 GMT\r\n
\r\n
HTTP Server Response when file not found:
HTTP/1.1 404 Not Found\r\n
Date: Sun, 04 Mar 2018 21:24:58 GMT\r\n
Content-Length: 0\r\n
\r\n
Get current time in UTC/GMT time zone and convert to string in HTTP format:
import datetime, time
t = datetime.datetime.now(timezone.utc)
date = time.strftime("%a, %d %b %Y %H:%M:%S %Z\r\n", t)
Determining a file's modification time (in seconds since 1 Jan, 1970 on Unix machines)
import os.path
secs = os.path.getmtime(filename)
Convert above time to UTC /GMT (returns a time tuple):
import time
t = time.gmtime(secs)
Convert above time tuple to a string in HTTP format:
last_mod_time = time.strftime("%a, %d %b %Y %H:%M:%S GMT\r\n", t)
Convert a date/time in string format back to time tuple and seconds since 1 Jan, 1970
t = time.strptime(last_mod_time, "%a, %d %b %Y %H:%M:%S %Z\r\n")
secs = time.mktime(t)