Notes for Week 3

  1. configuration file /etc/apache/httpd.conf

    • Scoping directives (directives go between these pairs)

      • < Directory directory path>
        < /Directory>
        defaults for all directories:
        < Directory />
            Options FollowSymLinks
            AllowOverride None
        </Directory>
        
        (AllowOverride None indicates that options may not be overridden by .htaccess files)
      • < DirectoryMatch regular expression>
        < /DirectoryMatch>
      • < Files file path>
        < /Files>
      • < FilesMatch regular expression>
        < /FilesMatch>
    • Logging directives

      • ErrorLog log file path
        default: /var/log/apache/error_log
      • LogLevel emerg | alert | crit | error | warn | notice | info | debug (from less to more)
        default: warn
    • Network directives

      • KeepAlive On (allows multiple HTTP requests per TCP connection)
        default: On
      • KeepAliveTimeout seconds (how long to wait before closing connection)
        default: 5
      • Listen IP address (which IP address to listen on)
        default: all
      • MaxKeepAliveRequests max requests per connection
        default: 100
      • Port port number to listen on
        default: 80
      • TimeOut number of seconds allowed to receive a GET, between POST or PUT packets, between ACKS
        default: 300
    • URL directives

      • DocumentRoot path to server root directory (omit trailing slash) - all URL paths are relative to this directory
        default: /srv/www/htdocs
      • ErrorDocument error_number path_to_error_message_file
        Error Numbermeaning
        100Continue
        101Switching Protocols
        200OK
        201Created
        202Accepted
        203Non-Authoritative Information
        204No Content
        205Reset Content
        206Partial Content
        300Multiple Choices
        301Moved Permanently
        302Found
        303See Other
        304Not Modified
        305Use Proxy
        306(Unused)
        307Temporary Redirect
        400Bad Request
        401Unauthorized
        402Payment Required
        403Forbidden
        404Not Found
        405Method Not Allowed
        406Not Acceptable
        407Proxy Authentication Required
        408Request Timeout
        409Conflict
        410Gone
        411Length Required
        412Precondition Failed
        413Request Entity Too Large
        414Request-URI Too Long
        415Unsupported Media Type
        416Requested Range Not Satisfiable
        417Expectation Failed
        500Internal Server Error
        501Not Implemented
        502Bad Gateway
        503Service Unavailable
        504Gateway Timeout
        505HTTP Version Not Supported
      • Redirect old_URL_path new_URL_path
      • RedirectMatch regular_expression new_URL_path ($1 represents parenthesized part of regular expression)
    • Security directives

      • Allow from ALL | host | net | net/mask
      • Deny from ALL | host | net | net/mask
      • Options directory options delimited by spaces
        options include FollowSymLinks, Indexes
        Note that FollowSymLinks and Indexes are on by default if no Options directives apply; if you want to turn them off, you must either use the Options -FollowSymLinks and/or -Indexes directives, or use an Options None directive. It is probably most convenient to put the Options None directive in the default directory scope (<Directory />), and then turn on any desired options in individual directories.
      • Order Allow,Deny | Deny,Allow
        for example, to deny access to anyone except my.pc.at.foo.com:
        order deny,allow
        deny from all
        allow from my.pc.at.foo.com
    • Server directives

      • MaxClients max simultaneous child server processes
        default: 256
      • ServerRoot path where server lives, containing conf files, etc.
        default: /etc/apache
      • StartServers number of child server processes to start
        default: 5
  2. CGI - Common Gateway Interface

    • program mode must be at least 705
    • data from client is
      path?fieldname1=value&fieldname2=value
      (path? is stripped off)

      Data validation is extremely important in CGI! This is how most web server attacks are made: invalid form data (ie., embedded SQL commands) that is not rejected may end up getting executed if the CGI program is stupid enough.

    • most special characters are escaped hex (%xx, ie., "@" would be %40)
    • GET request -> environment variable QUERY_STRING
      GET method is used to retrieve information (ordinary web requests, or from forms such as the one below)
    • POST request -> stdin
      In a POST request, information is processed by the server, but no reply is necessary (beyond status)
    • stdout must begin with a mime header "Content-type: text/html\r\n\r\n"
    • when maintaining CGI web content across multiple platforms, always use binary transfer
      If a CGI script is uploaded from a UNIX client to a Windows server in text mode, the Windows server will change \n to \r\n at the end of every line; if that same script is then transferred in binary mode to a UNIX server, the \r will be interpreted as part of the content, and NOT as end of line markers.

    Sample CGI script, which illustrates the data provided for either GET or POST requests:

    #!/bin/sh
    
    echo Content-type: text/plain
    echo 
    echo Server name is "$SERVER_NAME"
    echo Server is listening on port "$SERVER_PORT"
    echo Request method was "$REQUEST_METHOD"
    echo Query string is "$QUERY_STRING"
    echo Client IP address is "$REMOTE_ADDR"
    echo Content type is "$CONTENT_TYPE"
    echo Content length is "$CONTENT_LENGTH"
    echo Client will accept the MIME types "$HTTP_ACCEPT"
    echo Client Browser is "$HTTP_USER_AGENT"
    echo 
    echo stdin follows:
    read l
    while [ -n "$l" ]; do
        echo $l
        read l
    done
    echo end of stdin
    echo 
    
    A CGI script for a calculator, which generates HTML; note the translation of escaped characters back into their original form for input into BASH:
    #!/bin/sh
    
    echo Content-type: text/html
    echo 
    echo '<html>'
    echo '<body bgcolor="#E0FFFF">'
    echo '<h3>'
    echo "Hi $REMOTE_ADDR, welcome to the calculator at $SERVER_NAME!"
    echo '</h3><p>'
    echo '<h2>'
    if [ $REQUEST_METHOD = GET ]; then
        text=$(echo $QUERY_STRING | sed -e 's/^.*text=//')
    else
        read l
        text=$(echo $l | sed -e 's/^.*text=//')
    fi
    trtext=$(echo $text | 
        sed -e 's/%2B/+/g' | 
        sed -e 's^%2F^/^g' | 
        sed -e 's/%26/\&/g' | 
        sed -e 's/%7C/|/g' | 
        sed -e 's/%21/!/g' | 
        sed -e 's/%7E/~/g' | 
        sed -e 's/%5E/^/g' | 
        sed -e 's/%28/(/g' | 
        sed -e 's^%29^)^g' | 
        sed -e 's/%25/%/g' | 
        sed -e 's/%3C/</g' | 
        sed -e 's/%3E/>/g')
    result=$(echo $[$trtext])
    if [ $? -ne 0 ]; then
        echo "Syntax error in expression $trtext"
    else
        echo "$trtext = $result"
    fi
    echo '</h2>'
    echo '</body>'
    echo '</html>'
    echo 
    
  3. Sample forms for GET and POST

    • GET - this form:
      A button named "a_button"

      text:

      was created using this code:
      <form action="http://linux265.rwc.uc.edu/cgi-bin/get.cgi" method="GET">
      <input type="checkbox" name="a_button" value="on">A button named "a_button"
      <p>
      text: <input type=text size=25 value="" name=text>
      <input type="submit" value="Send">
      </form>
      
    • POST - this form:
      A button named "a_button"

      text:

      was created using this code:
      <form action="http://linux265.rwc.uc.edu/cgi-bin/post.cgi" method="POST">
      <input type="checkbox" name="a_button" value="on">A button named "a_button"
      <p>
      text: <input type=text size=25 value="" name=text>
      <input type="submit" value="Send">
      </form>
      
  4. EXERCISES for Week 3:

    1. Configure Apache to only serve to the PCs on your island, and to the instructor (192.168.1.150). Do not use firewall rules to do this.
    2. First change the name of the file "index.html" to "home.html".

      Then turn off FollowSymLinks and Indexes for all directories. This is a non-trivial exercise because on our systems, /srv is a symbolic link!

    3. Create a form which asks the client for their first name (use name=firstname in the input tag). Write a short CGI script to send them an HTML page welcoming them by name to your system. Use both the GET and POST methods, calling the scripts get.cgi and post.cgi, respectively.


©2007, Kenneth R. Koehler. All Rights Reserved. This document may be freely reproduced provided that this copyright notice is included.

Please send comments or suggestions to the author.