Hypertext Transfer Protocol (HTTP)


Einleitung


HTTP Konzepte

HTTP 0.9, 1.0, 1.1 spezifiziert die Syntax und Semantik von Anfragen und Antworten.

Ablauf einer Anfrage

www.uni-mannheim.de, localhost

Telnet-Verbindung an den Port 80

$ telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.0
User-Agent: Heinz Kredel
Host: localhost
Accept: */*

Antwort des Web-Servers

HTTP/1.1 200 OK
Date: Sat, 07 Nov 1998 20:53:21 GMT
Server: Apache/1.2.4 S.u.S.E./5.1
Last-Modified: Thu, 21 May 1998 19:19:48 GMT
ETag: "21047-44c-35647e54"
Content-Length: 1100
Accept-Ranges: bytes
Connection: close
Content-Type: text/html

<HTML>
...
</HTML>
Connection closed by foreign host.

Die Schritte der Anfrage im Detail

  1. TCP/IP-Verbindung (z.B. telnet) an einen verabredeten Port (default 80)

    erste Zeile: method-Kommando mit Parametern

    GET / HTTP/1.0
    
    
  2. folgende Zeilen: Informationen des Browsers (UA) an den Server, abgeschlossen durch eine Leerzeile.

    User-Agent: Heinz Kredel
    Host: localhost
    Accept: */*
    
    
  3. Abhägig von method folgen weitere Informationen.

    bei POST z.B. die Daten des Formulars, oder auch eine Datei.

Allgemein:

Method-Line
General-Header(s)
Request-Header(s)
Entity-Header(s)
CRLF
Entity-Body

Die Schritte der Antwort des Servers

  1. erste Zeile: Statuszeile über Erfolg oder Fehler

    HTTP/1.1 200 OK
    
  2. folgende Zeilen: Header-Informationen des Servers, abgeschlossen durch eine Leerzeile.

    Date: Sat, 07 Nov 1998 20:53:21 GMT
    Server: Apache/1.2.4 S.u.S.E./5.1
    Last-Modified: Thu, 21 May 1998 19:19:48 GMT
    ETag: "21047-44c-35647e54"
    Content-Length: 1100
    Accept-Ranges: bytes
    Connection: close
    Content-Type: text/html
    
    
  3. Abhägig von Content-Type folgen weitere Informationen.

    zum Beispiel bei text/html eine HTML-Datei, oder bei image/gif ein GIF-Bild.

Allgemein:

Status-Line
General-Header(s)
Response-Header(s)
Entity-Header(s)
CRLF
Entity-Body

Nach dem Abbau der Verbindung gibt es auf beiden Seiten keine Informationen mehr über den Partner.

Methoden

Statusmeldungen des Servers

HTTP Header

Allgemeine Header

Client Anfragen

Server Antworten

Inhaltsangaben

MIME-Types

Multipurpose Internet Mail Extensions

Typ/Subtyp Datei-Erweiterung Datei-Typ
application/pdf pdf Portable Document Format
von Adobe
application/ps eps, ps PostScript
application/x-tar tar "Tape-Archive"
audio/basic au, snd Audioformate
image/gif gif Graphikformat
image/tiff tiff, tif Graphikformat
text/html html, htm HTML Datei
text/plain txt reine ASCII Datei

Web-Server

Arbeitsweise

(Apache) Konfiguration

httpd.conf:

# ServerType is either inetd, or standalone.

ServerType standalone

# If you are running from inetd, go to "ServerAdmin".

# Port: The port the standalone listens to. For ports < 1023, you will
# need httpd to be run as root initially.

Port 80

# HostnameLookups: Log the names of clients or just their IP numbers
#   e.g.   www.apache.org (on) or 204.62.129.132 (off)
# You should probably turn this off unless you are going to actually
# use the information in your logs, or with a CGI.  Leaving this on
# can slow down access to your site.
HostnameLookups on

# If you wish httpd to run as a different user or group, you must run
# httpd as root initially and it will switch.  

# User/Group: The name (or #number) of the user/group to run httpd as.
#  On SCO (ODT 3) use User nouser and Group nogroup
#  On HPUX you may not be able to use shared memory as nobody, and the
#  suggested workaround is to create a user www and use that user.
User wwwrun
Group #-2

# The following directive disables keepalives and HTTP header flushes for
# Netscape 2.x and browsers which spoof it. There are known problems with
# these

BrowserMatch Mozilla/2 nokeepalive

# ServerAdmin: Your address, where problems with the server should be
# e-mailed.

ServerAdmin you@your.address

# ServerRoot: The directory the server's config, error, and log files
# are kept in

ServerRoot /usr/local/httpd

# BindAddress: You can support virtual hosts with this option. This option
# is used to tell the server which IP address to listen to. It can either
# contain "*", an IP address, or a fully qualified Internet domain name.
# See also the VirtualHost directive.

#BindAddress *

# ErrorLog: The location of the error log file. If this does not start
# with /, ServerRoot is prepended to it.

ErrorLog /var/log/httpd.error_log

# TransferLog: The location of the transfer log file. If this does not
# start with /, ServerRoot is prepended to it.

TransferLog /var/log/httpd.access_log

# PidFile: The file the server should log its pid to
PidFile /var/run/httpd.pid

# ScoreBoardFile: File used to store internal server process information.
# Not all architectures require this.  But if yours does (you'll know because
# this file is created when you run Apache) then you *must* ensure that
# no two invocations of Apache share the same scoreboard file.
ScoreBoardFile /var/log/apache_status

# ServerName allows you to set a host name which is sent back to clients for
# your server if it's different than the one the program would get (i.e. use
# "www" instead of the host's real name).
#
# Note: You cannot just invent host names and hope they work. The name you 
# define here must be a valid DNS name for your host. If you don't understand
# this, ask your network administrator.

#ServerName new.host.name

# CacheNegotiatedDocs: By default, Apache sends Pragma: no-cache with each
# document that was negotiated on the basis of content. This asks proxy
# servers not to cache the document. Uncommenting the following line disables
# this behavior, and proxies will be allowed to cache the documents.

#CacheNegotiatedDocs

# Timeout: The number of seconds before receives and sends time out

Timeout 300

# KeepAlive: Whether or not to allow persistent connections (more than
# one request per connection). Set to "Off" to deactivate.

KeepAlive On

# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited amount.
# We reccomend you leave this number high, for maximum performance.

MaxKeepAliveRequests 100

# KeepAliveTimeout: Number of seconds to wait for the next request

KeepAliveTimeout 15

# Server-pool size regulation.  Rather than making you guess how many
# server processes you need, Apache dynamically adapts to the load it
# sees --- that is, it tries to maintain enough server processes to
# handle the current load, plus a few spare servers to handle transient
# load spikes (e.g., multiple simultaneous requests from a single
# Netscape browser).

# It does this by periodically checking how many servers are waiting
# for a request.  If there are fewer than MinSpareServers, it creates
# a new spare.  If there are more than MaxSpareServers, some of the
# spares die off.  These values are probably OK for most sites ---

MinSpareServers 5
MaxSpareServers 10

# Number of servers to start --- should be a reasonable ballpark figure.

StartServers 5

# Limit on total number of servers running, i.e., limit on the number
# of clients who can simultaneously connect --- if this limit is ever
# reached, clients will be LOCKED OUT, so it should NOT BE SET TOO LOW.
# It is intended mainly as a brake to keep a runaway server from taking
# Unix with it as it spirals down...

MaxClients 150

# MaxRequestsPerChild: the number of requests each child process is
#  allowed to process before the child dies.
#  The child will exit so as to avoid problems after prolonged use when
#  Apache (and maybe the libraries it uses) leak.  On most systems, this
#  isn't really needed, but a few (such as Solaris) do have notable leaks
#  in the libraries.

MaxRequestsPerChild 30

# Proxy Server directives. Uncomment the following line to
# enable the proxy server:

#ProxyRequests On

# To enable the cache as well, edit and uncomment the following lines:

#CacheRoot /usr/local/etc/httpd/proxy
#CacheSize 5
#CacheGcInterval 4
#CacheMaxExpire 24
#CacheLastModifiedFactor 0.1
#CacheDefaultExpire 1
#NoCache a_domain.com another_domain.edu joes.garage_sale.com

# Listen: Allows you to bind Apache to specific IP addresses and/or
# ports, in addition to the default. See also the VirtualHost command

#Listen 3000
#Listen 12.34.56.78:80

# VirtualHost: Allows the daemon to respond to requests for more than one
# server address, if your server machine is configured to accept IP packets
# for multiple addresses. This can be accomplished with the ifconfig 
# alias flag, or through kernel patches like VIF.

# Any httpd.conf or srm.conf directive may go into a VirtualHost command.
# See alto the BindAddress entry.
 
#<VirtualHost host.some_domain.com>
#ServerAdmin webmaster@host.some_domain.com
#DocumentRoot /www/docs/host.some_domain.com
#ServerName host.some_domain.com
#ErrorLog logs/host.some_domain.com-error_log
#TransferLog logs/host.some_domain.com-access_log
#</VirtualHost>

#
# Read config files from /etc/httpd
#
ResourceConfig  /etc/httpd/srm.conf
AccessConfig    /etc/httpd/access.conf
TypesConfig     /etc/httpd/mime.types

Problem: starten externer CGI-Programme ausgehend von Port 80

srm.conf:

# DocumentRoot: The directory out of which you will serve your
# documents. By default, all requests are taken from this directory, but
# symbolic links and aliases may be used to point to other locations.

DocumentRoot /usr/local/httpd/htdocs

# UserDir: The name of the directory which is appended onto a user's home
# directory if a ~user request is recieved.

UserDir public_html

# DirectoryIndex: Name of the file or files to use as a pre-written HTML
# directory index.  Separate multiple entries with spaces.

DirectoryIndex index.html

# FancyIndexing is whether you want fancy directory indexing or standard

FancyIndexing on

# AddIcon tells the server which icon to show for different files or filename
# extensions

AddIconByEncoding (CMP,/icons/compressed.gif) x-compress x-gzip

AddIconByType (TXT,/icons/text.gif) text/*
AddIconByType (IMG,/icons/image2.gif) image/*
AddIconByType (SND,/icons/sound2.gif) audio/*
AddIconByType (VID,/icons/movie.gif) video/*

AddIcon /icons/binary.gif .bin .exe
AddIcon /icons/binhex.gif .hqx
AddIcon /icons/tar.gif .tar
AddIcon /icons/world2.gif .wrl .wrl.gz .vrml .vrm .iv
AddIcon /icons/compressed.gif .Z .z .tgz .gz .zip
AddIcon /icons/a.gif .ps .ai .eps
AddIcon /icons/layout.gif .html .shtml .htm .pdf
AddIcon /icons/text.gif .txt
AddIcon /icons/c.gif .c
AddIcon /icons/p.gif .pl .py
AddIcon /icons/f.gif .for
AddIcon /icons/dvi.gif .dvi
AddIcon /icons/uuencoded.gif .uu
AddIcon /icons/script.gif .conf .sh .shar .csh .ksh .tcl
AddIcon /icons/tex.gif .tex
AddIcon /icons/bomb.gif core

AddIcon /icons/back.gif ..
AddIcon /icons/hand.right.gif README
AddIcon /icons/folder.gif ^^DIRECTORY^^
AddIcon /icons/blank.gif ^^BLANKICON^^

# DefaultIcon is which icon to show for files which do not have an icon
# explicitly set.

DefaultIcon /icons/unknown.gif

# AddDescription allows you to place a short description after a file in
# server-generated indexes.
# Format: AddDescription "description" filename

# ReadmeName is the name of the README file the server will look for by
# default. Format: ReadmeName name
#
# The server will first look for name.html, include it if found, and it will
# then look for name and include it as plaintext if found.
#
# HeaderName is the name of a file which should be prepended to
# directory indexes. 

ReadmeName README
HeaderName HEADER

# IndexIgnore is a set of filenames which directory indexing should ignore
# Format: IndexIgnore name1 name2...

IndexIgnore */.??* *~ *# */HEADER* */README* */RCS

# AccessFileName: The name of the file to look for in each directory
# for access control information.

AccessFileName .htaccess

# DefaultType is the default MIME type for documents which the server
# cannot find the type of from filename extensions.

DefaultType text/plain

# AddEncoding allows you to have certain browsers (Mosaic/X 2.1+) uncompress
# information on the fly. Note: Not all browsers support this.

AddEncoding x-compress Z
AddEncoding x-gzip gz

# AddLanguage allows you to specify the language of a document. You can
# then use content negotiation to give a browser a file in a language
# it can understand.  Note that the suffix does not have to be the same
# as the language keyword --- those with documents in Polish (whose
# net-standard language code is pl) may wish to use "AddLanguage pl .po" 
# to avoid the ambiguity with the common suffix for perl scripts.

AddLanguage en .en
AddLanguage fr .fr
AddLanguage de .de
AddLanguage da .da
AddLanguage el .el
AddLanguage it .it

# LanguagePriority allows you to give precedence to some languages
# in case of a tie during content negotiation.
# Just list the languages in decreasing order of preference.

LanguagePriority en fr de

# Redirect allows you to tell clients about documents which used to exist in
# your server's namespace, but do not anymore. This allows you to tell the
# clients where to look for the relocated document.
# Format: Redirect fakename url


# Aliases: Add here as many aliases as you need (with no limit). The format is 
# Alias fakename realname

# Note that if you include a trailing / on fakename then the server will
# require it to be present in the URL.  So "/icons" isn't aliased in this
# example.

Alias /icons/ /usr/local/httpd/icons/

# ScriptAlias: This controls which directories contain server scripts.
# Format: ScriptAlias fakename realname

ScriptAlias /cgi-bin/ /usr/local/httpd/cgi-bin/

# If you want to use server side includes, or CGI outside
# ScriptAliased directories, uncomment the following lines.

# AddType allows you to tweak mime.types without actually editing it, or to
# make certain files to be certain types.
# Format: AddType type/subtype ext1

# AddHandler allows you to map certain file extensions to "handlers",
# actions unrelated to filetype. These can be either built into the server
# or added with the Action command (see below)
# Format: AddHandler action-name ext1

# To use CGI scripts:
AddHandler cgi-script .cgi

# To use server-parsed HTML files
#AddType text/html .shtml
#AddHandler server-parsed .shtml

# Uncomment the following line to enable Apache's send-asis HTTP file
# feature
#AddHandler send-as-is asis

# If you wish to use server-parsed imagemap files, use
#AddHandler imap-file map

# To enable type maps, you might want to use
#AddHandler type-map var

# Action lets you define media types that will execute a script whenever
# a matching file is called. This eliminates the need for repeated URL
# pathnames for oft-used CGI file processors.
# Format: Action media/type /cgi-script/location
# Format: Action handler-name /cgi-script/location

# MetaDir: specifies the name of the directory in which Apache can find
# meta information files. These files contain additional HTTP headers
# to include when sending the document

#MetaDir .web

# MetaSuffix: specifies the file name suffix for the file containing the
# meta information.

#MetaSuffix .meta

# Customizable error response (Apache style)
#  these come in three flavors
#
#    1) plain text
#ErrorDocument 500 "The server made a boo boo.
#  n.b.  the (") marks it as text, it does not get output
#
#    2) local redirects
#ErrorDocument 404 /missing.html
#  to redirect to local url /missing.html
#ErrorDocument 404 /cgi-bin/missing_handler.pl
#  n.b. can redirect to a script or a document using server-side-includes.
#
#    3) external redirects
#ErrorDocument 402 http://some.other_server.com/subscription_info.html
#

access.conf:

# This should be changed to whatever you set DocumentRoot to.

<Directory /usr/local/httpd/htdocs>

# This may also be "None", "All", or any combination of "Indexes",
# "Includes", "FollowSymLinks", "ExecCGI", or "MultiViews".

# Note that "MultiViews" must be named *explicitly* --- "Options All"
# doesn't give it to you (or at least, not yet).

Options Indexes FollowSymLinks

# This controls which options the .htaccess files in directories can
# override. Can also be "All", or any combination of "Options", "FileInfo", 
# "AuthConfig", and "Limit"

AllowOverride None

# Controls who can get stuff from this server.

order allow,deny
allow from all

</Directory>

# /usr/local/etc/httpd/cgi-bin should be changed to whatever your ScriptAliased
# CGI directory exists, if you have that configured.

<Directory /usr/local/httpd/cgi-bin>
AllowOverride None
Options None
</Directory>

# Allow server status reports, with the URL of http://servername/server-status
# Change the ".your_domain.com" to match your domain to enable.

#<Location /server-status>
#SetHandler server-status

#order deny,allow
#deny from all
#allow from .your_domain.com
#</Location>

# There have been reports of people trying to abuse an old bug from pre-1.1
# days.  This bug involved a CGI script distributed as a part of Apache.
# By uncommenting these lines you can redirect these attacks to a logging 
# script on phf.apache.org.  Or, you can record them yourself, using the script
# support/phf_abuse_log.cgi.

#<Location /cgi-bin/phf*>
#deny from all
#ErrorDocument 403 http://phf.apache.org/phf_abuse_log.cgi
#</Location>

# You may place any other directories or locations you wish to have
# access information for after this one.

mime.types:

application/activemessage
application/andrew-inset
application/applefile
application/atomicmail
application/dca-rft
application/dec-dx
application/mac-binhex40	hqx
application/mac-compactpro	cpt
application/macwriteii
application/msword		doc
application/news-message-id
application/news-transmission
application/octet-stream	bin dms lha lzh exe class
application/oda			oda
application/pdf			pdf
application/postscript		ai eps ps
application/powerpoint		ppt
application/remote-printing
application/rtf			rtf
application/slate
application/wita
application/wordperfect5.1
application/x-bcpio		bcpio
application/x-cdlink		vcd
application/x-compress
application/x-cpio		cpio
application/x-csh		csh
application/x-director		dcr dir dxr
application/x-dvi		dvi
application/x-gtar		gtar
application/x-gzip
application/x-hdf		hdf
application/x-koan		skp skd skt skm
application/x-latex		latex
application/x-mif		mif
application/x-netcdf		nc cdf
application/x-sh		sh
application/x-shar		shar
application/x-stuffit		sit
application/x-sv4cpio		sv4cpio
application/x-sv4crc		sv4crc
application/x-tar		tar
application/x-tcl		tcl
application/x-tex		tex
application/x-texinfo		texinfo texi
application/x-troff		t tr roff
application/x-troff-man		man
application/x-troff-me		me
application/x-troff-ms		ms
application/x-ustar		ustar
application/x-wais-source	src
application/zip			zip
audio/basic			au snd
audio/midi			mid midi kar
audio/mpeg			mpga mp2
audio/x-aiff			aif aiff aifc
audio/x-pn-realaudio		ram
audio/x-pn-realaudio-plugin	rpm
audio/x-realaudio		ra
audio/x-wav			wav
chemical/x-pdb			pdb xyz
image/gif			gif
image/ief			ief
image/jpeg			jpeg jpg jpe
image/png			png
image/tiff			tiff tif
image/x-cmu-raster		ras
image/x-portable-anymap		pnm
image/x-portable-bitmap		pbm
image/x-portable-graymap	pgm
image/x-portable-pixmap		ppm
image/x-rgb			rgb
image/x-xbitmap			xbm
image/x-xpixmap			xpm
image/x-xwindowdump		xwd
message/external-body
message/news
message/partial
message/rfc822
multipart/alternative
multipart/appledouble
multipart/digest
multipart/mixed
multipart/parallel
text/html			html htm
text/plain			txt
text/richtext			rtx
text/tab-separated-values	tsv
text/x-setext			etx
text/x-sgml			sgml sgm
video/mpeg			mpeg mpg mpe
video/quicktime			qt mov
video/x-msvideo			avi
video/x-sgi-movie		movie
x-conference/x-cooltalk		ice
x-world/x-vrml			wrl vrml

Sicherheit

Ablauf:

  1. Zugriff auf geschützte Seite

  2. Server schickt Fehlermeldung 401 Unauthorized  und verlangt Authentifizierung

  3. Browser fragt mit Dialogbox nach den Informationen, z.B. Benutzer/Passwort

  4. Browser wiederholt die Anfrage nach der geschützten Seite, diesmal aber mit Authentifizierung

  5. falls OK, schickt der Server die Seite

Installation


Ausblick


© Universität Mannheim, Rechenzentrum, 1998-2000.

Heinz Kredel
Last modified: Thu Nov 18 16:41:48 MET 1999