Advanced HTML Authoring (Stretching the Web)
- Advanced HTML Authoring (Stretching the Web)
- Part 1 - Intro, Imagemaps, Simple Dynamic Docs
- Dr Lawrie Brown
- Computer Science, ADFA
- Administrivia
- background familiarity assumed with:
- navigating the WWW using web browsers
- basic authoring of static HTML documents
- simple script programming
- web version of these notes are at:
- will have a practical session later each day
- Introduction
- will discuss the following features
- Clickable Images
- provides "hotspot" regions on an image
- Dynamic Document Creation
- documents whose contents change
- may be static with included information
- may be entirely dynamic, created by CGI script
- Forms
- enables information to be entered and returned
- Accessing Documents on the Web
- key (but not only) protocol is HTTP
- HTTP = HyperText Transfer Protocol
- agreed language and procedures for exchanging hypermedia (not just text)
documents
- User interacts with a client program (browser) eg Cello, Lynx, Netscape
- browser requests information from server programs (CERN, NCSA) using HTTP
- Accessing Documents
- each document (or separate part eg graphic) involves a NEW connection to
some server
- each request is separate, independent
- HTTP gives form of request & response
- HTTP Servers
- program which accepts HTTP requests
- run continuously, listening for requests
- many servers, commercial & free exist
- must run on a suitably accessible system
- high availability
- good network connectivity
- server installation beyond scope of this course
- HTTP Protocol
- two version of http protocol are in use:
- HTTP/0.9 original simple protocol
- GET url <CR><LF>
- document is returned (in raw binary format)
- HTTP/1.0 more complex protocol with multimedia support & complex
responses
- GET url HTTP/1.0 <CR><LF><CR><LF>
- document is returned (MIME encoded & tagged)
- also has HEAD & POST requests
- Structure of HTTP Response
- HTTP/1.0 200 Document follows
- MIME-Version: 1.0
- Server: CERN/3.0
- Content-Type: text/html
- <html><head>
- <title>My Doc</title></head><body>
- .... body of HTML document
- </body></html>
- Structure of HTTP Response
- HTTP/1.0 200 Document follows
- MIME-Version: 1.0
- Server: CERN/3.0
- Content-Type: image/gif
- GIF87a .... rest of binary gif image
- Clickable Images
- provides "hotspot" regions on an image
- different locations link to different docs
- can provide a more intuitive interface for spacially oriented
information, BUT
- at a cost of additional:
- download time (graphics are larger than text)
- resources (needed to map a clicked location to URL)
- eg.
- Using Imagemaps
- a graphical inline image (gif)
- a map file - relating areas on image to URLs
- a mechanism to take a location and map using the map file to some desired
URL
- a HTML including the image and related all the above info together
- Image
- may be any graphic which can included in a HTML document
- currently GIF or perhaps JPEG are suitable
- should be aware of image size
- larger images are slower to download
- use ISMAP attribute on IMG SRC tag when including in HTML document
- <IMG SRC="nifty.gif" ISMAP>
- informs browser that users can select areas on it
- Map File
- a text file containing info on which parts of the image are linked to
which URLs
- can partition image into:
- rectangles, circles, polygons & a default
- can create file manually or with a mapeditor
- program which lets you draw regions on the image and automatically
creates the map file
- unfortunately there are TWO equivalent but incompatible standards (more
later)
- Mapping Mechanism
- a program running on the HTTP server
- given the coordinates in the image selected by the user & the map file
- redirects the browser to the desired image
- can use, with appropriate map file:
- htimage (CERN)
- imagemap (NCSA)
- in future this may be done by browser
- HTML document
- may be any HTML document including:
- an anchor referring to the mapping program and the map file
- the anchor "text" is the included image
- <A HREF="/cgi-bin/htimage/~user/nifty.map"> <IMG SRC="nifty.gif"
ISMAP></A>
- note: map file name immediately follows mapping program name
- server may allow a shorter references
- "/img/~user/nifty.map" or "/~user/nifty.map"
- Which Standard?
- unfortunately there are TWO equivalent but incompatible standards for
these:
- htimage (CERN)
- imagemap (NCSA)
- functionally these are basically the same
- need to determine which (perhaps both) are run on the server you use
- check with your administrator for details
- htimage
- provided with the CERN web server
- example HTML file using htimage
- in file ~lpb/maptest1.html
- <html><head><title>htimage map
test</title></head>
- <body><h1>Map Test (htimage)</h1>
- Please select a country:<p>
- <a href="/cgi-bin/htimage/~lpb/aus-nz.cmap">
- <img src="aus-nz.gif" ISMAP></a>
- </body></html>
- htimage map file format
- may contain any of the following:
- default URL
- circle (x,y) r URL
- rectangle (x1,y1) (x2,y2) URL
- polygon (x1,y1) (x2,y2) ... (xn,yn) URL
- shapes are checked in order, first match used
- URLs may be:
- full (with access method, hostname, path)
- partial (file pathname must start with /)
- htimage map file example
- aus-nz.cmap
- rect (88,53) (117,96) /~lpb/map_resp1.html
- rect (2,2) (86,74) /~lpb/map_resp2.html
- default /~lpb/fumble.html
- nb. note that you can abbreviate keywords
- def, circ, rect, poly accepted
- imagemap
- provided with the NCSA web server
- example HTML file using imagemap
- in file ~lpb/maptest2.html
- <html><head><title>imagemap
test</title></head>
- <body><h1>Map Test (imagemap)</h1>
- Please select a country:<p>
- <a href="/cgi-bin/imagemap/~lpb/aus-nz.nmap">
- <img src="aus-nz.gif" ISMAP></a>
- </body></html>
- imagemap map file format
- may contain any of the following:
- default URL
- circle URL x1,y1 x2,y2
- rectangle URL x1,y1 x2,y2
- polygon URL x1,y1 x2,y2 ... xn,yn
- # arbitrary comment
- circles defined by bounding rect, may be ovals
- shapes are checked in order, first match used
- imagemap map file example
- aus-nz.nmap
- # comments are allowed in imagemap files
- default /~lpb/fumble.html
- rect /~lpb/ map_resp1.html 88,53 117,96
- rect /~lpb/ map_resp2.html 2,2 86,74
- nb. note that you can abbreviate keywords
- def, circ, rect, poly accepted
- Map Editors
- determining the coordinates for map regions from raw gif is not easy
- much better to use a custom map editor
- these load a graphics file
- let you draw regions over it and link to URLs
- export results in CERN/NCSA map formats
- examples:
- WebMap (Macintosh)
- Web Hotspots, Mapedit (Windows)
- tkmapedit (Unix/X)
- Some final words on Imagemaps
- remember they can be nice BUT
- some people don't/can't load images
- make sure you have a plain text alternative
- image loading can be slow
- not everyone is on your local LAN
- may need to compromise on:
- image size
- image resolution
- number of colours (16 colours is smaller than 256)
- Including Other Documents into HTML
- would be nice to have standard headers, logos, footers that could be
included
- can do this for graphics, vis
- would be nice to have say
- BUT we don't
- one reason is it requires client to open a separate connection to server
=> inefficient
- Client-Side Includes
- inline graphics are a client-side include
- client must open another connection to server to retrieve it (or have it
in cache)
- referenced image need not be static
- could be generated at time by a program
- common uses are for
- page counters
- random image selection (eg advertising)
- dynamic image creation (eg from GIS data)
- Counters
- one form of counter returns a graphic
- <IMG SRC="/cgi-bin/nph-count? link=user1">
- are a number of these programs available
- MUST be installed on server by administrator
- so check which if any you have available
- have Heini Withagen's WWW Counter on walrus
- to use, just have a program reference as SRC
- Using WWW Counter
- follow name of counter program with argument=value pairs
- <IMG SRC="/cgi-bin/nph-count?width=4&
link=link_arg&increase=1">
- important that link_arg is unique, using the document URL is a good idea
- width specifies how many digits to use
- to show value without changing: increase=0
- to update without showing: show=NO
- Why Use a Web Counter
- yes its a neat trick, BUT
- it defeats proxy caching servers efficiency
- uses extra network connections
- makes server run another program for each page
- if you really want to know how many people look at your pages,
statistical analysis of server logs is MUCH more efficient
- if must use, use sparingly
- don't use other sites counter programs!
- Server-Side Includes
- can also make server do includes
- more efficient than if done on client
- still makes server do more work scanning docs
- known as Server-Side Includes (SSI)
- only some servers currently support SSI
- NCSA, Apache, Netscape Commerce
- generally require a special file suffix file.shtml
- others don't, can fudge using a program
- CERN doesn't, can use ssis program
- SSI Format
- all SSI directives are SGML comments
- <!--#command tag1="val1" tag2="val2" -->
- common variants include:
- <!--#include file="name" -->
- includes contents of another file at the specified point
- <!--#exec cgi="script" -->
- executes script placing output at specified point
- <!--#exec cmd="program" -->
- Security Concerns
- SSI include is relatively safe, BUT
- SSI exec lets anyone, anywhere, run a program on your system, as server
user
- easy to be careless and compromise system
- hence exec capability is strictly controlled
- exec cgi="script" restricts scripts to know central cgi-bin script
directory
- exec cmd="program" very dangerous, program could be anywhere, often
disallowed
- Dynamic HTML Documents
- documents whose contents change
- as a result of included information
- by client (client-side include)
- by server (server-side include)
- either another file or output from a program
- may be dynamic, created by CGI script
- CGI = Common Gateway Interface
- standard for interfacing to web scripts
- Simple Dynamic Documents
- created by running a program/script
- by default, scripts are located in a central area
- called by using a URL referring to them
- <A HREF="/cgi-bin/fortune">thought for day</a>
- output from script is most of HTTP response
- Content-Type: text/html
- HTML document
- Sample Script
- #!/bin/sh
- cat << EOM
- Content-type: text/html
- <html><head><title>Fortune</title></head>
- <body><h1>Thought for the Day</h1><pre>
- EOM
- /usr/games/fortune
- cat << EOM
- </pre></body></html>
- EOM
-
- More on Simple Scripts
- could use any command giving useful info in place of fortune
- could process the output inserting HTML tags to improve its look if known
format
- basic idea though still the same
- use a program to generate a complete HTML file as part of the HTTP
response
- will consider CGI interface in next segment
- including how to pass information in to server
- Standard Scripts
- some common scripts are often setup on the server
- htimage or imagemap
- nph-count
- fortune, date
- must be installed by an administrator
- run as server user
- User CGI Scripts
- some servers can be configured to allow scripts in user areas (eg NCSA)
- others can use a standard script cgiwrap
- user scripts allow anyone, anywhere, to run a program on your system, as
YOU
- any program flaws compromise your files
- extreme care should be taken writing scripts
- Using cgiwrap
- create a directory called cgi-bin in your public web directory (WWW or
public_html)
- copy script to your cgi-bin
- make it executable by anyone
- refer to it from some html document
- URL invokes cgiwrap followed by user script
- <A HREF="/cgi-bin/cgiwrap/~user/fortune">my thought for
today</a>
- Lab Session
- goals for this session
- modify and use clickable images
- use maptest.html, adfa.html as guides
- edit the map files using a Map Editor
- add a counter to one of your documents
- create a simple user cgi-script to display info
- eg. show output from: cal, date, uptime, users
- based on sample fortune script
- using cgiwrap to allow user scripts
- References
- some useful references on advanced HTML:
- World Wide Web FAQ
- Local Info on assorted web features
- Barebone Guide to HTML
- the definitive reference
- found on the official W3 consortium server www.w3.org
- Review
- HTTP protocol & servers
- clickable images
- client and server side includes, counters
- simple dynamic documents
Lawrie.Brown@adfa.edu.au / 02-Feb-96