FIT3084: eXtensible
HyperText Markup Language (XHTML)
In the previous lecture:
The WWW is a document network linked by hyperlinks.
The WWW and Web browser mask the complexities of accessing computers and files on the Internet to simplify the retrieval of information from remote computers.
In this lecture:
Identifying files on the Internet. |
The Internet ("The Net") is a global network (of networks) of computers.
Every computer on the Net has a unique numerical address (an IP address) and a people-friendly equivalent.
130.194.64.81 ...is the numerical address for... molly.cs.monash.edu.au
The Net is divided into domains, and subdomains.
molly |
is the machine name. |
cs |
is the Computer Science subdomain. |
monash |
is the Monash University domain. |
edu |
indicates the address is educational. What other extensions are there for different types of institutions? |
au |
indicates the address is Australian. What other extensions are there for different countries? |
Every file on a computer has a filename unique for that machine. When appended to the IP address of its host computer, every file on the Internet therefore has a unique name.
Steps for Retrieving Documents from the Web.
Computers on the Internet called name servers keep lists of numerical
IP addresses & people-friendly names and translate between them.
1) A web browser (client) sends a request using HyperText Transfer
Protocol (HTTP) for a document, specified by its unique name, to a remote
(server) machine.
The unique file name is specified within a Uniform Resource Locator (URL)...
Protocol://server_domain_name/file_path
The protocol may be omitted within some web browsers in which case HTTP is assumed.
Absolute URL's
http://www.cs.monash.edu.au/~aland/index.html
ftp://ftp.cs.monash.edu.au/pub/are absolute because they include a domain name and a path.
Relative URL's
index.html
../index.html
are relative because they specify a path and domain name by reference to (usually) the URL of the file currently open in the browser (often referred to as the base).Locations within documents
http://www.cs.monash.edu.au/~aland/index.html#chapter
index.html#fred
The text after the # symbols indicates a location within the document specified by the URL.
These locations are named whilst the document is being created. The #location is an optional part of a URL. When would it be useful to specify a location within a document in a URL?
2) A web server program on a remote machine always 'listens'
on a 'well-known' port for incoming requests. (Port 80 for HTTP)
3) The web server checks client access privileges, if all is well, it
sends the requested document.
4) Browser displays document retrieved from server on client machine in human-readable form.
A web document is anything accessed with a single request from a client to a server.
Try this in your own time*
Commands to type. | Explanation. |
telnet www.csse.monash.edu.au 80 | Telnet to the school's WWW server (on port 80) |
GET /index.html HTTP/1.0 | Access the web page "index.html" using the GET command which the browser would normally do for you. Follow your command with two carriage returns. |
>> The server should send you the HTML of file "index.html" | See? The protocol isn't magic, you can participate in it manually. |
* A little exercise taken from Lloyd Allison's old notes
Hyper Text Markup Language (HTML)
HTML is a document-layout and hyperlink specification language that was derived from Standrard Generalized Markup Language (SGML).
HTML tags specify:
Several versions of HTML were approved by the WWW Consortium (W3C) (see http://www.w3.org/ ). The last of these versions was HTML 4.01 approved in 1999.
How was HTML supposed to be used?
HTML was intended for specification of document structure, not control of document appearance.
I.e. HTML was not originally intended for graphic design & typography.
Originally the browser interpretted and displayed a document's elements as it liked. Hence the final appearance of a document was up to the client browser, not the HTML author.
E.g. HTML allows the specification of a Heading level 2
but the client decides that all Headings level 2 shall be displayed
in
bold, 12pt Times Roman text (or otherwise).
HTML moved towards specification of document appearance with the addition of Style Sheets and tags that allowed specification of exact fonts and colours, but...
...differences between the way browsers of different authorship displayed and interpretted HTML made things tricky for designers.
Some browsers incorporated proprietary extensions to HTML...
...which did not work on other browsers (eg. Micro$oft Explorer & Netscape Navigator).
eXtensible HyperText Markup Language (XHTML)
Writing Your Own XHTML
For best results use one of the following:
For often poor results, use:
You will also require:
XHTML tags
Sample XHTML document code
In the sample code that follows, for interest's sake, XHTML-specific tags are marked in blue. The remaining tags were also present in HTML although it was possible to get away with missing some of them out altogether.<?xml version = "1.0" encoding = "utf-8"?>
<!DOCTYPE html PUBLIC "-//w3c//DTD XHMTL 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns = "http://www.w3.org/1999/xhtml">
<head>
<title> A silly, simple, sample page </title>
</head>
<body>
< !-- Document content goes here -->
<h2>A Grand Day</h2>
<p>
Oh what a <i>lovely</i> day <br /> for a walk!
</p>
<p>
Let's wander over to CEMA's <a href="http://www.csse.monash.edu.au/~cema">home page</a> and take a look!
</p></body>
</html>
The page produced by this code is available.
An XHTML document has two parts, a head and a body...
<a href="linked_to_doc_URL#anchor_name"> clickable elements go here </a>
<a name="anchor_name"> clickable elements go here </a>
The basic requirements for an image image tag are the source (src) attribute (a URL) of the image and some alternate (alt) text to display if images are turned off. (See these important notes on accessibility).
<img src="images/bug.GIF" alt="picture of bug" />
Additional attributes of the image tag may also be added. For example,
<img width="100" height="150" border="1" src="images/bug.GIF" align=left alt="picture of bug" />
<a href="bug.html"> <img src="bug.GIF" /> </a>
border=0 |
border=1 |
border=3 |
<img src="myImage.JPG" usemap="#myImageMap" />
<map id="myImageMap">
<area href="page1.html" shape="circle" coords="152,113,14" alt="page 1" />
<area href="page2.html" shape="polygon" coords="241,64,235,91,332,91,338,67" alt="page 2"/>
</map>
Click on the bugs above to see an image map in action!
(See these important notes on accessibility)
Ordered and Unordered Lists
<h4>Spot the odd one out</h4> <ul> |
Spot the odd one out
|
<h4>Spot the dog</h4> <ol> <li>Collar</li> <li>Cat</li> <li>Caterpillar</li> </ol> |
Spot the dog
|
Additional things to research
Tables - very useful for laying out pages.
|
<table> <tr> |
Forms - useful to obtain data from users
Formatting and other tags
Tags such as <br />, <hr />, <span>, <div> & <meta> are all handy to know about, as are many others... do some reading to find out what tags are available. Some of them will be touched upon in later lectures.
Web References:
©Copyright Alan Dorin 2008