CSE3325: eXtensible HyperText Markup Language (XHTML)


In the previous lecture:

In this lecture:


Identifying files on the Internet.

big bug talks to little bug

The Internet ("The Net") is a global network (of networks) of computers.

Every computer on the Net has a unique numerical address (an IP address) and a people-friendly equivalent.

 

130.194.64.81 ...is the numerical address for... molly.cs.monash.edu.au

 

The Net is divided into domains, and subdomains.

molly
is the machine name.
cs
is the Computer Science subdomain.
monash
is the Monash University domain.
edu
indicates the address is educational.
What other extensions are there for different types of institutions?
au
indicates the address is Australian.
What other extensions are there for different countries?

Every file on a computer has a filename unique for that machine. When appended to the IP address of its host computer, every file on the Internet therefore has a unique name.


Steps for Retrieving Documents from the Web.

Computers on the Internet called name servers keep lists of numerical IP addresses & people-friendly names and translate between them.

1) A web browser (client) sends a request using HyperText Transfer Protocol (HTTP) for a document, specified by its unique name, to a remote (server) machine.

The unique file name is specified within a Uniform Resource Locator (URL)...

Protocol://server_domain_name/file_path

The protocol may be omitted within some web browsers in which case HTTP is assumed.

Absolute URL's

http://www.cs.monash.edu.au/~aland/index.html

ftp://ftp.cs.monash.edu.au/pub/

are absolute because they include a domain name and a path.

Relative URL's

index.html

../index.html

are relative because they specify a path and domain name by reference to (usually) the URL of the file currently open in the browser (often referred to as the base).

Locations within documents

http://www.cs.monash.edu.au/~aland/index.html#chapter

index.html#fred

The text after the # symbols indicates a location within the document specified by the URL.

These locations are named whilst the document is being created. The #location is an optional part of a URL. When would it be useful to specify a location within a document in a URL?

2) A web server program on a remote machine always 'listens' on a 'well-known' port for incoming requests. (Port 80 for HTTP)

3) The web server checks client access privileges, if all is well, it sends the requested document.

4) Browser displays document retrieved from server on client machine in human-readable form.

A web document is anything accessed with a single request from a client to a server.


Try this in your own time*

Commands to type. Explanation.
telnet www.csse.monash.edu.au 80 Telnet to the school's WWW server (on port 80)
GET /index.html HTTP/1.0 Access the web page "index.html" using the GET command which the browser would normally do for you. Follow your command with two carriage returns.
>> The server should send you the HTML of file "index.html" See? The protocol isn't magic, you can participate in it manually.

* A little exercise taken from Lloyd Allison's old notes


Hyper Text Markup Language (HTML)

HTML is a document-layout and hyperlink specification language that was derived from Standrard Generalized Markup Language (SGML).

HTML tags specify:

Several versions of HTML were approved by the WWW Consortium (W3C) (see http://www.w3.org/ ). The last of these versions was HTML 4.01 approved in 1999.


How was HTML supposed to be used?

HTML was intended for specification of document structure, not control of document appearance.

I.e. HTML was not originally intended for graphic design & typography.

Originally the browser interpretted and displayed a document's elements as it liked. Hence the final appearance of a document was up to the client browser, not the HTML author.

E.g. HTML allows the specification of a Heading level 2 but the client decides that all Headings level 2 shall be displayed in bold, 12pt Times Roman text (or otherwise).

HTML moved towards specification of document appearance with the addition of Style Sheets and tags that allowed specification of exact fonts and colours, but...

...differences between the way browsers of different authorship displayed and interpretted HTML made things tricky for designers.

Some browsers incorporated proprietary extensions to HTML...

...which did not work on other browsers (eg. Micro$oft Explorer & Netscape Navigator).


eXtensible HyperText Markup Language (XHTML)


Writing Your Own XHTML

For best results use one of the following:

For often poor results, use:


You will also require:


XHTML tags

 


Sample XHTML document code

In the sample code that follows, for interest's sake, XHTML-specific tags are marked in blue. The remaining tags were also present in HTML although it was possible to get away with missing some of them out altogether.

<?xml version = "1.0" encoding = "utf-8"?>

<!DOCTYPE html PUBLIC "-//w3c//DTD XHMTL 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns = "http://www.w3.org/1999/xhtml">

<head>

<title> A silly, simple, sample page </title>

</head>

<body>

< !-- Document content goes here -->
<h2>A Grand Day</h2>
<p>
Oh what a <i>lovely</i> day <br /> for a walk!
</p>
<p>
Let's wander over to CEMA's <a href="http://www.csse.monash.edu.au/~cema">home page</a> and take a look!
</p>

</body>

</html>

The page produced by this code is available.


Some special tags

An XHTML document has two parts, a head and a body...

<a href="linked_to_doc_URL#anchor_name"> clickable elements go here </a>

<a name="anchor_name"> clickable elements go here </a>


Inline Images

The basic requirements for an image image tag are the source (src) attribute (a URL) of the image and some alternate (alt) text to display if images are turned off. (See these important notes on accessibility).

<img src="images/bug.GIF" alt="picture of bug" />

Additional attributes of the image tag may also be added. For example,

<img width="100" height="150" border="1" src="images/bug.GIF" align=left alt="picture of bug" />

<a href="bug.html"> <img src="bug.GIF" /> </a>

little bug

border=0

little bug

border=1

little bug

border=3


Image Maps

fat bug thin bug

Click on the bugs above to see an image map in action!

(See these important notes on accessibility)


Ordered and Unordered Lists

<h4>Spot the odd one out</h4>
<ul>
<li>Tomatoes</li> <li>Potatoes</li> <ul>
<li>sweet</li>
<li>rotten</li> </ul>
<li>Elephantoes</li>
</ul>

Spot the odd one out

  • Tomatoes
  • Potatoes
    • sweet
    • rotten
  • Elephantoes



<h4>Spot the dog</h4>
 <ol> 
<li>Collar</li>
 <li>Cat</li>
 <li>Caterpillar</li>
</ol>

Spot the dog

  1. Collar
  2. Cat
  3. Caterpillar
 

Additional things to research

Tables - very useful for laying out pages.

this is
a simple table

<table>
<tr>
<td>this</td>
<td>is</td>
</tr>

<tr>
<td>a simple</td>
<td>table</td>
</tr>
</table>

Forms - useful to obtain data from users


I'm a radio button
I'm a radio button too


Formatting and other tags

Tags such as <br />, <hr />, <span>, <div> & <meta> are all handy to know about, as are many others... do some reading to find out what tags are available. Some of them will be touched upon in later lectures.

Web References:



This lecture's key point(s):


CSE3325 courseware | CSE3325 lecture notes

CEMA logo

©Copyright Alan Dorin & Jon McCormack 1999-2006