Uniform Resource Locators
A Uniform Resource Locator, or URL, is a unique name that identifies
a part of a node somewhere in the world. A URL consists of five parts:
- protocol: The network protocol to be used to retrieve the node.
- host: The complete (Internet) address of the host on which the
node resides.
- port: The port number to use (default is port 80).
- nodename: The complete pathname of the node.
- anchor: An (optional) indication of the part of the node that
is of interest.
In Xanadu
(a fully distributed hypertext system, developed by Ted Nelson at Brown
University, from 1965 on)
there was only one protocol, so that part
could be missing. Within a node every possible (contiguous) subpart
could be the destination of a link.
In the World Wide Web the standard protocol is
http, the HyperText Transfer Protocol. However, other protocols
such as gopher, ftp and telnet can be used with most browsers as well.
The destination anchor must be defined in the node. So only parts of a node
that are indicated by the author can be selected.
The syntax of a URL for the World Wide Web is:
protocol://host:port/nodename#anchorname
The complete syntax description for URLs can be found in two
standard documents:
rfc1738 for
absolute addresses and
rfc1808 for
relative addresses.
Common errors when first creating HTML documents are:
- The inclusion of spaces in URLs is not allowed.
- The difference between uppercase and lowercase is significant.
- The "slash" which is used to separate parts of a URL is a "normal"
slash (/) and not a backslash (\).
These errors may not be apparent when using "file:" urls on a Windows'95
system for instance, but they appear when using an http (Web) server
on a Unix system.
When you make these errors in the
final assignment
of this course you will not get a grade until you correct them.
If you are reading this document using any one of the popular browsers
the URL and title of the current document are always displayed.