Uniform Resource Locator

From Wikipedia, the free encyclopedia

Jump to: navigation, search

In computing, a Uniform Resource Locator (URL) is a type of Uniform Resource Identifier (URI) that specifies where an identified resource is available and the mechanism for retrieving it.[1] In popular usage and in many technical documents and verbal discussions it is often, imprecisely and confusingly, used as a synonym for uniform resource identifier. The confusion in usage stems from historically different interpretations of the semantics of the terms involved.[2] In popular language, a URL is also referred to as a Web address.

Contents

[edit] History

The URL was created in 1990 by Tim Berners-Lee as part of the URI.[3] He regrets the format of the URL. Instead of being divided into the route to the server, separated by dots, and the file path, separated by slashes; he would have liked it to be one coherent hierarchical path.[4] For example, http://www.serverroute.com/path/to/file.html would look like http://com/serverroute/www/path/to/file.html.

[edit] Syntax

Every URL is made up of some combination of the following: The scheme name or resource type, a registered domain name or internet protocol address, the port number, the pathname of the file to be fetched or the program to be run, the query string,[5][6] and with html files, an anchor for where the page should be displayed at.[7]

The combined syntax will look similar to this: resource_type://domain:port/filepathname?query_string#anchor

  • The scheme name, or resource type, defines its namespace, purpose, and the syntax of the remaining part of the URL. Most Web-enabled programs will try to dereference a URL according to the semantics of its scheme and a context-vbn. For example, a Web browser will usually dereference the URL http://example.org:80 by performing an HTTP request to the host example.org, at the port number 80. Dereferencing the URL mailto:bob@example.com will usually start an e-mail composer with the address bob@example.com in the To field.
    • Other examples of scheme names include https: gopher:, wais:, ftp:. URLs that specify https as a scheme (such as https://example.com/) normally denote a secure website.
  • The registered domain name or IP address gives the destination location for the URL. The domain google.com, or its IP address 72.14.207.99, directs you to where Google's website resides.
  • The hostname and domain name portion of a URL are case insensitive since the DNS is specified to ignore case. http://en.wikipedia.org/ and HTTP://EN.WIKIPEDIA.ORG/ will both open same page.
  • The port number is optional. If it is not provided, the default for the scheme will be used. For example, in your browser you could type http://google.com:80 which would bring you to google.com on port 80. If you left out port 80, your browser would navigate to the same location because port 80 is the default for HTTP.
  • The file path name is the destination on the server for where to access the file or program they are looking for. It may be treated as case insensitive by some servers, especially those that are based on Microsoft Windows. For example:
    • http://en.wikipedia.org/wiki/URL is correct, but http://en.wikipedia.org/WIKI/URL/ will result in an HTTP 404 error page.
  • The query string contains data to be passed to web applications such as CGI programs. The query string contains name/value pairs separated by ampersands, with names and values in each pair being separated by equal signs, for example first_name=John&last_name=Doe.
  • The anchor part when used with HTTP allows you to be directed to a specific location on the page after you have navigated there. For example, http://en.wikipedia.org/wiki/URL#Syntax would bring you to the beginning of the Syntax section of this page.

[edit] Absolute vs Relative URLs

An absolute URL is one that points to the exact location of a file. It is unique meaning that if two URLs are identical, they point to the same file.[8] An example of this would be: http://en.wikipedia.org/wiki/File:Raster_to_Vector_Mechanical_Example.jpg

A relative URL points to the location of a file from a point of reference. This reference is usually the directory beneath the file.[8] It is preceded by two dots (../directory_path/file.txt) for the directory above, one dot (./directory_path/file.txt) for the current directory or without the beginning slash( directory_path/file.txt), which is also the current directory.

[edit] URLs as locators

In its current strict technical meaning, a URL is a URI that, “in addition to identifying a resource, [provides] a means of locating the resource by describing its primary access mechanism (e.g., its network ‘location’).”[9][10]

[edit] Internet hostnames

On the Internet, a hostname is a domain name assigned to a host computer. This is usually a combination of the host's local name with its parent domain's name. For example, "en.wikipedia.org" consists of a local hostname ("en") and the domain name "wikipedia.org". This kind of hostname is translated into an IP address via the local hosts file, or the Domain Name System (DNS) resolver. It is possible for a single host computer to have several hostnames; but generally the operating system of the host prefers to have one hostname that the host uses for itself.

Any domain name can also be a hostname, as long as the restrictions mentioned below are followed. So, for example, both "en.wikimedia.org" and "wikimedia.org" are hostnames because they both have IP addresses assigned to them. The domain name "pmtpa.wikimedia.org" is not a hostname since it does not have an IP address, but "rr.pmtpa.wikimedia.org" is a hostname. All hostnames are domain names, but not all domain names are hostnames.

[edit] See also

[edit] References

  1. ^ RFC 1738
  2. ^ RFC 3305
  3. ^ URL Specification
  4. ^ World Wide Web History
  5. ^ RFC 1738
  6. ^ PHP parse_url() Function, http://us.php.net/parse_url, retrieved on 2009-03-12 
  7. ^ URL Syntax
  8. ^ a b Absolute vs Relative URLs
  9. ^ Tim Berners-Lee, Roy T. Fielding, Larry Masinter. (January 2005). “Uniform Resource Identifier (URI): Generic Syntax”. Internet Society. RFC 3986; STD 66.
  10. ^ by describing its primary access mechanism

[edit] External links

Personal tools