Url Website Links For
Uniform
 

Information About

Url




A Uniform Resource Locator (URL) is a string of Character s conforming to a standardized format, which refers to a resource on the Internet (such as a document or an image) by its location. For example, the URL of this page on Wikipedia is http://en.wikipedia.org/wiki/Uniform_Resource_Locator.

An , commonly called a web address, is usually shown in the address bar of a Web Browser .

The term is typically pronounced as either a spelled-out Initialism ("yoo arr ell") or as an acronym (''earl'' or ''ural'' as in the ''Ural Mountains'').

Tim Berners-Lee created the URL in 1991 to allow the publishing of Hyperlink s on the World Wide Web , a fundamental innovation in the History Of The Internet . Since 1994 , the URL has been subsumed into the more general Uniform Resource Identifier (URI), but ''URL'' is still a widely used term.

The ''U'' in ''URL'' has always stood for Uniform, but it is sometimes described as Universal, perhaps because URI did mean Universal Resource Identifier before RFC 2396.


DEFINITION


URIs and URLs


Every URL is a type of RFC 1738 without indicating where to find the text of this RFC. Now consider three URLs for three separate documents containing the text of this RFC:
  • http://www.ietf.org/rfc/rfc1738.txt

  • http://www.w3.org/Addressing/rfc1738.txt

  • http://rfc.sunsite.dk/rfc/rfc1738.html

  • Each URL uniquely identifies each document and thus is a URI itself, but URL syntax is such that the identifier allows one to also locate each of these documents. Thus, a URL functions as the document's address.


Historically, the terms have been almost synonymous as almost all URIs have also been URLs.
For this reason, many definitions in this article mention URIs instead of URLs; the discussion applies to both URIs and URLs.


URL scheme


A URL is classified by its Scheme , which typically indicates the network protocol used to retrieve a representation of the identified resource over a computer network. A URL begins with the name of its scheme, followed by a colon, followed by a '''scheme-specific part'''.

Some examples of URL schemes:


Some of the first URL schemes, such as the still-popular "mailto", "http", "ftp", and "file" schemes, along with the general syntax of URLs, were first detailed in ) and RFC 2732 ( 1999 ), both of which are obsolete but still widely referenced by URL scheme definitions; and the current standard, STD 66 / RFC 3986 ( 2005 ).


Generic URL syntax


All URLs, regardless of scheme, must conform to a generic Syntax . Each scheme can impart its own requirements for the syntax of the scheme-specific part, but the URL must still conform to the generic syntax.

Using a limited subset of characters compatible with the printable subset of the ASCII repertoire, the generic syntax allows a URL to represent a resource's address, regardless of the original format of the components of the address.

Schemes using typical connection-based protocols use a common "generic URI" syntax, defined below:

''scheme''://''authority''/''path''?''query''#''fragment''

The ''authority'' typically consists of the name or IP Address of a server, optionally followed by a colon and a TCP port number. It may also contain a username and password for authenticating to the server.

The ''path'' is a specification of a location in some hierarchical structure, using a slash ("/") as delimiter between components.

The ''query'' typically expresses parameters of a dynamic query to some database, program, or script residing on the server.

The ''fragment'' identifies a portion of a resource, often a location in a document.


Example: HTTP URLs


The URLs employed by HTTP , the protocol used to transmit web pages, are the most popular kind of URI and can be used as an example to demonstrate the concept of the URI. The HTTP URL syntax is:

''scheme://host:port/path?parameter=value#anchor''

  • scheme, in the case of HTTP, is most of the time http, but https can also be used for signifying HTTP over a TLS connection (to make the connection more secure).

  • Many web browsers allow the use of scheme://''username:password@''host:port/... for HTTP authentication. This format has been used as an exploit to make it difficult to correctly identify the server involved. Consequently, support for this format has been dropped from some browsers. Section 3.2.1 of RFC 3986 recommends that browsers should display the username / password not in the address bar, but in a different way because of the security problems mentioned and because passwords should never be displayed in clear-text.

  • host, which is probably the most prominent part of a URL, is in almost all cases the Domain Name of a Server , e.g. www.wikipedia.org, google.com, www.imv.au.dk, etc.

  • The :port portion specifies a TCP port number. It is usually omitted, and web-browsers will use the default, port 80. However, HTTP can be used over any port number, provided the server has been set up to provide HTTP service at that port.

  • The path portion is used by the server (specified by host) in whatever way the server's software is set up, but in many cases it specifies a filename, possibly prepended with directory names. In the path /wiki/Cow for example, wiki would be a (pseudo-)directory and Cow would be a (pseudo-)filename.

  • The part given above as ?parameter=value is referred to as ''query'' portion (sometimes ''search'' portion). It can either be omitted, have one parameter-value pair as in the example, or have many of them, which is expressed as ?para=value&anotherpara=value&.... The parameter-value pairs are only relevant if the file specified by the path is not a simple, static webpage, but some sort of automatically generated page. The generator software uses the parameter-value pairs in any way it is set up; mostly they carry information specific to one user and one moment in the use of the site, like concrete search terms, usernames, etc. (Watch, for example, how the URL in your browser's address bar behaves during a Google search: your search term is passed to some sophisticated program on google.com as a parameter, and Google's program returns a page with the search results to you.)

  • The #anchor part, lastly, is called ''fragment identifier'' and refers to certain significant places inside a page; for example, this page has anchors at each section heading which can be referred to via the fragment ID. They are relevant if a URL should be given which, when loaded in a browser, directly jumps to a certain point in a long page. An example would be , which leads to this page and to the beginning of this section. (Watch how the URL in your browser's address bar changes when clicking the link.)


For another example of an HTTP URL, See Below .


URI references


The term URI reference means a particular instance of a URI, or portion thereof, as used in, for instance, an .

An absolute URL is a URI reference that is just like a URL defined above; it starts with a scheme followed by a colon and then a scheme-specific part. A '''relative URL''' is a URI reference that comprises just the scheme-specific part of a URL, or some trailing component thereof. The scheme and leading components are inferred from the context in which the URL reference appears: the '''base URI''' (or '''base URL''') of the document containing the reference.

A URI reference can also be followed by a hash sign ("#") and a pointer to within the resource referenced by the URI as a whole. This is not a part of the URI as such, but is intended for the "user agent" (browser) to interpret after a representation of the resource has been retrieved. Therefore, it is not supposed to be sent to the server in HTTP requests.

Examples of absolute URLs:
  • http://en.wikipedia.org/w/wiki.phtml?title=Train&action=history

  • http://en.wikipedia.org/wiki/Train#Freight_trains


Examples of relative URLs:
  • //nl.wikipedia.org/wiki/Train

  • /wiki/Train

  • Train#Passenger_trains



Case-sensitivity


According to the current standard, the scheme and host components are case-insensitive, and when normalized during processing, should be lowercase. Other components should be assumed to be case-sensitive. However, in practice case-sensitivity of the components other than the protocol and hostname are up to the Webserver and Operating System of the system hosting the website.


URLS IN EVERYDAY USE


An HTTP URL combines into one simple address the four basic items of information
necessary to retrieve a resource from anywhere on the Internet:
  • the Protocol to use to communicate,

  • the host ( Server ) to communicate with,

  • the Network Port on the server to connect to,

  • the Path to the resource on the server (for example, its file name).


A typical URL can look like:

http://en.wikipedia.org:80/wiki/Special:Search?search=train&go=Go

In the example above:
  • ''http'' is the protocol,

  • ''en.wikipedia.org'' is the host,

  • ''80'' is the network Port Number on the server (as 80 is the default value for the HTTP protocol, this portion could have been omitted entirely),

  • ''/wiki/Special:Search'' is the resource path,

  • ''?search=train&go=Go'' is the Query String ; this part is optional.


Most , as HTTP is by far the most common protocol used in web browsers. Likewise, since 80 is the default port for http it is not usually specified. One usually just enters a partial URL such as www.wikipedia.org/wiki/Train. To go to a Homepage one usually just enters the host name, such as www.wikipedia.org.

Since the HTTP protocol allows a server to respond to a request by redirecting the web browser to a different URL, many servers additionally allow users to omit certain parts of the URL, such as the "www." part, or the trailing slash if the resource in question is a redirection, which may be performed at a top-level server and not on the HTTP server - but this distinction is transparent to an end-user). However, these omissions technically make it a different URL, so the web browser cannot make these adjustments, and has to rely on the server to respond with a redirect. It is possible, but due to tradition rare, for a web server to serve two different pages for URLs that differ only in a trailing slash.

Note that in en.wikipedia.org/wiki/Train, the hierarchical order of the five elements is org ( Generic Top-level Domain ) - wikipedia (second-level domain) - en ( Subdomain ) - wiki - Train; i.e. before the first slash from right to left, then the rest from left to right.

For a more extensive discussion of HTTP URLs and their use, .


THE BIG PICTURE


The term URL is also used outside the context of the World Wide Web. Database servers specify URLs as a parameter to make connections to it. Similarly any Client-Server application following a particular protocol may specify a URL format as part of its communication process.

Example of a database URL :
jdbc:datadirect:oracle://myserver:1521;sid=testdb

If a webpage is uniquely and more or less permanently defined by a URL it can be .

Apart from the purpose of linking to a page or page component, one may want to know the URL to show the component alone, and/or to lift restrictions such as a browser window without toolbars, and/or of a small non-adjustable size.

Web servers also have the ability to redirect URLs if the destination has changed, allowing sites to change their structure without affecting existing links. This process is known as URL Redirection .


SEE ALSO




EXTERNAL LINKS


  • RFC 3986 / STD 66 (2005) – the current generic URI syntax specification

  • RFC 2396 (1998) and RFC 2732 (1999) – obsolete, but widely implemented, version of the generic URI syntax

  • RFC 1808 (1995) – obsolete companion to RFC 1738 covering relative URL processing

  • RFC 1738 (1994) – mostly obsolete definition of URL schemes and generic URI syntax

  • RFC 1630 (1994) – the first generic URI syntax specification; first acknowledgment of URLs in an Internet standard

  • URI Working Group – coordination center for development of URI standards

  • Architecture of the World Wide Web, Volume One, ยง2: Identification – by W3C

  • The IANA 's official list of registered URI schemes