This is a visual explainer based on the Wikipedia article on Uniform Resource Locators (URLs). Read the full source article here. (opens in new tab)

The Web's Address System

An in-depth exploration of how URLs locate and retrieve resources across the digital landscape, covering syntax, history, and internationalization.

Explore Syntax ➡️ Discover History

Dive in with Flashcard Learning!


When you are ready...
🎮 Play the Wiki2Web Clarity Challenge Game🎮

What is a URL?

Resource Locator

A Uniform Resource Locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it.[6] It is a specific type of Uniform Resource Identifier (URI), though the terms are often used interchangeably.[7][a]

Ubiquitous Application

URLs are most commonly used to reference web pages via the Hypertext Transfer Protocol (HTTP/HTTPS). However, their application extends to various other protocols and services, including file transfer (FTP), email (mailto), and database access (JDBC), among many others.[10]

Address Bar Standard

Most web browsers display the URL of the current web page prominently in their address bar. A typical URL structure, such as http://www.example.com/index.html, clearly delineates the protocol (http), the hostname (www.example.com), and the specific resource path (index.html).

Historical Development

Genesis of URLs

The formal definition of Uniform Resource Locators emerged in 1994 within RFC 1738, authored by Tim Berners-Lee, the architect of the World Wide Web, and the URI working group of the Internet Engineering Task Force (IETF).[11][12] This development was a culmination of collaborative efforts initiated at an IETF meeting in 1992.[12]

Syntactic Evolution

The URL format ingeniously merged the established domain name system (dating back to 1985) with file path conventions, utilizing slashes to delineate directories and filenames.[13] Early proposals considered "Universal Document Identifiers" (UDIs) and "Universal" Resource Locators, reflecting a desire for a comprehensive naming system, though the term "Uniform" eventually prevailed.[15] Berners-Lee later expressed a preference for the "universal" designation and noted the redundancy of the double slashes following the scheme.[14]

URL Syntax Breakdown

Hierarchical Structure

A URL adheres to the generic URI syntax, comprising five hierarchical components, ordered by decreasing significance from left to right:

URI = scheme ":" ["//" authority] path ["?" query] ["#" fragment]

The scheme and path are always defined, while authority, query, and fragment are optional. Components are considered undefined if their delimiter is absent, and empty if they contain no characters (except for the scheme, which must be non-empty).[1]

Key Components Explained

Let's dissect the primary components:

  • Scheme: Identifies the protocol (e.g., http, https, ftp, mailto). It begins with a letter and can include letters, digits, '+', '.', or '-'. Schemes are case-insensitive, but lowercase is canonical.[18]
  • Authority: Optional component, preceded by //. It includes optional userinfo (username:password, deprecated for security), the host (hostname or IP address), and an optional port number.
  • Path: A sequence of segments separated by slashes (/), representing the location of the resource. It can resemble a file system path but doesn't always imply a direct mapping.
  • Query: Optional component, preceded by ?, containing non-hierarchical data, often in attribute-value pairs.
  • Fragment: Optional component, preceded by #, identifying a specific part of a resource (e.g., a section in an HTML document).

Query Delimiters

The query component often consists of attribute-value pairs. While the syntax is flexible, conventions dictate how these pairs are structured:

Query Delimiter Example
Ampersand (&) key1=value1&key2=value2
Semicolon (;)[c] key1=value1;key2=value2

Historically, RFC 1866 encouraged support for semicolons in addition to ampersands.[c]

Internationalized URLs

Embracing Global Languages

To accommodate users worldwide, URLs can now include characters from various alphabets. An Internationalized Resource Identifier (IRI) is a URL that supports Unicode characters.[23][24] Modern browsers fully support IRIs.

IDNs and Encoding

The domain name part of an IRI is known as an Internationalized Domain Name (IDN). Web software automatically converts these into Punycode, a format compatible with the Domain Name System (DNS). For example, a Chinese URL might become http://xn--fsqu00a.xn--3lr804guic/.[25] Similarly, URL path names can use local scripts, which are then encoded using UTF-8 and percent-encoding for characters outside the standard URL set (e.g., Japanese characters becoming %E5%BC%95%E3%81%8D%E5%89%B2%E3%82%8A.html).[23]

Protocol-Relative URLs

Flexible Linking

Protocol-relative links (PRLs), or protocol-relative URLs (PRURLs), are URLs that omit the protocol scheme. For instance, //example.com will automatically adopt the protocol (typically HTTP or HTTPS) of the current page.[26][27] This offers flexibility, especially when serving content over both HTTP and HTTPS.

Related Concepts

Key Terminology

Understanding URLs involves familiarity with related concepts:

  • Hyperlink: A reference enabling users to navigate between resources.
  • URI: A broader term encompassing URLs and URNs (Uniform Resource Names).
  • URI Fragment: A part of a URL (after '#') that points to a specific section within a resource.
  • Hostname: The domain name or IP address identifying a server.
  • URI Scheme: The protocol identifier (e.g., http, ftp).
  • URL Normalization: The process of transforming a URL into a canonical form.
  • URL Redirection: Automatically forwarding a user from one URL to another.

Identifier Landscape

URLs exist within a larger framework of resource identification:

  • Internationalized Resource Identifier (IRI): An IRI is a URL that supports Unicode characters, allowing for non-ASCII characters in domain names and paths.
  • Persistent Uniform Resource Locator (PURL): A PURL is a type of URL designed to remain stable over time, even if the resource's actual location changes.
  • Uniform Resource Name (URN): A URN provides a persistent, location-independent identifier for a resource.

Further Exploration

Official Specifications & Tools

For deeper technical understanding and practical application, consult these resources:

  • URL specification at WHATWG
  • URL splitter tool

Teacher's Corner

Edit and Print this course in the Wiki2Web Teacher Studio

Edit and Print Materials from this study in the wiki2web studio
Click here to open the "Url" Wiki2Web Studio curriculum kit

Use the free Wiki2web Studio to generate printable flashcards, worksheets, exams, and export your materials as a web page or an interactive game.

True or False?

Test Your Knowledge!

Gamer's Corner

Are you ready for the Wiki2Web Clarity Challenge?

Learn about url while playing the wiki2web Clarity Challenge game.
Unlock the mystery image and prove your knowledge by earning trophies. This simple game is addictively fun and is a great way to learn!

Play now

References

References

A full list of references for this article are available at the URL Wikipedia page

Feedback & Support

To report an issue with this page, or to find out ways to support the mission, please click here.

Disclaimer

Important Notice

This page was generated by an Artificial Intelligence and is intended for informational and educational purposes only. The content is based on a snapshot of publicly available data from Wikipedia and may not be entirely accurate, complete, or up-to-date.

This is not professional technical advice. The information provided on this website is not a substitute for professional consultation regarding web development, network protocols, or software architecture. Always refer to official documentation and consult with qualified professionals for specific technical requirements or implementation guidance.

The creators of this page are not responsible for any errors or omissions, or for any actions taken based on the information provided herein.