Understanding URL Encoding
URL encoding, also known as percent-encoding, is a mechanism defined in RFC 3986 for representing characters in a Uniform Resource Identifier (URI) that are not allowed or have special meaning within the URL syntax. When a character cannot be directly represented in a URL, it is replaced with a percent sign (%) followed by two hexadecimal digits representing the character's byte value in UTF-8 encoding. For example, a space character becomes %20, an ampersand becomes %26, and a forward slash becomes %2F.
URLs can only contain a limited set of characters from the ASCII character set. The unreserved characters that can appear literally in a URL are: uppercase and lowercase letters (A-Z, a-z), digits (0-9), hyphens (-), periods (.), underscores (_), and tildes (~). All other characters, including spaces, international characters, and reserved characters like ? & = # /, must be percent-encoded when used as data (rather than as URL delimiters).
JavaScript provides two built-in functions for URL encoding: encodeURIComponent() and encodeURI(). The key difference is scope: encodeURIComponent() encodes everything except unreserved characters, making it ideal for encoding individual query parameters. encodeURI() preserves characters that have special meaning in URLs (like : / ? # @ &), making it suitable for encoding an entire URL where you want to preserve the URL structure. Using the wrong function is a common source of bugs in web applications.