Skip to main content

Home URL encoding

URL encoding

(also percent-encoding)

RAID 51 definition

URL encoding is a method to encode information in a URL by replacing unallowed characters with a percent sign (%) followed by two hexadecimal digits and spaces replaced by the plus sign (+) or %20. The two hexadecimal digits of the triplet(s) represent the numeric value of the replaced character.

Some characters have special meanings in URLs, for example, characters like #, &, ?, and % delineate different URL parts. When these characters are part of the data intended for transmission, they’re percent-encoded to avoid misinterpretation by web browsers or servers.

See also: URL hijack, URL redirection attack, vulnerability, SQL injection, XSS

Dangers associated with URL encoding

  • Incorrect encoding. The URL won’t if the special characters aren’t properly encoded.
  • Security vulnerabilities. Incorrect encoding may lead to security vulnerabilities like cross-site scripting (XSS) or SQL injection attacks.
  • Data corruption. Incorrect handling of URL encoding or decoding might result in data corruption. Characters might be misinterpreted or lost, leading to incorrect data being processed by the server or client.
  • Character set mismatches. URLs traditionally only support ASCII characters. If characters outside this range are not properly encoded, it can lead to issues — especially if the server expects a different character encoding than what is sent.
  • Compatibility issues. Different web servers and browsers may handle URL encoding differently, particularly with characters not usually treated as needing encoding. This can lead to inconsistencies in how different platforms interpret URLs.
  • Information leaks. URLs, including their encoded components, are often logged in server logs or browser history. Sensitive information, if included in URLs and not adequately protected, can be exposed in these logs.