URL Decoder/Encoder

URL Decoder/Encoder

URL Decoder and Encoder by bojidar.com
 URL Decoder/Encoder

URL Decoder and Encoder

URL Encoding is a technique employed in order to convert a string used in a URL into a valid URL format.

Input a string of text and encode or decode it as you like.
Handy for turning encoded JavaScript URLs from complete gibberish into readable gibberish.
If you’d like to have the URL Decoder/Encoder for offline use, just view source and save to your hard drive.

Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a Uniform Resource Identifier (URI) under certain circumstances. Although it is known as URL encoding it is, in fact, used more generally within the main Uniform Resource Identifier (URI) set, which includes both Uniform Resource Locator (URL) and Uniform Resource Name (URN). As such, it is also used in the preparation of data of the application/x-www-form-urlencoded media type, as is often used in the submission of HTML form data in HTTP requests.

Types of URI characters: The characters allowed in a URI are either reserved or unreserved (or a percent character as part of a percent-encoding). Reserved characters are those characters that sometimes have special meaning. For example, forward slash characters are used to separate different parts of a URL (or more generally, a URI). Unreserved characters have no such meanings. Using percent-encoding, reserved characters are represented using special character sequences. The sets of reserved and unreserved characters and the circumstances under which certain reserved characters have special meaning have changed slightly with each revision of specifications that govern URIs and URI schemes.

According to the specification for a valid URL format, published in RFC 1738, the permissible characters in URLs are limited to only AlphaNumerics [0-9, a-z, A-Z], the special characters “$-_.+!*'(),” and some very limited special purpose reserved characters.

URLs are normally needed to be encoded in an HTML document wherever the URL needs to refer to an object to access or retrieve elements such as FORM,TABLE, TD,TH,TR,A, APPLET, AREA, BASE, BGSOUND, BODY, EMBED, FORM, FRAME, IFRAME, ILAYER, IMG, ISINDEX, INPUT, LAYER, LINK, OBJECT, SCRIPT, or SOUND.

All versions of HTML allow ISO Latin (ISO-8859-1) character sets within HTML documents.

HTML4 allows Unicode character set besides ISO Latin (ISO-8859-1) although according to RFC 2396, there is no default process to specify the character set information safely as required in a URL.

The most common usage of URL encoding is the obtaining and conversion of data which is passed through HTML forms in a valid URL format. This is because some parts of this data may contain some special characters which may not be valid for an URL, or they may have some special meaning such as “/”, or “#”.

For instance, “space” , though permissible within an HTML document as a separator and delimiter, may not constitute a part of a valid URL. The “space” character is normally encoded as %20 or “+” character. The “$” symbol is encoded as %24, “&” ampersand symbol as %26 and so on.

Usually, the “GET” and “POST” form methods automatically encode data for a valid URL.

List of Common Special Characters and Their Encodes: The following is a list of special characters (also referred to as reserved characters) with their URL encodes:-

; %3B ? %3F / %2F : %3A # %23 & %26 = %3D + %2B $ %24 , %2C %20 or + % %25 < %3C > %3E ~ %7E % %25

*** The URL encoding of a character usually uses a % sign followed by a hexadecimal code for the specified character.

Apart from the reserved character set which is used here, other character sets that are needed to be URL encoded are the ASCII Control characters which include character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal.) and Non-ASCII characters which include character ranging from 80-FF hex (128-255 decimal.)

Programming Language Support for URL Encoding and URL Decoding
There exists built-in functions in most web programming languages which enable URL encoding as well as URL decoding. The following is a list of important methods used for URL Encoding and URL Decoding in various programming languages:-

PHP string urlencode ( string $str ) which is used to encode a string that is in a query part of a URL, where str is passed as an argument as a string to be encoded. string urldecode ( string $str ) which is used to decode a string , where str is passed as an argument as a string to be decoded. It returns decoded string. Other similar functions are:- rawurlencode() and rawurldecode() which work on URL encoding and URL decoding according to RFC 1738.

.NET HttpUtility.UrlEncode for encoding URL and can take various arguments:- UrlEncode(array[]()[]) For converting a byte array into an encoded URL string UrlEncode(String) For encoding a URL string UrlEncode(String, Encoding) For encoding a URL string using the specified encoding object. UrlEncode(array[]()[], Int32, Int32) For Converting a byte array into a URL-encoded string, starting at the specified position in the array to the specified number of bytes. .NET HttpUtility.UrlEncode For Url decoding HttpUtility.UrlDecode is available in the .Net.

JavaScript and VBScript escape(string) is used in URL encoding and unescape(string) is used in URL decoding.

ASP Server.URLEncode is used for URL encoding. Example:-
< %Response.Write(Server.URLEncode("http://www.google.com")) %>
Outputs:- http%3A%2F%2Fwww%2Egoogle%2Ecom 5) Perl uri_escape and uri_unescape followed by uri(uniform resource identifier) are the functions used for URL Encoding and Decoding respectively.

Leave a Reply

Your email address will not be published. Required fields are marked *

*


*