Site icon DataFlair

URL Encoding in HTML – HTML URL Encode Characters

Free Web development courses with real-time projects Start Now!!

Welcome to DataFlair HTML Tutorial. We will learn about HTML URL Encode and encoding process along with encode characters.

HTML URL Encoding

Uniform Resource Locator or URL is used as the address of a document on the web. It can be composed of words, typically the Domain Name Server(DNS), or IP address. For example, is a URL.

The structure of this URL is as follows-


Scheme – Defines the internet service type, commonly http or https.
Prefix – Defines the domain prefix, www.
Domain – It defines the domain name of the internet,
Port – Defines the host’s port number, 80 is the default port number for http.
Path – Defines the path at the server.
Filename – It defines the name of the file or the document that is being displayed.

HTML URL Schemes

Some common URL schemes are-

HTML URL Encode Characters

URL encoding is the practice of translating characters within URL to ASCII so that they can be easily transmitted and get accepted by all the browsers present globally on the internet. The non-ASCII characters are shown with a percentage sign (%) followed by hexadecimal digits.

Hence, URL encoding basically involves replacing a character that does not start with ‘%’ followed by hexadecimal digits to the ASCII character set. For example, if you want to type a space in the URL, you write it as %20. $ is replaced by %24.

Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!

We would interpret this as ‘new article.htm’

The browser encodes the input as per the character set used in the document. The character set used in HTML5 is UTF-8.

The character that is encoded are:

a. HTML ASCII Control Characters

The characters used for output control. Typically from 00-1F(0-31 in decimal) and 7F(127 in decimal).
ASCII Encoding Example:

Character From Windows-1252 From UTF-8
%80 %E2%82%AC
£ %A3 %C2%A3
© %A9 %C2%A9
® %AE %C2%AE
À %C0 %C3%80
Á %C1 %C3%81
 %C2 %C3%82
à %C3 %C3%83
Ä %C4 %C3%84
Å %C5 %C3%85

For entire URL encoding, please visit

b. HTML Non-ASCII Characters

The characters beyond the ASCII characters i.e., beyond 128 characters. Following is the list of Non-ASCII url encoding-

128 %80
129 %81
130 %82
131 %83
132 %84
133 %85
134 %86
135 %87
136 %88
137 %89
138 %8a
139 %8b
140 %8c
141 %8d
142 %8e
143 %8f
144 %90
145 %91
146 %92
147 %93
148 %94
149 %95
150 %96
151 %97
152 %98
153 %99
154 %9a
155 %9b
156 %9c
157 %9d
158 %9e
159 %9f
160 %a0
161 %a1
162 %a2
163 %a3
164 %a4
165 %a5
166 %a6
167 %a7
168 %a8
169 %a9
170 %aa
171 %ab
172 %ac
173 %ad
174 %ae
175 %af
176 %b0
177 %b1
178 %b2
179 %b3
180 %b4
181 %b5
182 %b6
183 %b7
184 %b8
185 %b9
186 %ba
187 %bb
188 %bc
189 %bd
190 %be
191 %bf
192 %c0
193 %c1
194 %c2
195 %c3
196 %c4
197 %c5
198 %v6
199 %c7
200 %c8
201 %c9
202 %ca
203 %cb
204 %cc
205 %cd
206 %ce
207 %cf
208 %d0
209 %d1
210 %d2
211 %d3
212 %d4
213 %d5
214 %d6
215 %d7
216 %d8
217 %d9
218 %da
219 %db
220 %dc
221 %dd
222 %de
223 %df
224 %e0
225 %e1
226 %e2
227 %e3
228 %e4
229 %e5
230 %e6
231 %e7
232 %e8
233 %e9
234 %ea
235 %eb
236 %ec
237 %ed
238 %ee
239 %ef
240 %f0
241 %f1
242 %f2
243 %f3
244 %f4
245 %f5
246 %f6
247 %f7
248 %f8
249 %f9
250 %fa
251 %fb
252 %fc
253 %fd
254 %fe
255 %ff

c. HTML Reserved Encode Characters

These include all the special characters such as semicolon(;), dollar($), question mark(?). These characters have a different meaning in URLs and need to be encoded. For example, the ‘/’ character has a special meaning i.e. it is used to separate the paths of URL and at the same time, it is also a reserved character. It is encoded as %2F. Following is the list of reserved characters.

! %21
* %2A
( %28
) %29
; %3B
: %3A
@ %40
& %26
= %3D
+ %2B
$ %24
, %2C
/ %2F
? %3F
# %23
[ %5B
] %5D

d. HTML Safe Encode Characters

Alphanumeric characters i.e. 0-9, a-z,A-Z , special characters such as $, -, _, ., +, !, *, ‘, (, ), and reserved characters are not encoded and known as safe characters.

e. HTML Unsafe Encode Characters

These include space, greater than and less than signs, quotation marks, etc. They have a tendency to be misinterpreted in the URL and thus should be encoded properly. Following is the list of some unsafe characters-

space %20
# %23
% %25
{ %7B
} %7D
| %7C
\ %5C
^ %5E
~ %7E
[ %5B
] %5D

URL-Encoding for Control Characters

ASCII characters from %00-%1F were designed to control hardware devices.

ASCII Character Description URL-encoding
NUL Null Character %00
SOH Start of Header %01
STX Start of Text %02
ETX End of Text %03
EOT End of Transmission %04
ENQ Enquiry %05
ACK Acknowledge %06
BEL Bell (Ring) %07
BS Backspace %08
HT Horizontal Tab %09
LF Line Feed %0A
VT Vertical Tab %0B
FF Form Feed %0C
CR Carriage Return %0D
SO Shift Out %0E
SI Shift In %0F
DLE Data Link Escape %10
DC1 Device Control 1 %11
DC2 Device Control 2 %12
DC3 Device Control 3 %13
DC4 Device Control 4 %14
NAK Negative Acknowledge %15
SYN Synchronize %16
ETB End Transmission Block %17
CAN Cancel %18
EM End of Medium %19
SUB Substitute %1A
ESC Escape %1B
FS File Separator %1C
GS Group Separator %1D
RS Record Separator %1E
US Unit Separator %1F


In this article, we’ve discussed the Uniform Resource Locator, which typically defines the address of a document on the web. We’ve discussed the process of encoding the character of the URL to the corresponding ASCII characters that are globally understood by the browsers on the internet. We’ve also looked at some common URL encoded characters in Windows-1252 and UTF-8 along with URL encoding of control characters.

Exit mobile version