A better HTMLencode for protection against cross-site scripting (XSS)

Cross site scripting can get really ugly. When interacting with a website visitors might try to post script (mostly javascript) which could be harmful.

In VBscript you can use server.HTMLencode to replace problematic characters such as <, > and ” with HTML-code, other programming langauges have their own counterparts. Unfortunatelly this is often not enough. Read the Microsofts article about server.HTMLencode and which characters are replaced.

What if you are paranoid or simply want encoding options?

HTMLencode with alternatives

I use the following:

‘//////////////////////////////////////////////////////////////
‘///
‘/// This script was created by Marcin Nowak and published at http://mnowak.se
‘/// Feel free to use this script as long as you include this disclaimer.
‘/// A link to http://mnowak.se is appriciated.
‘///
‘//////////////////////////////////////////////////////////////function getHTMLencode(str,char)
dim temp, i
temp=”"
if len(str)=0 then
str=”"
else
if char=”all” then
for i=1 to len(str)
temp=temp&”&#”&ascw(mid(str,i,1))&”;”
next
str=temp
elseif char=”nonletters” then
for i=1 to len(str)
if not instr(”abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ”,mid(str,i,1))>0 then
temp=temp&”&#”&ascw(mid(str,i,1))&”;”
else
temp=temp&mid(str,i,1)
end if
next
str=temp
elseif len(char)>0 then
for i=1 to len(str)
if instr(char,mid(str,i,1))>0 then
temp=temp&”&#”&ascw(mid(str,i,1))&”;”
else
temp=temp&mid(str,i,1)
end if
next
str=temp
else
str=server.HTMLEncode(str)
end if
end if
temp=empty
getHTMLencode=str
end function

str is the string for encoding, char contains encoding alternatives

char can have the following values:
all - encodes ALL characters, perfect for the paranoid one.
nonletters - encodes all characters except english letters.
custom charactes - decide exactly which characters are to be encoded. Ex: “abc” would encode the letter a, b and c.
“” - which result in the usage of the traditional server.HTMLencode.

Works in UTF-8

This script works in UTF-8 thanks to ascW.

As mentioned in the beginning of the script, if you choose to use the script a link to mnowak.se (my primary blog) would be appriciated.

No Comment

No comments yet

Leave a reply