Regular Expression: ASP Strip Tags

ASP, Regular Expressions No Comments »

This week, I have been mostly coding MAZIN. You will find out what that means when listening to Ram FM soon.

As part of the project, I had to create a page with a WYSIWYG editor (also known as Rich Text Editors) that would allow users to compose copy that may or may not include simple HTML tags such as bold, italic, lists, breaks and paragraphs.

As all sites we develop these days are XHTML based and standards compliant, I found that FCKEditor was the best choice - even though it is what Rich would call “Bloatware” - i.e, it’s rediculously large in terms of directories/language files/etc. That reminds me, I need to go through it and delete all the unwanted languages and plugin-in scripts before going live.

The problem with FCKEditor is that it will still allow users to post HTML that is not allowed i.e, you have told FCKEditor that you only want users to be able to make text bold, italic or whatever. This means that when we send the form, we need a strip_tags function, like PHP has.

Haven’t you already posted an ASP one!? I hear you ask. Well, I did. But the PHP version of strip_tags allows you to specify which tags you want to remain: www.php.net/strip_tags

After some researching, it seems that nobody has come up with an ASP version of this sweet function, so I wrote my own et voila:

'	=============================================================================================================
'	@name		stripHTML
'	@desc 		strips all HTML from code except for tags seperated by commas in the param "allowedTags"
'	@returns	string
'	=============================================================================================================
function stripHTML(strHTML, allowedTags)
	
	dim objRegExp, strOutput
	set objRegExp = new regexp
	
	strOutput = strHTML
	allowedTags = "," & lcase(replace(allowedTags, " ", "")) & ","
	
	objRegExp.IgnoreCase = true
	objRegExp.Global = true
	objRegExp.MultiLine = true
	objRegExp.Pattern = "< (.|\n)+?>" ' match all tags, even XHTML ones
	set matches = objRegExp.execute(strHTML)
	objRegExp.Pattern = "< (/?)(\w+)[^>]*>"
	for each match in matches
		tagName = objRegExp.Replace(match.value, "$2")
		tagName = "," & lcase(tagName) & ","
	
		if instr(allowedTags,tagName) = 0 then
			strOutput = replace(strOutput, match.value, "")
		end if
	next
	
	stripHTML = strOutput    'Return the value of strOutput
	set objRegExp = nothing
end function
	

Usage is simple, just do:

html = stripHTML(html, "b,i,strong,em,p,br")

Where b, i, strong, em, p and br are the tags you are allowing.

That’s all for now :)

Useful Regular Expressions in ASP

ASP, Regular Expressions No Comments »

While working on an ASP ticket system today that required regular expressions, I came up with a couple of useful regular expression patterns that may save people a few hours of thinking time.

Matching and extracting a string

Problem: I have the following chunk of arbitrary text and I want to extract the order number prefixed “ORD_”:

The quick brown fox... ORD_1012345678 ...jumped over the lazy dog

Solution: ORD_[a-zA-Z0-9_-]*

What is going on? Well, quite simply the regular expression engine is being asked to match the first three letters “ORD” followed by an underscore “_”. It then requires a series (*) of letters, numbers, underscores or dashes (but nothing else). Therefore, once the regular expression engine has found the order number “ORD_1012345678″ and then it comes to a whitespace, new line, period or whatever - it stops parsing.

ASP VBScript Code:

Set regEx = New RegExp
With regEx
	.Pattern = "ORD_[a-zA-Z0-9_-]*"
	.IgnoreCase = true
	.Global = false
End With
set matches = regEx.Execute(text)
if matches.count > 0 then
	result = matches.item(0).value
end if

The string “ORD_1012345678″, extracted from the chunk of text, will be stored in the variable “result”

A very similar version of string extraction

Problem: I have the following chunk of arbitrary text and I want to extract the ID number in square brackets (prefixed “[#”):

The quick brown fox jumped over the lazy dog [#101234-56789]

Solution: \[#([a-zA-Z0-9_-]*)

What is going on? In a similar way to the first one, this regular expression match pattern is asking for a square bracket followed by a hash “[#” - but because the opening square bracket is a reserved character (used to define sets), we have to escape it with a backwards slash before hand. We then surround the series of allowed characters with parenthesis ( ) which groups the match as a “sub match”.

ASP VBScript Code:

Set regEx = New RegExp
With regEx
	.Pattern = "\[#([a-zA-Z0-9_-]*)"
	.IgnoreCase = true
	.Global = false
End With
set matches = regEx.Execute(text)
if matches.count > 0 then
	result = matches(0).subMatches(0)
end if

The ID number “101234-56789″ will be stored in “result”

The important difference to note in this code is the use of “subMatches(0)” which returns the first match found in the brackets.

Stripping HTML tags

This function can be used to strip HTML tags from a string. It is very similar to the PHP function strip_tags(); but this one is not as advanced (yet).

A more advanced version is now available here :)

Let’s just jump straight to the code, you don’t really need to know what is going on (you can probably guess anyway)…

ASP VBScript Code:

function stripTags(strHTML)
	dim regEx
	Set regEx = New RegExp
	With regEx
		.Pattern = "< (.|\n)+?>"
		.IgnoreCase = true
		.Global = false
	End With
	stripTags = regEx.replace(strHTML, "")
end function

Trimming unwanted whitespace

If you want to trim unwanted whitespace from a string, e.g: turning “Text[space]spaced[space]normally[space][space][space]or[space][space]not?” into: “Text[space]spaced[space]normally[space]or[space]not?” use the following method:

function trimWhitespace(strIn, singleSpacing)
	dim regEx
	Set regEx = New RegExp
	With regEx
		.Pattern = "\s+"
		.IgnoreCase = true
		.Global = false
	End With
	if singleSpacing then
		space = " "
	else
		space = ""
	end if
	trimWhitespace = regEx.replace(strIn, space)
end function

When set to false, the second parameter “singleSpacing” will simply remove all whitespaces from a string, giving: “Textspacednormallyornot?”

I hope the above examples help someone!

You may find the following websites useful, I certainly did!

ASP Weather Class

ASP No Comments »

I’m currently working on a project that requires live weather data to be displayed on the homepage. Having searched around for a quick ASP script to do the work for me, I was left disappointed.

I ended up writing my own after finding that Yahoo! offer a free developers’ weather service here: http://developer.yahoo.com/weather.

The finished class is shown below - but requires functions “writeCache” and “readCache” which I might post another day. For now, you could just write your own or set “useCache” to false in the class.

Instructions on usage are in the comments.

Enjoy.

'	=============================================================================================================
'	@name			Weather Class
'
'	@author			James Crooke
'					james@fish-media.net
'
'	@copyright		Fish Media Ltd 2006
'
'	@desc 			retrieves latest weather from weather.yahoo.com
'
'	@usage			dim objWeather
'					set objWeather = new weather
'					with objWeather
'						.location = "SPXX0015" 	'(Spain)
'						.celsius = true			'(true uses celsius, false uses fahrenheit)
'						.fetch()
'					end with
'
'	@notes			Find location at http://weather.yahoo.com/regional/EUROPEX.html
'
'	@requires		readCache, writeCache functions
'	=============================================================================================================
	
class weather
	
	private parseError
	public condition
	public forecast
	private p_location
	private p_celsius
	private useCache
	
	private sub class_initialize()
		set objXML = Server.CreateObject("Microsoft.XMLDOM")
		set objLst = Server.CreateObject("Microsoft.XMLDOM")
		celsius = false
		useCache = true
	end sub
	
	public property let location(str)
		p_location = str
	end property
	
	public property let celsius(str)
		p_celsius = str
	end property
	
	private sub class_terminate()
		set objXML = nothing
		set objLst = nothing
	end sub
	
	public sub fetch()
		cacheEx = ""
		weatherCache = ""
		url = "http://xml.weather.yahoo.com/forecastrss?p=" & p_location
		if p_celsius then
			url = url & "&u=c"
			cacheEx = "cel"
		end if
	
		if useCache then
			weatherCache = readCache("weather" & p_location & cacheEx, 1) '1 day cache
		end if
	
		if weatherCache = "" then
			sourceXML = getXML(url)
			objXML.async = False
			objXML.loadXML(sourceXML)
			If objXML.parseError.errorCode <> 0 Then
				parseError = true
			else
				condition = getCondition("yweather:condition")
				forecast = getForecast("yweather:forecast")
				if useCache then
					cacheStr = condition(0) & "|" & condition(1) & "|" & condition(2) & "###" & forecast(0) & "|" & forecast(1)
					call writeCache("weather" & p_location & cacheEx, cacheStr)
				end if
			end if
		else
			weatherList = split(weatherCache, "###")
			condition = split(weatherList(0), "|")
			forecast = split(weatherList(1), "|")
		end if
	end sub
	
	private function getCondition(inElem)
		dim result(2)
		Set elemList = objXML.getElementsByTagName(inElem)
		For i=0 To (elemList.length -1)
			result(0) = elemList.item(i).getAttribute("temp")
			result(1) = elemList.item(i).getAttribute("text")
			result(2) = elemList.item(i).getAttribute("code")
		Next
		getCondition = result
	end function
	
	private function getForecast(inElem)
		dim result(1)
		Set elemList = objXML.getElementsByTagName(inElem)
		For i=0 To (elemList.length -1)
			' if the day is today...
			if lcase(elemList.item(i).getAttribute("day")) = lcase(weekdayName(weekday(date),true)) then
				result(0) = elemList.item(i).getAttribute("low")
				result(1) = elemList.item(i).getAttribute("high")
			end if
		Next
		' return the array result
		getForecast = result
	end function
	
	public function getXML(sourceFile)
		dim styleFile
		dim xmlDoc
	
		Dim xmlhttp
		Set xmlhttp = Server.CreateObject("Microsoft.XMLHTTP")
		xmlhttp.Open "GET", sourceFile, false
		xmlhttp.Send
		getXML = xmlhttp.ResponseText
		set xmlhttp = nothing
	end function
end class
	

Example Usage:

' (after including the class)...
	
dim objWeather
set objWeather = new weather
with objWeather
	.location = "SPXX0015"
	.celsius = true
	.fetch()
end with
	
response.write("<ul>")
response.write("<li>Temperature: " & objWeather.condition(0) & "</li>")
response.write("<li>Condition: " & objWeather.condition(1) & "</li>")
response.write("<li>Low: " & objWeather.forecast(0) & "</li>")
response.write("<li>High: " & objWeather.forecast(1) & "</li>")
response.write("</ul>")

Download “ASP Weather Class” here.

PHP Date Function

PHP, ASP No Comments »

The person that wrote the PHP Date function is a genious. I use it in almost every PHP application I write and its usage is so simple to work with. I have recently completed work on a booking calendar for www.tenerife-property-rentals.co.uk - which allows users to book any number of dates from an availability calendar. If it wasn’t for the mktime() and date(), it would have taken a long long time!

If anyone has an ASP alternative, please let me know.

ASP, it’s not all bad.

ASP No Comments »

Joining Fish Media meant getting to grips with ASP at a commercial level. As some of you know, I’m not I wasn’t the greatest fan of ASP (given the other choices available; PHP/JSP). I felt ASP lacked the control, the speed, the complexity and depth of PHP. The fact it was developed by Microsoft and closed source also put me off. It seems I was wrong, in most cases.

In actual fact, ASP 3.0 isn’t all that bad. Perhaps it’s because I am a better programmer than I was 3 years ago…

My first large scale ASP project was to write a (re-usable) e-commerce web site with all sorts of advanced features. I have written it entirely in ASP, using MySQL v5.0 as a backend datasource. The best thing about the application is its built with search engine optimisation in mind - one thing that 99% of e-commerce sites fail to achieve; even Amazon - Jakob take note.

I researched quite extensively on ASP URL re-writing techniques - where PHP has its Apache mod-rewrite tricks, ASP/IIS seems to lack big time. Never fear, I came up with my own solution © CJ ASP SEO-Friendly URL 2006! Other SEO features include 1) making the site accessible to search engines (rule number 1), 2) structuring the site correctly, 3) usability, 4)…. I can’t give all my secrets away, jesus.

Technical geeks read on…ASP is such a basic language (and by language I should say VBScript); it’s when you start writing spaghetti code that it slows down. My tip of the day is to write clear, minimalistic code that gets to the point. Code that is called more than once, should be re-used and re-used effectively! Only one database and query needs to be instantiated at any one time (unless you are doing nested querying that is) - make them global to your procedures/functions and your pages will load so much quicker, I guarantee.

This is a great page that will show you how you can open/close/rewind/skip through/re-query/cancel/resync a query and much more: http://www.w3schools.com/ado/ado_ref_recordset.asp. I bet you didn’t even know you could do half of that with ADO Recordsets.


© James Crooke 2000-2008
Entries RSS Login