Regular Expression: ASP Strip Tags

ASP, Regular Expressions No Comments »

This week, I have been mostly coding MAZIN. You will find out what that means when listening to Ram FM soon.

As part of the project, I had to create a page with a WYSIWYG editor (also known as Rich Text Editors) that would allow users to compose copy that may or may not include simple HTML tags such as bold, italic, lists, breaks and paragraphs.

As all sites we develop these days are XHTML based and standards compliant, I found that FCKEditor was the best choice - even though it is what Rich would call “Bloatware” - i.e, it’s rediculously large in terms of directories/language files/etc. That reminds me, I need to go through it and delete all the unwanted languages and plugin-in scripts before going live.

The problem with FCKEditor is that it will still allow users to post HTML that is not allowed i.e, you have told FCKEditor that you only want users to be able to make text bold, italic or whatever. This means that when we send the form, we need a strip_tags function, like PHP has.

Haven’t you already posted an ASP one!? I hear you ask. Well, I did. But the PHP version of strip_tags allows you to specify which tags you want to remain: www.php.net/strip_tags

After some researching, it seems that nobody has come up with an ASP version of this sweet function, so I wrote my own et voila:

'	=============================================================================================================
'	@name		stripHTML
'	@desc 		strips all HTML from code except for tags seperated by commas in the param "allowedTags"
'	@returns	string
'	=============================================================================================================
function stripHTML(strHTML, allowedTags)
	
	dim objRegExp, strOutput
	set objRegExp = new regexp
	
	strOutput = strHTML
	allowedTags = "," & lcase(replace(allowedTags, " ", "")) & ","
	
	objRegExp.IgnoreCase = true
	objRegExp.Global = true
	objRegExp.MultiLine = true
	objRegExp.Pattern = "< (.|\n)+?>" ' match all tags, even XHTML ones
	set matches = objRegExp.execute(strHTML)
	objRegExp.Pattern = "< (/?)(\w+)[^>]*>"
	for each match in matches
		tagName = objRegExp.Replace(match.value, "$2")
		tagName = "," & lcase(tagName) & ","
	
		if instr(allowedTags,tagName) = 0 then
			strOutput = replace(strOutput, match.value, "")
		end if
	next
	
	stripHTML = strOutput    'Return the value of strOutput
	set objRegExp = nothing
end function
	

Usage is simple, just do:

html = stripHTML(html, "b,i,strong,em,p,br")

Where b, i, strong, em, p and br are the tags you are allowing.

That’s all for now :)

Useful Regular Expressions in ASP

ASP, Regular Expressions No Comments »

While working on an ASP ticket system today that required regular expressions, I came up with a couple of useful regular expression patterns that may save people a few hours of thinking time.

Matching and extracting a string

Problem: I have the following chunk of arbitrary text and I want to extract the order number prefixed “ORD_”:

The quick brown fox... ORD_1012345678 ...jumped over the lazy dog

Solution: ORD_[a-zA-Z0-9_-]*

What is going on? Well, quite simply the regular expression engine is being asked to match the first three letters “ORD” followed by an underscore “_”. It then requires a series (*) of letters, numbers, underscores or dashes (but nothing else). Therefore, once the regular expression engine has found the order number “ORD_1012345678″ and then it comes to a whitespace, new line, period or whatever - it stops parsing.

ASP VBScript Code:

Set regEx = New RegExp
With regEx
	.Pattern = "ORD_[a-zA-Z0-9_-]*"
	.IgnoreCase = true
	.Global = false
End With
set matches = regEx.Execute(text)
if matches.count > 0 then
	result = matches.item(0).value
end if

The string “ORD_1012345678″, extracted from the chunk of text, will be stored in the variable “result”

A very similar version of string extraction

Problem: I have the following chunk of arbitrary text and I want to extract the ID number in square brackets (prefixed “[#”):

The quick brown fox jumped over the lazy dog [#101234-56789]

Solution: \[#([a-zA-Z0-9_-]*)

What is going on? In a similar way to the first one, this regular expression match pattern is asking for a square bracket followed by a hash “[#” - but because the opening square bracket is a reserved character (used to define sets), we have to escape it with a backwards slash before hand. We then surround the series of allowed characters with parenthesis ( ) which groups the match as a “sub match”.

ASP VBScript Code:

Set regEx = New RegExp
With regEx
	.Pattern = "\[#([a-zA-Z0-9_-]*)"
	.IgnoreCase = true
	.Global = false
End With
set matches = regEx.Execute(text)
if matches.count > 0 then
	result = matches(0).subMatches(0)
end if

The ID number “101234-56789″ will be stored in “result”

The important difference to note in this code is the use of “subMatches(0)” which returns the first match found in the brackets.

Stripping HTML tags

This function can be used to strip HTML tags from a string. It is very similar to the PHP function strip_tags(); but this one is not as advanced (yet).

A more advanced version is now available here :)

Let’s just jump straight to the code, you don’t really need to know what is going on (you can probably guess anyway)…

ASP VBScript Code:

function stripTags(strHTML)
	dim regEx
	Set regEx = New RegExp
	With regEx
		.Pattern = "< (.|\n)+?>"
		.IgnoreCase = true
		.Global = false
	End With
	stripTags = regEx.replace(strHTML, "")
end function

Trimming unwanted whitespace

If you want to trim unwanted whitespace from a string, e.g: turning “Text[space]spaced[space]normally[space][space][space]or[space][space]not?” into: “Text[space]spaced[space]normally[space]or[space]not?” use the following method:

function trimWhitespace(strIn, singleSpacing)
	dim regEx
	Set regEx = New RegExp
	With regEx
		.Pattern = "\s+"
		.IgnoreCase = true
		.Global = false
	End With
	if singleSpacing then
		space = " "
	else
		space = ""
	end if
	trimWhitespace = regEx.replace(strIn, space)
end function

When set to false, the second parameter “singleSpacing” will simply remove all whitespaces from a string, giving: “Textspacednormallyornot?”

I hope the above examples help someone!

You may find the following websites useful, I certainly did!

UK Companies Act 2007

General No Comments »

Limited companies in the UK must include certain regulatory information on their websites and in their email footers before 1st January 2007 or they will breach the Companies Act and risk a fine.

Whether in hard copy, electronic or any other form:

A company must state its name, in legible lettering, on the following -

  • all the company’s business letters;
  • all its notices and other official publications;
  • all bills of exchange, promissory notes, endorsements, cheques and orders for money or goods purporting to be signed by, or on behalf of, the company;
  • all its bills of parcels, invoices, receipts and letters of credit
  • on all its websites

On all of its business letters, order forms or any of the company’s web sites, the company must show in legible lettering:

  • its place of registration
  • registered number
  • its registered office address
  • and if it is being wound up, that fact,

Whenever an email is used where its paper equivalent would be caught by the stationery requirements then that email is also subject to the requirements

The above also applies to Limited Liability Partnerships.

Above information taken from http://www.companieshouse.gov.uk/promotional/busStationery.shtml

This regulation seems to have come out of nowhere - but it is true, no wind-up or early April fools. The important part to note for webmasters is:

For websites, contrary to the fears of some, the specified information does not need to appear on every page. Again, many websites will already list the required information, perhaps on their ‘About us’ or ‘Legal info’ pages.

As quoted here: http://www.out-law.com/page-7594

Get your site updated!

ASP Weather Class

ASP No Comments »

I’m currently working on a project that requires live weather data to be displayed on the homepage. Having searched around for a quick ASP script to do the work for me, I was left disappointed.

I ended up writing my own after finding that Yahoo! offer a free developers’ weather service here: http://developer.yahoo.com/weather.

The finished class is shown below - but requires functions “writeCache” and “readCache” which I might post another day. For now, you could just write your own or set “useCache” to false in the class.

Instructions on usage are in the comments.

Enjoy.

'	=============================================================================================================
'	@name			Weather Class
'
'	@author			James Crooke
'					james@fish-media.net
'
'	@copyright		Fish Media Ltd 2006
'
'	@desc 			retrieves latest weather from weather.yahoo.com
'
'	@usage			dim objWeather
'					set objWeather = new weather
'					with objWeather
'						.location = "SPXX0015" 	'(Spain)
'						.celsius = true			'(true uses celsius, false uses fahrenheit)
'						.fetch()
'					end with
'
'	@notes			Find location at http://weather.yahoo.com/regional/EUROPEX.html
'
'	@requires		readCache, writeCache functions
'	=============================================================================================================
	
class weather
	
	private parseError
	public condition
	public forecast
	private p_location
	private p_celsius
	private useCache
	
	private sub class_initialize()
		set objXML = Server.CreateObject("Microsoft.XMLDOM")
		set objLst = Server.CreateObject("Microsoft.XMLDOM")
		celsius = false
		useCache = true
	end sub
	
	public property let location(str)
		p_location = str
	end property
	
	public property let celsius(str)
		p_celsius = str
	end property
	
	private sub class_terminate()
		set objXML = nothing
		set objLst = nothing
	end sub
	
	public sub fetch()
		cacheEx = ""
		weatherCache = ""
		url = "http://xml.weather.yahoo.com/forecastrss?p=" & p_location
		if p_celsius then
			url = url & "&u=c"
			cacheEx = "cel"
		end if
	
		if useCache then
			weatherCache = readCache("weather" & p_location & cacheEx, 1) '1 day cache
		end if
	
		if weatherCache = "" then
			sourceXML = getXML(url)
			objXML.async = False
			objXML.loadXML(sourceXML)
			If objXML.parseError.errorCode <> 0 Then
				parseError = true
			else
				condition = getCondition("yweather:condition")
				forecast = getForecast("yweather:forecast")
				if useCache then
					cacheStr = condition(0) & "|" & condition(1) & "|" & condition(2) & "###" & forecast(0) & "|" & forecast(1)
					call writeCache("weather" & p_location & cacheEx, cacheStr)
				end if
			end if
		else
			weatherList = split(weatherCache, "###")
			condition = split(weatherList(0), "|")
			forecast = split(weatherList(1), "|")
		end if
	end sub
	
	private function getCondition(inElem)
		dim result(2)
		Set elemList = objXML.getElementsByTagName(inElem)
		For i=0 To (elemList.length -1)
			result(0) = elemList.item(i).getAttribute("temp")
			result(1) = elemList.item(i).getAttribute("text")
			result(2) = elemList.item(i).getAttribute("code")
		Next
		getCondition = result
	end function
	
	private function getForecast(inElem)
		dim result(1)
		Set elemList = objXML.getElementsByTagName(inElem)
		For i=0 To (elemList.length -1)
			' if the day is today...
			if lcase(elemList.item(i).getAttribute("day")) = lcase(weekdayName(weekday(date),true)) then
				result(0) = elemList.item(i).getAttribute("low")
				result(1) = elemList.item(i).getAttribute("high")
			end if
		Next
		' return the array result
		getForecast = result
	end function
	
	public function getXML(sourceFile)
		dim styleFile
		dim xmlDoc
	
		Dim xmlhttp
		Set xmlhttp = Server.CreateObject("Microsoft.XMLHTTP")
		xmlhttp.Open "GET", sourceFile, false
		xmlhttp.Send
		getXML = xmlhttp.ResponseText
		set xmlhttp = nothing
	end function
end class
	

Example Usage:

' (after including the class)...
	
dim objWeather
set objWeather = new weather
with objWeather
	.location = "SPXX0015"
	.celsius = true
	.fetch()
end with
	
response.write("<ul>")
response.write("<li>Temperature: " & objWeather.condition(0) & "</li>")
response.write("<li>Condition: " & objWeather.condition(1) & "</li>")
response.write("<li>Low: " & objWeather.forecast(0) & "</li>")
response.write("<li>High: " & objWeather.forecast(1) & "</li>")
response.write("</ul>")

Download “ASP Weather Class” here.

Cron for Windows IIS

General No Comments »

First, a lesson in the real implementation of cron

The crontab command, found in Unix operating systems, is used to schedule commands to be executed periodically. It reads commands and collects them into a file also known as a “crontab” which is later read by the operating system which carries out the instructions contained within.

One of the most useful cron commands, particularly from a web developers point of view, is the ability to call up a webpage silently, in the background, at intervals or at a specified time. Think of the possibilites:

  • Running a routine check on users in a database (deleting inactive accounts for example)
  • Scheduling an email, several days after a user has purchased an item from a site
  • Checking a site is functioning correctly

So how can you get Windows to perform cron-like tasks?

It’s very simple once you know how - which I do, you’ll be happy to learn. You have probably found that several web sites offer programs that cost around $100 that promise to emulate cron. Don’t bother; especially if all you want to do is call up a web address.

This is how you do it…

  1. Download wget (GnuWin32 is a good, clean, harmless one)
  2. Install wget
  3. Fire up the command line (Start > Run > then type “cmd”)
  4. Run a command as shown below:

schtasks /create /tn "Test Cron" /tr "C:\wget.exe http://www.site.com/cron.asp" /sc minute /mo 5 /ru "System"

Where

“Test Cron” is the name of your cron job (useful for reference purposes)
“C:\wget.exe” is the location of the wget.exe file
“http://www.site.com/cron.asp” is the URL for wget to fetch

More Examples

Let’s take a look at some more examples, that way you will understand the different options available…

Run a cron job every week

schtasks /create /tn "Test Cron" /tr "C:\wget.exe http://www.site.com/cron.asp" /sc weekly /ru "System"

Run a cron every week on a Friday

schtasks /create /tn "Test Cron" /tr "C:\wget.exe http://www.site.com/cron.asp" /sc weekly /d FRI /ru "System"

Run a cron every friday at 11:00am

schtasks /create /tn "Test Cron" /tr "C:\wget.exe http://www.site.com/cron.asp" /sc weekly /st 11:00:00 /d FRI /ru "System"

Run a cron every day at 17:00pm

schtasks /create /tn "Test Cron" /tr "C:\wget.exe http://www.site.com/cron.asp" /sc daily /st 17:00:00 /ru "System"

Run a cron every hour, on the hour

schtasks /create /tn "Test Cron" /tr "C:\wget.exe http://www.site.com/cron.asp" /sc hourly /st 00:00:00 /ru "System"

So what’s happening exactly?

Well, all your are doing is utilising Windows’ “Scheduled Tasks” program - using a command line interface, which I find easier to use. If you have installed wget and ran one of the commands above, you will be able to see your cron in action by going to “Start > Programs > Accessories > System Tools > Scheduled Tasks”.

And of course, your not just limited to fetching URL’s (using wget) - you can also execute a whole load of other programs - providing you know the right commands (or parameters, or both)!

The best thing about this Windows Cron method is you don’t need to have access to your web server for it to work - just any PC that is connected to the Internet (note: it needs to be connected when the Cron is scheduled to “fire”!). It’s also free and, because it runs as standard in Windows, it doesn’t use up any additional memory resources.

I hope this has helped you out! :)

The alternative of course is to host on a linux server, like this great host offers

Web Standards Solutions

General No Comments »

Today my book arrived; Web Standards Solutions by Dan Cederholm.

I’ve read this book already (University was good for something at least - a well stocked library!), but I wanted to get a copy of it. I managed to get a copy for around £10 from Amazon Marketplace - and believe me, its worth every penny.

I’ve read dozens of books on usability, accessibility and web standards and have to say this is the best, from a technical perspective. Dan gives solutions to common CSS problems and discusses methods of overcoming them - giving the pro’s and con’s of each.

It’s the kind of book I would like to write - I would call mine “position:relative or height:1% usually fixes IE”.

Submit button cursors

Web Design No Comments »

The other day Colin made a valid point about input button cursors. As most of us know, by default they are arrow cursors - but he said they should be pointing fingers, like hyperlinks are. I told him that changing them goes against web conventions (even Google uses arrows I said) and “Krug says” to stick to conventions as we all know. Stu then lifted up the cover of Don’t Make Me Think with a big finger hovering over a submit button. “pwn3d” comes to mind.

So I gave in to the pressure and decided to implement the usability issue that I agree, should be addressed. The fix is simple - or could be. If CSS had advanced to version 2 in all modern browsers (in other words, if IE supported CSS2), you could simply apply the following CSS rule:

input[type="submit"] { cursor:pointer;}

Alas, you have to end up doing this (in HTML):

<input type="submit" class="submitbutton" value="Whatever" />

and this (in CSS):

.submitbutton{ cursor:pointer;}

You could also do a nifty JavaScript to do it for you. Something like this would work:

function cursorMySubmits()
{
	var submitButtons = document.getElementsByTagName("input");
	for (var i=0; i<submitbuttons .length; i++)
	{
		if(submitButtons[i].type == "submit")
		{
			submitButtons[i].style.cursor = "pointer";
		}
	}
}
window.onload = cursorMySubmits;

It should be noted that using JavaScript is considered bad for accessibility reasons - so only use the above method if it’s too late to use CSS.

This is the above “Submit button finger-pointer cursor” script in action.

I should write a technical book on addressing these kind of usability issues - there are many more, just ask Colin - our in-house usability Guru.

Browser Stats (November ‘06)

Web Design No Comments »

With the release of IE7 and Firefox 2.0 recently, I wondered how this would effect browser statistics. This following statistics table is taken from webreference.com; and is therefore probably more bias given their “developer” visitor base - who tend to upgrade software more often. That said, it still makes for interesting reading:

Tue Nov 21 23:50:07 EST 2006
Unique Visitors:    41989
	
Major Versions               Count               Share(%)
---------------------------------------------------------
OP 3                             1                   0.00
OP 5                             7                   0.02
OP 6                             3                   0.01
OP 7                            13                   0.03
OP 8                            73                   0.17
OP 9                           523                   1.25
OP ALL                         620                   1.48
SF 0                             3                   0.01
SF 100                           2                   0.00
SF 125                           6                   0.01
SF 312                         126                   0.30
SF 412                          21                   0.05
SF 416                           7                   0.02
SF 417                          41                   0.10
SF 419                         723                   1.72
SF 420                           7                   0.02
SF 521                           3                   0.01
SF 85                            5                   0.01
SF 94                            1                   0.00
SF ALL                         945                   2.25
NS 2                             1                   0.00
NS 3                             2                   0.00
NS 4                           205                   0.49
NS 5                          1906                   4.54
NS 6                             2                   0.00
NS ALL                        2116                   5.04
FF 0                             1                   0.00
FF 0.1                           1                   0.00
FF 0.10                          5                   0.01
FF 0.10.1                       10                   0.02
FF 0.8                           4                   0.01
FF 0.9                           1                   0.00
FF 0.9.1                         1                   0.00
FF 0.9.2                         1                   0.00
FF 0.9.3                         4                   0.01
FF 0.9.6                         2                   0.00
FF 1.0                          69                   0.16
FF 1.0.1                        13                   0.03
FF 1.0.2                         9                   0.02
FF 1.0.3                         8                   0.02
FF 1.0.4                        72                   0.17
FF 1.0.5                         3                   0.01
FF 1.0.6                        88                   0.21
FF 1.0.7                       294                   0.70
FF 1.0.8                        23                   0.05
FF 1.4                           2                   0.00
FF 1.4.1                         4                   0.01
FF 1.5                          72                   0.17
FF 1.5.0.1                      80                   0.19
FF 1.5.0.2                      26                   0.06
FF 1.5.0.3                      74                   0.18
FF 1.5.0.4                     113                   0.27
FF 1.5.0.5                      52                   0.12
FF 1.5.0.6                     172                   0.41
FF 1.5.0.7                     499                   1.19
FF 1.5.0.8                    4383                  10.44
FF 1.6                           2                   0.00
FF 2.0                        5214                  12.42
FF 2.9.0.1                       1                   0.00
FF 3.0                           1                   0.00
FF ALL                       11304                  26.92
IE 3                             1                   0.00
IE 4                            27                   0.06
IE 5                           679                   1.62
IE 6                         15819                  37.67
IE 7                          3132                   7.46
IE ALL                       19658                  46.82

© James Crooke 2000-2008
Entries RSS Login