Click to See Complete Forum and Search --> : Regular Expression (Tcl)


deathadder
11-19-2008, 01:22 PM
Hi,

I'm having a bit of trouble with a regular expression I'm trying to write and I'm not sure if it's something Tcl specific or my lack of regexp understanding.

set a1 "3|14"

if [
regexp { (3\|(0|[5-12])) } $a1 match
] then {
puts "MATCH"
}
I get a number of strings passed to a proc in the format 3|x where x is a number, either 0 or within the range 5-12. My understanding is that that regexp will match the literal '3' followed by a '|', the \ escapes the special meaning of |, and then 0 or, because of |, a number within the range 5-12.

However I'm getting the error 'couldn't compile regular expression pattern: invalid character range'. Anyone got any hints on how to fix this problem? I'm googling at the moment, but regexp is something I'm not all that great with so any advice / pointers would be good.

TIA

bwkaz
11-20-2008, 01:47 AM
The issue is that [5-12] does not mean "all integers from 5 to 12". It means "any single character: either a character in the range from 5 to 1, or the character 2". And "the range 5 to 1" is invalid; the value of the second character in a range has to be greater than the value of the first character. (So 1-5 would be valid from the regex syntax point of view. Of course it still wouldn't do what you want though.) Regexes don't understand integers; they're pure string processors. :)

If you want to match 5-12, you have to do something like ([5-9]|1[0-2]), to match either 5 through 9, or 10 through 12. (Split it up into digit strings.)

deathadder
11-20-2008, 05:16 AM
Ah, that makes a bit more sense now :) Well I've changed it to the below now, it's not quite right but I'll post back once it is!

lappend alist "3|0" "3|5" "3|6" "3|7" "3|8" "3|9" "3|10" "3|11" "3|12" "3|13" "3|04" "3|05"

foreach x $alist {
if {[regexp {(3\|(0|[5-9]|1[0-2]))} $x match]} {
puts "$x matches $x"
} else {
puts "$x does not match"
}
}


[EDIT]
Well it wasn't working because 3|0 matched 3|05, which isn't what I wanted. So I found out that ^ matches the beginning of a line, and $ matches the end. So I've changed my code to the below and it works fine.

ajefferi@arkab:~$ cat test.tcl
lappend alist "3|0" "3|5" "3|6" "3|7" "3|8" "3|9" "3|10" "3|11" "3|12" "3|13" "3|04" "3|05"

foreach x $alist {
if {[regexp {(^3\|0$)|(^3\|([5-9]|1[0-2])$)} $x match]} {
puts "$x matches $x"
} else {
puts "$x does not match"
}
}
ajefferi@arkab:~$ tclsh test.tcl
3|0 matches 3|0
3|5 matches 3|5
3|6 matches 3|6
3|7 matches 3|7
3|8 matches 3|8
3|9 matches 3|9
3|10 matches 3|10
3|11 matches 3|11
3|12 matches 3|12
3|13 does not match
3|04 does not match
3|05 does not match
ajefferi@arkab:~$

By the way I found a couple of sites that someone might find interesting...

http://osteele.com/tools/rework/ Lets you try out regexp on the fly. You enter some text and then enter you regexp and it tells you what matches what. It updates as you type...which I found helpful.

http://www.regular-expressions.info/ Some great info on regular expressions and stuff about using them in different languages too.

bwkaz
11-22-2008, 01:19 AM
The one possible issue with random regular-expression-testing sites is that there are several different, incompatible versions of regular expressions. (Figures, right? :p) There are also a couple different, incompatible ways to write a regular expression (e.g. in sed, you need to escape paren characters to match them explicitly; in vim, you need to escape them before they'll act as grouping operators).

Most of the differences are minor (as in: affect only a few types of regular expressions), but beware that they do exist...