PtokaX forum

Development Section => HOW-TO's => Topic started by: plop on 17 December, 2003, 02:09:28

Title: HOW-TO: patern matching
Post by: plop on 17 December, 2003, 02:09:28
. all characters
%a letters
%c control characters
%d digits
%l lower case letters
%p punctuation characters
%s space characters
%u upper case letters
%w alphanumeric characters
%x hexadecimal digits
%z the character with representation 0
%b<> matches anything between the < >, they can be replaced by anything you want

An upper case version of any of the above represents the oposite of the class.
For instance, %A represents all non-letter characters.


except %b they all match 1 character, if you want more you can use the following:

+ 1 or more repetitions (returns nil if not found)
* 0 or more repetitions (returns "" on 0 repetitions)
- also 0 or more repetitions (returns "" on 0 repetitions)
? optional (0 or 1 ?repetitions?) (returns "" on 0 repetitions)


Some characters, called magic characters, have special meanings when used in a pattern.
The magic characters are:

( ) . % + - * ? [ ^ $

The character % works as an escape for those magic characters.
So, %. matches a dot, and %% matches the % itself.
You can use the escape % not only for the magic characters, but for any non alphanumeric character.
When in doubt, play safe and put an escape.



plop

*(more later when i have time)
Title:
Post by: plop on 30 January, 2004, 00:18:21
were again gone need the lua command line for this howto to execute the example scripts.

time to explain a bit more about the so called magic characters.
here they all are again.
( ) . % + - * ? [ ^ $

. + - * ? if you don't know what these do i suggest you read the above post again.

( ) are always used in a pair.
s,e,cmd = strfind(data, "%b<>%s+(%S+)") is something you can find in nearly every script for ptokax.
it means that we wanne give variable cmd the vallue of what is found between the ().
but thats not all, you can use one search to get lots of vallue's like that.
s,e,var1,var2,var3=strfind(data, "%b<>%s+(pat1)%s+(pat2)%s+(pat3)")
now you can see the order in this is straigth forwarded, var1 gets the vallue found by patern 1 etc..

^ now this one can do to things but gone explain only one of them here right now.
on a pattern like this "^%b<>%s+(%S+)" it means it has to start right at the beginning of the string.
if %b<> isn't found it's gone return nil.

$ works in the oposite way was ^.
"%b<>%s+(%S+)$" means start searching from the end of the string and just like ^ it's gone return nil if the first thing in the pattern isn't found.

lets show this with an example script.
string = "why is ptokax is way cooler then yhub? because it can do scripting"

print(" ")
print("the full string were using is: \""..string.."\"")
print(" ")
print("1st searching for the word \"why\"")
s,e,tmp =strfind(string, "(why)")
print("the pattern \"(why)\" returns: "..tmp)
s,e,tmp =strfind(string, "(why)$")
print("the pattern \"(why)$\" returns: "..(tmp or "not found"))
s,e,tmp =strfind(string, "^(why)")
print("the pattern \"^(why)\" returns: "..(tmp or "not found"))

print(" ")
print("now lets do the same patterns on the word \"cooler\"")
s,e,tmp =strfind(string, "(cooler)")
print("the pattern \"(cooler)\" returns: "..tmp)
s,e,tmp =strfind(string, "(cooler)$")
print("the pattern \"(cooler)$\" returns: "..(tmp or "not found"))
s,e,tmp =strfind(string, "^(cooler)")
print("the pattern \"^(cooler)\" returns: "..(tmp or "not found"))

print(" ")
print("now lets do the same patterns on the word \"scripting\"")
s,e,tmp =strfind(string, "(scripting)")
print("the pattern \"(scripting)\" returns: "..tmp)
s,e,tmp =strfind(string, "(scripting)$")
print("the pattern \"(scripting)$\" returns: "..(tmp or "not found"))
s,e,tmp =strfind(string, "^(scripting)")
print("the pattern \"^(scripting)\" returns: "..(tmp or "not found"))


% this is also not new, you've seen it in part 1, it's a so called escape.
we used it before in the above post on things like %S
incase we wanne find one of the magic characters were gone need this one to.
for example we wanne find the % itself we have to escape it from being magic.
sounds more complex then it is, %% this is all.
now you can guess what it needed to find the ?, again simply put a % infront of it, %? and were done.

again a simple example script.
string = "why is ptokax is way cooler then yhub? because it can do scripting"

print(" ")
print("the full string were using is: \""..string.."\"")
print(" ")
print("now lets lets find the ? in the string, and 2 make it easy the word before it")
s,e,tmp = strfind(string, "(yhub%?)")
print(tmp)

[ this one can do real magic.
we know that %d numbers and %a letters but what if we don't wanne find the whole range?
then this is gone be your best friend, but just like () he has a brother which he always needs, the ].
you can make your own ranges with this baby.
lets start with a example script to show the basic, this time not by using strfind but gsub (replace what is found)
first without the magic [ ] then with so we can clearly see the difference.
string = "why is ptokax is way cooler then yhub? because it can do scripting"

print(" ")
print("the full string were using is: \""..string.."\"")
print(" ")
print("now lets lets replace  word yhub so we don't have to see that, for the letter \"x\"")
print("first without the magic [ ]")
tmp = gsub(string, "yhub", "x")
print(tmp)
print("now lets lets replace all the letters which make up the word yhub for the letter \"x\"")
print("now with the magic [ ]")
tmp = gsub(string, "[yhub]", "x")
print(tmp)

you can see the difference, the first only matches the whole word, the last takes induvidual characters.
but we can do more then this with it.
promised you i would return to the ^, and now is that time.
just like %S is the oposite of %s, we can use ^ to flip around the magic between [ ].
again a example script.
string = "why is ptokax is way cooler then yhub? because it can do scripting"

print(" ")
print("the full string were using is: \""..string.."\"")
print(" ")
print("now lets lets replace all but the letters which make up the word yhub for the letter \"x\"")
print("now with the magic [ ]")
tmp = gsub(string, "[^yhub%s]", "x")
print(tmp)
that was all for now, i hope you can understand this at least a bit better now.

plop
Title:
Post by: NightLitch on 30 January, 2004, 00:39:49
GREAT explaination Plop. This is a great guide learning
gsub as well as strfind better.

This will really come in handy.
Title:
Post by: plop on 30 January, 2004, 03:34:32
QuoteOriginally posted by NightLitch
GREAT explaination Plop. This is a great guide learning
gsub as well as strfind better.

This will really come in handy.
yw.

gsub showed the last part better then strfind.
forgot 2 say that i added a %s on the last example script, this so the spaces didn't get replaced and what was left stayed @ least a bit readable.

plop
Title:
Post by: VidFamne on 30 January, 2004, 04:02:55
Have allways wonder if you could search in a string for only i.e;  two letters
Like in wich mode a client is, M:A or M:P
So for example is this valid?s,e,version,mode = strfind(data,"V:(%w.%w+)M:([AP])")regexp has allways been a nightmare to me  :D
Title:
Post by: plop on 30 January, 2004, 04:56:46
QuoteOriginally posted by VidFamne
Have allways wonder if you could search in a string for only i.e;  two letters
Like in wich mode a client is, M:A or M:P
So for example is this valid?s,e,version,mode = strfind(data,"V:(%w.%w+)M:([AP])")regexp has allways been a nightmare to me  :D
you forgot a , thats all.
but you can make it even cooler.
try this 2 see that it works fine.
data = "$MyINFO $ALL odc53 $ $DSL$$49945258045$|"
s,e,version,mode = strfind(data,"V:(%w.%w+),M:([AP])")
print(mode)
s,e,version,mode = strfind(data,"V:([%w%.]+),M:([AP])")
print(version)

plop
Title:
Post by: VidFamne on 30 January, 2004, 05:39:51
It works like charm :)
Thank you very much for clearing this out, for me  :D
Title:
Post by: plop on 30 January, 2004, 05:59:05
QuoteOriginally posted by VidFamne
It works like charm :)
Thank you very much for clearing this out, for me  :D
yw

plop
Title:
Post by: kepp on 31 January, 2004, 20:18:44
If you really kept atention you will figure this one for sure :)

:P

Here we go:


find = "I$%LOVE$%PTOKAX$%"

local s,e,IP = strfind(find,"([%a%$%]+)")

print(IP)


Now what's wrong with this one?
Title:
Post by: plop on 31 January, 2004, 21:10:08
QuoteOriginally posted by kepp
If you really kept atention you will figure this one for sure :)

:P

Here we go:


find = "I$%LOVE$%PTOKAX$%"

local s,e,IP = strfind(find,"([%a%$%]+)")

print(IP)


Now what's wrong with this one?
the last % in your pattern is waiting for something, in your case it's another %.
local s,e,IP = strfind(find,"([%a%$%%]+)")

plop
Title: advertisement pattern matching
Post by: jiten on 07 May, 2004, 09:08:46
which kind of pattern do i need to use so that i can censor this kind of trigger?
- hu.b (only a dot in between)
- h..............ub (more than one dot in between 2 words)

it may de done with a edit somewhere here in plop's sneaky anti advertiser?

-------

               local s,e, adver = strfind(data, "%b<>%s(%S+%.[^%.]+%.[^%.]+)")
               if adver ~= nil then
                  local s,e,hubby = strfind(adver, "(%S+%.[^%.]+%.%a+)/.*")
                  if hubby == nil then hubby = adver end
                  if OKHUBS[hubby] == nil then
                     SendToAll(user.sName, HUBADRESS)
                     return 1
                  end
               else
                  local s,e,msg,adver,msg2 = strfind(data, "%b<>%s(.*)%s([^%.]+%.[^%.]+%.%S+)(.*)$")
                  if adver ~= nil then
                     local s,e,hubby = strfind(adver, "(%S+%.[^%.]+%.%a+)/.*")
                     if hubby == nil then hubby = adver end
                     if OKHUBS[hubby] == nil then
                        SendToAll(user.sName, msg.." "..HUBADRESS..msg2)
                        return 1
                     end
                  end
               end
Title:
Post by: Fangs404 on 29 April, 2005, 00:21:07
bumping this because it's come in awfully handy in the past couple days.
Title:
Post by: plop on 29 April, 2005, 17:52:12
the . is a magic char, this means it can do special things.
in your case you want really a . and not something special.
to do this you have 2 escape the special thing, this is done with a %
so you get %.
but you want to find 1 or more of them, so we add a + to it.
result %.+

plop
Title:
Post by: jiten on 29 April, 2005, 18:36:52
QuoteOriginally posted by plop
the . is a magic char, this means it can do special things.
in your case you want really a . and not something special.
to do this you have 2 escape the special thing, this is done with a %
so you get %.
but you want to find 1 or more of them, so we add a + to it.
result %.+

plop
That post of mine is quite old  (07.05.2004, 09:08) and hopefully I solved that problem  :D
Anyway, thanks for the hint eheheh

Cheers
Title:
Post by: Dessamator on 29 April, 2005, 19:06:58
lol, plop, mayb u should write a new howto, for lua 5, :D, with some optimizations, etc etc,
Title:
Post by: Herodes on 29 April, 2005, 19:35:15
QuoteOriginally posted by Dessamator
lol, plop, mayb u should write a new howto, for lua 5, :D, with some optimizations, etc etc,
there are no changes in pattern matching from lua4 to lua5. .. only the string library was enhanced a lil bit .. so maybe some additions will do just great for plop's guide.
Although the Programming in Lua book gives a thourough look on those string library enhancements,.. ( one book never matches two books .. )
Title:
Post by: plop on 29 April, 2005, 20:00:39
QuoteOriginally posted by Dessamator
lol, plop, mayb u should write a new howto, for lua 5, :D, with some optimizations, etc etc,
allready have some in mind but i need 2 be in the mood 2 write them.

plop
Title:
Post by: Dessamator on 30 April, 2005, 17:31:52
QuoteOriginally posted by Herodes
QuoteOriginally posted by Dessamator
lol, plop, mayb u should write a new howto, for lua 5, :D, with some optimizations, etc etc,
there are no changes in pattern matching from lua4 to lua5. .. only the string library was enhanced a lil bit .. so maybe some additions will do just great for plop's guide.
Although the Programming in Lua book gives a thourough look on those string library enhancements,.. ( one book never matches two books .. )

yaps i agree, the prob is most ppl are lazy and dont want to look in a whole book, its far simpler to see a smaller, resumed summary, :)