HOW-TO: patern matching
 

News:

29 December 2022 - PtokaX 0.5.3.0 (20th anniversary edition) released...
11 April 2017 - PtokaX 0.5.2.2 released...
8 April 2015 Anti child and anti pedo pr0n scripts are not allowed anymore on this board!
28 September 2015 - PtokaX 0.5.2.1 for Windows 10 IoT released...
3 September 2015 - PtokaX 0.5.2.1 released...
16 August 2015 - PtokaX 0.5.2.0 released...
1 August 2015 - Crowdfunding for ADC protocol support in PtokaX ended. Clearly nobody want ADC support...
30 June 2015 - PtokaX 0.5.1.0 released...
30 April 2015 Crowdfunding for ADC protocol support in PtokaX
26 April 2015 New support hub!
20 February 2015 - PtokaX 0.5.0.3 released...
13 April 2014 - PtokaX 0.5.0.2 released...
23 March 2014 - PtokaX testing version 0.5.0.1 build 454 is available.
04 March 2014 - PtokaX.org sites were temporary down because of DDOS attacks and issues with hosting service provider.

Main Menu

HOW-TO: patern matching

Started by plop, 17 December, 2003, 02:09:28

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

plop

. all characters
%a letters
%c control characters
%d digits
%l lower case letters
%p punctuation characters
%s space characters
%u upper case letters
%w alphanumeric characters
%x hexadecimal digits
%z the character with representation 0
%b<> matches anything between the < >, they can be replaced by anything you want

An upper case version of any of the above represents the oposite of the class.
For instance, %A represents all non-letter characters.


except %b they all match 1 character, if you want more you can use the following:

+ 1 or more repetitions (returns nil if not found)
* 0 or more repetitions (returns "" on 0 repetitions)
- also 0 or more repetitions (returns "" on 0 repetitions)
? optional (0 or 1 ?repetitions?) (returns "" on 0 repetitions)


Some characters, called magic characters, have special meanings when used in a pattern.
The magic characters are:

( ) . % + - * ? [ ^ $

The character % works as an escape for those magic characters.
So, %. matches a dot, and %% matches the % itself.
You can use the escape % not only for the magic characters, but for any non alphanumeric character.
When in doubt, play safe and put an escape.



plop

*(more later when i have time)
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

plop

were again gone need the lua command line for this howto to execute the example scripts.

time to explain a bit more about the so called magic characters.
here they all are again.
( ) . % + - * ? [ ^ $

. + - * ? if you don't know what these do i suggest you read the above post again.

( ) are always used in a pair.
s,e,cmd = strfind(data, "%b<>%s+(%S+)") is something you can find in nearly every script for ptokax.
it means that we wanne give variable cmd the vallue of what is found between the ().
but thats not all, you can use one search to get lots of vallue's like that.
s,e,var1,var2,var3=strfind(data, "%b<>%s+(pat1)%s+(pat2)%s+(pat3)")
now you can see the order in this is straigth forwarded, var1 gets the vallue found by patern 1 etc..

^ now this one can do to things but gone explain only one of them here right now.
on a pattern like this "^%b<>%s+(%S+)" it means it has to start right at the beginning of the string.
if %b<> isn't found it's gone return nil.

$ works in the oposite way was ^.
"%b<>%s+(%S+)$" means start searching from the end of the string and just like ^ it's gone return nil if the first thing in the pattern isn't found.

lets show this with an example script.
string = "why is ptokax is way cooler then yhub? because it can do scripting"

print(" ")
print("the full string were using is: \""..string.."\"")
print(" ")
print("1st searching for the word \"why\"")
s,e,tmp =strfind(string, "(why)")
print("the pattern \"(why)\" returns: "..tmp)
s,e,tmp =strfind(string, "(why)$")
print("the pattern \"(why)$\" returns: "..(tmp or "not found"))
s,e,tmp =strfind(string, "^(why)")
print("the pattern \"^(why)\" returns: "..(tmp or "not found"))

print(" ")
print("now lets do the same patterns on the word \"cooler\"")
s,e,tmp =strfind(string, "(cooler)")
print("the pattern \"(cooler)\" returns: "..tmp)
s,e,tmp =strfind(string, "(cooler)$")
print("the pattern \"(cooler)$\" returns: "..(tmp or "not found"))
s,e,tmp =strfind(string, "^(cooler)")
print("the pattern \"^(cooler)\" returns: "..(tmp or "not found"))

print(" ")
print("now lets do the same patterns on the word \"scripting\"")
s,e,tmp =strfind(string, "(scripting)")
print("the pattern \"(scripting)\" returns: "..tmp)
s,e,tmp =strfind(string, "(scripting)$")
print("the pattern \"(scripting)$\" returns: "..(tmp or "not found"))
s,e,tmp =strfind(string, "^(scripting)")
print("the pattern \"^(scripting)\" returns: "..(tmp or "not found"))


% this is also not new, you've seen it in part 1, it's a so called escape.
we used it before in the above post on things like %S
incase we wanne find one of the magic characters were gone need this one to.
for example we wanne find the % itself we have to escape it from being magic.
sounds more complex then it is, %% this is all.
now you can guess what it needed to find the ?, again simply put a % infront of it, %? and were done.

again a simple example script.
string = "why is ptokax is way cooler then yhub? because it can do scripting"

print(" ")
print("the full string were using is: \""..string.."\"")
print(" ")
print("now lets lets find the ? in the string, and 2 make it easy the word before it")
s,e,tmp = strfind(string, "(yhub%?)")
print(tmp)

[ this one can do real magic.
we know that %d numbers and %a letters but what if we don't wanne find the whole range?
then this is gone be your best friend, but just like () he has a brother which he always needs, the ].
you can make your own ranges with this baby.
lets start with a example script to show the basic, this time not by using strfind but gsub (replace what is found)
first without the magic [ ] then with so we can clearly see the difference.
string = "why is ptokax is way cooler then yhub? because it can do scripting"

print(" ")
print("the full string were using is: \""..string.."\"")
print(" ")
print("now lets lets replace  word yhub so we don't have to see that, for the letter \"x\"")
print("first without the magic [ ]")
tmp = gsub(string, "yhub", "x")
print(tmp)
print("now lets lets replace all the letters which make up the word yhub for the letter \"x\"")
print("now with the magic [ ]")
tmp = gsub(string, "[yhub]", "x")
print(tmp)

you can see the difference, the first only matches the whole word, the last takes induvidual characters.
but we can do more then this with it.
promised you i would return to the ^, and now is that time.
just like %S is the oposite of %s, we can use ^ to flip around the magic between [ ].
again a example script.
string = "why is ptokax is way cooler then yhub? because it can do scripting"

print(" ")
print("the full string were using is: \""..string.."\"")
print(" ")
print("now lets lets replace all but the letters which make up the word yhub for the letter \"x\"")
print("now with the magic [ ]")
tmp = gsub(string, "[^yhub%s]", "x")
print(tmp)
that was all for now, i hope you can understand this at least a bit better now.

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

NightLitch

GREAT explaination Plop. This is a great guide learning
gsub as well as strfind better.

This will really come in handy.
//NL

plop

QuoteOriginally posted by NightLitch
GREAT explaination Plop. This is a great guide learning
gsub as well as strfind better.

This will really come in handy.
yw.

gsub showed the last part better then strfind.
forgot 2 say that i added a %s on the last example script, this so the spaces didn't get replaced and what was left stayed @ least a bit readable.

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

VidFamne

Have allways wonder if you could search in a string for only i.e;  two letters
Like in wich mode a client is, M:A or M:P
So for example is this valid?
s,e,version,mode = strfind(data,"V:(%w.%w+)M:([AP])")
regexp has allways been a nightmare to me  :D

plop

#5
QuoteOriginally posted by VidFamne
Have allways wonder if you could search in a string for only i.e;  two letters
Like in wich mode a client is, M:A or M:P
So for example is this valid?
s,e,version,mode = strfind(data,"V:(%w.%w+)M:([AP])")
regexp has allways been a nightmare to me  :D
you forgot a , thats all.
but you can make it even cooler.
try this 2 see that it works fine.
data = "$MyINFO $ALL odc53 $ $DSL$$49945258045$|"
s,e,version,mode = strfind(data,"V:(%w.%w+),M:([AP])")
print(mode)
s,e,version,mode = strfind(data,"V:([%w%.]+),M:([AP])")
print(version)

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

VidFamne

It works like charm :)
Thank you very much for clearing this out, for me  :D

plop

QuoteOriginally posted by VidFamne
It works like charm :)
Thank you very much for clearing this out, for me  :D
yw

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

kepp

If you really kept atention you will figure this one for sure :)

:P

Here we go:

find = "I$%LOVE$%PTOKAX$%"

local s,e,IP = strfind(find,"([%a%$%]+)")

print(IP)

Now what's wrong with this one?
Guarding    

plop

QuoteOriginally posted by kepp
If you really kept atention you will figure this one for sure :)

:P

Here we go:

find = "I$%LOVE$%PTOKAX$%"

local s,e,IP = strfind(find,"([%a%$%]+)")

print(IP)

Now what's wrong with this one?
the last % in your pattern is waiting for something, in your case it's another %.
local s,e,IP = strfind(find,"([%a%$%%]+)")

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

jiten

which kind of pattern do i need to use so that i can censor this kind of trigger?
- hu.b (only a dot in between)
- h..............ub (more than one dot in between 2 words)

it may de done with a edit somewhere here in plop's sneaky anti advertiser?

-------

               local s,e, adver = strfind(data, "%b<>%s(%S+%.[^%.]+%.[^%.]+)")
               if adver ~= nil then
                  local s,e,hubby = strfind(adver, "(%S+%.[^%.]+%.%a+)/.*")
                  if hubby == nil then hubby = adver end
                  if OKHUBS[hubby] == nil then
                     SendToAll(user.sName, HUBADRESS)
                     return 1
                  end
               else
                  local s,e,msg,adver,msg2 = strfind(data, "%b<>%s(.*)%s([^%.]+%.[^%.]+%.%S+)(.*)$")
                  if adver ~= nil then
                     local s,e,hubby = strfind(adver, "(%S+%.[^%.]+%.%a+)/.*")
                     if hubby == nil then hubby = adver end
                     if OKHUBS[hubby] == nil then
                        SendToAll(user.sName, msg.." "..HUBADRESS..msg2)
                        return 1
                     end
                  end
               end

Fangs404

bumping this because it's come in awfully handy in the past couple days.

plop

the . is a magic char, this means it can do special things.
in your case you want really a . and not something special.
to do this you have 2 escape the special thing, this is done with a %
so you get %.
but you want to find 1 or more of them, so we add a + to it.
result %.+

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

jiten

#13
QuoteOriginally posted by plop
the . is a magic char, this means it can do special things.
in your case you want really a . and not something special.
to do this you have 2 escape the special thing, this is done with a %
so you get %.
but you want to find 1 or more of them, so we add a + to it.
result %.+

plop
That post of mine is quite old  (07.05.2004, 09:08) and hopefully I solved that problem  :D
Anyway, thanks for the hint eheheh

Cheers

Dessamator

lol, plop, mayb u should write a new howto, for lua 5, :D, with some optimizations, etc etc,
Ignorance is Bliss.

Herodes

QuoteOriginally posted by Dessamator
lol, plop, mayb u should write a new howto, for lua 5, :D, with some optimizations, etc etc,
there are no changes in pattern matching from lua4 to lua5. .. only the string library was enhanced a lil bit .. so maybe some additions will do just great for plop's guide.
Although the Programming in Lua book gives a thourough look on those string library enhancements,.. ( one book never matches two books .. )

plop

QuoteOriginally posted by Dessamator
lol, plop, mayb u should write a new howto, for lua 5, :D, with some optimizations, etc etc,
allready have some in mind but i need 2 be in the mood 2 write them.

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

Dessamator

QuoteOriginally posted by Herodes
QuoteOriginally posted by Dessamator
lol, plop, mayb u should write a new howto, for lua 5, :D, with some optimizations, etc etc,
there are no changes in pattern matching from lua4 to lua5. .. only the string library was enhanced a lil bit .. so maybe some additions will do just great for plop's guide.
Although the Programming in Lua book gives a thourough look on those string library enhancements,.. ( one book never matches two books .. )

yaps i agree, the prob is most ppl are lazy and dont want to look in a whole book, its far simpler to see a smaller, resumed summary, :)
Ignorance is Bliss.

SMF spam blocked by CleanTalk