Author Topic: pattern matching and string.gsub  (Read 4907 times)

0 Members and 1 Guest are viewing this topic.

Offline st0ne-db

  • Scripter
  • Double Ace
  • ******
  • Posts: 107
  • Karma: +14/-3
pattern matching and string.gsub
« on: 19 March, 2006, 00:50:44 »
can someone please help me.. im trying to remove all spaces and tabs from the beginning of a string.
im am using string.gsub...

tried a few things without success..

  ???    sText=string.gsub(sText, "^%s+","")      ???


TIA  :)

-St0ne db

PtokaX forum

pattern matching and string.gsub
« on: 19 March, 2006, 00:50:44 »

Offline st0ne-db

  • Scripter
  • Double Ace
  • ******
  • Posts: 107
  • Karma: +14/-3
Re: pattern matching and string.gsub
« Reply #1 on: 19 March, 2006, 01:28:52 »
ok... the data im working with is for an updated ver of my rss bot. the string is comming directly from the host via bluebears newest ver of pxwsa. here is a sample of the raw data comming in from the feed.

Code: [Select]
- <?xml version="1.0"?>
- <!-- RSS generated by NFOrce on Sun, 19 Mar 2006 01:20:02 +0100 -->
- <rss version="2.0">
<channel>
<title>NFOrce NFOs - Xbox</title>
<link>http://www.nforce.nl/</link>
<description>All the latest Xbox NFOs provided by NFOrce.nl</description>
<image>
<url>http://www.nforce.nl/rss/logo.gif</url>
<title>NFOrce NFOs - Xbox</title>
<link>http://www.nforce.nl/</link>
</image>
<item>
<title>Sonic Riders (c) Sega *FULLDVD* *PAL* - PAL</title>
<link>http://www.nforce.nl/index.php?nfoid=103575</link>
<description>NFOrce NFOs-&gt;Xbox&lt;br /&gt;On 2006-03-18 &lt;b&gt;PAL&lt;/b&gt; released &lt;b&gt;Sonic Riders (c) Sega *FULLDVD* *PAL*&lt;/b&gt;&lt;br /&gt;Size: 42x50MB&lt;br /&gt;</description>
<pubDate>Sat, 18 Mar 2006 00:00:00 +0100</pubDate>
<guid>http://www.nforce.nl/index.php?nfoid=103575</guid>
<comments></comments>
</item>
<item>



ok, i have tried to remove the spaces with string.gsub. but they look like they are tabs, so can i match a pattern of tab characters?

Offline bastya_elvtars

  • Forum God
  • ****
  • Posts: 3 744
  • Karma: +173/-7
  • The rock n' roll doctor
    • The FreshStuff3 Site
Re: pattern matching and string.gsub
« Reply #2 on: 19 March, 2006, 01:45:11 »
1) Nevermind. I am stupid.2) Why don't you use the xml parser library?
« Last Edit: 19 March, 2006, 02:18:14 by bastya_elvtars »
Everything could have been anything else and it would have just as much meaning.

Offline bastya_elvtars

  • Forum God
  • ****
  • Posts: 3 744
  • Karma: +173/-7
  • The rock n' roll doctor
    • The FreshStuff3 Site
Re: pattern matching and string.gsub
« Reply #3 on: 19 March, 2006, 01:52:52 »
In this case, he should use ^%s-
Everything could have been anything else and it would have just as much meaning.

Offline bastya_elvtars

  • Forum God
  • ****
  • Posts: 3 744
  • Karma: +173/-7
  • The rock n' roll doctor
    • The FreshStuff3 Site
Re: pattern matching and string.gsub
« Reply #4 on: 19 March, 2006, 02:14:25 »
If you want to be 100% sure then you should pull all between <item> & </item> and parse the tags that come up then, cause RSS is a standard in theory. :)
In 5.1,
Code: [Select]
string.gfind is called
Code: [Select]
string.gmatch
Code: [Select]
for w in string.gfind(rss,"%<item%>(.+)%<%/item%>") do
Everything could have been anything else and it would have just as much meaning.

Offline bastya_elvtars

  • Forum God
  • ****
  • Posts: 3 744
  • Karma: +173/-7
  • The rock n' roll doctor
    • The FreshStuff3 Site
Re: pattern matching and string.gsub
« Reply #5 on: 19 March, 2006, 02:35:11 »
RSS feeds are divided into items (at least 0.9x and 2.0) and this is a 2.0 feed, that's why I told. Your pattern is perfect to parse what is between item tags.
Everything could have been anything else and it would have just as much meaning.

Offline bastya_elvtars

  • Forum God
  • ****
  • Posts: 3 744
  • Karma: +173/-7
  • The rock n' roll doctor
    • The FreshStuff3 Site
Re: pattern matching and string.gsub
« Reply #6 on: 19 March, 2006, 02:38:28 »
BTW reading by lines is not always good, with this feed it would fail. Not to deteriorate your code, Sir, just a (benign) warning.
Everything could have been anything else and it would have just as much meaning.

Offline st0ne-db

  • Scripter
  • Double Ace
  • ******
  • Posts: 107
  • Karma: +14/-3
Re: pattern matching and string.gsub
« Reply #7 on: 19 March, 2006, 03:11:06 »
first let me say thank you! for the reponses....


the examples u provided are similar to what i was doing... so

here is my acctual code from my script...

Code: [Select]
-- create the seperator between feeds
xFeedData=string.gsub(xFeedData, "<item>","\r\n"..string.rep("*",85).."\r\n")
-- remove the tags
xFeedData=string.gsub(xFeedData, "<([^>]-)>","")
-- remove the header
xFeedData=string.gsub(xFeedData, "HTTP/1(.-)/xml","")
-- remove any pipe characters
xFeedData=string.gsub(xFeedData, "|","I")
-- remove leading spaces
xFeedData=string.gsub(xFeedData, "^%s+","")

everything else was working great... except no matter what i do... i cant get rid of the tabs.
i also tried this

Code: [Select]
xFeedData=string.gsub(xFeedData, "<([^>]-)>","\r\n")

which works... but.. the string ends up with way too many blank lines...  which i cannot seem to remove.
i might add that im am trying to minimize the amount of disk writes... and would like to parse the feed without saving to disk.

-St0ne db

Offline bastya_elvtars

  • Forum God
  • ****
  • Posts: 3 744
  • Karma: +173/-7
  • The rock n' roll doctor
    • The FreshStuff3 Site
Re: pattern matching and string.gsub
« Reply #8 on: 19 March, 2006, 03:30:50 »
Tab=string.char(9), maby you can use this. ;)
Everything could have been anything else and it would have just as much meaning.

Offline st0ne-db

  • Scripter
  • Double Ace
  • ******
  • Posts: 107
  • Karma: +14/-3
Re: pattern matching and string.gsub
« Reply #9 on: 19 March, 2006, 03:36:28 »
Tab=string.char(9), maby you can use this. ;)

THANK YOU SO MUCH!!!!!

this works perfectly!!!!!

;D ;D ;D ;D ;D ;D ;D

Offline bastya_elvtars

  • Forum God
  • ****
  • Posts: 3 744
  • Karma: +173/-7
  • The rock n' roll doctor
    • The FreshStuff3 Site
Re: pattern matching and string.gsub
« Reply #10 on: 19 March, 2006, 03:40:21 »
Don't forget:
Code: [Select]
\r=string.char(10)
\n=string.char(13)
Everything could have been anything else and it would have just as much meaning.

Offline jiten

  • Scripter
  • Forum Legend
  • ******
  • Posts: 1 577
  • Karma: +71/-5
Re: pattern matching and string.gsub
« Reply #11 on: 19 March, 2006, 08:45:32 »
I'm working on a RSS Feeder for EntryBot and this is the parser I came up with:

Code: [Select]
RSSParser = function(rFeed)
-- Get Link and User from #1 in Queue
local Host, sUser, trig = RSSQueue(3)
-- If found
if sUser and Host and trig and rFeed then
local sLine, sContent = "", ""
local tTable= {
[1] = {
["/a&gt;"] = "", ["&lt;"] = "", ["b&gt;"] = "",
["/b&gt;"] = "", ["&gt;"] = "", ["br /"] = "",
["/ br"] = "", ["a href="] = "", ["&apos;"] = "",
["&quot;"] = "", ["&lt;/a"] = "", ["<%!%[CDATA%["] = "",
["</(.-)>"] = "", ["<(.-)>"] = "", ["]]>"] = "", ["\t"] = "",
},
[2] = {
["<item>"] = "</item>", ["<item%s.->"] = "</item>",
},
}
-- Create/Clear Host Cache
tCache[Host] = {}
-- For each pair in sub-table
for a,b in pairs(tTable[2]) do
-- Extract content between <item> and </item>
for sItem in string.gfind(rFeed,a.."(.-)"..b) do
-- string.gsub unwanted chars
for i,v in pairs(tTable[1]) do sItem = string.gsub(sItem,i,v) end
-- Insert sItem in Cache file
table.insert(tCache[Host],sItem)
end
end
-- Save Cache file
SaveToFile(Settings.eFolder.."/"..Settings.cFile,tCache,"tCache")
local user = GetItemByName(sUser)
-- Write os.clock to RSS Cache
RSS[trig]["Cache"].iTime = os.clock()
-- Remove first user from queue, Save RSS file
Queuer()
-- If Host is cached
if next(tCache[Host]) then
-- Loop through specific host RSS feeds
for i,v in ipairs(tCache[Host]) do sContent = sContent..v end
-- If sUser is online
if user then
-- Send it
user:SendData(Settings.sBot,"*** Your request for "..Host.." has been completed!")
user:SendPM(Settings.sBot,"\r\n\r\n"..string.rep("- -",80).."\r\nFeed: "..Host.."\r\n"..sContent..
"\r\n"..string.rep("- -",80).."\r\n")
end
else
user:SendData(Settings.sBot,"*** Error: An error occured. Check your RSS please.")
end
end
end

PS: I just ripped it from there :)

Offline bastya_elvtars

  • Forum God
  • ****
  • Posts: 3 744
  • Karma: +173/-7
  • The rock n' roll doctor
    • The FreshStuff3 Site
Re: pattern matching and string.gsub
« Reply #12 on: 19 March, 2006, 14:38:37 »
Alll one needs to do is capture tag names and the text within.
I think a wisely set gfind [gmatch in 5.1] is all that is required, and therby
building a table as you go.   Now the data is indexed, sortable and easily searchable.
I wouldnt gsub this at all, just providing example of how one might and why the
original pattern failed with this data format.

Yes, I was telling the same... string.gsub would in no way be my preferred choice. Now we finally agreed! :P
Everything could have been anything else and it would have just as much meaning.

PtokaX forum

Re: pattern matching and string.gsub
« Reply #12 on: 19 March, 2006, 14:38:37 »