can someone please help me.. im trying to remove all spaces and tabs from the beginning of a string.
im am using string.gsub...
tried a few things without success..
??? sText=string.gsub(sText, "^%s+","") ???
TIA :)
-St0ne db
ok... the data im working with is for an updated ver of my rss bot. the string is comming directly from the host via bluebears newest ver of pxwsa. here is a sample of the raw data comming in from the feed.
- <?xml version="1.0"?>
- <!-- RSS generated by NFOrce on Sun, 19 Mar 2006 01:20:02 +0100 -->
- <rss version="2.0">
<channel>
<title>NFOrce NFOs - Xbox</title>
<link>http://www.nforce.nl/</link>
<description>All the latest Xbox NFOs provided by NFOrce.nl</description>
<image>
<url>http://www.nforce.nl/rss/logo.gif</url>
<title>NFOrce NFOs - Xbox</title>
<link>http://www.nforce.nl/</link>
</image>
<item>
<title>Sonic Riders (c) Sega *FULLDVD* *PAL* - PAL</title>
<link>http://www.nforce.nl/index.php?nfoid=103575</link>
<description>NFOrce NFOs->Xbox<br />On 2006-03-18 <b>PAL</b> released <b>Sonic Riders (c) Sega *FULLDVD* *PAL*</b><br />Size: 42x50MB<br /></description>
<pubDate>Sat, 18 Mar 2006 00:00:00 +0100</pubDate>
<guid>http://www.nforce.nl/index.php?nfoid=103575</guid>
<comments></comments>
</item>
<item>
ok, i have tried to remove the spaces with string.gsub. but they look like they are tabs, so can i match a pattern of tab characters?
1)
Nevermind. I am stupid.
2) Why don't you use the xml parser library?
In this case, he should use ^%s-
If you want to be 100% sure then you should pull all between <item> & </item> and parse the tags that come up then, cause RSS is a standard in theory. :)
In 5.1, string.gfind
is called string.gmatch
for w in string.gfind(rss,"%<item%>(.+)%<%/item%>") do
RSS feeds are divided into items (at least 0.9x and 2.0) and this is a 2.0 feed, that's why I told. Your pattern is perfect to parse what is between item tags.
BTW reading by lines is not always good, with this feed (http://media-cyber.law.harvard.edu/blogs/gems/tech/rss2sample.xml) it would fail. Not to deteriorate your code, Sir, just a (benign) warning.
first let me say thank you! for the reponses....
the examples u provided are similar to what i was doing... so
here is my acctual code from my script...
-- create the seperator between feeds
xFeedData=string.gsub(xFeedData, "<item>","\r\n"..string.rep("*",85).."\r\n")
-- remove the tags
xFeedData=string.gsub(xFeedData, "<([^>]-)>","")
-- remove the header
xFeedData=string.gsub(xFeedData, "HTTP/1(.-)/xml","")
-- remove any pipe characters
xFeedData=string.gsub(xFeedData, "|","I")
-- remove leading spaces
xFeedData=string.gsub(xFeedData, "^%s+","")
everything else was working great... except no matter what i do... i cant get rid of the tabs.
i also tried this
xFeedData=string.gsub(xFeedData, "<([^>]-)>","\r\n")
which works... but.. the string ends up with way too many blank lines... which i cannot seem to remove.
i might add that im am trying to minimize the amount of disk writes... and would like to parse the feed without saving to disk.
-St0ne db
Tab=string.char(9), maby you can use this. ;)
Quote from: bastya_elvtars on 19 March, 2006, 03:30:50
Tab=string.char(9), maby you can use this. ;)
THANK YOU SO MUCH!!!!!
this works perfectly!!!!!
;D ;D ;D ;D ;D ;D ;D
Don't forget:
\r=string.char(10)
\n=string.char(13)
I'm working on a RSS Feeder for EntryBot and this is the parser I came up with:
RSSParser = function(rFeed)
-- Get Link and User from #1 in Queue
local Host, sUser, trig = RSSQueue(3)
-- If found
if sUser and Host and trig and rFeed then
local sLine, sContent = "", ""
local tTable= {
[1] = {
["/a>"] = "", ["<"] = "", ["b>"] = "",
["/b>"] = "", [">"] = "", ["br /"] = "",
["/ br"] = "", ["a href="] = "", ["'"] = "",
["""] = "", ["</a"] = "", ["<%!%[CDATA%["] = "",
["</(.-)>"] = "", ["<(.-)>"] = "", ["]]>"] = "", ["\t"] = "",
},
[2] = {
["<item>"] = "</item>", ["<item%s.->"] = "</item>",
},
}
-- Create/Clear Host Cache
tCache[Host] = {}
-- For each pair in sub-table
for a,b in pairs(tTable[2]) do
-- Extract content between <item> and </item>
for sItem in string.gfind(rFeed,a.."(.-)"..b) do
-- string.gsub unwanted chars
for i,v in pairs(tTable[1]) do sItem = string.gsub(sItem,i,v) end
-- Insert sItem in Cache file
table.insert(tCache[Host],sItem)
end
end
-- Save Cache file
SaveToFile(Settings.eFolder.."/"..Settings.cFile,tCache,"tCache")
local user = GetItemByName(sUser)
-- Write os.clock to RSS Cache
RSS[trig]["Cache"].iTime = os.clock()
-- Remove first user from queue, Save RSS file
Queuer()
-- If Host is cached
if next(tCache[Host]) then
-- Loop through specific host RSS feeds
for i,v in ipairs(tCache[Host]) do sContent = sContent..v end
-- If sUser is online
if user then
-- Send it
user:SendData(Settings.sBot,"*** Your request for "..Host.." has been completed!")
user:SendPM(Settings.sBot,"\r\n\r\n"..string.rep("- -",80).."\r\nFeed: "..Host.."\r\n"..sContent..
"\r\n"..string.rep("- -",80).."\r\n")
end
else
user:SendData(Settings.sBot,"*** Error: An error occured. Check your RSS please.")
end
end
end
PS: I just ripped it from there :)
Quote from: Mutor on 19 March, 2006, 05:09:20
Alll one needs to do is capture tag names and the text within.
I think a wisely set gfind [gmatch in 5.1] is all that is required, and therby
building a table as you go. Now the data is indexed, sortable and easily searchable.
I wouldnt gsub this at all, just providing example of how one might and why the
original pattern failed with this data format.
Yes, I was telling the same... string.gsub would in no way be my preferred choice. Now we finally agreed! :P