strsub vs. table lookup
 

News:

29 December 2022 - PtokaX 0.5.3.0 (20th anniversary edition) released...
11 April 2017 - PtokaX 0.5.2.2 released...
8 April 2015 Anti child and anti pedo pr0n scripts are not allowed anymore on this board!
28 September 2015 - PtokaX 0.5.2.1 for Windows 10 IoT released...
3 September 2015 - PtokaX 0.5.2.1 released...
16 August 2015 - PtokaX 0.5.2.0 released...
1 August 2015 - Crowdfunding for ADC protocol support in PtokaX ended. Clearly nobody want ADC support...
30 June 2015 - PtokaX 0.5.1.0 released...
30 April 2015 Crowdfunding for ADC protocol support in PtokaX
26 April 2015 New support hub!
20 February 2015 - PtokaX 0.5.0.3 released...
13 April 2014 - PtokaX 0.5.0.2 released...
23 March 2014 - PtokaX testing version 0.5.0.1 build 454 is available.
04 March 2014 - PtokaX.org sites were temporary down because of DDOS attacks and issues with hosting service provider.

Main Menu

strsub vs. table lookup

Started by c h i l l a, 13 January, 2004, 01:25:01

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

c h i l l a

strtable = {
["$MyINFO"] = 1,
["$To:"] = 1,
["$Kick"] = 1,
["$ConnectToMe"] = 1,
["$RevConnectToMe"] = 1,
["$ValidateNick"] = 1,
["$Search"] =1,
["Other"] = 1,
["Other2"] = 1,

}

string = 0
table = 0

function TestStrings(line)
	local _,_,str = strfind(line,"^(%S+)")
	--print(str)
	if str then
		t1 = clock()
		if strsub(str,1,12) == "$ConnectToMe" then
		elseif strsub(str, 1, 7) == "$Search" then
		elseif strsub(str,1,5) == "$Kick" then
		elseif strsub(str,1,13) == "$ValidateNick" then
		elseif strsub(str,1,7) == "$MyINFO" then
		elseif strsub(str,1,4) == "$To:" then
		end
		t2 = clock()
		if strtable[str] then
		end
		t3 = clock()
	end
	string = string + (t2-t1)
	--print("STRSUB:   "..(t2-t1))
	--print("TABLE:   "..(t3-t2))
	table = table + (t3-t2)
end


function CreateString()
	for i = 1,1000000 do
		local word = strchar(random(65,90))
		len = random(3,30)
		for i = 1,len do
			word = word..strchar(random(97,122))
		end
		TestStrings(word)
	end
	print("STRSUB:   "..string.."     "..(string/1000000))
	print("TABLE:   "..table.."     "..(table/1000000))
end




> CreateString()
STRSUB:   3.25800000000163     3.25800000000163e-006
TABLE:   0.5000000000027285     5.000000000027285e-007
> CreateString()
STRSUB:   6.742999999995845     6.742999999995846e-006
TABLE:   0.9350000000040382     9.350000000040382e-007


pro's and contras are welcome.

NotRabidWombat

Your test is VERY biased towards the hash table being victorious. You create a random dc command which will most likely always be worst case scenario (ie: does not exist in the set you are searching through). So you perform 7 strsubs every single time you test that method. You should extend the test to have a set of actual results (chat, search, invalid, etc) that you randomly select and test.
The next problem is you are not taking advantage of one of the most powerful functions, strfind. You should Use strfind to grab the first word for each test and then do comparisons and table indexing.
However, you will find the hashing is MUCH faster than a switch style if/elseif/else statement.

Hashing has also been improved (collision handling I believe) in Lua 5.0. *hint* *hint* *hint*

-NotRabidWombat


I like childish behavior. Maybe this post will be deleted next.

c h i l l a

but its worst case for both methods, but I will only compare the str  from strfind, for beeing right.

Done.  Now they are both almost equal.
So i guess.. either hashing in LUA 5.0 is way faster than in 4.0, or else it actually doesn't matter.

here again the code

so you can check ;)

strtable = {
["$MyINFO"] = 1,
["$To:"] = 1,
["$Kick"] = 1,
["$ConnectToMe"] = 1,
["$RevConnectToMe"] = 1,
["$ValidateNick"] = 1,
["$Search"] =1,
["Other"] = 1,
["Other2"] = 1,

}

string = 0
table = 0
ttotal = 0

function TestStrings(line)
	tbegin = clock()
	local _,_,str = strfind(line,"^(%S+)")
	--print(str)
	if str then
		t1 = clock()
		if str == "$ConnectToMe" then
		elseif str == "$Search" then
		elseif str == "$Kick" then
		elseif str == "$ValidateNick" then
		elseif str == "$MyINFO" then
		elseif str == "$To:" then
		end
		t2 = clock()
		if strtable[str] then
			--print("yes")
		end
		t3 = clock()
	end
	string = string + (t2-t1)
	--print("STRSUB:   "..(t2-t1))
	--print("TABLE:   "..(t3-t2))
	table = table + (t3-t2)
	tend = clock()
	ttotal = ttotal + (tend-tbegin)
end

function CreateString()
	var = 0
	for _ = 1,10000 do
		for i1 = 1,10 do
			var = var + 1
			if i1 == 10 then
				TestStrings("$ConnectToMe")
			elseif i1 == 5 then
				TestStrings("$Search")
			else
				local word = strchar(random(65,90))
				len = random(3,30)
				for _ = 1,len do
					word = word..strchar(random(97,122))
				end
				TestStrings(word)
			end
		end
	end
	print("STRSUB:   "..string.."     "..(string/1000000))
	print("TABLE:   "..table.."     "..(table/1000000))
	print("TOTAL: "..ttotal)
	print(var)
	collectgarbage()
end

resuts:

> CreateString()
STRSUB:   0.3539999999999282     3.539999999999281e-007
TABLE:   0.2499999999996589     2.49999999999659e-007
TOTAL: 2.256999999999835
100000
> CreateString()
STRSUB:   0.3849999999999909     3.849999999999909e-007
TABLE:   0.280999999999608     2.80999999999608e-007
TOTAL: 2.583999999999946
100000

plop

chilla have you tryed it with 50+ elseif's???
here table indexing should show much more speed.
btw if your sneaky you can use str=gsub(line, 1, 3), if you look you only need the 1st 3 characters.

wombat if hashing has been improved on 5.0 i vote for your hint.
i love those thingys, i try 2 use them as much as posible on a.i..

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

c h i l l a

table is getting faster. yepp.
but why strsub, when you can compare the first word,
maybe someone wants to do a test on strsub and strfind.

> CreateString()
STRSUB:   1.472999999994499     1.472999999994499e-006
TABLE:   0.2029999999995198     2.029999999995198e-007
TOTAL: 3.174999999995634
100000
> CreateString()
STRSUB:   1.658999999996013     1.658999999996013e-006
TABLE:   0.2179999999989377     2.179999999989377e-007
TOTAL: 3.671999999995023
100000

else I will stick to strfind now.

plop

strfind checks every character, so $RevConnectToMe whill take 16 checks and $To: only 5.
if you grab only the 1st 3 characters all whill use the same time.

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

c h i l l a

#6
sounds logical ;).
Will improve LIS with this stuff.

NotRabidWombat

Yes, but most scripts should use strfind to get the DC command or the command from the chat. You would need the strfind for either method, so it should be outside of the time test.

The problem with using worst case for both is hashing will always beat a switch at worst case (if it deals with collisions well).

-NotRabidWombat


I like childish behavior. Maybe this post will be deleted next.

c h i l l a

its outside the strfind..

but truth is, a good strsub, can beat the strfind. almost, half of the time can be saved as shown...

but...  I know that Ptokax, has optimised its strfind, through strsub...  in c++ way of course...

but its not as secura as I found out.. because you can

connect to ptokax with  $MyInfo   but dc protocoll  is case sensitiva so you should get dissed, well you get by LIS, but only cause it checks, for the whole MyINFO.

But to do Cases, you can improve your script by using strsub, to filterout, only the data you need or not need.

test and see for yourself what suits best, it always depends on what you need.

plop

QuoteOriginally posted by NotRabidWombat
Yes, but most scripts should use strfind to get the DC command or the command from the chat. You would need the strfind for either method, so it should be outside of the time test.

The problem with using worst case for both is hashing will always beat a switch at worst case (if it deals with collisions well).

-NotRabidWombat
yep, would also be very boring if every1 would have 2 use the same length for a nick.
and all the commands the same length can result in some weird abreviation's. lol

if a string is always on the same place then gsub is the best 2 use.
but then again cmd = gsub(data, (strlen(user.sName)+1), -1) also works for cmd's without arguments. lol
but i think this is slower then strfind.

plop
http://www.plop.nl lua scripts/howto\'s.
http://www.thegoldenangel.net
http://www.vikingshub.com
http://www.lua.org

>>----> he who fights hatred with hatred, drives the spreading of hatred <----<<

SMF spam blocked by CleanTalk