Prefix Tree Optimization

To all lua gurus including Ansca staff:

I'm trying to implement a prefix tree (trie) to be used as my dictionary for a word game. It loads around 260k+ words from a flat file. Although I was able to build a program that works, it is slow. Running via simulator takes around 2.6 seconds to load and build the trie. In my iPod Touch 2G, it takes 40 seconds. I want this number to go down to about <10 seconds of loading time in the device.

So why am I doing this and not just use a simple SELECT statement in SQLite to check if words exist? Yes I could do that but I also want a find-all-possible-words solver at the end of each level. Got the idea from a discussion at StackOverflow.

Questions:
- How do I optimize the trie to reduce loading time?
- Does the bottleneck occur in the trie algorithm? Or is the io.lines() too slow? Or is the list just too big?

Many thanks for the help! Get the SOWPODS list here.

trie.lua

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
module(..., package.seeall)
 
function createTrie()
 
    -- PRIVATE
    local trie = {}
 
    local function addNode(word, position, node, wordLength)
        if position <= wordLength then
            local letter = string.sub(word, position, position)
            local child = node[letter]
 
            if child == nil then
                child = {}
                node[letter] = child
            end
 
            if position < wordLength then
                addNode(word, (position + 1), child, wordLength)
            else
                child.isWord = true
            end
        else
            return
        end
    end
 
    -- PUBLIC
    function trie:addWord(word)
        local letter = string.sub(word, 1, 1)
        local node = trie[letter]
        local wordLength = string.len(word)
 
        if node == nil then
            node = {}
            trie[letter] = node
        end
 
        addNode(word, 2, node, wordLength)
    end
 
    return trie
end

jon....don't parse this words file at all.......let Lua do it natively by storing it in table format as so:

return (word1 = true, word2 = true, ... word260000 = true}

when you require the above file, you instantly have a lua table in memory with zero code to run.

You can't get any faster than that

dgaedcke, though I was not convinced by your example, your explanation was spot on and gave me a very good idea. Dictionary now loading 5.3 seconds on the device! Woot!

Thanks!

Jon

Please share your solution........if you've found a faster way than my suggestion, I'll bow at your feet

btw, I just noticed that my first parenthesis was ( when it should have been { (a table constructor)

You were absolutely right that the dictionary should be a lua table to begin with. But I still need it as a trie structure. Your example was not what I was looking for. So what I did was still parse the text file, create the trie and physically write that trie into another lua file. This new file needs no extra processing and can be imported as a native lua table as you suggested.

Here is a snippet of what my trie looks like:

1
local trie = {A={A={L={I={I={S={isWord=true},isWord=true}},S={isWord=true},isWord=true},H={E={D={isWord=true}}...

Mine sharing the code that you used to create that? I'm also making a word game and need the same exact thing.

views:1490 update:2011/9/29 19:21:19
corona forums © 2003-2011