This first edition was written for Lua 5.0. While still largely relevant for later versions, there are some differences.
The fourth edition targets Lua 5.3 and is available at Amazon and other bookstores.
By buying the book, you also help to support the Lua project.

10.2 – Markov Chain Algorithm

Our second example is an implementation of the Markov chain algorithm. The program generates random text, based on what words may follow a sequence of n previous words in a base text. For this implementation, we will use n=2.

The first part of the program reads the base text and builds a table that, for each prefix of two words, gives a list with the words that follow that prefix in the text. After building the table, the program uses the table to generate random text, wherein each word follows two previous words with the same probability of the base text. As a result, we have text that is very, but not quite, random. For instance, when applied over this book, the output of the program has pieces like "Constructors can also traverse a table constructor, then the parentheses in the following line does the whole file in a field n to store the contents of each function, but to show its only argument. If you want to find the maximum element in an array can return both the maximum value and continues showing the prompt and running the code. The following words are reserved and cannot be used to convert between degrees and radians."

We will code each prefix by its two words concatenated with spaces in between:

    function prefix (w1, w2)
      return w1 .. ' ' .. w2
We use the string NOWORD ("\n") to initialize the prefix words and to mark the end of the text. For instance, for the following text
    the more we try the more we do
the table of following words would be
    { ["\n \n"] = {"the"},
      ["\n the"] = {"more"},
      ["the more"] = {"we", "we"},
      ["more we"] = {"try", "do"},
      ["we try"] = {"the"},
      ["try the"] = {"more"},
      ["we do"] = {"\n"},

The program keeps its table in the global variable statetab. To insert a new word in a prefix list of this table, we use the following function:

    function insert (index, value)
      if not statetab[index] then
        statetab[index] = {value}
        table.insert(statetab[index], value)
It first checks whether that prefix already has a list; if not, it creates a new one with the new value. Otherwise, it uses the predefined function table.insert to insert the new value at the end of the existing list.

To build the statetab table, we keep two variables, w1 and w2, with the last two words read. For each prefix, we keep a list of all words that follow it.

After building the table, the program starts to generate a text with MAXGEN words. First, it re-initializes variables w1 and w2. Then, for each prefix, it chooses randomly a next word from the list of valid next words, prints that word, and updates w1 and w2. Next we show the complete program.

    -- Markov Chain Program in Lua
    function allwords ()
      local line =    -- current line
      local pos = 1             -- current position in the line
      return function ()        -- iterator function
        while line do           -- repeat while there are lines
          local s, e = string.find(line, "%w+", pos)
          if s then      -- found a word?
            pos = e + 1  -- update next position
            return string.sub(line, s, e)   -- return the word
            line =    -- word not found; try next line
            pos = 1             -- restart from first position
        return nil            -- no more lines: end of traversal
    function prefix (w1, w2)
      return w1 .. ' ' .. w2
    local statetab
    function insert (index, value)
      if not statetab[index] then
        statetab[index] = {n=0}
      table.insert(statetab[index], value)
    local N  = 2
    local MAXGEN = 10000
    local NOWORD = "\n"
    -- build table
    statetab = {}
    local w1, w2 = NOWORD, NOWORD
    for w in allwords() do
      insert(prefix(w1, w2), w)
      w1 = w2; w2 = w;
    insert(prefix(w1, w2), NOWORD)
    -- generate text
    w1 = NOWORD; w2 = NOWORD     -- reinitialize
    for i=1,MAXGEN do
      local list = statetab[prefix(w1, w2)]
      -- choose a random item from list
      local r = math.random(table.getn(list))
      local nextword = list[r]
      if nextword == NOWORD then return end
      io.write(nextword, " ")
      w1 = w2; w2 = nextword