Lua DDJ article

reprint from Dr. Dobb's Journal 21 #12 (Dec 1996) 26–33. Copyright © 1996 Miller Freeman, Inc.

Lua: an extensible embedded language
A few metamechanisms replace a host of features

by Luiz Henrique de Figueiredo, Roberto Ierusalimschy, and Waldemar Celes

In recent years, a multitude of little languages have been proposed for extending and customizing applications. In general, these extension languages should have the following attributes:

Since extension languages are not for writing large pieces of software, mechanisms for supporting programming-in-the-large, like static type checking and information hiding, are not essential.

Lua, the extensible, embedded language we present here, satisfies these requirements. Its syntax and control structures are simple and familiar. Lua is small—the whole implementation is less than 6000 lines of ANSI C. Besides the facilities common to most procedural languages, Lua has special features that make it a powerful high-level extensible language:

Lua is a general-purpose embedded programming language designed to support procedural programming with data-description facilities. Although it is not in the public domain (TeCGraf retains the copyright), Lua is freely available for both academic and commercial purposes at http://www.lua.org/. The distribution also includes a standard library of mathematical functions (sin, cos, and so on), I/O and system functions, and string-manipulation functions. This optional library adds some 1000 lines of code to the system. Also included are a debugger and a separate compiler that produces portable binary files containing bytecodes. The code compiles without change in most ANSI C compilers, including gcc (on AIX, IRIX, Linux, Solaris, SunOS, and ULTRIX), Turbo C (on DOS), Visual C++ (on Windows 3.1/95/NT), Think C (MacOS), and CodeWarrior (MacOS).

All external identifiers are prefixed with lua to avoid name clashing when linking with applications. Even the code generated by yacc passes through a sed filter to comply with this rule, so that it is possible to link Lua with applications that use yacc for other purposes.

The Lua Implementation

Lua is provided as a small library of C functions to be linked to host applications. For example, the simplest Lua client is the interactive, stand-alone interpreter in Listing One. In this program, the function lua_dostring calls the interpreter over a section of code contained in a string. Each chunk of Lua code may contain a mixture of statements and function definitions.

The header file lua.h defines Lua's API, which has about 30 functions. Besides lua_dostring, there is a lua_dofile function to interpret Lua code contained in files, lua_getglobal and lua_setglobal to manipulate Lua global variables, lua_call to call Lua functions, lua_register to make C functions accessible from Lua, and so on.

Lua has a syntax somewhat similar to Pascal. To avoid dangling elses, control structures like ifs and whiles finish with an explicit end. Comments follow the Ada convention, starting with "--" and run until the end of the line. Lua supports multiple assignment; for example, x, y = y, x swaps the values of x and y. Likewise, functions can return multiple values.

Lua is a dynamically typed language. This means that values have types but variables don't, so there are no type or variable declarations. Internally, each value has a tag that identifies its type; the tag can be queried at run time with the built-in function type. Variables are typeless and can hold values of any type. Lua's garbage collection keeps track of which values are being used, discarding those that are not.

Lua provides the types nil, string, number, user data, function, and table. nil is the type of the value nil; its main property is that it is different from any other value. This is handy to use as the initial value of variables, for instance. The type number represents floating-point real numbers. string has the usual meaning. Type user data corresponds to a generic void* pointer in C, and represents host objects in Lua. All of these types are useful, but the flexibility of Lua is due to functions and tables, the product of two key lessons from Lisp and Scheme:

Function values in Lua can be stored into variables, passed as parameters to other functions, stored in tables, and the like.

When you declare a function in Lua (see Listing Two), the function body is precompiled into bytecodes, creating a function value. This value is assigned to a global variable with the given name. C functions, on the other hand, are provided by the host program through an appropriate call to the API. Lua cannot call C functions that have not been registered by their host. Therefore, the host has complete control over what a Lua program can do, including any potentially dangerous access to the operating system.

Tables

Tables are for Lua what lists are for Lisp: powerful data-structuring mechanisms. A table in Lua is similar to an associative array. Associative arrays can be indexed with values of any type, not just numbers.

Many algorithms become trivial when implemented with associative arrays, because the data structures and algorithms for searching them are implicitly provided by the language. Lua implements associative arrays as hash tables.

Unlike other languages that implement associative arrays, tables in Lua are not bound to a variable name. Instead, they are dynamically created objects that can be manipulated much like pointers in conventional languages. In other words, tables are objects, not values. Variables do not contain tables, only references to them. Assignment, parameter passing, and function returns always manipulate references to tables, and do not imply any kind of copy. While this means that a table must be explicitly created before it is used, it also allows tables to freely refer to other tables. So, tables in Lua can be used to represent recursive data types and to create generic graph structures, even those with cycles.

Tables simulate records simply by using field names as indices. Lua makes this easier by providing a.name as syntactic sugar for a["name"]. Sets also can be easily implemented by storing their elements as indices of a table. Note that tables (and therefore sets) need not be homogeneous; they can store values of all types simultaneously, including functions and tables.

Lua provides a constructor, a special kind of expression to create tables, that is handy for initializing lists, arrays, records, etc. See Example 1.

User-Defined Constructors

Sometimes you need finer control over the data structures you are building. Following the philosophy of providing only a few general metamechanisms, Lua provides user-defined constructors. These constructors are written name{...}, which is a more intuitive version of name({...}). In other words, with such a constructor, a table is created, initialized, and passed as a parameter to a function. This function can do whatever initialization is needed, such as dynamic type checking, initialization of absent fields, and auxiliary data-structure update, even in the host program.

User-defined constructors can be used to provide higher-level abstractions. So, in an environment with proper definitions, you can write window1=Window{x=200, y=300, color="blue"} and think about "windows," not plain tables. Moreover, because constructors are expressions, they can be nested to describe more complex structures in a declarative style, as in Listing Four.

Object-Oriented Programming

Because functions are first-class values, table fields can refer to functions. This is a step toward object-oriented programming, and one made easier by simpler syntax for defining and calling methods.

A method definition is written as Example 2(a), which is equivalent to Example 2(b). In other words, defining a method is equivalent to defining a function, with a hidden first parameter called self and storing the function in a table field.

A method call is written as receiver: method(params), which is translated to receiver.method(receiver,params). The receiver of the method is passed as the first argument of the method, giving the expected meaning to the parameter self.

These constructions do not provide information hiding, so purists may (correctly) claim that an important part of object orientation is missing. Moreover, Lua does not provide classes; each object carries its own method-dispatch tables. Nevertheless, these constructions are extremely light, and classes can be simulated using inheritance, as is common in other prototype-based languages, such as Self.

Fallbacks

Because Lua is an untyped language, many abnormal run-time events can happen: arithmetic operations being applied to nonnumerical operands, nontable values being indexed, nonfunction values being called. In typed, stand-alone languages, some of these conditions are flagged by the compiler; others result in aborting the program at run time. It's rude for an embedded language to abort its host program, so embedded languages usually provide hooks for error handling.

In Lua, these hooks are called "fallbacks" and are also used for handling situations that are not strictly error conditions, such as accessing an absent field in a table and signaling garbage collection. Lua provides default fallback handlers, but you can set your own handlers by calling the built-in function setfallback with two arguments: a string identifying the fallback condition (see Table 1), and the function to be called whenever the condition occurs. setfallback returns the old fallback function, so you can chain fallback handlers if necessary

Inheritance via Fallbacks

One of the most interesting uses of fallbacks is in implementing inheritance in Lua. Simple inheritance allows an object to look for the value of an absent field in another object called its "parent;" in particular, this field can be a method. This mechanism is a kind of object inheritance, in contrast to the more traditional class inheritance adopted in Smalltalk and C++.

One way to implement simple inheritance in Lua is to store the parent object in a distinct field, parent for instance, and set an "index" fallback function; see Listing Five. This code defines a function Inherit and sets it as the index fallback. Whenever Lua attempts to access a field that is absent in an object, the fallback mechanism calls the function Inherit. This function first checks whether the object has a field parent containing a table value. If so, it attempts to access the desired field in the parent object. If the field is not present in the parent, the fallback is automatically called again. This process is repeated "upwards" until a value for the field is found or the parent chain ends. When better performance is needed, the same inheritance scheme can be implemented in C using Lua's API.

Reflexive Facilities

Being an interpreted language, Lua provides some reflexive facilities. One example is the function type, already mentioned. Other powerful reflexive functions are next, which traverses a table, and nextvar, which traverses all global variables. The function next gets two arguments, a table and an index in the table, and returns a "next" index in some implementation-dependent order. (Recall that tables are implemented as hash tables.) It also returns the value associated with the index in the table. (Recall that functions in Lua can return multiple values.) The function nextvar has similar behavior, but it traverses the global variables instead of the indices of a table.

An interesting example using reflexivity is dynamic typing. As noted before, Lua has no static typing. However, sometimes it is useful to check if a given value has the correct type to prevent weird behavior in a program. It's easy to check simple types with type. But for tables, we must check whether all fields are present and correctly filled.

Using Lua's data-description facilities, you can use values to describe types: A single type is described by its name, and a table type is described by a table that maps each field to its required type (Listing Six). Given such descriptions, you can write a single, polymorphic function that checks whether a value has a given type; see Listing Seven.

Reflexive facilities also allow a program to manipulate its own environment. For instance, a program can create a "protected environment" in which to run another piece of code. This situation is common in agent-based applications, when a host runs untrusted code received through the Internet (for instance, Web pages with executable content, a trendy thing in these days of Java). Some extension languages have to provide specific support for safe execution, but Lua is flexible enough to do this using the language itself. Listing Eight shows how the whole global environment can be saved in a table. A similar function restores a saved environment. Because all functions are first-class values assigned to variables, it is trivial to remove a function from the global environment. Listing Nine shows a function that runs a piece of code in a protected environment.

Binding Tk to Lua

A natural use of Lua is in the description of GUIs—where you need facilities to describe hierarchies of objects (widgets) and to bind user actions to them. Lua is suitable for such tasks because it combines data-description mechanisms with simple, powerful, and extensible semantics. Indeed, we have developed several UI toolkits with Lua.

Although Tk is a versatile GUI toolkit, Tcl isn't the kind of language that everyone feels comfortable with. Since we regard Lua as an alternative to Tcl, we decided to implement a Tk/Lua binding, allowing access to Tk widgets from Lua.

As far as possible, we preserved Tk's philosophy, including widget names, attributes, and commands. Everyone knows how tempting it is to try to improve an existing API, but in the long run, leaving it alone is better for Tk users, because they do not have to learn new concepts. (It's also better for us, because we don't have to write a new manual!)

Creating Tk/Lua Widgets

We have mapped all Tk widgets to Lua. You create a widget describing its attributes, using Lua table constructors. For instance, Example 3(a) creates a button and stores it in b; b is now an object that represents the button. After this definition, you can use ordinary Lua syntax to manipulate the object. Hence, the assignment b.label="Hello world from Lua!" changes the button's label and updates its image if it is already displayed on the screen. A text-entry widget limited to 20 characters can be created with e=entry{width=20}. After this widget is mapped on a displayed window, e.current contains the current value assigned by the user to the widget. (We use the current field to store the widget value, instead of using global variables as in Tcl/Tk.)

Widgets are not automatically mapped on a window. Unlike the Tk/Tcl environment, there isn't the concept of a current window in Tk/Lua. You must create a window (which can be the main window or a top-level widget) to hold the other widgets, and then explicitly map it onto the screen; see Example 3(b).

In this way, users can freely describe their dialogs, even cross-referencing widgets and mapping them when necessary. We have also eliminated the need for packing widgets explicitly, because it is more natural to specify layouts in a descriptive manner. Thus, the window (main and top-level) and frame widgets are used as containers that automatically pack their contents. For instance, to show a message with two bottom buttons, you can write Example 3(c). Besides all regular Tk widgets, we have also implemented two additional canvases, one using a simplified API to Xlib, and another using OpenGL.

Almost all functions provided by these libraries were mapped to Lua. So you can create sophisticated graphical applications that use direct manipulation on custom widgets—solely in Lua.

Accessing Widget Commands

All Tk widget commands are implemented as object methods in Tk/Lua. Their names, parameters, and functionality were preserved. If lb represents a listbox widget, then lb:insert("New item") inserts a new item in the list, following the Tk insert command for listboxes. On the other hand, the most-used Tk widget command, configure, is no longer needed because its effect is now obtained with simple assignments.

The main and top-level widgets inherit the methods from the window manager. If w represents a window, then w:iconify() has its usual effect.

Behind the Scenes

Implementing Tk/Lua wasn't hard. Using the C interface to Tcl/Tk, we created a service provider and registered it to be accessed from Lua. The Lua code that implements the binding uses an object-oriented approach, with the index fallback, as mentioned earlier. Each widget instance inherits from a class object, with the widget class at the top of the hierarchy. This class provides standard methods used by all widgets. Listing Ten shows the definition of this generic class and its method to set the focus to a widget. It also shows the button class definition.

As you now know, each widget is created with a table constructor. The constructor sets the instance class, creates the widget, and stores it in a global array. However, we also use a small trick—instead of returning the new table, the constructor returns the widget location as a numeric ID (Listing Eleven).

Hence, when Lua tries to index a widget, as in b.label, it calls a fallback, because numbers cannot be indexed. This trick gives us complete control over widget semantics. For instance, if b is a button (actually, b stores an ID) and you set b.label = "New label," then the fallback is responsible for calling the appropriate service command to update the widget.

Listing Twelve shows the Tk/Lua "settable" fallback function. This fallback is called at every attempt to index a nontable value. First, we check whether the first parameter corresponds to a valid widget ID. If it does, then we retrieve the widget table by accessing the global array. Otherwise, we dispatch the occurrence to the previously registered fallback.

The widgets in table tklua_IDtable have an internal field, called tkname, that stores the corresponding Tk widget name. This name is used to invoke Tk commands. We check whether there is a corresponding Tk widget and whether the indexing value is a valid Tk attribute. If so, we ask the service provider to change the widget attribute (calling the registered C function tklua_configure). The assignment h[f]=v assures us that we can use the widget table to store values besides Tk attributes.

Implementing the "gettable" fallback is similar. In addition to these two fallbacks, Tk/Lua also uses the index fallback to implement inheritance (Listing Five) and the "function" fallback to call widget commands or window-manager commands.

Conclusion

Extension languages are always interpreted, in one way or another. Simple extension languages can be interpreted directly from source code. On the other hand, embedded languages are usually powerful programming languages with complex syntax and semantics. A more-efficient implementation technique for embedded languages is now standard: Design a virtual machine suited to the needs of the language, compile extension programs into bytecodes for this machine, and then simulate the virtual machine by interpreting the bytecodes. We have chosen this hybrid architecture for implementing Lua because lexical and syntactical analysis are done only once, resulting in faster execution. Also, it allows extension programs to be provided in precompiled bytecode form only, resulting in faster loading and safer environments.

Example 1: The table definitions in (a) are equivalent to (b).

(a)
     t = {}  -- empty table
     t[1] = i
     t[2] = i*2
     t[3] = i*3
     t[4] = i+j

     s = {}
     s.a = x   -- same as s["a"] = x
     s.b = y

(b)
     t = {i, i*2, i*3, i+j}
     s = {a=x, b=y}

Example 2: The method definition in (a) is equivalent to (b).

(a)
     function object:method(params)
       ...
     end

(b)
     function object.method(self, params)
       ...
     end

Example 3: (a) Creating and storing a button; (b) explicitly mapping a window onto the screen; (c) showing a message with two bottom buttons.

(a)
     b = button{label = "Hello world!"
                command = "exit(0)"
               }


(b)
     w = toplevel{b}
     w:show()

(c)
     b1 = button{label="Yes", command="yes=1"}
     b2 = button{label="No", command="yes=0"}
     w  = toplevel{message{text="Overwrite file?"},
                   frame{b1, b2; side="left"};
                   side="top"
                  }

Table 1: Fallback conditions.

String Condition
"arith" Arithmetic on invalid operands.
"order" Order comparison on invalid operands.
"concat" String concatenation on invalid operands.
"getglobal" Reading the value of a global variable that has not been defined.
"index" Retrieving the value of an index not present in a table.
"gettable" Reading the value of an index in a nontable value.
"settable" Writing the value of an index in a nontable value.
"function" Calling a nonfunction value.
"gc" Called during garbage collection for each table being collected.
"error" Called when a fatal error occurs.

Listing One

#include <stdio.h>
#include "lua.h"

int main()
{
 char line[BUFSIZ];
 while (fgets(line,sizeof(line),stdin)!=0)
   lua_dostring(line);
 return 0;
}

Listing Two

function map(list, func)
  local newlist = {}
  local i = 1
  while list[i] do
    newlist[i] = func(list[i])
    i = i+1
  end
  return newlist
end

Listing Three

list = {}
i = 4
while i >= 1 do
   list = {head=i,tail=list}
   i = i-1
end

Listing Four

S = Separator{
     drawStyle = DrawStyle{style = FILLED},
     material =  Material{
       ambientColor  = {0.377, 0.377, 0.377},
       diffuseColor  = {0.800, 0.771, 0.093},
       emissiveColor = {0.102, 0.102, 0.102},
       specularColor = {0.0, 0.0, 0.0}
     },

     transform = Transform{
       translation = {64.293, 20.206, 0.0},
       rotation    = {0.0, 0.0, 0.0, 0.0}
     },
     shape = Sphere{radius = 10.0}
   }

Listing Five

function Inherit(t,f)
  if f == "parent" then  -- avoid loops
    return nil
  end
  local p = t.parent
  if type(p) == "table" then
    return p[f]
  else
    return nil
  end
end

setfallback("index", Inherit)

Listing Six

TNumber="number"
TPoint={x=TNumber, y=TNumber}
TColor={red=TNumber, blue=TNumber, green=TNumber}
TRectangle={topleft=TPoint, botright=TPoint}
TWindow={title="string", bounds=TRectangle, color=TColor}

Listing Seven

function checkType(d, t)
  if type(t) == "string" then
    -- t is the name of a type
    return (type(d) == t)
  else
    -- t is a table, so d must also be a table
    if type(d) ~= "table" then
      return nil
    else
      -- d is also a table; check its fields
      local i,v = next(t,nil)
      while i do
        if not checkType(d[i],v) then
          return nil
        end
        i,v = next(t,i)
      end
    end
  end
  return 1
end

Listing Eight

function save()
  -- create table to hold environment

  local env = {}
  -- get first global var and its value
  local n, v = nextvar(nil)
  while n do
    -- save global variable in table
    env[n] = v
    -- get next global var and its value
    n, v = nextvar(n)
  end
  return env
end

Listing Nine

function runProtected(code)
  -- save current environment
  local oldenv = save()
  -- erase "dangerous" functions
  readfrom,writeto,execute = nil,nil,nil
  -- run untrusted code
  dostring(code)
  -- restore original environment
  restore(oldenv)
end

Listing Ten

widgetClass = {}
function widgetClass:focus()
 if self.tkname then
  tklua_setFocus(self.tkname)
 end
end
buttonClass = {
 parent = widgetClass,
 tkwidget = "button"
}

Listing Eleven

function button(self)
 self.parent = classButton
 tklua_ID = tklua_ID + 1
 tklua_IDtable[tklua_ID] = self
 return tklua_ID
end

Listing Twelve

function setFB(id, f, v)
 local h = tklua_IDtable[id]
 if h == nil then
  old_setFB(id,f,v)
  return
 end
 if h.tkname and h:isAttrib(f) then
  tklua_configure(h.tkname,f,v)
 end
 h[f] = v

end
old_setFB = setfallback("settable",setFB)