This document lists the set of patches we need to make to the lua interpreter. I'm doing my best to keep the set of patches to a minimum, and to keep the patches as simple as possible.
## Error Line Number Patch
Standard Lua is pretty inconsistent about how it reports the line number and filename of errors:
- Some errors have the line number and filename right in the error message. Some don't.
- Some errors generate stack tracebacks, some don't.
- When there's a traceback, the file and line number in the error message is redundant.
The goal of this patch is to achieve a more consistent state of affairs. First, we follow the rule that every error must include a stack traceback. Doing this doesn't require any patches to lua. It just requires consistent use of Lua's traceback facilities at every entry point into lua.
Second, since the traceback will always contain the file and line number, and since there will always be a traceback, there's no reason to put the file and line number into the error message itself. We have taken steps to remove that feature. To accomplish this, the following patches to lua were made:
- removed calls to 'lua_where' (replacing them with 'lua_no_where')
- removed calls to 'addinfo' (replacing them with 'no_addinfo')
This patch is live and functioning.
## Print Integers Patch
Lua numbers are actually double-precision floating point. There is no separate "int" data type. If you try to store an integer in lua, it gets stored as a double-precision float. That's usually just fine. Double precision can store integers up to 53 bits losslessly and precisely.
When you use the builtin lua function "tostring" that converts a number to a string, this is actually implemented using 'sprintf', using the format directive in LUA_NUMBER_FMT, which by default is "%.14d". As you can see, that's a number format for double-precision numbers, which makes sense, because lua numbers *are* double-precision floats.
That particular format has the effect of printing certain large integers in scientific notation. I found that to be problematic, because you can't see all the digits. It's a very bad format for printing, say, unique ID numbers.
So, I changed the default format from "%.14d" to "%.16d". That may seem like a trivial change, but it makes it so that every large integer up to 53 bits, which is the largest integer that can be losslessly stored in a double, is printed as a simple integer, showing all the digits.
The 'print', 'pprint,' and 'tostring' routines built into our lua interpreter do not use this builtin functionality. We have rewritten all of those functions from scratch, to take precise control over what is printed. None of those functions rely on the builtin *sprintf*.
The patch is probably still live, though. The builtin *sprintf* probably does get called implicitly from inside certain lua error message generating routines. It's just of very minor importance these days.
## The Generalized Less-Than Patch
The standard lua less-than operator will throw an error if you try to compare two objects of different types, or if you try to compare two tables, two functions, or two threads.
This patch adds the lua function *genlt* and the C function *lua_genlt*. This is a generalized less-than operator to compare any two lua objects. It always returns true or false, and never generates an error. Here is how comparison works:
- boolean: false is less than true.
- number: ordered by ascending value.
- string: ordered lexically.
- table: all tables are equal (ie, not less than).
- functions: all functions are equal (ie, not less than).
- threads: all threads are equal (ie, not less than).
- two different types: they are ordered nil, bool, number, string, table, function, thread.
This patch is live and functioning. The generalized less-than operator is quite useful as the second parameter to lua's builtin *table.sort* function. We have also provided an iterator *table.sortedpairs* that is similar to the lua builtin *table.pairs* that iterates over a table in sorted order. This implicitly uses the *genlt* comparison operator.
This patch is designed to address the nondeterminism of the lua 'next' iterator. In the original Lua design, table iteration was nondeterministic. By that, I mean that in the original lua, I can create two empty tables, I can then perform the same sequence of insertions and deletions on those two tables. The two tables are identical: they have the same keys, and they've had the same same sequence of operations applied to them. But I can iterate over them using "table.pairs", and they produce their keys in two different orders. That's nondeterminism.
We can't allow nondeterminism in our version of Lua. To fix this was a big deal: we had to change the internal representation of lua tables. Internally, a lua table now contains a separate "sequence" vector which works as follows:
- When you insert a new key into a table (by mapping that key to non-nil), the new key is appended to the sequence.
- When you remove a key from a table (by mapping that key to nil) that key is removed from the sequence, creating a gap in the sequence. The last key in the sequence is moved to fill the gap in the sequence.
- When you call the iterator, next(table, k), it finds k in the sequence, and then returns the key immediately after k in the sequence.
Importantly, the sequence is deterministic - given a fixed set of operations on a table, it will always be in the same order. Therefore the *next* iterator and the *lua_next* function are now both deterministic, as is the *pairs* iterator.
There's a very special case. The standard lua 'next' iterator can find a successor to a key that was *recently deleted*. This functionality is necessary in order to be able to iterate over a table and delete elements while iterating. We support this same functionality, but only on the very most recently deleted key. It's still sufficient to allow deleting table elements while iterating.
This patch is live and is used implicitly whenever you iterate over a lua table.
The lua library function luaL_tolstring converts a lua object into a string, no matter what the type of the lua object is. In the case of tables, the string it generates looks like "table: 0123456701234567", with the address of the table being part of the string. That would make any code that uses that string dependent on the precise address of the table. That's considered nondeterminism for our purposes, it's not allowed.
We don't use luaL_tolstring in the Luprex codebase. However, the luaL_tolstring function is used from within the lua runtime in three different places:
- In the lua function *string.format*. This is no longer an issue. We wrote our own version of format which replaces the built-in version. Our implementation doesn't use luaL_tostring.
- In the builtin function *tostring*. This is no longer an issue. We wrote our own version of tostring which replaces the built-in version. Our implementation doesn't use luaL_tolstring.
- In the eris runtime. It is using luaL_tolstring to generate keys that become part of an associative map.
So the obvious thing to do would be to just change the code for luaL_tolstring to not include the table address. But that would totally break the eris code.
Another possibility would be to patch or override the code for string.format. That fixes the problem, but it leaves luaL_tolstring in there as a time-bomb where somebody else might use it not realizing that it's an issue.
There's no obvious approach to fixing this, so I haven't patched it yet.
## The GC Finalizer Patch and the Weak Table Patch (Unimplemented)
GC Finalizers and weak tables both introduce nondeterminism into Lua execution. We can't allow that. It may be necessary to patch the lua interpreter to simply disable these functions. Alternately, we could simply ask the scripters not to use these features, and declare "undefined behavior" if they do.
Update 1: I'm using GC finalizers in some cases to clean up userdata objects. I think it's safe as long as the only thing the finalizer does is free memory. (NOTE: WHERE?)