Files
integration/Docs/A-Summary-of-our-Lua-Patches.md

281 lines
15 KiB
Markdown
Raw Permalink Normal View History

2026-02-05 12:40:27 -05:00
### A Summary of Our Lua Patches
This document lists the set of patches we need to make to the lua interpreter. I'm doing my best to keep the set of patches to a minimum, and to keep the patches as simple as possible.
## Error Line Number Patch
Standard Lua is pretty inconsistent about how it reports the line number and filename of errors:
- Some errors have the line number and filename right in the error message. Some don't.
- Some errors generate stack tracebacks, some don't.
- When there's a traceback, the file and line number in the error message is redundant.
The goal of this patch is to achieve a more consistent state of affairs. First, we follow the rule that every error must include a stack traceback. Doing this doesn't require any patches to lua. It just requires consistent use of Lua's traceback facilities at every entry point into lua.
Second, since the traceback will always contain the file and line number, and since there will always be a traceback, there's no reason to put the file and line number into the error message itself. We have taken steps to remove that feature. To accomplish this, the following patches to lua were made:
- removed calls to 'lua_where' (replacing them with 'lua_no_where')
- removed calls to 'addinfo' (replacing them with 'no_addinfo')
This patch is live and functioning.
## Print Integers Patch
Lua numbers are actually double-precision floating point. There is no separate "int" data type. If you try to store an integer in lua, it gets stored as a double-precision float. That's usually just fine. Double precision can store integers up to 53 bits losslessly and precisely.
When you use the builtin lua function "tostring" that converts a number to a string, this is actually implemented using 'sprintf', using the format directive in LUA_NUMBER_FMT, which by default is "%.14d". As you can see, that's a number format for double-precision numbers, which makes sense, because lua numbers *are* double-precision floats.
That particular format has the effect of printing certain large integers in scientific notation. I found that to be problematic, because you can't see all the digits. It's a very bad format for printing, say, unique ID numbers.
So, I changed the default format from "%.14d" to "%.16d". That may seem like a trivial change, but it makes it so that every large integer up to 53 bits, which is the largest integer that can be losslessly stored in a double, is printed as a simple integer, showing all the digits.
The 'print', 'pprint,' and 'tostring' routines built into our lua interpreter do not use this builtin functionality. We have rewritten all of those functions from scratch, to take precise control over what is printed. None of those functions rely on the builtin *sprintf*.
The patch is probably still live, though. The builtin *sprintf* probably does get called implicitly from inside certain lua error message generating routines. It's just of very minor importance these days.
## The Generalized Less-Than Patch
The standard lua less-than operator will throw an error if you try to compare two objects of different types, or if you try to compare two tables, two functions, or two threads.
This patch adds the lua function *genlt* and the C function *lua_genlt*. This is a generalized less-than operator to compare any two lua objects. It always returns true or false, and never generates an error. Here is how comparison works:
- boolean: false is less than true.
- number: ordered by ascending value.
- string: ordered lexically.
- table: all tables are equal (ie, not less than).
- functions: all functions are equal (ie, not less than).
- threads: all threads are equal (ie, not less than).
- two different types: they are ordered nil, bool, number, string, table, function, thread.
This patch is live and functioning. The generalized less-than operator is quite useful as the second parameter to lua's builtin *table.sort* function. We have also provided an iterator *table.sortedpairs* that is similar to the lua builtin *table.pairs* that iterates over a table in sorted order. This implicitly uses the *genlt* comparison operator.
2026-02-11 11:41:32 -05:00
We originally designed this patch to help with determinism.
But we eventually realized it was the wrong design, and we
ended up not needing it for determinism. It's still a
very useful function, though.
2026-02-05 12:40:27 -05:00
## The Table Iterator Patch
This patch is designed to address the nondeterminism of the lua 'next' iterator. In the original Lua design, table iteration was nondeterministic. By that, I mean that in the original lua, I can create two empty tables, I can then perform the same sequence of insertions and deletions on those two tables. The two tables are identical: they have the same keys, and they've had the same same sequence of operations applied to them. But I can iterate over them using "table.pairs", and they produce their keys in two different orders. That's nondeterminism.
We can't allow nondeterminism in our version of Lua. To fix this was a big deal: we had to change the internal representation of lua tables. Internally, a lua table now contains a separate "sequence" vector which works as follows:
- When you insert a new key into a table (by mapping that key to non-nil), the new key is appended to the sequence.
- When you remove a key from a table (by mapping that key to nil) that key is removed from the sequence, creating a gap in the sequence. The last key in the sequence is moved to fill the gap in the sequence.
- When you call the iterator, next(table, k), it finds k in the sequence, and then returns the key immediately after k in the sequence.
Importantly, the sequence is deterministic - given a fixed set of operations on a table, it will always be in the same order. Therefore the *next* iterator and the *lua_next* function are now both deterministic, as is the *pairs* iterator.
There's a very special case. The standard lua 'next' iterator can find a successor to a key that was *recently deleted*. This functionality is necessary in order to be able to iterate over a table and delete elements while iterating. We support this same functionality, but only on the very most recently deleted key. It's still sufficient to allow deleting table elements while iterating.
This patch is live and is used implicitly whenever you iterate over a lua table.
## The Table Length Patch
I've changed the lua length operator so that when it is
applied to a table, it returns the number of keys in the
table. It does this in constant time. This change affects
lua_len, lua_rawlen, and the lua # operator.
You might be wondering what the lua length operator
used to do? The lua documentation says this:
> The length operator applied on a table returns a border in
> that table. A border in a table t is any non-negative
> integer that satisfies the following condition:
>
> (border == 0 or t[border] ~= nil) and
> (t[border + 1] == nil or border == math.maxinteger)
Those are *terrible* semantics:
- They're not useful for anything.
- It's not deterministic.
- In no sense of the word "length" is this the
length of the table.
Let me explain how that mess happened. They obviously
wanted the length operator to return the number of keys in
the table. Unfortunately, to count the number of keys in a
lua table actually takes O(N) time. So they came up with a
hack to make it faster: O(1). Unfortunately, the hack relies
on the table being a vector. That is, the table must have
numbered keys starting with 1. As long as you apply their
hack to a vector, it works perfectly and returns the
number of keys.
Unfortunately, if you apply the hacked length algorithm to a
table that isn't a vector, it doesn't work at all.
But I think the lua documentation didn't want to admit, "it
doesn't work at all." So instead, they invented this
concept of "a border" and pretended that was in some way a
helpful result. They should have just said, "the result is
undefined."
I had to change the table internal representation
for the table iterator patch (above). With the modified
table representation, returning the number of keys in the
table can be done in constant time, whether it's a vector or
not. So I changed the length operator to just return
the number of keys, full stop.
I've also added another function, lua_nkeys. This also
returns the number of keys in the table. It doesn't add any
functionality - I could use lua_rawlen and that would work
just as well. However, using lua_nkeys emphasizes the fact
that my code needs the *real* table length, not the "border"
bullshit that lua used to provide.
This patch is live, and is necessary to the determinism of
the system.
2026-02-05 12:40:27 -05:00
## The Table Flag Bits Patch
Our difference transmission algorithm does a recursive walk
of the tables in a given tangible. That recursive walk
requires a *visited* bit in each table. Of course, the lua
way of doing this would be to store the set of visited
tables as a separate table. But that would be a lot slower
than just setting a bit, and difference transmission is the
core of our system's performance bottleneck.
2026-02-05 12:40:27 -05:00
We also need to store, in each table, a *table-type* enum.
We have several subtypes of tables: general tables, tangible
tables, class tables, and so forth. The difference
transmitter treats different types of tables differently.
2026-02-05 12:40:27 -05:00
This patch adds a 16-bit "flagbits" field to every Lua
table. We have added these lua API functions to access these
flag bits:
2026-02-05 12:40:27 -05:00
```cpp
uint16_t lua_getflagbits(lua_State *L, int index);
void lua_setflagbits(lua_State *L, int index, uint16_t flagbits);
void lua_modflagbits(lua_State *L, int index, uint16_t clearbits, uint16_t setbits);
```
The eris code for serializing the lua data structures has
been modified to save and restore the flagbits. Aside from
simply storing them, and saving and restoring them with
eris, the lua runtime doesn't do anything at all with the
flagbits.
2026-02-05 12:40:27 -05:00
The Luprex engine has set aside four of the bits to store
the table-type enum. It has set aside one of the bits for
the 'visited' bit of the difference transmission algorithm.
The rest of the bits are currently unused.
2026-02-05 12:40:27 -05:00
This patch is live and in-use.
## The Insert Frame Patch
When we write C functions for Lua, we allocate a "stack
frame" on the lua stack. This is accomplished by class
LuaDefStack. See the document "Our In-House Lua API" for
more information.
2026-02-05 12:40:27 -05:00
This function has to insert some "nils" into the base of the
stack. The lua API does have a function that can do this,
but using it would be O(N^2). Since this functionality is
used in every single C function for Lua, we decided to
optimize things a little. We added a function to the lua API
that can do it in O(N) time. The name of the function is
lua_insert_frame, which sounds fancy, but all it does is
insert N "nils" at the bottom of the stack.
2026-02-05 12:40:27 -05:00
This patch is live and is used in class LuaDefStack.
## The C++ Exceptions Patch
We've compiled lua to use C++ exceptions instead of longjmp.
The advantage of this is that if you do a lua_yield or
lua_error, any C++ destructors on the stack will get called.
2026-02-05 12:40:27 -05:00
Although lua_yield and lua_error both throw C++ exceptions,
Lua cannot *deal with* C++ exceptions except for those it
generates itself. Therefore:
2026-02-05 12:40:27 -05:00
- Never call the lua interpreter inside a C++ catch-block!
- Never throw an exception from inside a LuaDefine!
Exception 1: If you throw an uncaught exception, all that
does is terminate the program. It's always legal to
terminate the program.
Exception 2: If you throw an exception inside a LuaDefine
and then catch it inside the same LuaDefine, that's OK,
because the lua interpreter is not getting unwound.
Using C++ exceptions in lua_yield and lua_error means that
C++ destructors get called. Normally, calling destructors is
a good thing. However, there is one known case where this
causes issues: class LuaExtStack. Class LuaExtStack pushes
values onto the lua stack in its constructor, and later, in
its destructor, it pops those values back off.
Straightforward enough. But if you throw an error using the
lua_error function, then the error message is pushed on top
of the lua stack. If the throw triggers the LuaExtStack
destructor, then LuaExtStack will pop the stack, and in
doing so, it will unintentionally throw out the error
message. Oops.
To fix this, we had to add a lua patch, which adds a new API
function "lua_isthrowing." This API function is used by the
LuaExtStack destructor, to decide whether to clean up the
stack or not. This new API function is not used anywhere
else in Luprex, and I do not expect it will ever be needed
anywhere else.
This patch is live and is needed to keep lua error messages
working.
2026-02-05 12:40:27 -05:00
## The Object-Oriented Lua Patch
We have a patch to make lua object-oriented. To
really understand this patch, we need a separate
markdown file, "Object-Oriented-Lua.md". See
that file for an explanation.
2026-02-05 12:40:27 -05:00
## The Print No Address Patch (Unimplemented)
The lua library function luaL_tolstring converts a lua object into a string, no matter what the type of the lua object is. In the case of tables, the string it generates looks like "table: 0123456701234567", with the address of the table being part of the string. That would make any code that uses that string dependent on the precise address of the table. That's considered nondeterminism for our purposes, it's not allowed.
We don't use luaL_tolstring in the Luprex codebase. However, the luaL_tolstring function is used from within the lua runtime in three different places:
- In the lua function *string.format*. This is no longer an issue. We wrote our own version of format which replaces the built-in version. Our implementation doesn't use luaL_tostring.
2026-02-05 12:40:27 -05:00
- In the builtin function *tostring*. This is no longer an issue. We wrote our own version of tostring which replaces the built-in version. Our implementation doesn't use luaL_tolstring.
- In the eris runtime. It is using luaL_tolstring to generate keys that become part of an associative map.
So the obvious thing to do would be to just change the code for luaL_tolstring to not include the table address. But that would totally break the eris code.
Another possibility would be to patch or override the code for string.format. That fixes the problem, but it leaves luaL_tolstring in there as a time-bomb where somebody else might use it not realizing that it's an issue.
There's no obvious approach to fixing this, so I haven't patched it yet.
## The GC Finalizer Patch and the Weak Table Patch (Unimplemented)
GC Finalizers and weak tables both introduce nondeterminism into Lua execution. We can't allow that. It may be necessary to patch the lua interpreter to simply disable these functions. Alternately, we could simply ask the scripters not to use these features, and declare "undefined behavior" if they do.
2026-02-11 11:41:32 -05:00
Update 1: I'm using GC finalizers in some cases to clean up userdata objects. I think it's safe as long as the only thing the finalizer does is free memory. (NOTE: WHERE?)
2026-02-05 12:40:27 -05:00
Update 2: I don't remember using userdata objects at all. I am not sure that Update 1 is the truth any more.
2026-02-19 00:11:44 -05:00
## Token Literal Syntax Patch
2026-02-19 00:32:29 -05:00
Tokens are lightuserdata values encoding short alphanumeric
strings as base38 numbers (see `Tokens-A-New-Lua-Type.md`).
2026-02-19 00:32:29 -05:00
This patch adds a literal syntax to the Lua parser so that
tokens can be written directly in Lua source code using the
`@` prefix:
2026-02-19 00:11:44 -05:00
```lua
local x = @null
local y = @found
```
This patch is live and functioning.