Better documentation of the object-oriented-lua patch, and removal of an unused patch from lua

This commit is contained in:
2026-03-11 04:00:16 -04:00
parent 9fa6bd4bb6
commit d5fb9cd224
6 changed files with 407 additions and 88 deletions

View File

@@ -19,18 +19,6 @@ Second, since the traceback will always contain the file and line number, and si
This patch is live and functioning.
## Thread NextID Patch
We had the idea that Lua threads should contain an int64 field, "nextid", which theoretically was meant to help our system's unique ID allocator. We implemented a patch to store this field. When a Lua thread is created, the nextid field is initialized to zero. The following accessors were added to the Lua API:
```cpp
lua_Integer lua_getnextid(lua_State *L);
void lua_setnextid(lua_State *L, lua_Integer id);
```
This patch is dead code: as it turns out, we ended up not using this feature. These two accessors are never used anywhere in the Luprex codebase.
## Print Integers Patch
Lua numbers are actually double-precision floating point. There is no separate "int" data type. If you try to store an integer in lua, it gets stored as a double-precision float. That's usually just fine. Double precision can store integers up to 53 bits losslessly and precisely.
@@ -146,11 +134,22 @@ the system.
## The Table Flag Bits Patch
Our difference transmission algorithm does a recursive walk of the tables in a given tangible. That recursive walk requires a *visited* bit in each table. Of course, the lua way of doing this would be to store the set of visited tables as a separate table. But that would be a lot slower than just setting a bit, and difference transmission is the core of our system's performance bottleneck.
Our difference transmission algorithm does a recursive walk
of the tables in a given tangible. That recursive walk
requires a *visited* bit in each table. Of course, the lua
way of doing this would be to store the set of visited
tables as a separate table. But that would be a lot slower
than just setting a bit, and difference transmission is the
core of our system's performance bottleneck.
We also need to store, in each table, a *table-type* enum. We have several subtypes of tables: general tables, tangible tables, class tables, and so forth. The difference transmitter treats different types of tables differently.
We also need to store, in each table, a *table-type* enum.
We have several subtypes of tables: general tables, tangible
tables, class tables, and so forth. The difference
transmitter treats different types of tables differently.
This patch adds a 16-bit "flagbits" field to every Lua table. We have added these lua API functions to access these flag bits:
This patch adds a 16-bit "flagbits" field to every Lua
table. We have added these lua API functions to access these
flag bits:
```cpp
uint16_t lua_getflagbits(lua_State *L, int index);
@@ -160,70 +159,87 @@ void lua_setflagbits(lua_State *L, int index, uint16_t flagbits);
void lua_modflagbits(lua_State *L, int index, uint16_t clearbits, uint16_t setbits);
```
The eris code for serializing the lua data structures has been modified to save and restore the flagbits. Aside from simply storing them, and saving and restoring them with eris, the lua runtime doesn't do anything at all with the flagbits.
The eris code for serializing the lua data structures has
been modified to save and restore the flagbits. Aside from
simply storing them, and saving and restoring them with
eris, the lua runtime doesn't do anything at all with the
flagbits.
The Luprex engine has set aside four of the bits to store the table-type enum. It has set aside one of the bits for the 'visited' bit of the difference transmission algorithm. The rest of the bits are currently unused.
The Luprex engine has set aside four of the bits to store
the table-type enum. It has set aside one of the bits for
the 'visited' bit of the difference transmission algorithm.
The rest of the bits are currently unused.
This patch is live and in-use.
## The Insert Frame Patch
When we write C functions for Lua, we allocate a "stack frame" on the lua stack. This is accomplished by class LuaDefStack. See the document "Our In-House Lua API" for more information.
When we write C functions for Lua, we allocate a "stack
frame" on the lua stack. This is accomplished by class
LuaDefStack. See the document "Our In-House Lua API" for
more information.
This function has to insert some "nils" into the base of the stack. The lua API does have a function that can do this, but using it would be O(N^2). Since this functionality is used in every single C function for Lua, we decided to optimize things a little. We added a function to the lua API that can do it in O(N) time. The name of the function is lua_insert_frame, which sounds fancy, but all it does is insert N "nils" at the bottom of the stack.
This function has to insert some "nils" into the base of the
stack. The lua API does have a function that can do this,
but using it would be O(N^2). Since this functionality is
used in every single C function for Lua, we decided to
optimize things a little. We added a function to the lua API
that can do it in O(N) time. The name of the function is
lua_insert_frame, which sounds fancy, but all it does is
insert N "nils" at the bottom of the stack.
This patch is live and is used in class LuaDefStack.
## The C++ Exceptions Patch
We've compiled lua to use C++ exceptions instead of longjmp. The advantage of this is that if you do a lua_yield or lua_error, any C++ destructors on the stack will get called.
We've compiled lua to use C++ exceptions instead of longjmp.
The advantage of this is that if you do a lua_yield or
lua_error, any C++ destructors on the stack will get called.
Although lua_yield and lua_error both throw C++ exceptions, Lua cannot *deal with* C++ exceptions except for those it generates itself. Therefore:
Although lua_yield and lua_error both throw C++ exceptions,
Lua cannot *deal with* C++ exceptions except for those it
generates itself. Therefore:
- Never call the lua interpreter inside a C++ catch-block!
- Never throw an exception from inside a LuaDefine!
Exception 1: If you throw an uncaught exception, all that does is terminate the program. It's always legal to terminate the program.
Exception 1: If you throw an uncaught exception, all that
does is terminate the program. It's always legal to
terminate the program.
Exception 2: If you throw an exception inside a LuaDefine and then catch it inside the same LuaDefine, that's OK, because the lua interpreter is not getting unwound.
Exception 2: If you throw an exception inside a LuaDefine
and then catch it inside the same LuaDefine, that's OK,
because the lua interpreter is not getting unwound.
Using C++ exceptions in lua_yield and lua_error means that C++ destructors get called. Normally, calling destructors is a good thing. However, there is one known case where this causes issues: class LuaExtStack. Class LuaExtStack pushes values onto the lua stack in its constructor, and later, in its destructor, it pops those values back off. Straightforward enough. But if you throw an error using the lua_error function, then the error message is pushed on top of the lua stack. If the throw triggers the LuaExtStack destructor, then LuaExtStack will pop the stack, and in doing so, it will unintentionally throw out the error message. Oops.
Using C++ exceptions in lua_yield and lua_error means that
C++ destructors get called. Normally, calling destructors is
a good thing. However, there is one known case where this
causes issues: class LuaExtStack. Class LuaExtStack pushes
values onto the lua stack in its constructor, and later, in
its destructor, it pops those values back off.
Straightforward enough. But if you throw an error using the
lua_error function, then the error message is pushed on top
of the lua stack. If the throw triggers the LuaExtStack
destructor, then LuaExtStack will pop the stack, and in
doing so, it will unintentionally throw out the error
message. Oops.
To fix this, we had to add a lua patch, which adds a new API function "lua_isthrowing." This API function is used by the LuaExtStack destructor, to decide whether to clean up the stack or not. This new API function is not used anywhere else in Luprex, and I do not expect it will ever be needed anywhere else.
To fix this, we had to add a lua patch, which adds a new API
function "lua_isthrowing." This API function is used by the
LuaExtStack destructor, to decide whether to clean up the
stack or not. This new API function is not used anywhere
else in Luprex, and I do not expect it will ever be needed
anywhere else.
This patch is live and is needed to keep lua error messages working.
This patch is live and is needed to keep lua error messages
working.
## The Object-Oriented Lua Patch
Lua has a colon operator, for method lookup:
```lua
obj:method(arg1, arg2, arg3)
```
This looks for the method closure in *obj*. Instead of looking for the method in *obj*, I would like lua to look for the method in a separate table, the object's *class* table. One way to accomplish that is to use the *index* metamethod:
```lua
setmetatable(obj, { __index = class } )
```
That's not bad, but it puts both values and methods into the same namespace:
- Looking up data (eg, *obj.value)* will look for value in *obj*, and if it's not found, it will look for value in the *class* table. Looking for a value in the *class* table is pointless and inefficient.
- Looking up a method (eg, *obj:method*) will look for the closure in *obj* first, before looking in the *class* table. Looking for a closure in the data table *obj* is again pointless and inefficient.
- By putting both values and methods into the same namespace, we create the possibility of unintended mistakes.
I am thinking about implementing a new metatable entry: __METHODS = true. If this flag is present, then the colon operator *obj:method* looks for the method in the metatable, instead of looking for it in the object. With this new metamethod, the way to create a class would be to make a table full of methods, and then in that table, also put __METHODS = true. Then you would do this:
```lua
setmetatable(obj, class)
```
The class table would *be* the metatable. The result of this patch is that method lookup is done in the class table, value lookup is done in the object, and the two namespaces are kept separate.
Implementing this, as it turns out, is quite simple. The lua virtual machine contains an opcode, OP_SELF. This opcode is only used when the lua scripter uses the colon operator for method invocation. The opcode has two inputs: *obj*, and *method* (where method is a string, a method name). It returns the closure to call. Currently, OP_SELF just calls luaV_getttable to fetch the method from *obj*. We could easily replace the call to luaV_gettable with a function that we write ourselves, luaV_getmethod.
We have a patch to make lua object-oriented. To
really understand this patch, we need a separate
markdown file, "Object-Oriented-Lua.md". See
that file for an explanation.
## The Print No Address Patch (Unimplemented)

336
Docs/Object-Oriented-Lua.md Normal file
View File

@@ -0,0 +1,336 @@
## How to do Methods and Inheritance in Standard Lua
Standard Lua has the ability to do classes and
inheritance. Let me explain how.
Lua has a colon operator, for method lookup. It looks
like this:
```lua
local result = v1:dotproduct(v2)
```
By default, this looks for the method "dotproduct" inside
the table v1. But you don't really want to put a whole
bunch of function objects (like dotproduct) into v1, and
into every other vector. Instead, you want to put those
function objects in a separate table, the class table:
```lua
vector = {}
function vector.dotproduct(v1, v2) ... end
function vector.crossproduct(v1, v2) ... end
```
Fortunately, Lua can do that. Lua has a metamethod, INDEX,
that basically says: "if you don't find what you're looking
for in this table, look in that table instead." We can use
the INDEX metamethod to tell lua: if you're looking a method
"dotproduct" in v1, and it's not there, look in class vector
instead.
So to do that, you only need two steps. First, we're going
to decide that class vector isn't just hold the methods
for vectors. We're also going to use it as the metatable
for all vectors:
```lua
v1 = {x=100, y=100, z=100}
v2 = {x=200, y=200, z=200}
setmetatable(v1, vector)
setmetatable(v2, vector)
```
Now that vector is the metatable for all vectors, you can
put metamethods into class vector and they will affect all
vectors. We're going to put an __INDEX metamethod in there:
```lua
vector = {}
vector.__INDEX = vector
function vector.dotproduct(v1, v2) ... end
function vector.crossproduct(v1, v2) ... end
```
That says: if you're doing a lookup on a vector,
and the lookup fails, look in class vector instead.
OK, in case you didn't quite follow, let me walk you through
step by step what happens when you evaluate the expression
v1:dotproduct(v2).
First, the colon operator looks for "dotproduct" in v1.
It's not going to find it: we're not going to put all
those function objects into every single vector.
When the table lookup code fails to find "dotproduct" in v1,
it goes to a fallback codepath: it checks whether v1 has a
metatable. It does: vector is the metatable for all vectors.
So v1 does indeed have a metatable. The table lookup
code then checks whether the metatable contains INDEX:
again, it does, class vector contains INDEX. Finally,
the table lookup checks what INDEX is pointing to:
class vector again. So the table lookup code, having
failed to find "dotproduct" in v1, looks for it in
class vector. The method lookup succeeds.
The little trick "vector.__INDEX = vector" certainly feels
arcane, but it works. I like to hide it with a little
syntactic sugar:
```lua
def makeclass(name)
_G[name] = {}
_G[name].__INDEX = _G[name]
end
```
Now I can say this, and it hides all the metatable magic:
```lua
makeclass("vector")
function vector.dotproduct(v1, v2) ... end
function vector.crossproduct(v1, v2) ... end
```
A useful property of this design is that vector is *both*
the metatable for all vectors, and *also* the class table
for all vectors. One consequence of that is that I can
put both regular methods and metamethods in there:
```lua
makeclass("vector")
function vector.__eq(v1, v2) ... end
function vector.dotproduct(v1, v2) ... end
function vector.crossproduct(v1, v2) ... end
```
Let's say you also want inheritance: you want vector to
inherit from vectorbase. That's another thing you can
accomplish using INDEX again: "if you can't find what you're
looking for in vector, look in vectorbase." Here's the code
for that:
```lua
makeclass("vectorbase")
function vectorbase.whatever(...) ... end
makeclass("vector")
function vector.dotproduct(v1, v2) ... end
function vector.crossproduct(v1, v2) ... end
setmetatable(vector, vectorbase)
```
That 'setmetatable' directs all failed lookups from
vector into vectorbase. I've set up a search path.
That's really all there is to it. Honestly, it's
reasonably straightforward.
## Why I Don't Love It: High Bug Potential
Let's say you want to create a class, "json". This is
for receiving and transmitting json over HTTP:
```lua
json={}
function json.validate() ... end
function json.serialize(output) ... end
```
Now, suppose you receive this json from an HTTP client:
```json
{
"jsonrpc" : "2.0",
"validate" : true
}
```
You convert the json into a lua table *request*, call
setclass(request, json), and then you try to validate
the request:
```lua
bool legal = request:validate()
```
This produces an error: "true" is not a callable function.
That's because when it looks for the "validate" method,
it instead finds the "validate" data that is in the request.
The underlying problem is this: method lookups should
*only* look in the class table, not in the request.
Meanwhile, data lookup should be *only* in the request,
not in the class table. But the INDEX metamethod
doesn't differentiate between method lookups and data
lookups. All lookups are treated the same. So all
accesses go to *both* tables.
Using this design for object-orientation means you're always
at risk of data hiding your methods, at unexpected times.
It also means you can get false readings on instance
variables if the actual instance variable is nil.
## Why I Don't Love It: Speed
How much does this cost:
```lua
local result = v1:dotproduct(v2)
```
The answer is: it has to do three table lookups for just
the method lookup:
- v1["dotproduct"]
- metatable["__INDEX"]
- vector["dotproduct"]
That also applies to data lookups. Suppose I do this:
```lua
local param = request.param
```
That's a simple data lookup. But if the request is of class
json, and the request doesn't contain "param", then it does
three table lookups, trying to find the data "param":
- request["param"]
- metatable["__INDEX"]
- json["param"]
Looking for param in the class table, aside from being
risky, is wasting CPU time. This makes lua, an already slow
interpreted language, significantly slower.
## A Patch
To solve these problems, we need to establish a rule: method
lookups should *only* go to the class table, and data
lookups should *only* go to the data table.
To make that happen, I introduce a new metamethod, CLASS
(with leading underscores). This metamethod changes *only*
the behavior of the method look up operator (colon):
```lua
v1:dotproduct(v2)
```
The behavior is: if you do a method lookup, then the colon
operator looks to see if v1 has a metatable with the CLASS
metamethod in it. If so, then it looks for the method in
the *metatable* of v1 instead of looking in v1.
Let me show you how this is intended to be used. We create
the class table in pretty much the same way, but instead
of putting INDEX in there, we put CLASS:
```lua
vector = {}
vector.__CLASS = true
function vector.dotproduct(v1, v2) ... end
function vector.crossproduct(v1, v2) ... end
```
We again intend to make vector the metatable for all
vectors. So again, when you create a vector, you do
this:
```lua
v1 = {x=100, y=100, z=100}
v2 = {x=200, y=200, z=200}
setmetatable(v1, vector)
setmetatable(v2, vector)
```
So far, it looks almost identical. But when you do
v1:dotproduct(v2), the method lookup checks whether v1 has a
metatable (it does), and whether that metatable has CLASS in
it (it does). So therefore, it *doesn't* look for dotproduct
in v1. It *only* looks for it in the metatable of v1, which
is class vector.
The CLASS metamethod has no effect on data lookups. That
incudes the dot operator, and the array lookup operator:
```lua
local param = request.param
local param = request["param"]
```
Even if request above has the json class as its metatable,
these lookups are just plain lua table lookups, and that's
all. Class json isn't involved.
This cleanly separates the two namespaces: data is looked
up in the data table, methods are looked up in the methods
table.
So that leaves inheritance. We use a slightly different
version of the CLASS metamethod:
```lua
vector = {}
vector.__CLASS = vectorbase
function vector.dotproduct(v1, v2) ... end
function vector.crossproduct(v1, v2) ... end
```
Saying CLASS=true means "this is a class." Saying
CLASS=vectorbase means "this is a class, derived from
vectorbase". If you do that, then the method look-up
operator will look in vector first, then vectorbase, and it
will follow the inheritance chain upward.
How does the CLASS metamethod compare, performance-wise, to
using the INDEX metamethod to achieve object-orientation?
Let's start with data lookups. Using INDEX, this can take
up to three table lookups:
```lua
local param = request.param
```
However, using CLASS instead, this is only ever one table
lookup. You might assume that since request has a
metatable, that lua would at least have to do a table lookup
to *see* if there's an INDEX metamethod, even if there's not
one. However, lua has a clever optimization: lua remembers
that INDEX is not present in class json, and it doesn't
look a second time. It only ever does the INDEX lookup
once, for the entire program.
Method lookups like v1:dotproduct(v2) are also accelerated.
Using INDEX, it takes three table lookups to find
"dotproduct" in class vector. Using CLASS, however, it
usually takes only one table lookup. It skips looking in v1
entirely. And, like INDEX, lua remembers that CLASS=true is
present in class vector, and it doesn't look a second time.
So it goes straight to looking for "dotproduct" in class
vector. It only has to lookup CLASS if the method is not
found, and we have to walk the inheritance tree.
Finally, I'd like to say something about readability:
```lua
vector.__CLASS = true
```
## Summary
Metamethod CLASS achieves object-oriented method lookup,
but with two advantages over using INDEX:
* Safely separates the two namespaces: data, and methods.
* CLASS is faster than using INDEX.

View File

@@ -679,25 +679,6 @@ bool SourceDB::function_docs(const LuaCoreStack &LS, LuaSlot fn, std::ostream &o
}
}
// These should go away eventually. They're for debugging.
LuaDefine(coroutine_setnextid, "thread,id", "set the next id of a thread (debugging only)") {
LuaArg co, lid;
LuaDefStack LS(L, co, lid);
lua_State *CO = LS.ckthread(co);
lua_Number id = LS.ckinteger(lid);
lua_setnextid(CO, id);
return LS.result();
}
LuaDefine(coroutine_getnextid, "thread", "get the next id of a thread (debugging only)") {
LuaArg co;
LuaRet lid;
LuaDefStack LS(L, co, lid);
lua_State *CO = LS.ckthread(co);
LS.set(lid, lua_getnextid(CO));
return LS.result();
}
LuaDefine(unittests_sourcedb, "", "some unit tests") {
LuaSnap msnap;
LuaSnap ssnap;

View File

@@ -514,16 +514,6 @@ LUA_API const void *lua_topointer (lua_State *L, int idx) {
}
LUA_API lua_Integer lua_getnextid (lua_State *L) {
return L->nextid;
}
LUA_API void lua_setnextid (lua_State *L, lua_Integer id) {
L->nextid = id;
}
/*
** push functions (C -> stack)
*/

View File

@@ -173,7 +173,6 @@ struct lua_State {
struct lua_longjmp *errorJmp; /* current error recover point */
ptrdiff_t errfunc; /* current error handling function (stack index) */
CallInfo base_ci; /* CallInfo for first level (C calling Lua) */
lua_Integer nextid; /* ID allocator for luprex */
};

View File

@@ -178,9 +178,6 @@ LUA_API void *(lua_touserdata) (lua_State *L, int idx);
LUA_API lua_State *(lua_tothread) (lua_State *L, int idx);
LUA_API const void *(lua_topointer) (lua_State *L, int idx);
LUA_API lua_Integer (lua_getnextid) (lua_State *L);
LUA_API void (lua_setnextid) (lua_State *L, lua_Integer id);
/*
** Comparison and arithmetic functions
*/