diff --git a/Docs/A-Summary-of-our-Lua-Patches.md b/Docs/A-Summary-of-our-Lua-Patches.md index 11528b35..37b6ccdd 100644 --- a/Docs/A-Summary-of-our-Lua-Patches.md +++ b/Docs/A-Summary-of-our-Lua-Patches.md @@ -19,18 +19,6 @@ Second, since the traceback will always contain the file and line number, and si This patch is live and functioning. -## Thread NextID Patch - -We had the idea that Lua threads should contain an int64 field, "nextid", which theoretically was meant to help our system's unique ID allocator. We implemented a patch to store this field. When a Lua thread is created, the nextid field is initialized to zero. The following accessors were added to the Lua API: - -```cpp -lua_Integer lua_getnextid(lua_State *L); - -void lua_setnextid(lua_State *L, lua_Integer id); -``` - -This patch is dead code: as it turns out, we ended up not using this feature. These two accessors are never used anywhere in the Luprex codebase. - ## Print Integers Patch Lua numbers are actually double-precision floating point. There is no separate "int" data type. If you try to store an integer in lua, it gets stored as a double-precision float. That's usually just fine. Double precision can store integers up to 53 bits losslessly and precisely. @@ -146,11 +134,22 @@ the system. ## The Table Flag Bits Patch -Our difference transmission algorithm does a recursive walk of the tables in a given tangible. That recursive walk requires a *visited* bit in each table. Of course, the lua way of doing this would be to store the set of visited tables as a separate table. But that would be a lot slower than just setting a bit, and difference transmission is the core of our system's performance bottleneck. +Our difference transmission algorithm does a recursive walk +of the tables in a given tangible. That recursive walk +requires a *visited* bit in each table. Of course, the lua +way of doing this would be to store the set of visited +tables as a separate table. But that would be a lot slower +than just setting a bit, and difference transmission is the +core of our system's performance bottleneck. -We also need to store, in each table, a *table-type* enum. We have several subtypes of tables: general tables, tangible tables, class tables, and so forth. The difference transmitter treats different types of tables differently. +We also need to store, in each table, a *table-type* enum. +We have several subtypes of tables: general tables, tangible +tables, class tables, and so forth. The difference +transmitter treats different types of tables differently. -This patch adds a 16-bit "flagbits" field to every Lua table. We have added these lua API functions to access these flag bits: +This patch adds a 16-bit "flagbits" field to every Lua +table. We have added these lua API functions to access these +flag bits: ```cpp uint16_t lua_getflagbits(lua_State *L, int index); @@ -160,70 +159,87 @@ void lua_setflagbits(lua_State *L, int index, uint16_t flagbits); void lua_modflagbits(lua_State *L, int index, uint16_t clearbits, uint16_t setbits); ``` -The eris code for serializing the lua data structures has been modified to save and restore the flagbits. Aside from simply storing them, and saving and restoring them with eris, the lua runtime doesn't do anything at all with the flagbits. +The eris code for serializing the lua data structures has +been modified to save and restore the flagbits. Aside from +simply storing them, and saving and restoring them with +eris, the lua runtime doesn't do anything at all with the +flagbits. -The Luprex engine has set aside four of the bits to store the table-type enum. It has set aside one of the bits for the 'visited' bit of the difference transmission algorithm. The rest of the bits are currently unused. +The Luprex engine has set aside four of the bits to store +the table-type enum. It has set aside one of the bits for +the 'visited' bit of the difference transmission algorithm. +The rest of the bits are currently unused. This patch is live and in-use. ## The Insert Frame Patch -When we write C functions for Lua, we allocate a "stack frame" on the lua stack. This is accomplished by class LuaDefStack. See the document "Our In-House Lua API" for more information. +When we write C functions for Lua, we allocate a "stack +frame" on the lua stack. This is accomplished by class +LuaDefStack. See the document "Our In-House Lua API" for +more information. -This function has to insert some "nils" into the base of the stack. The lua API does have a function that can do this, but using it would be O(N^2). Since this functionality is used in every single C function for Lua, we decided to optimize things a little. We added a function to the lua API that can do it in O(N) time. The name of the function is lua_insert_frame, which sounds fancy, but all it does is insert N "nils" at the bottom of the stack. +This function has to insert some "nils" into the base of the +stack. The lua API does have a function that can do this, +but using it would be O(N^2). Since this functionality is +used in every single C function for Lua, we decided to +optimize things a little. We added a function to the lua API +that can do it in O(N) time. The name of the function is +lua_insert_frame, which sounds fancy, but all it does is +insert N "nils" at the bottom of the stack. This patch is live and is used in class LuaDefStack. ## The C++ Exceptions Patch -We've compiled lua to use C++ exceptions instead of longjmp. The advantage of this is that if you do a lua_yield or lua_error, any C++ destructors on the stack will get called. +We've compiled lua to use C++ exceptions instead of longjmp. +The advantage of this is that if you do a lua_yield or +lua_error, any C++ destructors on the stack will get called. -Although lua_yield and lua_error both throw C++ exceptions, Lua cannot *deal with* C++ exceptions except for those it generates itself. Therefore: +Although lua_yield and lua_error both throw C++ exceptions, +Lua cannot *deal with* C++ exceptions except for those it +generates itself. Therefore: - Never call the lua interpreter inside a C++ catch-block! - Never throw an exception from inside a LuaDefine! -Exception 1: If you throw an uncaught exception, all that does is terminate the program. It's always legal to terminate the program. +Exception 1: If you throw an uncaught exception, all that +does is terminate the program. It's always legal to +terminate the program. -Exception 2: If you throw an exception inside a LuaDefine and then catch it inside the same LuaDefine, that's OK, because the lua interpreter is not getting unwound. +Exception 2: If you throw an exception inside a LuaDefine +and then catch it inside the same LuaDefine, that's OK, +because the lua interpreter is not getting unwound. -Using C++ exceptions in lua_yield and lua_error means that C++ destructors get called. Normally, calling destructors is a good thing. However, there is one known case where this causes issues: class LuaExtStack. Class LuaExtStack pushes values onto the lua stack in its constructor, and later, in its destructor, it pops those values back off. Straightforward enough. But if you throw an error using the lua_error function, then the error message is pushed on top of the lua stack. If the throw triggers the LuaExtStack destructor, then LuaExtStack will pop the stack, and in doing so, it will unintentionally throw out the error message. Oops. +Using C++ exceptions in lua_yield and lua_error means that +C++ destructors get called. Normally, calling destructors is +a good thing. However, there is one known case where this +causes issues: class LuaExtStack. Class LuaExtStack pushes +values onto the lua stack in its constructor, and later, in +its destructor, it pops those values back off. +Straightforward enough. But if you throw an error using the +lua_error function, then the error message is pushed on top +of the lua stack. If the throw triggers the LuaExtStack +destructor, then LuaExtStack will pop the stack, and in +doing so, it will unintentionally throw out the error +message. Oops. -To fix this, we had to add a lua patch, which adds a new API function "lua_isthrowing." This API function is used by the LuaExtStack destructor, to decide whether to clean up the stack or not. This new API function is not used anywhere else in Luprex, and I do not expect it will ever be needed anywhere else. +To fix this, we had to add a lua patch, which adds a new API +function "lua_isthrowing." This API function is used by the +LuaExtStack destructor, to decide whether to clean up the +stack or not. This new API function is not used anywhere +else in Luprex, and I do not expect it will ever be needed +anywhere else. -This patch is live and is needed to keep lua error messages working. +This patch is live and is needed to keep lua error messages +working. ## The Object-Oriented Lua Patch -Lua has a colon operator, for method lookup: - -```lua -obj:method(arg1, arg2, arg3) -``` - -This looks for the method closure in *obj*. Instead of looking for the method in *obj*, I would like lua to look for the method in a separate table, the object's *class* table. One way to accomplish that is to use the *index* metamethod: - -```lua -setmetatable(obj, { __index = class } ) -``` - -That's not bad, but it puts both values and methods into the same namespace: - -- Looking up data (eg, *obj.value)* will look for value in *obj*, and if it's not found, it will look for value in the *class* table. Looking for a value in the *class* table is pointless and inefficient. - -- Looking up a method (eg, *obj:method*) will look for the closure in *obj* first, before looking in the *class* table. Looking for a closure in the data table *obj* is again pointless and inefficient. - -- By putting both values and methods into the same namespace, we create the possibility of unintended mistakes. - -I am thinking about implementing a new metatable entry: __METHODS = true. If this flag is present, then the colon operator *obj:method* looks for the method in the metatable, instead of looking for it in the object. With this new metamethod, the way to create a class would be to make a table full of methods, and then in that table, also put __METHODS = true. Then you would do this: - -```lua -setmetatable(obj, class) -``` - -The class table would *be* the metatable. The result of this patch is that method lookup is done in the class table, value lookup is done in the object, and the two namespaces are kept separate. - -Implementing this, as it turns out, is quite simple. The lua virtual machine contains an opcode, OP_SELF. This opcode is only used when the lua scripter uses the colon operator for method invocation. The opcode has two inputs: *obj*, and *method* (where method is a string, a method name). It returns the closure to call. Currently, OP_SELF just calls luaV_getttable to fetch the method from *obj*. We could easily replace the call to luaV_gettable with a function that we write ourselves, luaV_getmethod. +We have a patch to make lua object-oriented. To +really understand this patch, we need a separate +markdown file, "Object-Oriented-Lua.md". See +that file for an explanation. ## The Print No Address Patch (Unimplemented) diff --git a/Docs/Object-Oriented-Lua.md b/Docs/Object-Oriented-Lua.md new file mode 100644 index 00000000..285840a1 --- /dev/null +++ b/Docs/Object-Oriented-Lua.md @@ -0,0 +1,336 @@ + +## How to do Methods and Inheritance in Standard Lua + +Standard Lua has the ability to do classes and +inheritance. Let me explain how. + +Lua has a colon operator, for method lookup. It looks +like this: + +```lua + local result = v1:dotproduct(v2) +``` + +By default, this looks for the method "dotproduct" inside +the table v1. But you don't really want to put a whole +bunch of function objects (like dotproduct) into v1, and +into every other vector. Instead, you want to put those +function objects in a separate table, the class table: + +```lua + vector = {} + function vector.dotproduct(v1, v2) ... end + function vector.crossproduct(v1, v2) ... end +``` + +Fortunately, Lua can do that. Lua has a metamethod, INDEX, +that basically says: "if you don't find what you're looking +for in this table, look in that table instead." We can use +the INDEX metamethod to tell lua: if you're looking a method +"dotproduct" in v1, and it's not there, look in class vector +instead. + +So to do that, you only need two steps. First, we're going +to decide that class vector isn't just hold the methods +for vectors. We're also going to use it as the metatable +for all vectors: + +```lua + v1 = {x=100, y=100, z=100} + v2 = {x=200, y=200, z=200} + setmetatable(v1, vector) + setmetatable(v2, vector) +``` + +Now that vector is the metatable for all vectors, you can +put metamethods into class vector and they will affect all +vectors. We're going to put an __INDEX metamethod in there: + +```lua + vector = {} + vector.__INDEX = vector + function vector.dotproduct(v1, v2) ... end + function vector.crossproduct(v1, v2) ... end +``` + +That says: if you're doing a lookup on a vector, +and the lookup fails, look in class vector instead. + +OK, in case you didn't quite follow, let me walk you through +step by step what happens when you evaluate the expression +v1:dotproduct(v2). + +First, the colon operator looks for "dotproduct" in v1. +It's not going to find it: we're not going to put all +those function objects into every single vector. + +When the table lookup code fails to find "dotproduct" in v1, +it goes to a fallback codepath: it checks whether v1 has a +metatable. It does: vector is the metatable for all vectors. +So v1 does indeed have a metatable. The table lookup +code then checks whether the metatable contains INDEX: +again, it does, class vector contains INDEX. Finally, +the table lookup checks what INDEX is pointing to: +class vector again. So the table lookup code, having +failed to find "dotproduct" in v1, looks for it in +class vector. The method lookup succeeds. + +The little trick "vector.__INDEX = vector" certainly feels +arcane, but it works. I like to hide it with a little +syntactic sugar: + +```lua + def makeclass(name) + _G[name] = {} + _G[name].__INDEX = _G[name] + end +``` + +Now I can say this, and it hides all the metatable magic: + +```lua + makeclass("vector") + function vector.dotproduct(v1, v2) ... end + function vector.crossproduct(v1, v2) ... end +``` + +A useful property of this design is that vector is *both* +the metatable for all vectors, and *also* the class table +for all vectors. One consequence of that is that I can +put both regular methods and metamethods in there: + +```lua + makeclass("vector") + function vector.__eq(v1, v2) ... end + function vector.dotproduct(v1, v2) ... end + function vector.crossproduct(v1, v2) ... end +``` + +Let's say you also want inheritance: you want vector to +inherit from vectorbase. That's another thing you can +accomplish using INDEX again: "if you can't find what you're +looking for in vector, look in vectorbase." Here's the code +for that: + +```lua + makeclass("vectorbase") + function vectorbase.whatever(...) ... end + + makeclass("vector") + function vector.dotproduct(v1, v2) ... end + function vector.crossproduct(v1, v2) ... end + + setmetatable(vector, vectorbase) +``` + +That 'setmetatable' directs all failed lookups from +vector into vectorbase. I've set up a search path. + +That's really all there is to it. Honestly, it's +reasonably straightforward. + +## Why I Don't Love It: High Bug Potential + +Let's say you want to create a class, "json". This is +for receiving and transmitting json over HTTP: + +```lua + json={} + function json.validate() ... end + function json.serialize(output) ... end +``` + +Now, suppose you receive this json from an HTTP client: + +```json + { + "jsonrpc" : "2.0", + "validate" : true + } +``` + +You convert the json into a lua table *request*, call +setclass(request, json), and then you try to validate +the request: + +```lua + bool legal = request:validate() +``` + +This produces an error: "true" is not a callable function. +That's because when it looks for the "validate" method, +it instead finds the "validate" data that is in the request. + +The underlying problem is this: method lookups should +*only* look in the class table, not in the request. +Meanwhile, data lookup should be *only* in the request, +not in the class table. But the INDEX metamethod +doesn't differentiate between method lookups and data +lookups. All lookups are treated the same. So all +accesses go to *both* tables. + +Using this design for object-orientation means you're always +at risk of data hiding your methods, at unexpected times. +It also means you can get false readings on instance +variables if the actual instance variable is nil. + +## Why I Don't Love It: Speed + +How much does this cost: + +```lua + local result = v1:dotproduct(v2) +``` + +The answer is: it has to do three table lookups for just +the method lookup: + +- v1["dotproduct"] +- metatable["__INDEX"] +- vector["dotproduct"] + +That also applies to data lookups. Suppose I do this: + +```lua + local param = request.param +``` + +That's a simple data lookup. But if the request is of class +json, and the request doesn't contain "param", then it does +three table lookups, trying to find the data "param": + +- request["param"] +- metatable["__INDEX"] +- json["param"] + +Looking for param in the class table, aside from being +risky, is wasting CPU time. This makes lua, an already slow +interpreted language, significantly slower. + +## A Patch + +To solve these problems, we need to establish a rule: method +lookups should *only* go to the class table, and data +lookups should *only* go to the data table. + +To make that happen, I introduce a new metamethod, CLASS +(with leading underscores). This metamethod changes *only* +the behavior of the method look up operator (colon): + +```lua + v1:dotproduct(v2) +``` + +The behavior is: if you do a method lookup, then the colon +operator looks to see if v1 has a metatable with the CLASS +metamethod in it. If so, then it looks for the method in +the *metatable* of v1 instead of looking in v1. + +Let me show you how this is intended to be used. We create +the class table in pretty much the same way, but instead +of putting INDEX in there, we put CLASS: + +```lua + vector = {} + vector.__CLASS = true + function vector.dotproduct(v1, v2) ... end + function vector.crossproduct(v1, v2) ... end +``` + +We again intend to make vector the metatable for all +vectors. So again, when you create a vector, you do +this: + +```lua + v1 = {x=100, y=100, z=100} + v2 = {x=200, y=200, z=200} + setmetatable(v1, vector) + setmetatable(v2, vector) +``` + +So far, it looks almost identical. But when you do +v1:dotproduct(v2), the method lookup checks whether v1 has a +metatable (it does), and whether that metatable has CLASS in +it (it does). So therefore, it *doesn't* look for dotproduct +in v1. It *only* looks for it in the metatable of v1, which +is class vector. + +The CLASS metamethod has no effect on data lookups. That +incudes the dot operator, and the array lookup operator: + +```lua + local param = request.param + + local param = request["param"] +``` + +Even if request above has the json class as its metatable, +these lookups are just plain lua table lookups, and that's +all. Class json isn't involved. + +This cleanly separates the two namespaces: data is looked +up in the data table, methods are looked up in the methods +table. + +So that leaves inheritance. We use a slightly different +version of the CLASS metamethod: + +```lua + vector = {} + vector.__CLASS = vectorbase + function vector.dotproduct(v1, v2) ... end + function vector.crossproduct(v1, v2) ... end +``` + +Saying CLASS=true means "this is a class." Saying +CLASS=vectorbase means "this is a class, derived from +vectorbase". If you do that, then the method look-up +operator will look in vector first, then vectorbase, and it +will follow the inheritance chain upward. + +How does the CLASS metamethod compare, performance-wise, to +using the INDEX metamethod to achieve object-orientation? +Let's start with data lookups. Using INDEX, this can take +up to three table lookups: + +```lua + local param = request.param +``` + +However, using CLASS instead, this is only ever one table +lookup. You might assume that since request has a +metatable, that lua would at least have to do a table lookup +to *see* if there's an INDEX metamethod, even if there's not +one. However, lua has a clever optimization: lua remembers +that INDEX is not present in class json, and it doesn't +look a second time. It only ever does the INDEX lookup +once, for the entire program. + +Method lookups like v1:dotproduct(v2) are also accelerated. +Using INDEX, it takes three table lookups to find +"dotproduct" in class vector. Using CLASS, however, it +usually takes only one table lookup. It skips looking in v1 +entirely. And, like INDEX, lua remembers that CLASS=true is +present in class vector, and it doesn't look a second time. +So it goes straight to looking for "dotproduct" in class +vector. It only has to lookup CLASS if the method is not +found, and we have to walk the inheritance tree. + +Finally, I'd like to say something about readability: + +```lua + vector.__CLASS = true +``` + +## Summary + +Metamethod CLASS achieves object-oriented method lookup, +but with two advantages over using INDEX: + +* Safely separates the two namespaces: data, and methods. + +* CLASS is faster than using INDEX. + + + + diff --git a/luprex/cpp/core/source.cpp b/luprex/cpp/core/source.cpp index ec1704e8..cbff7559 100644 --- a/luprex/cpp/core/source.cpp +++ b/luprex/cpp/core/source.cpp @@ -679,25 +679,6 @@ bool SourceDB::function_docs(const LuaCoreStack &LS, LuaSlot fn, std::ostream &o } } -// These should go away eventually. They're for debugging. -LuaDefine(coroutine_setnextid, "thread,id", "set the next id of a thread (debugging only)") { - LuaArg co, lid; - LuaDefStack LS(L, co, lid); - lua_State *CO = LS.ckthread(co); - lua_Number id = LS.ckinteger(lid); - lua_setnextid(CO, id); - return LS.result(); -} - -LuaDefine(coroutine_getnextid, "thread", "get the next id of a thread (debugging only)") { - LuaArg co; - LuaRet lid; - LuaDefStack LS(L, co, lid); - lua_State *CO = LS.ckthread(co); - LS.set(lid, lua_getnextid(CO)); - return LS.result(); -} - LuaDefine(unittests_sourcedb, "", "some unit tests") { LuaSnap msnap; LuaSnap ssnap; diff --git a/luprex/ext/eris-master/src/lapi.c b/luprex/ext/eris-master/src/lapi.c index f2d47919..2b1337dc 100644 --- a/luprex/ext/eris-master/src/lapi.c +++ b/luprex/ext/eris-master/src/lapi.c @@ -514,16 +514,6 @@ LUA_API const void *lua_topointer (lua_State *L, int idx) { } -LUA_API lua_Integer lua_getnextid (lua_State *L) { - return L->nextid; -} - - -LUA_API void lua_setnextid (lua_State *L, lua_Integer id) { - L->nextid = id; -} - - /* ** push functions (C -> stack) */ diff --git a/luprex/ext/eris-master/src/lstate.h b/luprex/ext/eris-master/src/lstate.h index 7a26c7c9..daffd9aa 100644 --- a/luprex/ext/eris-master/src/lstate.h +++ b/luprex/ext/eris-master/src/lstate.h @@ -173,7 +173,6 @@ struct lua_State { struct lua_longjmp *errorJmp; /* current error recover point */ ptrdiff_t errfunc; /* current error handling function (stack index) */ CallInfo base_ci; /* CallInfo for first level (C calling Lua) */ - lua_Integer nextid; /* ID allocator for luprex */ }; diff --git a/luprex/ext/eris-master/src/lua.h b/luprex/ext/eris-master/src/lua.h index 17d5e621..d96e244c 100644 --- a/luprex/ext/eris-master/src/lua.h +++ b/luprex/ext/eris-master/src/lua.h @@ -178,9 +178,6 @@ LUA_API void *(lua_touserdata) (lua_State *L, int idx); LUA_API lua_State *(lua_tothread) (lua_State *L, int idx); LUA_API const void *(lua_topointer) (lua_State *L, int idx); -LUA_API lua_Integer (lua_getnextid) (lua_State *L); -LUA_API void (lua_setnextid) (lua_State *L, lua_Integer id); - /* ** Comparison and arithmetic functions */