Change the lua length operator to do the obvious thing.

This commit is contained in:
2026-02-11 14:51:03 -05:00
parent 24075cd356
commit 30e53c3054
4 changed files with 56 additions and 53 deletions

View File

@@ -86,21 +86,63 @@ This patch is live and is used implicitly whenever you iterate over a lua table.
## The Table Length Patch
The builtin lua function lua_len is nondeterministic. By that, I mean that two tables with the exact same keys might return different values for lua_len. We can't allow nondeterministic anything in our version of Lua. We have altered the implementation of lua_len so that it is deterministic. Two tables with the same keys will always return the same lua_len, that is now guaranteed.
I've changed the lua length operator so that when it is
applied to a table, it returns the number of keys in the
table. It does this in constant time. This change affects
lua_len, lua_rawlen, and the lua # operator.
Our new implementation of lua_len conforms to the specification in the documentation. I'm not sure that's the right thing to do.
You might be wondering what the lua length operator
used to do? The lua documentation says this:
It's obvious how this specification got written: they implemented an algorithm to find the length of a vector as efficiently as possible. By "vector," I mean a table whose keys are 1,2,3,4,5 and so forth. After they wrote this vector-length algorithm, somebody asked, "what happens if you apply that algorithm to a table that's not a vector?" The implementor replied, "it wasn't meant for non-vectors." "Ok, but what *does* it do if you apply it to a non-vector?" They puzzled it out, and they wrote down what it does as "the specification." But that specification when applied to non-vectors isn't *useful*. It's just what their vector-length algorithm happens to spit out when you feed it an input that it wasn't designed to handle.
> The length operator applied on a table returns a border in
> that table. A border in a table t is any non-negative
> integer that satisfies the following condition:
>
> (border == 0 or t[border] ~= nil) and
> (t[border + 1] == nil or border == math.maxinteger)
So why did they use an algorithm that only works on vectors? Why not use a better algorithm, one that can return the number of keys in the table regardless of whether the table is a vector? The answer is that given the lua table internal representation, returning the number of keys in the table is O(N), whereas the vector-only implementation is usually O(1).
Those are *terrible* semantics:
However, I had to change the table internal representation for the table iterator patch (above). With the modified table representation, returning the number of keys in the table can be done in constant time, whether it's a vector or not.
- They're not useful for anything.
- It's not deterministic.
- In no sense of the word "length" is this the
length of the table.
Now, I'm seriously tempted to have lua_len just return the number of keys in the table. That would be so straightforward and self-explanatory, and faster than the current algorithm. The only reason I haven't done this is that it wouldn't conform to the specification! My new lua_len algorithm is similar to the original algorithm, in that it fails in exactly the same way on non-vectors, in order to be compliant with the specification.
Let me explain how that mess happened. They obviously
wanted the length operator to return the number of keys in
the table. Unfortunately, to count the number of keys in a
lua table actually takes O(N) time. So they came up with a
hack to make it faster: O(1). Unfortunately, the hack relies
on the table being a vector. That is, the table must have
numbered keys starting with 1. As long as you apply their
hack to a vector, it works perfectly and returns the
number of keys.
Since this feels insane, I have also provided a totally new API function: lua_nkeys. This returns the number of keys in the table, full stop. It's constant-time.
Unfortunately, if you apply the hacked length algorithm to a
table that isn't a vector, it doesn't work at all.
This patch is live, and is necessary to the determinism of the system.
But I think the lua documentation didn't want to admit, "it
doesn't work at all." So instead, they invented this
concept of "a border" and pretended that was in some way a
helpful result. They should have just said, "the result is
undefined."
I had to change the table internal representation
for the table iterator patch (above). With the modified
table representation, returning the number of keys in the
table can be done in constant time, whether it's a vector or
not. So I changed the length operator to just return
the number of keys, full stop.
I've also added another function, lua_nkeys. This also
returns the number of keys in the table. It doesn't add any
functionality - I could use lua_rawlen and that would work
just as well. However, using lua_nkeys emphasizes the fact
that my code needs the *real* table length, not the "border"
bullshit that lua used to provide.
This patch is live, and is necessary to the determinism of
the system.
## The Table Flag Bits Patch

View File

@@ -379,6 +379,10 @@ def generate_integration_code_workspace():
def build_clean():
"""
This code is underdeveloped.
For a more aggressive form of cleaning, use 'git clean -xfd',
which resets your git repository to its pristine state.
DANGER: this deletes any new source code you've created!
"""
shell(f"{INTEGRATION}/luprex", "make clean")
shell(INTEGRATION, f"{UNREALENGINE}/Engine/Build/BatchFiles/{BUILD_BAT} -waitmutex IntegrationEditor {OS} {DEBUG} {INTEGRATION}/Integration.uproject -clean")

View File

@@ -163,27 +163,6 @@ int luaH_nkeys (Table *t) {
return t->nnkeys;
}
int luaH_nthkey (lua_State *L, Table *t, int n, StkId pair) {
n -= 1; /* convert to C indexing */
if ((n < 0) || (n >= t->nnkeys)) {
setnilvalue(pair+0);
setnilvalue(pair+1);
return 0;
}
int index = t->sequence[n];
if (index < t->sizearray) {
setnvalue(pair + 0, index + 1);
setobj2s(L, pair + 1, &t->array[index].i_val);
return 1;
} else {
index -= t->sizearray;
Node *n = t->node + index;
setobj2s(L, pair + 0, gkey(n));
setobj2s(L, pair + 1, gval(n));
return 1;
}
}
int successorindex (lua_State *L, Table *t, StkId key) {
int i, seqno;
if (ttisnil(key)) {
@@ -706,32 +685,11 @@ void luaH_setint (lua_State *L, Table *t, int key, TValue *value) {
}
/*
** Try to find a boundary in table `t'. A `boundary' is an integer index
** such that t[i] is non-nil and t[i+1] is nil (and 0 if t[1] is nil).
** Return the number of keys in the table.
*/
int luaH_getn (Table *t) {
unsigned int j = 1;
unsigned int i = 0;
/* find `i' and `j' such that i is present and j is not */
while (!ttisnil(luaH_getint(t, j))) {
i = j;
j *= 2;
if (j > cast(unsigned int, MAX_INT)) { /* overflow? */
/* table was built with bad purposes: resort to linear search */
i = 1;
while (!ttisnil(luaH_getint(t, i))) i++;
return i - 1;
return t->nnkeys;
}
}
/* now do a binary search between them */
while (j - i > 1) {
unsigned int m = (i+j)/2;
if (ttisnil(luaH_getint(t, m))) j = m;
else i = m;
}
return i;
}
#if defined(LUA_DEBUG)

View File

@@ -37,7 +37,6 @@ LUAI_FUNC void luaH_resize (lua_State *L, Table *t, int nasize, int nhsize);
LUAI_FUNC void luaH_resizearray (lua_State *L, Table *t, int nasize);
LUAI_FUNC void luaH_free (lua_State *L, Table *t);
LUAI_FUNC int luaH_nkeys (Table *t);
LUAI_FUNC int luaH_nthkey (lua_State *L, Table *t, int n, StkId pair);
LUAI_FUNC int luaH_next (lua_State *L, Table *t, StkId key);
LUAI_FUNC int luaH_getn (Table *t);