72 lines
4.6 KiB
Markdown
72 lines
4.6 KiB
Markdown
|
|
### A New Lua Type: Tokens
|
||
|
|
|
||
|
|
Tokens are a custom Lua data type built on top of Lua's lightuserdata. They are mainly intended for use as sentinels and special reserved values.
|
||
|
|
|
||
|
|
## Motivation
|
||
|
|
|
||
|
|
Tokens were invented when we were developing a JSON-to-LUA converter. Such a converter is mostly straightforward: json tables and lua tables are very similar. However, we did encounter a stumbling block. Consider this JSON:
|
||
|
|
|
||
|
|
```json
|
||
|
|
{ "foo": null }
|
||
|
|
```
|
||
|
|
|
||
|
|
In Lua, setting a table key to nil deletes the key. There is no way to represent "foo is present with value null" in a Lua table. You might try `{foo = 0}` or `{foo = "null"}`, but both are lossy: you can no longer distinguish JSON null from the number 0 or the string "null". Any sentinel value drawn from an existing Lua type collides with legitimate values of that type.
|
||
|
|
|
||
|
|
The solution is to use lightuserdata. A lightuserdata is a distinct Lua type — it cannot be confused with a string, number, boolean, or nil, and unlike nil, it can be stored in a table. The Luprex engine does not use lightuserdata for any other purpose, so all lightuserdata values are available for use as tokens.
|
||
|
|
|
||
|
|
## What a Token Is
|
||
|
|
|
||
|
|
A token is a short string encoded as a base36 number and stored in the 8-byte lightuserdata value. The lightuserdata is not actually a pointer to anything — it holds the base36-encoded integer directly. Tokens may only contain the characters a-z and 0-9. Since 36^12 fits in 64 bits but 36^13 does not, the maximum token length is 12 characters. That is sufficient for most natural identifiers.
|
||
|
|
|
||
|
|
Since lightuserdata is not used for anything else, it is safe to assume that any lightuserdata in our engine represents a token.
|
||
|
|
|
||
|
|
## The C++ Side: struct LuaToken
|
||
|
|
|
||
|
|
On the C++ side, tokens are represented by `struct LuaToken` (in luastack.hpp). You can construct one from a string or from the raw integer:
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
LuaToken("null") // parsed at compile time via consteval — becomes 0x10FAA9
|
||
|
|
LuaToken(0x10FAA9) // equivalent raw value
|
||
|
|
```
|
||
|
|
|
||
|
|
The string form is preferred — it is readable, and because the constructor is `consteval`, it compiles down to the same constant as the raw integer. There is zero runtime cost. If the string contains invalid characters (anything outside a-z, 0-9) or is too long, the error is caught at compile time.
|
||
|
|
|
||
|
|
There is also a runtime constructor that accepts `std::string_view`, for cases where the token string is not known at compile time.
|
||
|
|
|
||
|
|
The LuaStack API provides the usual accessors for tokens:
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
LS.set(slot, LuaToken("null")) // store a token in a LuaSlot
|
||
|
|
LuaToken t = LS.cktoken(slot) // extract a token (error if not lightuserdata)
|
||
|
|
auto t = LS.trytoken(slot) // extract a token (returns empty optional on mismatch)
|
||
|
|
```
|
||
|
|
|
||
|
|
Named token constants can be auto-registered into the Lua environment using the `LuaTokenConstant` macro, which works the same way `LuaDefine` auto-registers functions:
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
LuaTokenConstant(null, "null", "Represents JSON null")
|
||
|
|
```
|
||
|
|
|
||
|
|
## Properties
|
||
|
|
|
||
|
|
- **Distinct type.** Tokens are lightuserdata, a separate Lua type. They cannot collide with strings, numbers, booleans, tables, or nil.
|
||
|
|
- **Storable in tables.** Unlike nil, tokens can be used as both table keys and table values.
|
||
|
|
- **No allocation.** Tokens are 8 bytes inline. There is no heap allocation and no string interning.
|
||
|
|
- **Fast comparison.** Comparing two tokens is just an integer comparison.
|
||
|
|
|
||
|
|
## Limitation: No Token Literals in Lua
|
||
|
|
|
||
|
|
Lua's parser has no syntax for token literals. In C++, you can write `LuaToken("null")` and it's clean and compile-time. In Lua, there is no equivalent — you cannot write a token literal the way you write `"hello"` or `42`.
|
||
|
|
|
||
|
|
Currently, the way tokens are made available to Lua is that C++ code uses `LuaTokenConstant` to insert specific token values into global tables. Lua scripts can then reference these pre-registered constants by name.
|
||
|
|
|
||
|
|
Modifying the Lua parser to add token literal syntax has been considered but is unappealing — it would be a significant and invasive patch. Adding a Lua function like `token("null")` to construct tokens at runtime is also possible and not off the table, but there hasn't been a need for it yet.
|
||
|
|
|
||
|
|
## Passing Tokens to Unreal
|
||
|
|
|
||
|
|
Tokens can get passed to Unreal in a variety of ways. For example, in animation step key-value pairs, the value can be a token. When animation queues are passed to Unreal, tokens are converted to FNames. Since both tokens and FNames are short identifier-like strings with fast comparison, the mapping is natural.
|
||
|
|
|
||
|
|
## Usage
|
||
|
|
|
||
|
|
Tokens are mainly intended as sentinels and special reserved values. The JSON null example above is the motivating case, but tokens can represent any short reserved constant the engine needs.
|