Files
integration/Docs/Major Data Structures.md
2026-02-05 12:41:07 -05:00

236 lines
12 KiB
Markdown

## Overview
This document gives a grand overview of the major data structures used by the engine.
## The World Model
There will be a C++ class World.
Each World will own its own lua interpreter. The World constructor will call *lua_newstate* to create a new lua interpreter. The World will retain a pointer to the lua_State returned by lua_newstate. When the World is destroyed, the associated lua_State will be closed down.
When you call *lua_newstate* to create a new lua interpreter, it automatically creates:
- One lua registry table
- One global environment table
We think of these two tables as "owned" by the World.
We're going to put a pointer to the C++ World object into the lua registry. Any C++ code that takes a lua_State can access the registry, so therefore, any C++ code that takes a lua_State can access the World object.
## The Global Environment Table
When you create a lua interpreter, it automatically creates one global environment table. We use the one global environment table created by lua as the global environment for all functions, methods, and threads inside that lua interpreter. In short: there's only one global environment table per interpreter. Or to put it differently, one global environment table per World.
The global environment table will contain the class database, which are described in the next section. We will take steps to ensure that the global environment doesn't contain anything else other than classes.
## Class Database
We will have a database of classes. Classes have these properties:
- A class is a lua table, typically full of functions and methods.
- Classes have names, which are strings. Class names are flat, not heirarchical.
- Classes are stored in the global environment table, by name.
- Has "__class" key which is the class name.
- Has "__index" key that points back to itself.
- May have metamethods in it.
- Can be used directly as a metatable.
- Can be used as the __index part of a metatable.
In addition, if the class is meant to be used with tangible objects, the class may have these things inside it:
- An 'action' subtable containing plans.
We provide an operator *makeclass(name).* This operator does the following steps:
- Creates the table if it doesn't already exist.
- If the table already exists, it is retained.
- Inserts the key __class=classname into the table.
- Inserts the key __index=(self) into the table.
- Does *not* remove any existing methods or functions.
We also provide an operator *maketangible(name).* This is a variant of makeclass. In addition to the steps above, it also does these:
- Creates an 'action' subtable within the class table, for plans.
## Source Database
The game's script will consist of a collection of lua source files. The lua files will be stored in a source database. The source database is a table inside the lua registry, under the key "sourcedb." The table contains keys which are filenames. Each filename maps to the following:
```
filename : {
"name" : the filename again,
"fingerprint" : a hash of the file's modification date and size,
"code" : entire source code of the lua file as a string,
"hash": a 128-bit hash of the code,
"loadresult": result of calling 'load' - either a closure or an error message,
"sequence": an integer indicating order in which files are to be loaded
}
```
The source database is loaded from the lua subdirectory. This subdirectory contains a file "control.lst" which governs which lua files are to be loaded, and in what order.
The source database is considered part of the World.
## Class Rebuilds
Frequently, we will do "class rebuilds." That means we reconstruct the entire class database, from scratch, from the source database. Here's how that works. Suppose you write this source file:
```lua
makeclass("furnace")
function furnace.burn()
```
That source file gets loaded into the source database. As part of this process, the code gets passed to lua's *load* function, which returns a *loadresult*, which is a closure (if there's no error). When we call that closure, it creates the class 'furnace' (if it's not already there), and inserts the function *burn* into that class. In summary: calling a source database *loadresult* typically has the effect of inserting functions into the class database.
Therefore, a class rebuild consists of these steps: first, empty out all the classes in the class database. Then, for every source file in the source database, call the *loadresult*. This will repopulate the class database.
The class rebuild is an extremely cheap operation. It doesn't need to load the source from disk, the source is already in RAM. It doesn't need to compile the source - the loaded closures are already present in the source database. The only actual thing happening is that the loaded closures are reinstalling the pointers into the class database. It should take less than a millisecond.
## Reloadable Source
We need our lua source files to be reloadable, for two reasons. First, the class rebuild procedure described above reloads the source files over and over. Second, in order to support on-the-fly code testing, we're going to be editing, loading, editing, loading, and so forth. So for both reasons, we need our lua source files to be reloadable.
Function and method definitions tend to be reloadable without any extra effort. Let's look at that example source file again:
```lua
makeclass("furnace")
function furnace.burn() end
```
If you load that source file twice, no harm is done. The first time you load it, you fetch the class and insert the function burn. The second time, you re-fetch the class, and re-insert the function burn. It works fine. In general, when you insert functions into the class database, you don't have to do anything special to make the code reloadable.
But other code may not be reloadable. For example, suppose we want to implement a user database module, and let's say that module has some initialization code:
```lua
local user_database = global("user database")
user_database.hashed_passwords = {}
```
That's not going to work - it's going to clear the hashed passwords table every time we reload the source file. Lua source files will sometimes need explicit code to achieve reloadability:
```lua
local user_database = global("user database")
if user_database.hashed_passwords == nil then
user_database.hashed_passwords = {}
end
```
We will try to provide builtin functions to make this sort of thing as easy as possible, but we can't completely eliminate the burden of thinking about this. Lua source code will need to be carefully written to be reloadable.
## Tangibles
Tangibles are what the legacy engine used to call "sprites."
Every tangible consists of two parts: a "lua tangible" and a "C++ tangible". The tangible has an ID number which is a 64-bit integer. ID numbers are small enough to fit inside a lua_Number.
The C++ Tangible is a class that contains these things:
- The tangible's ID number.
- The animation queue.
- The positional tracker (which helps us do "scanradius" searches).
- If it's a player, the IdPlayerPool (which helps with unique ID allocation).
The C++ World contains a table that maps tangible ID to C++ tangible.
The Lua Tangible is a table. The lua tangible is used by the script to store arbitrary data. All keys are available to the scripter, none are reserved. However, the lua tangible always has a metatable which *is* reserved. The metatable of a lua tangible contains:
- __id -- the tangible's ID number.
- __metatable -- helps protect the metatable.
- __index -- points to a class from the class database.
- __threads -- the thread table, see the next section
The registry contains a key, "tangibles", which is a table that maps ID number to lua tangible.
Although the scripter cannot replace the metatable for a tangible, the scripter *can* use an operator that we provide, *setclass(tangible, class)* to change the tangible's class.
## Threads
Every tangible has a table of threads. A thread has a numeric ID which is a 64-bit number. The table of threads has the following contents:
```
tangible metatable : {
"threads" : {
tid = { thread info table },
tid = { thread info table },
}
}
```
The thread info table contains the following information:
```
thread info table : {
"thread" = coroutine handle,
"actorid" = actor ID,
"print" = if true, thread return values should be printed on thread completion.
"useppool" = if true, the thread should use the actor's ID allocator.
"new" = if true, the thread is newly-created, if false, it's a yielded thread.
}
```
The "new" thread flag helps us understand what's on the thread's stack. In the newly-created state, the thread's stack contains a function, followed by function arguments. In the yielded state, the thread's stack contains values that should be passed to the lua_yieldk continuation function.
The World contains a table of scheduled wakeups. This is an ordered list of tuples: <time, thread id, tangible id>. The scheduled wakeups also contain threads that want to wake up immediately, with time=0.
When a thread is created, it is inserted into the appropriate thread table right away. It is also inserted into the schedule right away, with a time of zero. Then, the scheduler is called to run the thread. The thread stays in the thread table until it terminates.
## Snapshot
Predictive reexecution asks the client to permanently maintain a synchronous model. The client is also supposed to generate transient asynchronous models, apply predictions to those asynchronous models, and then moments later, throw those asynchronous models away.
To implement this, we're not actually going to create separate synchronous and asynchronous models. Instead, we're going to have one combined "client" model which plays both roles. It will accomplish this using a snapshot-and-rollback ability.
The lua interpreter's state is snapshotted via a memory trick: we keep track of the memory used by Lua, and we just snapshot the memory without understanding what's in it. When we're ready, we restore all that memory to its previous state. This works quite well to restore a lua interpreter to a previous state.
Snapshot and rollback is used as follows. The client model initially contains no predictions, which means it's a synchronous model. We snapshot this synchronous state. Then, we apply predictions. Now it's an asynchronous model. When we're ready, we roll back to the snapshotted state, which means the predictions have been removed and it's back to being a synchronous model.
The snapshot and rollback functionality will be built right into C++ class World.
## The World Model Summary
Here is a summary of all the items we've listed as being part of the world model:
- The lua_State pointer - the lua main thread
- The lua registry
- The lua global environment
- The source database (in registry key "sourcedb")
- The class database (in the global environment)
- The script globals (in registry key "globaldb")
- The C++ tangibles (stored by ID in the World model)
- The Lua tangibles (in registry key "tangibles")
- Possibly, a snapshot of a synchronous state.
## What Gets Difference Transmitted
Here's a summary of what gets difference transmitted:
**Source database:** transmitted. Source code is easy to difference transmit, it's just strings. If you want to obfuscate it a little, we could serialize the loaded closures and transmit those instead. Either way works fine.
**Class database**: not transmitted in itself, but it gets rebuilt from the source database whenever the source database gets difference transmitted.
**Globals database**: not transmitted at all. Only the master world model is allowed to access global data structures. Any attempt to access these data structures in the synchronous models is deliberately trapped.
**Tangibles**: difference transmitted when nearby.
FUTURE STUFF - Client / Server
Code privacy
Data privacy
Walkability (Pancake Data Structure?)
Global Variables
Good PRNG functions
FUTURE STUFF - Game / Library
Doors to other servers / Blockchain stuff