Rewrite some bad AI-generated documentation.

This commit is contained in:
2026-02-11 11:25:49 -05:00
parent db35967fb9
commit 77fc65898c

View File

@@ -1,58 +1,138 @@
### About Determinism ### About Determinism
The driven portion of the Luprex engine is deterministic. This document explains what that means and why it matters. For the specific rules you must follow to maintain determinism, see "The Event-Driven Structure of the Engine." Luprex uses two different kinds of determinism.
## Two Degrees of Determinism **Synchronous Model Determinism** Predictive reexecution
uses four world models, including a server-synchronous and
client-synchronous model. These two models are fed the same
events, and must remain in the same state after executing
the same events. See the document "Predictive Reexecution"
for an explanation of why these models exist. I you were to
do a comparison of the two models, they would be equal in
the lisp sense of `equal`, but not in the sense of `eq`,
because corresponding data structures are not at the
same memory address.
There are two distinct degrees of determinism in the engine, each serving a different purpose. **Replay Log Determinism** The server stores a log of all
events it feeds into the Luprex DLL. It can replay a log by
feeding the same events into a new copy of the Luprex DLL.
When replaying a log, the new copy of the Luprex DLL
reproduces the original execution right down to the memory
level: every data structure is at the same address, every
byte of memory is the same. This is the `eq` level of
equivalence.
**Value-level determinism** is the property that the server-synchronous and client-synchronous models stay in the same logical state. These two models run on different machines and receive the same command acknowledgements in the same order. Value-level determinism guarantees that they end up containing the same Lua tables with the same keys and values. If you go into both models and print things out, everything looks the same. However, a Lua table in one model is not necessarily at the same memory address as the corresponding table in the other, because they are running on different machines with different memory layouts. The two models are *equal* in the Lisp sense of `equal`, but not in the sense of `eq`. These two forms of determinism serve different purposes and
impose different costs.
**Bit-exact determinism** is the property that a recorded event log, when replayed into a fresh DrivenEngine, reproduces the original execution right down to the memory level: every data structure is at the same address, every byte of memory is the same. This is the `eq` level of equivalence. ## Implementing Synchronous Model Determinism
The engine currently aims for both, but they serve different purposes and impose different costs. To get the two synchronous models to be deterministic
enough, we had to take several steps:
## Value-Level Determinism: Synchronous Model Pairing - **Deterministic Lua table iteration.** We patch the Lua
runtime so that iterating over a table always produces
keys in the same order. The order depends only on
the order in which the keys were inserted, but not on the
memory layout.
- **No iterating over C++ unordered maps.** Unordered maps
produce elements in an order that depends on memory
addresses. Since addresses differ between the two models,
iteration order would differ, breaking value-level
determinism. An exception: iterating an unordered map and
then immediately sorting the results into a predictable
order is allowed, because the randomness is sandboxed.
- **No genuinely random numbers.** We do not use random
numbers in the world model. We do use pseudorandom
numbers, we store the generator's state as part of the
world model and maintain it using difference transmission.
Value-level determinism is what makes the multiplayer architecture work. It is non-negotiable.
Luprex uses four types of world models to handle multiplayer networking (see "Predictive Reexecution" for the full explanation). Two of these models are critical to understand here:
- The **server-synchronous** model runs on the server.
- The **client-synchronous** model runs on the client.
These two models receive the same command acknowledgements in the same order. Because the driven portion is deterministic at the value level, the two models always end up in the same logical state: the same Lua tables, the same values, the same game state. They never need to exchange full state to stay in sync.
The two models are running on different machines, so naturally they have different memory layouts and different pointer addresses. That's fine. All that matters is that the values match. This is why value-level determinism is sufficient for synchronous model pairing.
The constraints that maintain value-level determinism are:
- **Deterministic Lua table iteration.** We patch the Lua runtime so that iterating over a table always produces keys in the same order, regardless of memory layout. Without this, two engines processing the same commands could iterate tables in different orders and produce different results.
- **No iterating over unordered maps.** Unordered maps produce elements in an order that depends on memory addresses. Since addresses differ between the two models, iteration order would differ, breaking value-level determinism. (An exception: iterating an unordered map and then immediately sorting the results into a predictable order is allowed, because the randomness is sandboxed.)
- **No genuinely random numbers.** Pseudorandom numbers are fine as long as the state is privately owned by the driven portion and seeded deterministically.
- **Controlled use of real-time clocks.** The driven portion (the Luprex DLL) cannot call system functions to obtain the current time, because the result would differ between runs and between machines. However, the driver can feed the current time into the driven portion as an event. Since events are the same during paired execution and during replay, the time value is deterministic from the driven portion's perspective.
## Bit-Exact Determinism: Replay Debugging ## Bit-Exact Determinism: Replay Debugging
Bit-exact determinism enables replay debugging. It is valuable but expensive, and its cost-benefit tradeoff is an open question. Bit-exact determinism enables replay debugging. It is
valuable but expensive, and its cost-benefit tradeoff is an
open question.
As the server runs, the driver can write a log of every event it feeds into the driven portion. Later, a new DrivenEngine can be created and fed those same events from the log file. The goal of bit-exact determinism is that during this replay, the DrivenEngine does the *exact* same thing it did during the live run, right down to every data structure being at the same memory address. As the server runs, the driver can write a log of every
event it feeds into the driven portion. Later, a new
DrivenEngine can be created and fed those same events from
the log file. The goal of bit-exact determinism is that
during this replay, the DrivenEngine does the *exact* same
thing it did during the live run, right down to every data
structure being at the same memory address.
Why does this matter? If the server crashed during the live run, the replay will crash in exactly the same way. You can run the replay inside a debugger, single-step right up to the crash, and examine the exact same pointers and memory layout that existed during the original crash. Why does this matter? If the server crashed during the live
run, the replay will crash in exactly the same way. You can
run the replay inside a debugger, single-step right up to
the crash, and examine the exact same pointers and memory
layout that existed during the original crash.
Value-level determinism alone is not sufficient for this. If the replay produces the same logical state but at different memory addresses, then pointer-related bugs (buffer overruns, use-after-free, etc.) might not reproduce. Bit-exact determinism ensures they do. Value-level determinism alone is not sufficient for this. If
the replay produces the same logical state but at different
memory addresses, then pointer-related bugs (buffer
overruns, use-after-free, etc.) might not reproduce.
Bit-exact determinism ensures they do.
The additional constraints that maintain bit-exact determinism, beyond those needed for value-level determinism, are: To implement replay determinism, we took several
difficult steps:
- **The eng::malloc heap.** A custom memory allocator positioned at a fixed address, used exclusively by the driven portion. Because the driven portion is deterministic, the sequence of allocations and frees is identical between the live run and the replay, so every data structure ends up at the same address. See "The Event-Driven Structure of the Engine" for details. - **The Driver/Driven Partition**. The luprex engine is
- **No threads in the driven portion.** Thread scheduling is nondeterministic at the OS level. Even if two threaded programs produce the same final values, the interleaving of operations differs between runs, which would cause memory allocations to occur in different orders and at different addresses. event-driven portion, and an event-driver. The driven
portion contains all the game logic. The driver is mainly
for I/O. The driven portion cannot contain any I/O. That
includes:
Note that the constraints for value-level determinism (deterministic table iteration, no unordered maps, etc.) also contribute to bit-exact determinism. But they are *required* for value-level determinism regardless. The eng::malloc heap and the no-threads rule are the additional cost imposed specifically by the bit-exact guarantee. - **Clocks only in the Driver.** The driven portion cannot
call system functions to obtain the current time.
However, the driver can feed the current time into the
driven portion as an event.
- **Lua Source files only in the Driver** The driven
portion cannot read lua source files. It can however
enter a state that indicates to the driver that it
wants a lua source file. Then, the driver can feed
the lua source file in as an event.
- **Sockets only in the Driver** The driven portion
cannot open TCP/IP sockets. However, it can enter
a state that indicates its desire to make a TCP/IP
connection, and then the driver can do it and feed
the data into the driven portion.
## The Practical Distinction - **The eng::malloc heap.** A custom memory allocator
positioned at a fixed address, used exclusively by the
driven portion. The memory allocator, if asked to
perform the same sequence of malloc/free operations,
will return the same addresses.
If the engine ever relaxed its determinism requirements, the value-level constraints would remain because they are essential to the multiplayer architecture. The bit-exact constraints (eng::malloc, no threads) could theoretically be dropped if replay debugging were deemed not worth the cost. Dropping the no-threads rule in particular would be a significant performance benefit. - **No threads in the driven portion.** Thread scheduling is
nondeterministic at the OS level. We cannot use it in the
driven portion.
## Should we Ditch Replay Determinism?
Implementing synchronous model determinism is necessary
for predictive reexecution. It is non-negotiable.
On the other hand, replay log determinism is not necessarily
required for us to have a usable engine. We could ditch it.
It certainly does impose a lot of difficult constraints on
the engine.
The driver/driven distinction certainly required us to tie
ourselves into knots in some part of the engine design.
But, that's pretty baked in at this point, we're probably
never going to change that.
However, it also imposes a no-threads requirement. That
is certainly a bummer from a performance perspective.
## Lua Scripters Don't Need to Worry ## Lua Scripters Don't Need to Worry
The Lua environment is carefully sandboxed to be deterministic at both levels without any effort from the scripter. Lua's random number generators are seeded pseudorandom generators owned by the driven portion. Table iteration is patched to be deterministic. Lua "threads" (coroutines) are not real OS threads and don't run concurrently. The scripter writes ordinary Lua code and gets determinism for free. The Lua environment is carefully sandboxed to be
deterministic at both levels without any effort from the
scripter. Lua's random number generators are seeded
pseudorandom generators owned by the driven portion. Table
iteration is patched to be deterministic. Lua "threads"
(coroutines) are not real OS threads and don't run
concurrently. The scripter writes ordinary Lua code and gets
determinism for free.