Improved Docs, AnimationStepApplyMesh+Materials, some other minor tweaks

This commit is contained in:
2026-02-06 17:34:26 -05:00
parent a2e179e15b
commit 56765fdc16
28 changed files with 731 additions and 165 deletions

58
Docs/About Determinism.md Normal file
View File

@@ -0,0 +1,58 @@
### About Determinism
The driven portion of the Luprex engine is deterministic. This document explains what that means and why it matters. For the specific rules you must follow to maintain determinism, see "The Event-Driven Structure of the Engine."
## Two Degrees of Determinism
There are two distinct degrees of determinism in the engine, each serving a different purpose.
**Value-level determinism** is the property that the server-synchronous and client-synchronous models stay in the same logical state. These two models run on different machines and receive the same command acknowledgements in the same order. Value-level determinism guarantees that they end up containing the same Lua tables with the same keys and values. If you go into both models and print things out, everything looks the same. However, a Lua table in one model is not necessarily at the same memory address as the corresponding table in the other, because they are running on different machines with different memory layouts. The two models are *equal* in the Lisp sense of `equal`, but not in the sense of `eq`.
**Bit-exact determinism** is the property that a recorded event log, when replayed into a fresh DrivenEngine, reproduces the original execution right down to the memory level: every data structure is at the same address, every byte of memory is the same. This is the `eq` level of equivalence.
The engine currently aims for both, but they serve different purposes and impose different costs.
## Value-Level Determinism: Synchronous Model Pairing
Value-level determinism is what makes the multiplayer architecture work. It is non-negotiable.
Luprex uses four types of world models to handle multiplayer networking (see "Predictive Reexecution" for the full explanation). Two of these models are critical to understand here:
- The **server-synchronous** model runs on the server.
- The **client-synchronous** model runs on the client.
These two models receive the same command acknowledgements in the same order. Because the driven portion is deterministic at the value level, the two models always end up in the same logical state: the same Lua tables, the same values, the same game state. They never need to exchange full state to stay in sync.
The two models are running on different machines, so naturally they have different memory layouts and different pointer addresses. That's fine. All that matters is that the values match. This is why value-level determinism is sufficient for synchronous model pairing.
The constraints that maintain value-level determinism are:
- **Deterministic Lua table iteration.** We patch the Lua runtime so that iterating over a table always produces keys in the same order, regardless of memory layout. Without this, two engines processing the same commands could iterate tables in different orders and produce different results.
- **No iterating over unordered maps.** Unordered maps produce elements in an order that depends on memory addresses. Since addresses differ between the two models, iteration order would differ, breaking value-level determinism. (An exception: iterating an unordered map and then immediately sorting the results into a predictable order is allowed, because the randomness is sandboxed.)
- **No genuinely random numbers.** Pseudorandom numbers are fine as long as the state is privately owned by the driven portion and seeded deterministically.
- **Controlled use of real-time clocks.** The driven portion (the Luprex DLL) cannot call system functions to obtain the current time, because the result would differ between runs and between machines. However, the driver can feed the current time into the driven portion as an event. Since events are the same during paired execution and during replay, the time value is deterministic from the driven portion's perspective.
## Bit-Exact Determinism: Replay Debugging
Bit-exact determinism enables replay debugging. It is valuable but expensive, and its cost-benefit tradeoff is an open question.
As the server runs, the driver can write a log of every event it feeds into the driven portion. Later, a new DrivenEngine can be created and fed those same events from the log file. The goal of bit-exact determinism is that during this replay, the DrivenEngine does the *exact* same thing it did during the live run, right down to every data structure being at the same memory address.
Why does this matter? If the server crashed during the live run, the replay will crash in exactly the same way. You can run the replay inside a debugger, single-step right up to the crash, and examine the exact same pointers and memory layout that existed during the original crash.
Value-level determinism alone is not sufficient for this. If the replay produces the same logical state but at different memory addresses, then pointer-related bugs (buffer overruns, use-after-free, etc.) might not reproduce. Bit-exact determinism ensures they do.
The additional constraints that maintain bit-exact determinism, beyond those needed for value-level determinism, are:
- **The eng::malloc heap.** A custom memory allocator positioned at a fixed address, used exclusively by the driven portion. Because the driven portion is deterministic, the sequence of allocations and frees is identical between the live run and the replay, so every data structure ends up at the same address. See "The Event-Driven Structure of the Engine" for details.
- **No threads in the driven portion.** Thread scheduling is nondeterministic at the OS level. Even if two threaded programs produce the same final values, the interleaving of operations differs between runs, which would cause memory allocations to occur in different orders and at different addresses.
Note that the constraints for value-level determinism (deterministic table iteration, no unordered maps, etc.) also contribute to bit-exact determinism. But they are *required* for value-level determinism regardless. The eng::malloc heap and the no-threads rule are the additional cost imposed specifically by the bit-exact guarantee.
## The Practical Distinction
If the engine ever relaxed its determinism requirements, the value-level constraints would remain because they are essential to the multiplayer architecture. The bit-exact constraints (eng::malloc, no threads) could theoretically be dropped if replay debugging were deemed not worth the cost. Dropping the no-threads rule in particular would be a significant performance benefit.
## Lua Scripters Don't Need to Worry
The Lua environment is carefully sandboxed to be deterministic at both levels without any effort from the scripter. Lua's random number generators are seeded pseudorandom generators owned by the driven portion. Table iteration is patched to be deterministic. Lua "threads" (coroutines) are not real OS threads and don't run concurrently. The scripter writes ordinary Lua code and gets determinism for free.