Better data formatters in progress.

This commit is contained in:
2026-04-20 05:42:34 -04:00
parent 275698c5aa
commit 21d8c40005
4 changed files with 1106 additions and 860 deletions

View File

@@ -6,183 +6,167 @@ When debugging Unreal with VS Code + CodeLLDB, the **Variables** pane and
the **Watch** pane use two completely different evaluation paths:
- **Variables pane** walks a tree built by lldb's *synthetic children
providers* (including Unreal's Python formatters for TArray, TMap, FName,
FString, TSharedRef, etc.). Values are looked up by offset/type — no
compilation. Base classes appear as named children ("SCompoundWidget",
"UWidget"), smart-pointer inners get unwrapped, container elements get
indexed.
providers* (Unreal's Python formatters for TArray, TMap, FName, FString,
and now our TObjectPtr/TSharedPtr/TWeakPtr). Values are looked up by
offset/type — no compilation. Base classes appear as named children
("SCompoundWidget", "UWidget"); smart-pointer inners get unwrapped;
container elements get indexed.
- **Watch pane** (and `p`/`expression`) runs your text through Clang to
compile a real C++ expression against the program's full type system,
then applies formatters to the result. For Unreal, this is slow and
often fails outright — synthetic children don't exist as C++ members.
- **Watch pane** (without intervention) runs text through the `/se` (simple)
or `/nat` (Clang) evaluator. `/nat` fails for most Unreal paths;
`/se` transpiles to Python and walks the SBValue tree, but — as
originally shipped — fails on pointer auto-deref and can't find base
classes by type name.
The practical consequence: a path you see in the Variables pane — like
`Widget.Object.SCompoundWidget.SWidget.bCanSupportFocus` — cannot be
typed into the Watch pane, because the intermediate labels
(`SCompoundWidget`, `SWidget`) aren't real C++ members. Even "Copy as
Expression" in the Variables pane gives you that broken synthetic path.
When you right-click in Variables and **Add to Watch**, CodeLLDB sends
a path like `Top.Widget` (built itself for synthetic children, uniform
`.`, no pointer awareness). Without the fix below, that path fails in
Watch even though it identifies a real value Variables can show.
Hand-writing the equivalent real C++ expression (with explicit
`->` for pointers, `static_cast`s through base classes, formatter-internal
array indexing math, etc.) is impractical.
## What We Did
## The Fix: A Python Helper That Walks Synthetic Children
All changes live in `tools/UEDataFormatter.py`, loaded via `initCommands`
in every launch config in `Integration.code-workspace.tpl.json`.
We added a function `fv(path)` to `tools/UEDataFormatter.py`. It walks a
dotted path through the synthetic tree using the same APIs the Variables
pane uses (`GetChildMemberWithName`, then fallback to iterating children
by name so base classes match). Pointers are auto-dereferenced.
### 1. Patched `codelldb.value.Value.__getattr__`
CodeLLDB's `/py` expression mode evaluates Python in the Watch pane and
renders returned `SBValue`s through the full formatter pipeline — same
expandable tree as Variables. So:
At module load, we monkey-patch CodeLLDB's `Value` class — the Python
wrapper it uses inside `/se` and `/py` evaluation — to:
/py fv("Widget.Object.SCompoundWidget.SWidget")
- **Auto-deref pointers** before descending into a named child.
- **Fall back to iterating children** by `GetName()` when
`GetChildMemberWithName` returns invalid. This catches base classes
(which appear as named children when iterated but can't be looked up
by name).
works in Watch the way you'd originally expect `Widget.Object.SCompoundWidget.SWidget`
to work.
**Result:** plain "Add to Watch" from the Variables pane produces a path
like `Top.Widget.SomeField` and Watch evaluates it correctly — same
expandable tree you'd see in Variables. No `/py fv(...)` wrapper, no
VS Code extension, no manual prefix typing.
`fv` is also injected into Python builtins, so from the Debug Console
you can also just:
Blast radius: every `/se` and `/py` expression that goes through
`Value.__getattr__`. The change is strictly more permissive (paths that
used to fail now succeed), but it is a global behavior change to
CodeLLDB internals. Breaks if CodeLLDB renames `Value` or the
`__sbvalue` slot; re-applied automatically on extension reload since
our patch runs on `command script import`.
script fv("Widget.Object.SCompoundWidget.SWidget")
### 2. Synth + Summary Providers for Smart Pointers
### How `fv` is loaded
UE's engine formatter covers containers and TWeakObjectPtr, but not the
smart pointer family. We added providers for:
The launch configurations in `Integration.code-workspace.tpl.json` run:
- **`TObjectPtr<T>`** — shows `nullptr` / `unresolved` / wrapped object's
summary. Expanding flattens straight to the target's members (no
intermediate `*DebugPtr` click).
- **`TSharedPtr<T>`** and **`TSharedRef<T>`** — shows `nullptr` / target
summary; expands straight to target's members.
- **`TWeakPtr<T>`** — checks
`WeakReferenceCount.ReferenceController.SharedReferenceCount` before
dereferencing. Expired weak refs show `expired` rather than garbage
from a dangling pointer.
command script import /home/jyelon/integration/tools/UEDataFormatter.py
All registration regexes are anchored with `^` — otherwise the greedy
`.+` matches nested occurrences (a `TArray<TObjectPtr<X>, ...>` would
get dispatched to the TObjectPtr provider instead of TArray).
This both (a) loads all the Unreal data formatters and (b) defines `fv`
and injects it into builtins.
### 3. Dynamic Type Resolution
### Reloading without restarting the debug session
Every launch config now sets:
settings set target.prefer-dynamic-value no-run-target
lldb reads the vtable at each polymorphic value and shows the runtime
type's members. A `UObject*` that actually points to a `UUserWidget`
expands to the full `UUserWidget` subtree, not just `UObject`.
`no-run-target` avoids running code in the debuggee, which is important
during synthesis.
### 4. Universal SIGTRAP Handling
`process handle SIGTRAP --notify false --pass false --stop false` is now
in every launch config. Unreal raises SIGTRAP internally in a number of
places (soft asserts, ensure-style checks); without this, the debugger
stops constantly.
## Reloading Without a Session Restart
If you edit `tools/UEDataFormatter.py`, reload in the Debug Console:
script import importlib; importlib.reload(UEDataFormatter)
script import importlib; importlib.reload(UEDataFormatter); UEDataFormatter.__lldb_init_module(lldb.debugger, {})
## The Remaining Gap: "Add to Watch"
VS Code's "Add to Watch" context menu copies the variable's `evaluateName`
as reported by the debug adapter. CodeLLDB sends the synthetic path — so
"Add to Watch" produces exactly the unusable expression that started this
whole problem.
The clean solution is a small VS Code extension that registers a new
command **Add to Watch (fv)** on the Variables pane context menu. When
invoked, it reads the variable's evaluateName and adds
`/py fv("<evaluateName>")` to the watch expressions.
### Scaffolding the extension
Create a folder, e.g. `tools/vscode-fv-watch/`:
```
tools/vscode-fv-watch/
package.json
src/extension.ts
tsconfig.json
```
**`package.json`** declares the contribution points — the command, and
its appearance in the Variables pane context menu:
```json
{
"name": "fv-watch",
"version": "0.0.1",
"engines": { "vscode": "^1.80.0" },
"activationEvents": ["onDebug"],
"main": "./out/extension.js",
"contributes": {
"commands": [{
"command": "fv.addToWatch",
"title": "Add to Watch (fv)"
}],
"menus": {
"debug/variables/context": [{
"command": "fv.addToWatch",
"group": "3_modification@1"
}]
}
},
"devDependencies": {
"@types/vscode": "^1.80.0",
"typescript": "^5.0.0"
}
}
```
**`src/extension.ts`** implements the command. The variable reference is
passed as the command argument; we pull its `evaluateName` and call the
built-in `debug.addToWatchExpressions` with the wrapped form:
```ts
import * as vscode from 'vscode';
export function activate(ctx: vscode.ExtensionContext) {
ctx.subscriptions.push(vscode.commands.registerCommand(
'fv.addToWatch',
async (variable: any) => {
const name = variable?.evaluateName ?? variable?.name;
if (!name) return;
const expr = `/py fv(${JSON.stringify(name)})`;
await vscode.commands.executeCommand(
'debug.addToWatchExpressions',
{ variable: { evaluateName: expr } }
);
}));
}
export function deactivate() {}
```
**`tsconfig.json`** is standard:
```json
{
"compilerOptions": {
"module": "commonjs",
"target": "ES2020",
"outDir": "out",
"strict": true,
"esModuleInterop": true
},
"include": ["src"]
}
```
### Building and installing
```
cd tools/vscode-fv-watch
npm install
npx tsc
ln -s "$PWD" ~/.vscode/extensions/fv-watch
```
Restart VS Code. Right-click any entry in the Variables pane and
**Add to Watch (fv)** appears alongside the built-in **Add to Watch**.
The reload re-executes module code (updating the patch and the provider
classes). The explicit `__lldb_init_module` call re-runs the provider
registrations — lldb only fires `__lldb_init_module` on initial import,
not on reload.
## Summary of Workflow
| Action | Where | How |
|---|---|---|
| Explore a value | Variables pane | Click disclosure triangles |
| Track a value across steps | Watch pane | "Add to Watch (fv)" from Variables context menu, or type `/py fv("...")` manually |
| One-shot inspection | Debug Console | `script fv("path")` |
| Reload formatter edits | Debug Console | `script import importlib; importlib.reload(UEDataFormatter)` |
| Track a value across steps | Watch pane | Right-click variable → Add to Watch (just works) |
| One-shot inspection | Debug Console | `v Widget.Object->SCompoundWidget.SWidget` (`v` = `frame variable`; use `->` for pointers explicitly) |
| Reload formatter edits | Debug Console | `script import importlib; importlib.reload(UEDataFormatter); UEDataFormatter.__lldb_init_module(lldb.debugger, {})` |
## Why Not Make It Automatic?
## Notes on the Design
The cleanest solution would be to change CodeLLDB's default Watch
evaluator to route through `frame variable` / synthetic children instead
of Clang. That's not exposed as a setting — CodeLLDB's `"expressions"`
option only chooses between its own parser, lldb's `expr`, or raw Python.
None of those hit the synthetic-children walk.
### Why this works
A feature request against CodeLLDB to add a "native frame-variable"
expression mode would address this at the source. In the meantime, the
`fv` helper + extension combo reproduces the missing behavior.
CodeLLDB's `/se` evaluator transpiles user expressions into Python that
operates on `Value` objects. `Value.__getattr__` drives every `.field`
access. By making that method auto-deref and iterate for base classes,
every downstream mechanism (Watch, Debug Console, hover, conditional
breakpoints) inherits the fix.
### Why `fv` is no longer needed
Earlier we had an `fv(path)` helper plus a plan for a companion VS Code
extension to wrap "Add to Watch" results in `/py fv(...)`. The
`Value.__getattr__` patch makes the default path work, so that whole
layer is obsolete.
### Why not patch `SBValue` instead
Tempting, but much larger blast radius — affects every tool, every
adapter, every Python script using lldb. Patching `Value` confines the
change to CodeLLDB's expression pipeline.
## Ideas for Further Improvement
### Propose `/fv` mode to CodeLLDB upstream
Prefix dispatch is in CodeLLDB's Rust binary, so we can't add a new
prefix from Python. A clean feature request would be to add `/fv`
`SBFrame::GetValueForVariablePath(code)` as a native evaluator. That
would give direct access to lldb's own frame-variable walker without
the Python/Value indirection — though it has its own limitations
(no arithmetic, no casts).
### More synth providers
- **`TOptional<T>`** — hide the storage bytes and `bIsSet`; expose the
contained value (or `unset`) directly.
- **`TVariant<...>`** — expose the currently-held alternative as the
single child.
- **`FText`** — show the resolved localized string as the summary.
- **`FSoftObjectPtr` / `FSoftClassPtr`** — show the asset path.
### Enrich TObjectPtr summary
When resolved, show both class and name (`UUserWidget 'W_HUD_0'`)
instead of just the name. The class is reachable via
`ClassPrivate->NamePrivate`.
### A `fdump` helper
A Debug Console helper that prints an entire subtree as indented text —
useful for grabbing a snapshot of complex state into a log or comment.
### Get `Copy as Expression` to emit `->` for pointers
The path CodeLLDB builds for synthetic children uses `.` uniformly,
regardless of whether intermediate values are pointers. That's why
`v Top.Widget` fails but `v Top->Widget` works. A feature request to
have CodeLLDB emit `->` when traversing a pointer would make paths
`frame variable`-compatible out of the box.