Decision Pipeline

The framing

The interesting question is not "how do I let an LLM trade for me." It is "how do I structure a trading system so that I can usefully consult an LLM about it, without ever giving the model authority over real orders."

Those are very different problems. The first is an autonomous-agent problem and almost always premature. The second is mostly a layout problem: where the data lives, what the human can paste into a chat, what the system exposes read-only, what stays completely sealed.

This chapter is about the second one.

Decompose the loop

A useful investment loop has four stages, and they should be visibly separate processes, not lines in one script:

Stage	Responsibility	Determinism
Perception	Ingest market data (candles, orderbook, on-chain state) into local storage	Fully deterministic — pure I/O
Analysis	Compute signals from perception data (VWAP, Z-score, position cost, etc.)	Fully deterministic — pure math
Decision	Combine signals with constraints to produce an order intent	Deterministic — closed-form rules
Execution	Place, cancel, and track orders on the exchange; record fills	Fully deterministic — pure I/O

Every stage is deterministic. There is no slot for an LLM in any of them. A model that cleans up a noisy candle feed will sometimes invent candles; a model that "helps" with sizing will sometimes invent sizes. The whole stack is built to be reproducible from inputs.

The LLM lives outside this pipeline, talking only to the human.

What stays sealed

The components that should never see an LLM, not now and not later:

Data ingestion. A worker polls the exchange's public API on a fixed interval and writes raw candles to a local sorted set (Redis) and a long-term store (SQLite). It has no opinion.
Signal math. VWAP, Z-score, moving averages, position cost — all closed form. A separate worker reads the candles and writes signal hashes back to Redis.
Order placement. A third worker reads order intents from a queue, places them through the exchange's private API, and records the fills.
Fill tracking. Every executed order is written to a local executions table with timestamp, price, size, fee, and a reference back to the signal that produced it.

Splitting these into distinct workers is the same separation-of-failure-mode argument from the previous chapter. A signal worker crashing on a malformed candle does not bring down the execution worker that is managing existing orders.

This entire stack is one half of the system, and the LLM is excluded from all of it.

What the LLM gets — the analytics layer

The other half of the system is an analytics layer that reads from the same underlying data and produces a human-readable view:

Portfolio state across exchanges and chains.
Cost basis and unrealized PnL by position.
Recent executions with their signal context.
Tax lots and realized PnL by year.
LP positions, fee accruals, on-chain balances.

The analytics layer is a read replica, derived from the execution side's writes. It never writes back. A bug in the analytics dashboard cannot corrupt live trading state, because there is no path from the analytics layer to the execution layer.

This is the layer the LLM gets to see. Not through a wired-in API call — through the human, who pastes a portfolio view, a signal snapshot, or a recent decision log into a chat and asks for a second opinion.

What consultation actually is

In practice, LLM-assisted investing today is one of these conversations:

"Here is my current portfolio breakdown. Does the LP / spot / cash ratio still match my stated thesis, or have I drifted?"
"Here is the last 30 days of executions. The strategy was supposed to ladder buys; did it actually? Where did it deviate?"
"Here is the signal hash for symbol X right now (z_score, vwap, c_t). Walk me through whether the upcoming order makes sense, and what would change my mind."
"Here is a draft of the tax filing. The lots come from this CSV. Does the total-average computation look right?"

None of these involve the LLM placing an order. None of them give the LLM write access to anything. The LLM is a sounding board for the human, and the human is the one who acts.

The structural requirement for this to be useful is that the analytics layer makes pasteable views easy. If the human has to manually stitch together five queries to assemble a portfolio snapshot, they will not consult the LLM, because the friction is too high. So:

Every important view has a CSV or JSON export.
The dashboard pages double as exportable artifacts.
The signal hashes are dumpable to a single block of text.

That is the actual integration work: not building an advisor service, but making the analytics layer fluent in formats a chat window will accept.

What not to do

The temptation, once the analytics layer is clean, is to "close the loop" — let the LLM read the signal hash directly, let it propose an order, let it size a position.

Don't.

Latency. LLM inference is seconds. Markets move in milliseconds. Even with batching, the model cannot sit on the hot path.
Hallucination on quantities. A model that occasionally invents a price or a size controls real capital. Rare hallucinations compound over months.
No audit trail without ceremony. A free-form generation has no schema. Three months later, asking "why did we open this position" produces a paragraph instead of a row in a log.
No risk envelope. The model does not natively respect a position-size cap or a daily-loss limit. Those have to be enforced in the deterministic layer regardless.

Every one of these is solvable in principle — strict JSON schemas, prompt + response logging, deterministic risk gates wrapping the model. The point is that all of that work has to be done before the closed loop is honest, and none of it is necessary for the consultation pattern. The consultation pattern delivers most of the value of "LLM in the loop" with none of the engineering cost or correctness risk.

So the default position is: stay in consultation. Move to a wired-in advisor only when the consultation pattern has been outgrown for a specific, articulated reason — and even then, only as an advisor, never as a driver.

Where the analytics layer can use an LLM internally

There is one safe place for the analytics layer to call an LLM directly, not in service of trading decisions:

Explaining what happened. Generating a one-paragraph summary of a month's PnL.
Drafting tax documentation. Producing prose around the numerical filing.
Annotating execution logs. Tagging trades with regime context ("this was the day of the FOMC announcement").

These are generative tasks with no execution authority. They run on read-only data and produce documents, not orders. They live entirely on the analytics side of the system, and the worst possible failure mode is a misleading paragraph — not a misplaced trade.

This is the only place in the stack where the LLM has a permanent home today. Everything else is the human, copying and pasting.

Sanity checks

Properties the system should preserve:

The execution pipeline has no LLM calls anywhere in it.
The analytics layer is a read replica with no write path back to execution.
Every fill has a signal snapshot ID logged at decision time.
Important analytics views have a one-click export to CSV or JSON.
Consultation happens through copy-paste, not through API calls from the LLM.

If those stay true, the system remains in a configuration where introducing an actual advisor service later is a small, deliberate change — not a rewrite. That optionality is the real payoff of keeping the layers separate.