Build journal · Trading systems · Live

MES Futures Automation.

A defined-risk intraday futures bot, written in Python against the Tradovate API, running on personal capital. This is the build journal: what's running, why it's built that way, and what won't be in this writeup.

Brad O'Haire Status: live · single-instrument · paper-then-live workflow Stack: Python · Tradovate REST + WebSocket · DuckDB

Why this work

The boring parts of running a live bot are the parts that decide whether it survives.

Most retail trading-bot writeups focus on the signal. The signal is the easy part. The hard part is the rest of the system: the kill switches, the fill-quality logging, the reconciliation between what you think you sent and what the broker actually did, the monitoring that catches a stuck order at 3:14 a.m. Every place this bot has been close to losing money has been an operations failure, not a signal failure. That's the part worth writing about.

Craft on display

Python event loop
REST + WebSocket integration
Defined-risk position model
Order-state machine
Reconciliation
Operational alerting
Risk discipline
Honest writeup

What this is

What it trades. Micro E-mini S&P 500 futures (MES). One instrument. Defined risk per trade. Intraday only, no overnight exposure.
Why MES. $5 per point notional. Tight tick value lets the risk model scale down to a single contract without the position size doing the talking. Right instrument for proving an architecture before scaling notional.
What I'm sharing. Architecture, risk framework, operations playbook, and the specific places I've been wrong. Enough for another engineer to rebuild the skeleton.
What I'm not sharing. The signal. Specific parameter values. Live P&L. The reasons are below in section 06.
Disclosure. Personal capital, my own account, my own risk. This isn't a service, isn't a product, isn't advice.

01 / The premiseWhy automate this at all

I'm a discretionary trader by background. The reason to automate isn't that the algo finds something my eye can't. The reason is that my eye is inconsistent and the algo isn't. The same setup that I'd take on Tuesday morning I might skip on Friday afternoon because the week has been bad and I'm tired and the cognitive cost of pulling the trigger is higher than it should be. An automation removes that variance from the execution layer. It still leaves the strategy variance, the model risk, and the operational risk, which is plenty.

The other reason is honest measurement. A discretionary trader's journal is full of trades that "I almost took" and trades that "I would have taken if not for X." An automation's journal is the trades it actually took. The accountability gap collapses to zero. That alone is worth the build cost.

The reason to automate isn't that the bot sees something I can't. The reason is that the bot does exactly what it said it would, which is the part I'm bad at.

02 / ArchitectureThree loops, one ledger

The system is three asynchronous loops sharing one source of truth. The loops are deliberately separated so that one loop crashing doesn't take the others down with it.

Market loop

→

Decision loop

→

Order loop

WebSocket ticks
1m bars

→

Local ledger
(DuckDB)

←

Order state
Fill events

Risk monitor

→

Kill switch

→

Pager / Slack

Three loops sharing one ledger. Risk monitor sits perpendicular and can flatten everything.

Market loop

Subscribes to MES tick data over WebSocket. Aggregates into 1-minute bars on the local clock. Writes both the raw tick stream and the bar table to DuckDB on disk. The reason for writing raw ticks is reconciliation: if the bar logic ever drifts from what the broker reports, I can replay the actual tick stream and find the bug instead of arguing with the chart.

Decision loop

Reads the bar table on a fixed cadence. Evaluates the entry rule. If a signal fires, writes a trade_intent row to the ledger. The decision loop never talks to the broker. This separation is the single most useful design choice in the whole system.

Order loop

Polls the ledger for new trade_intent rows. Constructs the bracket order (entry + stop + target), submits it via the Tradovate REST endpoint, and writes the broker's response back to the ledger. Then it tails the order events WebSocket and updates the ledger as fills come in. This is the only loop that has authority to send orders.

Risk monitor

Runs perpendicular to all three loops. Reads the ledger every few seconds. Compares realized P&L against the daily stop, open positions against the max-position rule, and the time of day against the no-trade window. Has the authority to call flatten_all() and to set a halt_until flag that the order loop respects. The risk monitor can't open positions. It can only close them and stop new ones.

03 / The risk modelWhere the discipline lives

The strategy logic in this bot is the part you can replace with another strategy. The risk model isn't. It is the part that decides whether the system survives a bad day, and it's worth describing precisely:

Per-trade risk

Every trade is bracketed at submission time. Stop loss and profit target are placed as part of the same OCO order, not as separate orders the bot tries to maintain. The maximum loss on any single trade is fixed in dollar terms before entry. If the spread to the stop is wider than that dollar amount, the trade isn't sized down, it's skipped. Sizing down past one micro contract isn't an option, so the rule is binary.

Daily stop

A hard daily loss limit. Hit it and the bot flattens any open positions and sets halt_until to the next session open. The number is set so that a worst-case day is annoying but not destructive. The number lives in a config file, not in the code, so changing it is a deliberate edit.

Daily target

Less common in retail writeups, but I think correct. A daily profit target also halts the bot. The reason is asymmetry: a great session followed by a give-back is a frequent enough pattern that capping the upside removes the impulse to "let it run." The bot doesn't have impulses, but the discretionary failure mode would be to manually re-enable it. The cap removes the temptation.

No-trade windows

The bot doesn't trade in the first 5 minutes of the session, doesn't trade in the last 10 minutes of the session, and skips the 30-minute window around scheduled high-impact macro releases (CPI, FOMC, NFP). The reason isn't superstition. The reason is that fill quality during those windows is bad enough to matter at this notional, and the strategy assumes a typical execution environment. Trades that depend on a non-typical environment belong in a different system.

04 / The order-state machineThe boring part that matters most

The single biggest source of bugs in any retail bot I've seen is the assumption that "I sent the order" means "the order exists at the broker." Networks drop packets. APIs return 200 on requests that don't actually clear. Filling logic on the broker side has its own state. The fix is a state machine the bot owns, with the broker's responses as inputs:

# conceptual order states the bot tracks
class OrderState(Enum):
    DRAFT          = "draft"           # intent recorded, not sent
    SUBMITTED      = "submitted"       # sent to broker, no ack yet
    ACKNOWLEDGED   = "acknowledged"    # broker returned an order id
    WORKING        = "working"         # live in the book
    PARTIALLY_FILLED = "partial"      # some quantity filled
    FILLED         = "filled"          # fully filled
    CANCEL_PENDING = "cancel_pending"  # cancel sent
    CANCELED       = "canceled"        # broker confirmed cancel
    REJECTED       = "rejected"        # broker refused
    UNKNOWN        = "unknown"         # reconciliation needed

Every state transition is logged with the broker payload that triggered it. UNKNOWN exists because some failure modes leave the bot's view of the order desynced from the broker. When that happens the reconciliation routine queries the broker's order history and forces the bot's view to match, then alerts.

This sounds tedious because it is. It is also the difference between a bot that runs for months and a bot that you have to reboot every few days because something got stuck.

05 / Operational lessonsThe places I've been wrong

The signal hasn't changed much. Almost everything I've fixed since this bot went live has been operational:

Lesson 1: Clock skew is a real problem

Early version of the decision loop assumed local system time matched exchange time within a second. It mostly does. The day it didn't, the bot evaluated a signal on a stale bar and entered a position 90 seconds later than the rule called for. Fix: every signal carries the timestamp of the bar it was generated from, and the order loop refuses to send any order whose intent is more than N seconds old.

Lesson 2: WebSocket reconnects are not free

The Tradovate WebSocket disconnects sometimes. Reconnect is fast but not instant. During the gap, ticks are lost. If the bot is in a position during a disconnect, it can't see the stop being approached. Fix: any reconnect lasting longer than a defined threshold triggers flatten_all() on safety grounds. A whipsaw close is cheaper than an unmanaged position.

Lesson 3: My local clock is not the broker's clock

Order timestamps from the broker and bar timestamps from the data feed don't always line up to the same source. Reconciliation needs to be timestamp-aware about which clock it's using. This bug ate two days of debugging the first time it showed up.

Lesson 4: Logging is not free either

Writing every tick to disk in the same loop that's processing the tick will eventually back up under volatility. Fix: ticks land in an in-memory ring buffer first, and a separate writer thread drains them to DuckDB. The processing loop never blocks on disk.

Lesson 5: The "halt the bot" Slack alert needs to go to my phone, not my laptop

Self-explanatory. The first time the daily stop fired during a meeting and I didn't see the alert until two hours later, I added pagerduty.

06 / What I'm not sharing, and whyWhere the line is

This page deliberately stops short of the signal. The reasons, listed:

The signal isn't novel enough to defend. If I posted the rule, two-thirds of the readers would either already be running it or be capable of running it within a week. Once it's a more crowded trade, the edge erodes. That's a normal property of public trading rules; it's not a moral problem, it's a math one.

Live P&L is more theatre than evidence. A few months of P&L on a single instrument is mostly noise. The number that matters is the multi-year, multi-regime track record, and I don't have that yet. Posting a chart that looks great for six months would be misleading. Posting a chart that looks bad for six months would also be misleading. So I'm not posting a chart.

The architecture is the transferable part. The state machine, the loop separation, the risk monitor, the reconciliation pattern: all of that ports to a different signal, a different broker, a different instrument. If you're an engineer or a trader reading this to figure out whether I can build a real production system, that's the part that answers the question.

For recruiters and operators

If you want to talk through the parts that aren't on the page, I'm happy to do that in a conversation. The line for what I'll discuss in private is much further out than the line for what I'll publish.

07 / Caveats I want statedWhere this could break

Honest about the limits

Single instrument, single regime. The bot has been live across one vol regime. The risk model is built to survive a worse one but hasn't been tested against one yet.
Tradovate is one broker. The state machine is broker-agnostic by design but only proven against one API. Another broker would surface different edge cases.
Latency is not optimized. This is a bar-cadence strategy, not a tick-cadence one. Low-latency infrastructure isn't on the roadmap because the strategy doesn't depend on it. A signal that did would need a different stack.
I am not a quant. The strategy is informed by reading and discretionary experience, not by a formal research process. The framing of this page reflects that. Anyone who wants to extend the work should treat the architecture as the load-bearing part.
Personal capital. Personal account. Nothing on this page is investment advice or a recommendation to anyone.

Pre-registered open questions

What I'm planning to test next, written down so I can hold myself to the result.

Does the risk model survive a synthetic 2020-style spike? Replay March 2020 tick data through the bot and confirm the daily stop and reconciliation behave correctly. Pre-registered fail condition: any state where the bot is unaware of an open position for more than one bar.
How does fill quality degrade when the bot trades into a no-trade window by mistake? Synthetic test: force the bot to ignore the no-trade window, measure realized vs theoretical fill price. The cost of the window violation is the value of the rule.
What's the smallest meaningful sample size for the daily stop number? The current daily stop is set on a discretionary basis. The honest version is calibrated against a measured loss distribution. I owe a writeup of how I'd derive it.