Skip to main content
TACAVAR
Build in Public

How We Built an Autonomous Trading Bot for Crypto + Prediction Markets

A technical breakdown of Tacavar's 9-strategy LLM trading architecture, Polymarket prediction market integration, and the hard-veto critic system that keeps it safe — built in public, paper-traded with real data.

Most AI trading bots run one strategy. Ours runs nine — plus a prediction market layer no competitor has.

That's not a boast. It's a design decision that emerged from a specific failure mode: single-strategy bots work until they don't. A mean reversion bot crushes it in a ranging market and bleeds out in a breakout. A momentum bot prints in a trend and dies in a chop.

We wanted a system that adapts to the market instead of hoping the market adapts to it. So we built one.

This is the architecture, the reasoning behind it, and what the data shows after 30 days of paper trading.

Architecture: The 15-Minute Cycle

The bot runs a continuous loop. Every 15 minutes, it executes this pipeline:

``` data_scraper.py (60s tick) → insights.jsonl crypto_strategies.py (per cycle, 9 strategies) → insights.jsonl (trade signals) autopilot.py → reads insights.jsonl → LLM decision → decisions.jsonl → critic system ```

**data_scraper.py** collects market data, sentiment, and on-chain metrics on a 60-second tick. It calls 12+ APIs: exchange order books, funding rates, open interest, Fear & Greed Index, Deribit options flow, Polymarket odds and volume. Circuit breakers catch API failures — when Twitter's API returns 403 (it does, 88-90% of the time), the scraper logs the error and moves on rather than blocking the cycle.

**crypto_strategies.py** runs 9 signal generators in parallel. Each strategy writes trade_signal events to `insights.jsonl`. They don't compete — they contribute. A single cycle might see mean reversion flagging a BTC oversold condition while volume breakout detects nothing and market regime detection outputs "ranging with bullish tilt."

**autopilot.py** is the decision engine. It reads the accumulated insights, calls an LLM with a sanitized prompt containing the full market context, and writes a decision to `decisions.jsonl`. The decision includes: action (buy/sell/hold), confidence score (0-1), strategy attribution, and reasoning hash.

Then the critic reviews it. More on that below.

The 9 Crypto Strategies

Nine sounded like too many when we started. We kept adding because each captured an edge the others missed.

**Mean Reversion** — Fades extremes in ranging markets. Uses ADX, Bollinger Band width, and distance from EMA50 to identify regimes where reversion is likely. Performed best in the 30-day test.

**RSI Swing Trading** — Enters on RSI dips below 30 and exits above 70. Standard but effective when combined with regime filters.

**MACD Cross** — Tracks histogram momentum shifts. Useful as a confluence signal rather than a standalone entry.

**Bollinger Bounce** — Scalps bounces off the lower band in trending markets. Bands must be narrowing — wide bands in high volatility are unreliable.

**Volume Breakout** — Flags when price moves on volume 2x above the 20-period average. Triggers in trending regimes only.

**DCA Ladder** — Dollar-cost averages into positions during confirmed dips. Limited to 3 entries per cycle, max 0.5% position per entry.

**Minute Scalper** — Fast entries and exits on 1-minute candles. Tight stop, small profit targets. Designed to capture micro-inefficiencies during high-liquidity windows.

**Agentic LLM Analysis** — A separate LLM call that analyzes the full market context and makes an open-ended recommendation. Not constrained to a single indicator. This is the wildcard — sometimes it catches things the indicators miss.

**Market Regime Detection** — The meta-strategy. Identifies whether the market is trending, ranging, or breaking down. This output feeds into every other strategy as a modifier — a buy signal from mean reversion in a strong downtrend gets deprioritized.

Why nine? Diversification of edge. No single strategy dominates, but the ensemble adapts to changing conditions. In ranging markets, mean reversion carries. In breakouts, volume and momentum take over. The LLM synthesizes the ensemble weighted by regime confidence.

The Prediction Market Layer: Where We're Different

No competitor offers Polymarket integration. Not 3Commas. Not Cryptohopper. Not Bitsgap.

Prediction markets are structurally different from crypto spot trading. You're not betting on price direction — you're betting on the probability of a binary event. The edge lives in information synthesis and base rate analysis, not pattern recognition.

The system has five layers:

1. **Market Monitoring** — Polls Polymarket for active markets in focus categories (macro, crypto regulation, Fed policy, ETF decisions). Tracks prices, volume, open interest, and flags unusual movement.

2. **Probability Estimation** — Generates an independent probability estimate using base rates from historical analogues, current data (polling, economic indicators), and LLM-synthesized news analysis. This is the core intellectual work.

3. **Edge Calculation** — Compares our estimate to Polymarket's current price. Edge = (our probability × payout) - cost. Only markets with divergence above 8 percentage points are eligible.

4. **Position Sizing** — Kelly Criterion with a 0.25x fraction cap. High-confidence, high-edge markets get larger positions. Uncertain estimates get minimal allocation regardless of upside.

5. **Execution** — Places orders via Polymarket's CLOB API. Monitors open positions for material information changes. Exits early if our estimate changes significantly.

Currently running 4 strategies (momentum, arb, fade, market-making). One (pm_mm) is disabled pending a fee-aware spread implementation — market-making before the fees are fully modeled is a losing proposition.

The Hard-Veto Critic System

This is the safety layer that most trading systems don't have.

The critic runs after every LLM decision. It produces one of five verdicts:

  • **Agree** — Decision is sound. Execute as proposed.
  • **Adjust** — Decision has merit but position size or timing needs modification.
  • **Veto** — Hard no. Decision contradicts safety rules or market data.
  • **Escalate** — Decision needs human review. Confidence is too low or contradiction is too complex.
  • **Monitor** — No action needed. Correct call to do nothing.

The critic has hard rules that cannot be overridden by the LLM:

  • Drawdown >= 4% → automatic veto on new positions
  • Drawdown >= 2.5% → automatic adjust (reduce position size)
  • Fear & Greed Index < 18 → vetoes all long entries
  • Total exposure >= 12% → halts new positions
  • 4 consecutive losses → veto until human review

These are simple, mechanical, non-negotiable. The LLM can reason about the market. The critic reasons about risk.

Over 30 days, the critic logged **126,915 signals**. Most were "monitor" — the system correctly choosing to watch rather than act.

The LLM Stack

The bot runs a triple-tier fallback chain:

  • **Primary:** Claude Opus via API
  • **Secondary:** Qwen3.5-plus via DashScope
  • **Emergency:** Local Ollama models (Qwen2:7b, Gemma3:4b)

We logged 427 error actions in 30 days. Most were provider unreachability. When Claude timed out, the fallback to Qwen kicked in. When both failed, the bot defaulted to hold.

Never depend on a single LLM provider. A trading bot that freezes because one API endpoint is down is not autonomous — it's fragile.

Sanitized prompts prevent the LLM from hallucinating false data. Contradiction checks force the model to explicitly confirm its reasoning doesn't conflict with incoming market data. If the critic disagrees twice in a row, the trade escalates to human review.

Paper Trading Results: 30 Days

The bot ran on $10K simulated capital. 25 closed trades across two markets.

Numbers worth noting:

  • **Mean reversion** in ranging markets performed best. ETH/USDT long at -0.9% closed at +1.3% when price reverted. An extreme fear reading (0.64) provided contrarian entry confidence.
  • **The critic prevented multiple bad trades.** At least 2 potential drawdown events were caught by the drawdown circuit breaker before they materialized.
  • **Confidence thresholds worked.** Auto-execute at 85%+ confidence. 65-85% queues for human review. Below 65% is a hold. 1,550+ instances where the bot correctly did nothing.
  • **427 error actions** — mostly provider outages. The triple-tier fallback prevented any trading halt.

The bot is still in paper trading mode (`dry_run = true`). Before we trade real capital, we need consistent profitability over 6-12 months, max drawdown under 15%, win rate above 45%, and zero catastrophic failures.

Why Polymarket Is Underutilized by Algo Traders

Most algorithmic traders don't touch prediction markets. The reasons are straightforward:

1. **Binary outcomes** don't fit standard backtesting frameworks 2. **Limited historical data** compared to crypto or equities 3. **Thin liquidity** makes large position entries difficult 4. **Dominated by casual, narrative-driven participants**

Each of these is a structural barrier. Each is also a structural opportunity.

When a market is dominated by participants who trade on vibes rather than data, the systematic trader with real probability estimation has a persistent edge. The friction that keeps most algo traders out is the exact friction that makes prediction markets exploitable.

As Polymarket volume grows and 2026 election cycle activity expands, that edge will compress. The time to build the capability is before the competition arrives.

Building in Public

We publish the data because trust in trading systems is earned, not claimed.

The bot runs 24/7. The data is published. When the edge compresses, we'll find or build a new one.

That's what autonomous systems do. They don't stop optimizing — they adapt.

*You built it. We optimize it.*