Dexter is an autonomous research agent for financial analysis. You ask a question, it plans the steps, pulls income statements, balance sheets, and cash flows from the Financial Datasets API, validates its own answer, and iterates until it’s done. The README pitches it as “Claude Code, but for financial research” — that framing is closer to honest than most.

The part I’d steal first is the scratchpad. Every query writes a .dexter/scratchpad/<timestamp>_<hash>.jsonl file with newline-delimited entries: the original question, every tool call with its raw result, the LLM summary of that result, and the agent’s reasoning steps. JSONL on disk, not a hosted trace UI, not OpenTelemetry — just a file you can grep. For an agent that touches numbers people will act on, an auditable log isn’t optional, and most agent projects bolt this on later.

Two more things worth flagging. There’s an actual eval suite (bun run src/evals/run.ts), backed by LangSmith and LLM-as-judge scoring against a dataset of real questions, with a sample flag for quick runs. And there’s a WhatsApp gateway — bun run gateway:login, scan a QR, then DM yourself queries from the kitchen. Useful or gimmicky depending on your day.

The skeptical note is the obvious one. The agent summarizes every tool result through an LLM before reasoning over it, which is fine for narrative answers and dangerous for anything you’d put in a model. “Revenue grew from $274B to $394B” is a sentence; the underlying API call is the source of truth. I’d want the raw JSON in front of me before I trusted any number from a pipeline like this.

Bun runtime, MIT licensed, 24.7k stars. Bring your own OpenAI (or Anthropic, Google, xAI, OpenRouter, Ollama) and Financial Datasets API keys.