← hackathons

Data-OnCall

first place

First place. 4-agent incident response team for data quality bugs — fine-tuned Llama 3.1 + LoRA for NL→GraphQL, end-to-end in 54.8s at ~2¢/run.

April 10, 2026 · DataHub × Nebius Hackathon ·Entrepreneur First, SF · repo ↗
PythonDataHubNebiusLlama 3.1 + LoRAMulti-agent

First place, solo build.

A 4-agent system that takes a data quality alert from DataHub and produces a complete incident response: triage, root-cause investigation against lineage, fix proposal, and human-readable writeup. Each agent owns one stage and hands off via structured artifacts.

  • Coordinator (Kimi-K2-Thinking) — long-horizon planner with visible reasoning traces
  • Detective (Llama 3.1 8B + LoRA) — lineage tracer via NL→GraphQL
  • Reality-Checker (same fine-tune, different system prompt)
  • Fixer (MiniMax-M2.5) — writes the postmortem back to the catalog via Python SDK

Fine-tuned the LoRA myself on 300 synthetic NL→GraphQL pairs targeting narrow DataHub query patterns. Validation loss dropped 34% over 3 epochs, monotonic, no overfitting.

End-to-end in 54.8 seconds at ~2¢ per run. Found three planted bugs by exact row count: 5,632 truncated seller IDs, 7,955 deleted customers, 988 NULL categories.

The motivating problem was real: Elias (my personal agent on OpenClaw) had been telling guests that one of my properties had a hot tub. It doesn’t. The hallucination had propagated across LanceDB / Postgres / Qdrant / Gemini embeddings, and I had no traceability. DataHub gave me the metadata catalog + lineage graph I needed to root-cause it.

Solve your own problem — it’s almost always someone else’s problem too.