The Agentic Wire
Archive

Search

Every story we've curated, in one place. Type a phrase, a tool name, or a researcher. Quotes match phrases; a leading - excludes.

3 matches for evals
  1. Industry

    AI evals are becoming the new compute bottleneck (19 minute read)

    AI evaluation is emerging as a serious compute bottleneck, with some benchmark runs now rivaling training costs. The piece is useful for builders because it quantifies where eval spend concentrates and argues for better…

  2. Industry

    The Sequence Opinion #860: Every Company's Last eXam: Some Reflection About Practical AI Evals

    Some ideas about how companies should think about evaluations.

  3. Industry

    OpenAI's Database Change Analysis (28 minute read)

    SchemaFlow demonstrates an AI-assisted workflow for database change requests, covering structured request parsing, impact analysis, SQL generation, guardrails, artifact creation, and evals. The cookbook used a retail…