Structured Outputs Are Now GA on the Claude API

Anthropic has moved structured outputs from beta to general availability. If you've been passing the anthropic-beta: output-128k-2025-02-19 header to get guaranteed JSON schema conformance, you can drop it. Structured outputs are now a first-class API feature with no beta flag required.

Note: The exam tests structured output patterns (Domain 4) — the move to GA status doesn't change the concepts tested, but it means you no longer need to pass the beta header in production.

What changed from beta to GA

Three things:

No more beta header. The anthropic-beta header is no longer required. Structured output parameters work directly in the standard API.
Expanded schema support. The GA release supports a broader set of JSON Schema features including anyOf, oneOf, nested $ref definitions, and more complex array item schemas. During beta, some of these would silently fall back to best-effort mode. Now they're fully enforced.
Improved grammar compilation latency. When you send a JSON schema, the API compiles it into a constrained grammar that guides token generation. In beta, complex schemas could add noticeable latency to the first request. The GA release has faster compilation, particularly for schemas with deep nesting or many properties.

How structured outputs work

Structured outputs guarantee that Claude's response conforms to a JSON schema you provide. This isn't prompt-based — it's not "please output JSON." The API constrains token generation at the decoding level so the output cannot violate your schema. Every key, every type, every required field is enforced.

There are two ways to use this.

Direct JSON schema enforcement

Pass a response_format parameter with your schema:

python

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6-20250514",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Analyse this customer review and extract sentiment, key topics, and urgency level: 'The product arrived damaged and support hasn't responded in 3 days. Very frustrated.'"
        }
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "review_analysis",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "sentiment": {
                        "type": "string",
                        "enum": ["positive", "negative", "neutral", "mixed"]
                    },
                    "topics": {
                        "type": "array",
                        "items": {"type": "string"}
                    },
                    "urgency": {
                        "type": "string",
                        "enum": ["low", "medium", "high", "critical"]
                    },
                    "summary": {
                        "type": "string"
                    }
                },
                "required": ["sentiment", "topics", "urgency", "summary"],
                "additionalProperties": False
            }
        }
    }
)

The response is guaranteed to be valid JSON matching that schema. You'll always get a sentiment from the enum, an array of topics, an urgency level, and a summary. No parsing failures, no missing fields, no creative reinterpretation of your structure.

Structured outputs via tool_use

The second approach uses tool definitions with input_schema. When Claude calls a tool, the input it generates is already constrained by the schema you defined in the tool's input_schema. This has been the standard pattern for structured extraction since the tool_use feature launched.

python

response = client.messages.create(
    model="claude-sonnet-4-6-20250514",
    max_tokens=1024,
    tools=[
        {
            "name": "extract_order_details",
            "description": "Extract structured order information from customer messages.",
            "input_schema": {
                "type": "object",
                "properties": {
                    "order_id": {
                        "type": "string",
                        "description": "Order number in format ORD-XXXXX"
                    },
                    "issue_type": {
                        "type": "string",
                        "enum": ["refund", "exchange", "tracking", "damage", "other"]
                    },
                    "amount": {
                        "type": "number",
                        "description": "Order amount if mentioned"
                    },
                    "requires_escalation": {
                        "type": "boolean"
                    }
                },
                "required": ["order_id", "issue_type", "requires_escalation"]
            }
        }
    ],
    tool_choice={"type": "tool", "name": "extract_order_details"},
    messages=[
        {
            "role": "user",
            "content": "I need a refund for order ORD-44821, it was $129.99 and arrived completely smashed."
        }
    ]
)

Using tool_choice to force a specific tool turns the tool call into a structured extraction step. The model must call the tool, and the input must conform to the schema. This gives you the same guarantee as response_format but routes through the tool_use mechanism.

Which approach to use

Use response_format when you want Claude's direct answer in a structured format. Use tool_use with forced tool choice when you're building the extraction into an agentic workflow where the structured data feeds into subsequent tool calls or agent steps.

In practice, tool_use is more common in production systems because it integrates naturally with the agentic tool loop. response_format is cleaner for standalone extraction tasks where you just need structured data back.

Validation still matters

Structured outputs guarantee schema conformance — the JSON will be valid and match your types. They don't guarantee semantic correctness. If you ask Claude to extract an order number and it hallucinates one that looks plausible but doesn't exist, the output will still conform to your schema. It'll be a valid string in the right format. Just wrong.

Always validate extracted values against your source data:

python

import json

# Parse the structured output
tool_input = json.loads(response.content[0].input)

# Schema conformance is guaranteed — but verify the data
order = db.lookup_order(tool_input["order_id"])
if not order:
    # The model extracted a plausible but incorrect order ID
    handle_extraction_error(tool_input)

This is a critical distinction the exam tests. Schema enforcement and semantic validation are separate concerns. Structured outputs handle the first. Your application code handles the second.

What this means for the exam

Domain 4 tests structured output patterns — specifically how to use JSON schemas with the API, when to use tool_use for structured extraction, and how to validate outputs. The move from beta to GA doesn't change any of these concepts. The patterns are identical; you just don't need the beta header any more.

The exam-relevant concepts are:

Schema design: choosing the right types, enums for constrained values, required vs optional fields
tool_use as structured extraction: forcing tool calls with tool_choice to guarantee structured output
Validation layering: understanding that schema conformance and semantic correctness are different things
When structured outputs break down: very large schemas with deep nesting can increase latency; extremely open-ended tasks don't benefit from rigid schemas

Work through Domain 4 — Prompt Engineering & Structured Output for the full treatment, or jump to Task 4.4 — Structured Output for the specific task statement.

Structured Outputs Are Now GA on the Claude API

What changed from beta to GA

How structured outputs work

Direct JSON schema enforcement

Structured outputs via tool_use

Which approach to use

Validation still matters

What this means for the exam

Related articles

Prompt Engineering for Claude: System Prompts, Few-Shot, and Output Validation

Claude vs ChatGPT for Developers: An Honest Technical Comparison