结构化输出 / Structured Output

强制模型返回严格符合 JSON Schema 的结构化数据，避免解析字符串 / 正则提取。适合数据提取、表单填充、工作流中间产物等场景。

何时使用

需要可直接反序列化的 JSON（不允许 json.loads 失败）
下游系统有严格 schema（数据库字段、API payload）
数据提取 / 实体识别 / 分类
作为 Agent 的决策输出（模型必须输出某个 action 对象）

和工具调用的区别：

	结构化输出	工具调用
目的	让最终回答是 JSON	让模型调用你的函数
循环	单次请求即可	至少两轮（调用 → 回传结果）
典型场景	数据提取、字段填充	Agent、外部系统集成

三种实现方式

不同厂商有不同的实现路径。Turing 都能通过 Chat Completions / Messages 两条协议透传。

方式 A：`response_format` JSON Mode（OpenAI 最早版本）

保证输出是合法 JSON，但不强制具体 schema。模型可能返回结构不符合预期。

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="https://live-turing.cn.llm.tcljd.com/api/v1",
)

response = client.chat.completions.create(
    model="turing/gpt-5.4-mini",
    messages=[
        {"role": "system", "content": "你是 JSON 输出助手，返回严格 JSON。"},
        {"role": "user", "content": "提取：张三，28 岁，工程师"},
    ],
    response_format={"type": "json_object"},
)

data = json.loads(response.choices[0].message.content)

注意

使用 json_object 时必须在 prompt 中提到 "JSON"，否则 OpenAI 会拒绝请求。

方式 B：`response_format` with JSON Schema（推荐）

强制输出严格符合 schema。OpenAI gpt-4o / gpt-5.x 支持。

response = client.chat.completions.create(
    model="turing/gpt-5.4-mini",
    messages=[
        {"role": "user", "content": "提取：张三，28 岁，工程师"},
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "person_info",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                    "occupation": {"type": "string"}
                },
                "required": ["name", "age", "occupation"],
                "additionalProperties": False
            }
        }
    },
)

# 保证解析成功，字段齐全
data = json.loads(response.choices[0].message.content)

curl $TURING_BASE_URL/chat/completions \
  -H "Authorization: Bearer $TURING_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "turing/gpt-5.4-mini",
    "messages": [{"role": "user", "content": "提取：张三，28 岁，工程师"}],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "person_info",
        "strict": true,
        "schema": {
          "type": "object",
          "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"},
            "occupation": {"type": "string"}
          },
          "required": ["name", "age", "occupation"],
          "additionalProperties": false
        }
      }
    }
  }'

方式 C：Gemini 原生 `responseSchema`

Gemini 家族用 responseMimeType + responseSchema 两个字段。

response = client.chat.completions.create(
    model="turing/gemini-3.1-pro-latest",
    messages=[{"role": "user", "content": "提取：张三，28 岁，工程师"}],
    extra_body={
        "responseMimeType": "application/json",
        "responseSchema": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "age": {"type": "integer"},
                "occupation": {"type": "string"}
            },
            "required": ["name", "age", "occupation"]
        }
    }
)

Claude：通过工具调用模拟

Claude 原生不支持 response_format。最稳的做法是用工具调用当成单次结构化输出：定义一个工具，强制调用它：

from anthropic import Anthropic

client = Anthropic(
    base_url="https://live-turing.cn.llm.tcljd.com/api/v1",
    auth_token="your-api-key",
)

response = client.messages.create(
    model="turing/claude-sonnet-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "提取：张三，28 岁，工程师"}],
    tools=[{
        "name": "emit_person_info",
        "description": "返回提取到的人员信息",
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "age": {"type": "integer"},
                "occupation": {"type": "string"}
            },
            "required": ["name", "age", "occupation"]
        }
    }],
    tool_choice={"type": "tool", "name": "emit_person_info"},
)

# 从 tool_use block 拿到结构化数据
data = next(b.input for b in response.content if b.type == "tool_use")

厂商差异速查

供应商	JSON Mode	JSON Schema	原生字段	模型选择
OpenAI	✅ `response_format: {"type":"json_object"}`	✅ `response_format: {"type":"json_schema","json_schema":{…,"strict":true}}`	—	见模型列表 → OpenAI，优先选支持 JSON Schema / strict 的型号
Anthropic	❌ 用工具调用模拟	❌ 用工具调用 + `tool_choice: {"type":"tool",…}`	—	见模型列表 → Claude，优先选工具调用稳定的大模型
Gemini	✅ `responseMimeType: "application/json"`	✅ `responseSchema` + `responseMimeType`	`responseMimeType` / `responseSchema`	见模型列表 → Gemini，优先选支持结构化输出的型号
Qwen	✅ `response_format: {"type":"json_object"}`	⚠ 依赖模型	—	见模型列表 → 阿里，优先选 JSON 稳定性更好的主力型号
DeepSeek	✅	⚠ 依赖模型	—	见模型列表 → DeepSeek

参数参考

完整 schema 见：

OpenAI / 跨厂商：/api/create-chat-completion 的 response_format
Gemini 原生:见上方方式 C
Claude 工具法：/api/create-message 的 tools + tool_choice

计费影响

模型要生成整个 JSON 结构，completion_tokens 会包含所有字段名和符号
Schema 定义（parameters / json_schema）本身也占输入 token
OpenAI strict: true 首次调用会花少量额外时间编译 schema（之后会缓存到模型服务端）

常见问题

strict: true 报 schema 不支持 → OpenAI strict 模式要求 additionalProperties: false、不支持 oneOf/anyOf 在根级；参考 OpenAI Structured Outputs Docs 的限制列表
Gemini 返回 JSON 但字段顺序乱 → 正常，JSON Object 本身无序；按 key 读取即可
Claude 调用了工具但参数字段为空 → 在 input_schema 里加更多 required + 丰富 description
小模型生成的 JSON 不合法 → 升级到大模型，或叠加 json_object + 二次正则校验

何时使用​

三种实现方式​

方式 A：response_format JSON Mode（OpenAI 最早版本）​

方式 B：response_format with JSON Schema（推荐）​

方式 C：Gemini 原生 responseSchema​

Claude：通过工具调用模拟​

厂商差异速查​

参数参考​

计费影响​

常见问题​

See also​