Skip to content

3.3 结构化输出(Structured Output)

让 LLM 输出可靠的、可解析的结构化数据——JSON Schema、Pydantic 校验、Instructor 库,这是 AI 应用开发的基础能力。

学习时长:1-2 周


为什么需要结构化输出?

LLM 默认输出自由文本:

  用户:"提取这段文字中的人名和公司"
  模型:"文中提到了张三,他在阿里巴巴工作..."
  问题:无法直接被代码解析 ❌

结构化输出:

  用户:"提取这段文字中的人名和公司"
  模型:{"name": "张三", "company": "阿里巴巴"}
  结果:可以直接 json.loads() 使用 ✅

核心需求:AI 输出必须能被下游代码可靠消费。


方式一:OpenAI JSON Mode

python
from openai import OpenAI

client = OpenAI()

# 基础 JSON 模式
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "你是一个信息提取助手,输出 JSON 格式"},
        {"role": "user", "content": "从以下文本中提取人名和公司:张三是阿里巴巴的高级工程师"}
    ],
    response_format={"type": "json_object"}  # 强制 JSON 输出
)

import json
data = json.loads(response.choices[0].message.content)
# {"name": "张三", "company": "阿里巴巴"}

限制:只保证输出是合法 JSON,不保证 Schema 正确。


方式二:OpenAI Structured Outputs(推荐)⭐

python
from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

# 1. 用 Pydantic 定义期望的输出结构
class PersonInfo(BaseModel):
    name: str
    company: str
    role: str
    confidence: float  # 置信度 0-1

class ExtractionResult(BaseModel):
    persons: list[PersonInfo]
    summary: str

# 2. 使用 parse 方法,自动校验输出
completion = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "从文本中提取人物信息"},
        {"role": "user", "content": "张三是阿里巴巴的P8工程师,李四在腾讯做产品经理"}
    ],
    response_format=ExtractionResult  # Pydantic 模型作为 Schema
)

result = completion.choices[0].message.parsed
# ExtractionResult 类型,IDE 自动补全
print(result.persons[0].name)      # "张三"
print(result.persons[0].company)   # "阿里巴巴"
print(result.summary)              # "文中提到了两位互联网从业者..."

优势:100% 保证输出匹配 Pydantic Schema,类型安全,IDE 补全。


方式三:Instructor 库(多模型通用)⭐

python
# pip install instructor
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field

# 1. Patch OpenAI client
client = instructor.from_openai(OpenAI())

# 2. 定义输出模型(带校验规则)
class MovieReview(BaseModel):
    title: str = Field(description="电影名称")
    rating: float = Field(ge=0, le=10, description="评分 0-10")
    pros: list[str] = Field(description="优点列表")
    cons: list[str] = Field(description="缺点列表")
    recommend: bool = Field(description="是否推荐")

# 3. 调用(自动重试直到输出合法)
review = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "评价电影《流浪地球2》"}
    ],
    response_model=MovieReview,  # 指定输出模型
    max_retries=3  # 校验失败自动重试
)

print(review.title)      # "流浪地球2"
print(review.rating)     # 8.5
print(review.recommend)  # True

Instructor 支持多种模型

python
# Anthropic
import anthropic
client = instructor.from_anthropic(anthropic.Anthropic())

# Ollama(本地模型)
from openai import OpenAI
client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
    mode=instructor.Mode.JSON
)

# DeepSeek
client = instructor.from_openai(
    OpenAI(base_url="https://api.deepseek.com", api_key="sk-xxx")
)

高级技巧

1. 枚举约束

python
from enum import Enum

class Sentiment(str, Enum):
    POSITIVE = "positive"
    NEGATIVE = "negative"
    NEUTRAL = "neutral"

class SentimentResult(BaseModel):
    sentiment: Sentiment  # 输出只能是这三个值之一
    confidence: float = Field(ge=0, le=1)
    reason: str

2. 嵌套结构

python
class Address(BaseModel):
    city: str
    district: str
    street: str

class Company(BaseModel):
    name: str
    industry: str
    address: Address  # 嵌套对象

class Person(BaseModel):
    name: str
    age: int = Field(ge=0, le=150)
    companies: list[Company]  # 嵌套数组

3. 可选字段与默认值

python
from typing import Optional

class ArticleSummary(BaseModel):
    title: str
    author: Optional[str] = None  # 可能提取不到
    publish_date: Optional[str] = None
    key_points: list[str] = Field(default_factory=list)
    word_count: int = Field(default=0)

4. 流式结构化输出

python
# Instructor 支持流式输出 Partial 对象
from instructor import Partial

for partial_result in client.chat.completions.create_partial(
    model="gpt-4o",
    messages=[{"role": "user", "content": "详细分析比特币的投资价值"}],
    response_model=InvestmentAnalysis
):
    # 逐步填充字段,可以实时显示
    print(f"已解析: {partial_result.model_dump()}")

实战:结构化输出在 AI 应用中的典型场景

场景                     输出模型设计
─────────────────────────────────────────
信息提取           →     PersonInfo / CompanyInfo
分类任务           →     CategoryResult (enum)
情感分析           →     SentimentResult (enum + score)
数据清洗           →     CleanedRecord (标准化字段)
Agent 工具参数     →     ToolCallArgs (function params)
RAG 查询改写       →     RewrittenQuery (query + filters)
内容审核           →     ModerationResult (safe + reasons)
API 响应格式化     →     StandardResponse (code + data + msg)
python
# 实战示例:AI 内容审核
class ModerationResult(BaseModel):
    is_safe: bool
    risk_level: Literal["none", "low", "medium", "high"]
    categories: list[str] = Field(description="触发的风险类别")
    explanation: str
    suggested_action: Literal["pass", "review", "block"]

result = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "你是内容安全审核员"},
        {"role": "user", "content": f"审核以下内容:{user_input}"}
    ],
    response_model=ModerationResult
)

if result.suggested_action == "block":
    raise ContentBlockedError(result.explanation)

方式对比

方式可靠性多模型重试适用场景
Prompt 约束("输出 JSON")⭐⭐快速原型
JSON Mode⭐⭐⭐部分简单结构
Structured Outputs⭐⭐⭐⭐⭐OpenAI onlyOpenAI 项目
Instructor⭐⭐⭐⭐⭐✅ 全部生产推荐

学习资源

坚持是一种品格