学习时长:2-3 周
记忆与状态管理是 AI Agent 实现连续对话、个性化服务和复杂任务执行的核心能力。本节覆盖从简单的对话历史管理到高级的长期记忆系统设计。
📋 8 种记忆策略总览
| # | 策略 | 核心思想 | 适用场景 | 章节 |
|---|---|---|---|---|
| 1 | 固定窗口 | 保留最近 N 轮对话,超出丢弃 | 简单对话、成本敏感 | 5.4.1 §1 |
| 2 | Token 预算 | 按 Token 数裁剪历史,精确控制成本 | API 调用、预算控制 | 5.4.1 §1 |
| 3 | 滑动窗口 + 摘要 | 近期保留原文,远期压缩为摘要 | 长对话、客服场景 | 5.4.1 §1 |
| 4 | 上下文压缩 | 用 LLM 提取关键信息,压缩历史 | 信息密集对话 | 5.4.1 §2 |
| 5 | 短期记忆 | 会话内的工作记忆,任务结束即清 | 单次任务、工具调用 | 5.4.2 §1 |
| 6 | 长期记忆 | 向量数据库持久化,跨会话检索 | 个性化、知识积累 | 5.4.2 §2 |
| 7 | 优先级与遗忘 | 按重要性评分,低优先级自动遗忘 | 长期运行 Agent | 5.4.4 §1 |
| 8 | 情景记忆 | 记录完整事件片段,按场景检索 | 复杂推理、经验复用 | 5.4.4 §2 |
5.4.1 对话历史管理
核心概念
对话历史(Conversation History)是 Agent 理解上下文的基础,但受限于 LLM 的上下文窗口(Context Window),需要合理的管理策略。
挑战与解决方案
| 挑战 | 问题 | 解决方案 |
|---|---|---|
| 上下文长度限制 | GPT-4 最多 128K tokens | 滑动窗口、摘要压缩 |
| 成本控制 | 历史越长,API 费用越高 | 智能裁剪、缓存优化 |
| 信息冗余 | 无关历史干扰推理 | 相关性过滤、分层存储 |
| 多轮依赖 | 代词指代、省略回复 | 上下文补全、指代消解 |
1. 基础历史管理策略
策略 1:固定窗口(Fixed Window)
python
from typing import List, Dict
class FixedWindowMemory:
"""固定窗口记忆:保留最近 N 轮对话"""
def __init__(self, max_turns: int = 10):
self.max_turns = max_turns
self.messages: List[Dict[str, str]] = []
def add_message(self, role: str, content: str):
"""添加消息"""
self.messages.append({"role": role, "content": content})
# 保留最近的 max_turns 轮(2条消息 = 1轮)
if len(self.messages) > self.max_turns * 2:
# 保留 system prompt(如果存在)
system_msg = [m for m in self.messages if m["role"] == "system"]
recent = self.messages[-(self.max_turns * 2):]
self.messages = system_msg + recent
def get_messages(self) -> List[Dict[str, str]]:
"""获取当前对话历史"""
return self.messages.copy()
def clear(self):
"""清空历史"""
self.messages = []
# 使用示例
memory = FixedWindowMemory(max_turns=5)
memory.add_message("system", "你是一个有帮助的AI助手")
memory.add_message("user", "什么是量子计算?")
memory.add_message("assistant", "量子计算是利用量子力学原理...")
memory.add_message("user", "它有什么应用?")
print(f"当前历史: {len(memory.get_messages())} 条消息")策略 2:Token 预算管理(Token Budget)
python
import tiktoken
from typing import List, Dict
class TokenBudgetMemory:
"""基于 Token 预算的记忆管理"""
def __init__(self, max_tokens: int = 4000, model: str = "gpt-4o"):
self.max_tokens = max_tokens
self.model = model
self.messages: List[Dict[str, str]] = []
try:
self.encoding = tiktoken.encoding_for_model(model)
except KeyError:
self.encoding = tiktoken.get_encoding("cl100k_base")
def count_tokens(self, messages: List[Dict[str, str]]) -> int:
"""计算消息列表的 token 数量"""
total = 0
for msg in messages:
# 每条消息的固定开销(约 4 tokens)
total += 4
total += len(self.encoding.encode(msg.get("content", "")))
total += 2 # 回复的起始 token
return total
def add_message(self, role: str, content: str):
"""添加消息,自动裁剪超出预算的历史"""
self.messages.append({"role": role, "content": content})
# 保留 system prompt
system_msgs = [m for m in self.messages if m["role"] == "system"]
other_msgs = [m for m in self.messages if m["role"] != "system"]
# 从最新消息开始累积,直到达到预算
selected = []
current_tokens = self.count_tokens(system_msgs)
for msg in reversed(other_msgs):
msg_tokens = self.count_tokens([msg])
if current_tokens + msg_tokens <= self.max_tokens:
selected.insert(0, msg)
current_tokens += msg_tokens
else:
break
self.messages = system_msgs + selected
def get_messages(self) -> List[Dict[str, str]]:
return self.messages.copy()
def get_token_usage(self) -> Dict[str, int]:
"""返回当前 token 使用情况"""
current = self.count_tokens(self.messages)
return {
"current": current,
"max": self.max_tokens,
"remaining": self.max_tokens - current,
"usage_percent": round(current / self.max_tokens * 100, 1)
}
# 使用示例
memory = TokenBudgetMemory(max_tokens=2000)
memory.add_message("system", "你是Python专家")
memory.add_message("user", "解释装饰器")
memory.add_message("assistant", "装饰器是Python的一种设计模式...")
usage = memory.get_token_usage()
print(f"Token 使用: {usage['current']}/{usage['max']} ({usage['usage_percent']}%)")策略 3:滑动窗口 + 摘要(Sliding Window + Summary)
python
from openai import OpenAI
from typing import List, Dict
client = OpenAI()
class SummarizedMemory:
"""滑动窗口 + 自动摘要压缩"""
def __init__(
self,
window_size: int = 10, # 保留最近 10 轮
summary_threshold: int = 20 # 超过 20 轮触发摘要
):
self.window_size = window_size
self.summary_threshold = summary_threshold
self.messages: List[Dict[str, str]] = []
self.summary: str = ""
def add_message(self, role: str, content: str):
"""添加消息,自动触发摘要"""
self.messages.append({"role": role, "content": content})
# 排除 system 消息计数
non_system = [m for m in self.messages if m["role"] != "system"]
if len(non_system) > self.summary_threshold:
self._compress_history()
def _compress_history(self):
"""压缩历史:生成摘要 + 保留最近窗口"""
system_msgs = [m for m in self.messages if m["role"] == "system"]
non_system = [m for m in self.messages if m["role"] != "system"]
# 需要摘要的部分(除了最近窗口)
to_summarize = non_system[:-self.window_size * 2]
recent = non_system[-self.window_size * 2:]
if not to_summarize:
return
# 调用 LLM 生成摘要
summary_prompt = self._build_summary_prompt(to_summarize)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": summary_prompt}],
temperature=0.3,
max_tokens=500
)
new_summary = response.choices[0].message.content
# 合并摘要
if self.summary:
self.summary = f"{self.summary}\n\n{new_summary}"
else:
self.summary = new_summary
# 更新消息列表
self.messages = system_msgs + recent
print(f"✅ 历史已压缩:{len(to_summarize)} 条消息 → 摘要")
def _build_summary_prompt(self, messages: List[Dict[str, str]]) -> str:
"""构建摘要提示词"""
conversation = "\n".join([
f"{m['role'].upper()}: {m['content']}"
for m in messages
])
return f"""请总结以下对话的关键信息,保留重要事实、决策和上下文。
对话内容:
{conversation}
要求:
1. 提取关键事实和结论
2. 保留用户的偏好和需求
3. 简洁明了,不超过 200 字
4. 使用第三人称客观描述
摘要:"""
def get_messages(self) -> List[Dict[str, str]]:
"""获取消息(如果有摘要,插入到开头)"""
messages = self.messages.copy()
if self.summary:
# 在 system 消息后插入摘要
system_msgs = [m for m in messages if m["role"] == "system"]
other_msgs = [m for m in messages if m["role"] != "system"]
summary_msg = {
"role": "system",
"content": f"[对话历史摘要]\n{self.summary}"
}
return system_msgs + [summary_msg] + other_msgs
return messages
# 使用示例
memory = SummarizedMemory(window_size=5, summary_threshold=12)
# 模拟长对话
for i in range(15):
memory.add_message("user", f"问题 {i+1}")
memory.add_message("assistant", f"回答 {i+1}")
messages = memory.get_messages()
print(f"最终消息数: {len(messages)}")
print(f"摘要: {memory.summary[:100]}...")2. 上下文压缩技术
方法 1:LLMLingua 压缩(保留关键信息)
python
# pip install llmlingua
from llmlingua import PromptCompressor
class CompressedMemory:
"""使用 LLMLingua 压缩对话历史"""
def __init__(self, target_token: int = 2000):
self.compressor = PromptCompressor(
model_name="microsoft/llmlingua-2-xlm-roberta-large-meetingbank",
use_llmlingua2=True
)
self.target_token = target_token
self.messages: List[Dict[str, str]] = []
def add_message(self, role: str, content: str):
self.messages.append({"role": role, "content": content})
def get_compressed_messages(self) -> List[Dict[str, str]]:
"""返回压缩后的消息列表"""
if not self.messages:
return []
# 合并对话为单个文本
full_text = "\n".join([
f"{m['role']}: {m['content']}"
for m in self.messages
])
# 压缩
compressed = self.compressor.compress_prompt(
full_text,
target_token=self.target_token,
use_sentence_level_filter=True
)
# 重新解析为消息格式(简化处理)
return [{
"role": "system",
"content": f"[压缩的对话历史]\n{compressed['compressed_prompt']}"
}] + self.messages[-2:] # 保留最后一轮
# 使用示例
memory = CompressedMemory(target_token=1000)
# 添加大量消息...
compressed = memory.get_compressed_messages()方法 2:关键信息提取
python
from openai import OpenAI
from typing import List, Dict, Set
client = OpenAI()
class KeyInfoExtractor:
"""提取对话中的关键信息(实体、事实、偏好)"""
def __init__(self):
self.key_facts: Set[str] = set()
self.user_preferences: Dict[str, str] = {}
self.entities: Set[str] = set()
def extract_from_conversation(self, messages: List[Dict[str, str]]):
"""从对话中提取关键信息"""
conversation = "\n".join([
f"{m['role']}: {m['content']}"
for m in messages
])
prompt = f"""分析以下对话,提取关键信息,以 JSON 格式返回:
对话:
{conversation}
请提取:
1. key_facts: 重要事实和结论(数组)
2. user_preferences: 用户偏好(对象,键值对)
3. entities: 提到的实体(人名、地名、产品名等,数组)
只返回 JSON,不要其他文字。"""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
response_format={"type": "json_object"},
temperature=0
)
import json
data = json.loads(response.choices[0].message.content)
# 更新内部状态
self.key_facts.update(data.get("key_facts", []))
self.user_preferences.update(data.get("user_preferences", {}))
self.entities.update(data.get("entities", []))
return data
def get_context_summary(self) -> str:
"""生成上下文摘要"""
summary_parts = []
if self.key_facts:
summary_parts.append(
"关键事实:\n" + "\n".join(f"- {fact}" for fact in self.key_facts)
)
if self.user_preferences:
summary_parts.append(
"用户偏好:\n" + "\n".join(
f"- {k}: {v}" for k, v in self.user_preferences.items()
)
)
if self.entities:
summary_parts.append(f"相关实体:{', '.join(self.entities)}")
return "\n\n".join(summary_parts)
# 使用示例
extractor = KeyInfoExtractor()
messages = [
{"role": "user", "content": "我喜欢喝咖啡,尤其是美式"},
{"role": "assistant", "content": "了解,您偏好美式咖啡"},
{"role": "user", "content": "推荐一家上海的咖啡馆"}
]
info = extractor.extract_from_conversation(messages)
print(extractor.get_context_summary())5.4.2 短期与长期记忆
记忆层次架构
┌─────────────────────────────────────────┐
│ 工作记忆(Working Memory) │ ← 当前对话上下文
│ - 最近 5-10 轮对话 │
│ - 立即可用,无需检索 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 短期记忆(Short-term Memory) │ ← 会话级记忆
│ - 当前会话的完整历史 │
│ - 摘要压缩后存储 │
│ - 会话结束后归档 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 长期记忆(Long-term Memory) │ ← 跨会话持久化
│ - 用户画像、偏好 │
│ - 历史交互事实 │
│ - 向量数据库存储 │
│ - 按需检索激活 │
└─────────────────────────────────────────┘1. 短期记忆实现
python
from datetime import datetime
from typing import List, Dict, Optional
import json
class ShortTermMemory:
"""短期记忆:会话级别的记忆管理"""
def __init__(self, session_id: str):
self.session_id = session_id
self.created_at = datetime.now()
self.messages: List[Dict] = []
self.metadata: Dict = {
"topic": None,
"user_intent": None,
"sentiment": "neutral"
}
def add_turn(
self,
user_message: str,
assistant_message: str,
metadata: Optional[Dict] = None
):
"""添加一轮对话"""
turn = {
"timestamp": datetime.now().isoformat(),
"user": user_message,
"assistant": assistant_message,
"metadata": metadata or {}
}
self.messages.append(turn)
def get_recent_turns(self, n: int = 5) -> List[Dict]:
"""获取最近 N 轮对话"""
return self.messages[-n:]
def to_dict(self) -> Dict:
"""序列化为字典(用于存储)"""
return {
"session_id": self.session_id,
"created_at": self.created_at.isoformat(),
"messages": self.messages,
"metadata": self.metadata,
"duration_seconds": (datetime.now() - self.created_at).total_seconds()
}
@classmethod
def from_dict(cls, data: Dict) -> 'ShortTermMemory':
"""从字典恢复"""
memory = cls(data["session_id"])
memory.created_at = datetime.fromisoformat(data["created_at"])
memory.messages = data["messages"]
memory.metadata = data["metadata"]
return memory
# 使用示例
session = ShortTermMemory(session_id="user123_20240215")
session.add_turn(
user_message="推荐一本Python书",
assistant_message="推荐《流畅的Python》",
metadata={"intent": "recommendation", "category": "books"}
)
print(json.dumps(session.to_dict(), indent=2, ensure_ascii=False))2. 长期记忆实现(向量数据库)
python
# pip install chromadb openai
import chromadb
from chromadb.config import Settings
from openai import OpenAI
from typing import List, Dict, Optional
from datetime import datetime
import json
client = OpenAI()
class LongTermMemory:
"""长期记忆:跨会话的持久化记忆"""
def __init__(self, user_id: str, persist_directory: str = "./memory_db"):
self.user_id = user_id
self.chroma_client = chromadb.PersistentClient(
path=persist_directory,
settings=Settings(anonymized_telemetry=False)
)
# 为每个用户创建独立的 collection
self.collection = self.chroma_client.get_or_create_collection(
name=f"user_{user_id}_memory",
metadata={"user_id": user_id}
)
def _get_embedding(self, text: str) -> List[float]:
"""生成文本向量"""
response = client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
def store_memory(
self,
content: str,
memory_type: str = "fact", # fact / preference / event
metadata: Optional[Dict] = None
) -> str:
"""存储一条记忆"""
memory_id = f"{self.user_id}_{datetime.now().timestamp()}"
full_metadata = {
"type": memory_type,
"timestamp": datetime.now().isoformat(),
"user_id": self.user_id,
**(metadata or {})
}
self.collection.add(
documents=[content],
embeddings=[self._get_embedding(content)],
metadatas=[full_metadata],
ids=[memory_id]
)
return memory_id
def recall(
self,
query: str,
n_results: int = 5,
memory_type: Optional[str] = None
) -> List[Dict]:
"""检索相关记忆"""
where_filter = {"user_id": self.user_id}
if memory_type:
where_filter["type"] = memory_type
results = self.collection.query(
query_embeddings=[self._get_embedding(query)],
n_results=n_results,
where=where_filter
)
memories = []
if results["documents"]:
for i, doc in enumerate(results["documents"][0]):
memories.append({
"content": doc,
"metadata": results["metadatas"][0][i],
"distance": results["distances"][0][i]
})
return memories
def get_user_profile(self) -> Dict:
"""生成用户画像(基于所有记忆)"""
all_memories = self.collection.get()
if not all_memories["documents"]:
return {"user_id": self.user_id, "profile": "新用户,暂无历史"}
# 按类型分组
facts = []
preferences = []
events = []
for doc, meta in zip(all_memories["documents"], all_memories["metadatas"]):
if meta["type"] == "fact":
facts.append(doc)
elif meta["type"] == "preference":
preferences.append(doc)
elif meta["type"] == "event":
events.append(doc)
# 调用 LLM 生成画像
profile_prompt = f"""基于以下用户记忆,生成用户画像:
事实记忆:
{chr(10).join(f'- {f}' for f in facts[:10])}
偏好记忆:
{chr(10).join(f'- {p}' for p in preferences[:10])}
历史事件:
{chr(10).join(f'- {e}' for e in events[:5])}
请生成简洁的用户画像(200字以内),包括:
1. 基本特征
2. 兴趣偏好
3. 交互模式"""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": profile_prompt}],
temperature=0.3
)
return {
"user_id": self.user_id,
"profile": response.choices[0].message.content,
"memory_count": len(all_memories["documents"]),
"last_interaction": max(
[m["timestamp"] for m in all_memories["metadatas"]]
)
}
# 使用示例
ltm = LongTermMemory(user_id="alice")
# 存储记忆
ltm.store_memory("用户喜欢喝美式咖啡", memory_type="preference")
ltm.store_memory("用户在上海工作", memory_type="fact")
ltm.store_memory("2024年2月15日咨询了Python学习路径", memory_type="event")
# 检索记忆
memories = ltm.recall("咖啡相关的偏好", n_results=3)
for mem in memories:
print(f"- {mem['content']} (距离: {mem['distance']:.3f})")
# 生成用户画像
profile = ltm.get_user_profile()
print(f"\n用户画像:\n{profile['profile']}")5.4.3 记忆整合:混合记忆系统
python
from typing import List, Dict, Optional
from datetime import datetime
class HybridMemorySystem:
"""混合记忆系统:整合工作记忆、短期记忆和长期记忆"""
def __init__(self, user_id: str):
self.user_id = user_id
# 工作记忆(当前对话)
self.working_memory = TokenBudgetMemory(max_tokens=4000)
# 短期记忆(当前会话)
self.short_term = ShortTermMemory(
session_id=f"{user_id}_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
)
# 长期记忆(跨会话)
self.long_term = LongTermMemory(user_id=user_id)
def process_turn(
self,
user_message: str,
assistant_message: str,
extract_memory: bool = True
):
"""处理一轮对话"""
# 1. 添加到工作记忆
self.working_memory.add_message("user", user_message)
self.working_memory.add_message("assistant", assistant_message)
# 2. 添加到短期记忆
self.short_term.add_turn(user_message, assistant_message)
# 3. 提取并存储到长期记忆(可选)
if extract_memory:
self._extract_and_store(user_message, assistant_message)
def _extract_and_store(self, user_msg: str, assistant_msg: str):
"""从对话中提取关键信息并存储到长期记忆"""
# 使用 LLM 提取关键信息
extract_prompt = f"""从以下对话中提取需要长期记忆的信息:
用户:{user_msg}
助手:{assistant_msg}
请提取(如果有):
1. 用户偏好(如"喜欢X"、"不喜欢Y")
2. 重要事实(如"在X公司工作"、"住在Y城市")
3. 重要事件(如"计划做Z"、"完成了W")
以 JSON 格式返回:
{{
"preferences": ["偏好1", "偏好2"],
"facts": ["事实1", "事实2"],
"events": ["事件1"]
}}
如果没有需要记忆的信息,返回空数组。只返回 JSON,不要其他文字。"""
try:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": extract_prompt}],
response_format={"type": "json_object"},
temperature=0
)
import json
extracted = json.loads(response.choices[0].message.content)
# 存储到长期记忆
for pref in extracted.get("preferences", []):
self.long_term.store_memory(pref, memory_type="preference")
for fact in extracted.get("facts", []):
self.long_term.store_memory(fact, memory_type="fact")
for event in extracted.get("events", []):
self.long_term.store_memory(event, memory_type="event")
except Exception as e:
print(f"记忆提取失败: {e}")
def get_context_for_llm(
self,
current_query: str,
include_long_term: bool = True
) -> List[Dict[str, str]]:
"""为 LLM 构建完整上下文"""
messages = []
# 1. 系统提示(包含用户画像)
if include_long_term:
profile = self.long_term.get_user_profile()
system_content = f"""你是一个有记忆的AI助手。
用户画像:
{profile['profile']}
请基于用户的历史偏好和背景提供个性化回复。"""
messages.append({"role": "system", "content": system_content})
# 2. 检索相关长期记忆
if include_long_term:
relevant_memories = self.long_term.recall(current_query, n_results=3)
if relevant_memories:
memory_context = "相关历史记忆:\n" + "\n".join(
f"- {m['content']}" for m in relevant_memories
)
messages.append({"role": "system", "content": memory_context})
# 3. 工作记忆(最近对话)
messages.extend(self.working_memory.get_messages())
return messages
def end_session(self):
"""结束会话,归档短期记忆"""
# 将短期记忆的摘要存入长期记忆
session_summary = f"会话摘要({self.short_term.session_id}):" \
f"共 {len(self.short_term.messages)} 轮对话"
self.long_term.store_memory(
session_summary,
memory_type="event",
metadata={"session_id": self.short_term.session_id}
)
# 完整使用示例
memory_system = HybridMemorySystem(user_id="bob")
# 第一轮对话
memory_system.process_turn(
user_message="我在学习Python,想找本适合初学者的书",
assistant_message="推荐《Python编程:从入门到实践》,非常适合初学者"
)
# 第二轮对话
memory_system.process_turn(
user_message="我比较喜欢项目驱动的学习方式",
assistant_message="那这本书很合适,它包含大量实战项目"
)
# 构建上下文(包含长期记忆)
context = memory_system.get_context_for_llm(
current_query="还有其他推荐的学习资源吗?",
include_long_term=True
)
print("=== 发送给 LLM 的完整上下文 ===")
for msg in context:
print(f"{msg['role']}: {msg['content'][:100]}...")
# 结束会话
memory_system.end_session()5.4.4 高级记忆技术
1. 记忆优先级与遗忘机制
python
from datetime import datetime, timedelta
from typing import List, Dict
import math
class PrioritizedMemory:
"""带优先级和遗忘曲线的记忆系统"""
def __init__(self):
self.memories: List[Dict] = []
def add_memory(
self,
content: str,
importance: float = 0.5, # 0-1,重要性
memory_type: str = "fact"
):
"""添加记忆"""
memory = {
"id": len(self.memories),
"content": content,
"type": memory_type,
"importance": importance,
"created_at": datetime.now(),
"last_accessed": datetime.now(),
"access_count": 0,
"strength": 1.0 # 记忆强度(受遗忘曲线影响)
}
self.memories.append(memory)
def _calculate_strength(self, memory: Dict) -> float:
"""计算记忆强度(艾宾浩斯遗忘曲线)"""
days_since_creation = (datetime.now() - memory["created_at"]).days
days_since_access = (datetime.now() - memory["last_accessed"]).days
# 遗忘曲线:R = e^(-t/S)
# R: 记忆保留率, t: 时间, S: 记忆强度(受重要性和访问次数影响)
S = memory["importance"] * (1 + math.log(1 + memory["access_count"]))
retention = math.exp(-days_since_access / max(S, 0.1))
return retention
def recall(self, query: str, threshold: float = 0.3) -> List[Dict]:
"""检索记忆(考虑强度和相关性)"""
# 更新所有记忆的强度
for mem in self.memories:
mem["strength"] = self._calculate_strength(mem)
# 过滤掉强度过低的记忆
active_memories = [m for m in self.memories if m["strength"] > threshold]
# 按强度排序(实际应用中应结合语义相似度)
active_memories.sort(key=lambda m: m["strength"], reverse=True)
return active_memories[:5]
def access_memory(self, memory_id: int):
"""访问记忆(增强记忆强度)"""
for mem in self.memories:
if mem["id"] == memory_id:
mem["last_accessed"] = datetime.now()
mem["access_count"] += 1
mem["strength"] = self._calculate_strength(mem)
break
def consolidate(self):
"""记忆巩固:删除过弱的记忆"""
self.memories = [m for m in self.memories if m["strength"] > 0.1]
# 使用示例
pm = PrioritizedMemory()
pm.add_memory("用户喜欢Python", importance=0.8, memory_type="preference")
pm.add_memory("今天天气不错", importance=0.2, memory_type="fact")
# 模拟时间流逝
import time
time.sleep(1)
# 访问重要记忆
pm.access_memory(0)
# 检索
active = pm.recall("用户偏好")
for mem in active:
print(f"{mem['content']} (强度: {mem['strength']:.2f})")2. 情景记忆(Episodic Memory)
python
from typing import List, Dict, Optional
from datetime import datetime
class EpisodicMemory:
"""情景记忆:记录完整的交互情景"""
def __init__(self):
self.episodes: List[Dict] = []
def create_episode(
self,
context: str,
user_action: str,
agent_response: str,
outcome: Optional[str] = None,
tags: Optional[List[str]] = None
) -> str:
"""创建一个情景记忆"""
episode_id = f"ep_{len(self.episodes)}_{datetime.now().timestamp()}"
episode = {
"id": episode_id,
"timestamp": datetime.now().isoformat(),
"context": context,
"user_action": user_action,
"agent_response": agent_response,
"outcome": outcome,
"tags": tags or [],
"success": None # 可后续标注
}
self.episodes.append(episode)
return episode_id
def mark_success(self, episode_id: str, success: bool):
"""标记情景的成功与否(用于强化学习)"""
for ep in self.episodes:
if ep["id"] == episode_id:
ep["success"] = success
break
def retrieve_similar_episodes(
self,
current_context: str,
n: int = 3
) -> List[Dict]:
"""检索相似的历史情景(简化版:基于标签匹配)"""
# 实际应用中应使用向量相似度
# 这里简化为返回最近的成功情景
successful = [ep for ep in self.episodes if ep.get("success") == True]
return successful[-n:]
def get_learning_insights(self) -> Dict:
"""从情景记忆中学习"""
if not self.episodes:
return {"total": 0}
total = len(self.episodes)
successful = len([ep for ep in self.episodes if ep.get("success") == True])
failed = len([ep for ep in self.episodes if ep.get("success") == False])
return {
"total_episodes": total,
"successful": successful,
"failed": failed,
"success_rate": successful / total if total > 0 else 0,
"common_success_patterns": self._extract_patterns(True),
"common_failure_patterns": self._extract_patterns(False)
}
def _extract_patterns(self, success: bool) -> List[str]:
"""提取成功/失败模式"""
episodes = [ep for ep in self.episodes if ep.get("success") == success]
# 简化:统计标签频率
from collections import Counter
all_tags = []
for ep in episodes:
all_tags.extend(ep.get("tags", []))
common = Counter(all_tags).most_common(3)
return [f"{tag} (出现{count}次)" for tag, count in common]
# 使用示例
em = EpisodicMemory()
# 记录情景
ep_id = em.create_episode(
context="用户询问Python学习资源",
user_action="请求推荐书籍",
agent_response="推荐《Python编程:从入门到实践》",
outcome="用户表示满意",
tags=["recommendation", "python", "books"]
)
em.mark_success(ep_id, success=True)
# 学习洞察
insights = em.get_learning_insights()
print(f"成功率: {insights['success_rate']:.1%}")5.4.5 实战:个性化对话 Agent
python
"""
完整示例:带记忆的个性化对话 Agent
功能:
- 记住用户偏好和历史
- 根据上下文提供个性化回复
- 自动提取和存储关键信息
"""
from openai import OpenAI
from typing import List, Dict
client = OpenAI()
class PersonalizedAgent:
"""个性化对话 Agent"""
def __init__(self, user_id: str):
self.user_id = user_id
self.memory = HybridMemorySystem(user_id=user_id)
def chat(self, user_message: str) -> str:
"""处理用户消息"""
# 1. 构建上下文(包含记忆)
messages = self.memory.get_context_for_llm(
current_query=user_message,
include_long_term=True
)
# 2. 添加当前用户消息
messages.append({"role": "user", "content": user_message})
# 3. 调用 LLM
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0.7
)
assistant_message = response.choices[0].message.content
# 4. 更新记忆
self.memory.process_turn(user_message, assistant_message)
return assistant_message
def get_user_summary(self) -> str:
"""获取用户画像"""
profile = self.memory.long_term.get_user_profile()
return profile["profile"]
# 使用示例
agent = PersonalizedAgent(user_id="charlie")
print("Agent: 你好!有什么可以帮助你的?\n")
# 第一次对话
response = agent.chat("我想学习机器学习,但数学基础比较薄弱")
print(f"Agent: {response}\n")
# 第二次对话
response = agent.chat("推荐一些适合我的学习资源")
print(f"Agent: {response}\n")
# 第三次对话(几天后)
response = agent.chat("我现在想深入学习深度学习")
print(f"Agent: {response}\n")
# 查看用户画像
print("=== 用户画像 ===")
print(agent.get_user_summary())5.4.6 学习资源
推荐阅读
- LangChain Memory 文档
- MemGPT 论文(虚拟上下文管理)
- Mem0 开源项目(AI 记忆层)
开源工具
- LangChain Memory:多种记忆类型实现
- Mem0:生产级记忆管理系统
- Zep:长期记忆存储和检索
实战练习
- 实现一个带记忆的客服机器人(记住用户问题和偏好)
- 构建个人知识助手(记录学习进度和笔记)
- 开发游戏 NPC(记住玩家行为和对话历史)
关键要点
- ✅ 分层设计:工作记忆 + 短期记忆 + 长期记忆
- ✅ 主动遗忘:不是所有信息都需要长期保存
- ✅ 优先级管理:重要信息优先保留和检索
- ✅ 隐私保护:敏感信息加密存储,支持用户删除
- ✅ 成本控制:避免无限增长的上下文导致 API 费用激增