实施策略一:构建标准化的数据格式
假设您的原始数据如下所示:
{
"question": "What is the capital of France?",
"answer": "Paris"
}
您可以将其转换为带有掩码的形式:
[
{"input": "What is the capital of France?", "output": "-100"},
{"input": "Who wrote Hamlet?", "output": "-100"}
]
这样做的好处是可以让模型专注于输入而非答案本身,进而优化其“捞针”的性能。