developer.chat
17 April 2024
category
屏蔽
实验掩蔽解析器和转换器是一个可扩展的模块,用于掩蔽和重新水合字符串。该模块的主要用例之一是在调用llm之前,从字符串中编辑PII(个人标识信息)。
真实世界场景
客户支持系统接收包含敏感客户信息的消息。系统必须解析这些消息,屏蔽任何PII(如姓名、电子邮件地址和电话号码),并在遵守隐私法规的同时将其记录下来进行分析。在记录转录本之前,将使用llm生成摘要。
开始
基本示例
使用RegexMaskingTransformer为电子邮件和电话创建一个简单的掩码。
TIP
See this section for general instructions on installing integration packages.
- npm
- Yarn
- pnpm
npm install @langchain/openai
import {
MaskingParser,
RegexMaskingTransformer,
} from "langchain/experimental/masking";
// Define masking strategy
const emailMask = () => `[email-${Math.random().toString(16).slice(2)}]`;
const phoneMask = () => `[phone-${Math.random().toString(16).slice(2)}]`;
// Configure pii transformer
const piiMaskingTransformer = new RegexMaskingTransformer({
email: { regex: /\S+@\S+\.\S+/g, mask: emailMask },
phone: { regex: /\d{3}-\d{3}-\d{4}/g, mask: phoneMask },
});
const maskingParser = new MaskingParser({
transformers: [piiMaskingTransformer],
});
maskingParser.addTransformer(piiMaskingTransformer);
const input =
"Contact me at jane.doe@email.com or 555-123-4567. Also reach me at john.smith@email.com";
const masked = await maskingParser.mask(input);
console.log(masked);
// Contact me at [email-a31e486e324f6] or [phone-da8fc1584f224]. Also reach me at [email-d5b6237633d95]
const rehydrated = await maskingParser.rehydrate(masked);
console.log(rehydrated);
// Contact me at jane.doe@email.com or 555-123-4567. Also reach me at john.smith@email.com
API Reference:
- MaskingParser from
langchain/experimental/masking
- RegexMaskingTransformer from
langchain/experimental/masking
NOTE
如果计划存储掩蔽状态以异步重新水合原始值,请确保遵循最佳安全实践。在大多数情况下,您将希望定义一个自定义哈希和盐析策略。
Next.js stream
示例nextjs聊天端点利用RegexMaskingTransformer。每次使用聊天负载调用api时,都会屏蔽当前聊天消息和聊天消息历史记录。
// app/api/chat
import {
MaskingParser,
RegexMaskingTransformer,
} from "langchain/experimental/masking";
import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
import { BytesOutputParser } from "@langchain/core/output_parsers";
export const runtime = "edge";
// Function to format chat messages for consistency
const formatMessage = (message: any) => `${message.role}: ${message.content}`;
const CUSTOMER_SUPPORT = `You are a customer support summarizer agent. Always include masked PII in your response.
Current conversation:
{chat_history}
User: {input}
AI:`;
// Configure Masking Parser
const maskingParser = new MaskingParser();
// Define transformations for masking emails and phone numbers using regular expressions
const piiMaskingTransformer = new RegexMaskingTransformer({
email: { regex: /\S+@\S+\.\S+/g }, // If a regex is provided without a mask we fallback to a simple default hashing function
phone: { regex: /\d{3}-\d{3}-\d{4}/g },
});
maskingParser.addTransformer(piiMaskingTransformer);
export async function POST(req: Request) {
try {
const body = await req.json();
const messages = body.messages ?? [];
const formattedPreviousMessages = messages.slice(0, -1).map(formatMessage);
const currentMessageContent = messages[messages.length - 1].content; // Extract the content of the last message
// Mask sensitive information in the current message
const guardedMessageContent = await maskingParser.mask(
currentMessageContent
);
// Mask sensitive information in the chat history
const guardedHistory = await maskingParser.mask(
formattedPreviousMessages.join("\n")
);
const prompt = PromptTemplate.fromTemplate(CUSTOMER_SUPPORT);
const model = new ChatOpenAI({ temperature: 0.8 });
// Initialize an output parser that handles serialization and byte-encoding for streaming
const outputParser = new BytesOutputParser();
const chain = prompt.pipe(model).pipe(outputParser); // Chain the prompt, model, and output parser together
console.log("[GUARDED INPUT]", guardedMessageContent); // Contact me at -1157967895 or -1626926859.
console.log("[GUARDED HISTORY]", guardedHistory); // user: Contact me at -1157967895 or -1626926859. assistant: Thank you for providing your contact information.
console.log("[STATE]", maskingParser.getState()); // { '-1157967895' => 'jane.doe@email.com', '-1626926859' => '555-123-4567'}
// Stream the AI response based on the masked chat history and current message
const stream = await chain.stream({
chat_history: guardedHistory,
input: guardedMessageContent,
});
return new Response(stream, {
headers: { "content-type": "text/plain; charset=utf-8" },
});
} catch (e: any) {
return new Response(JSON.stringify({ error: e.message }), {
status: 500,
headers: {
"content-type": "application/json",
},
});
}
}
API Reference:
- MaskingParser from
langchain/experimental/masking
- RegexMaskingTransformer from
langchain/experimental/masking
- ChatOpenAI from
@langchain/openai
- PromptTemplate from
@langchain/core/prompts
- BytesOutputParser from
@langchain/core/output_parsers
Kitchen sink
import {
MaskingParser,
RegexMaskingTransformer,
} from "langchain/experimental/masking";
// A simple hash function for demonstration purposes
function simpleHash(input: string): string {
let hash = 0;
for (let i = 0; i < input.length; i += 1) {
const char = input.charCodeAt(i);
hash = (hash << 5) - hash + char;
hash |= 0; // Convert to 32bit integer
}
return hash.toString(16);
}
const emailMask = (match: string) => `[email-${simpleHash(match)}]`;
const phoneMask = (match: string) => `[phone-${simpleHash(match)}]`;
const nameMask = (match: string) => `[name-${simpleHash(match)}]`;
const ssnMask = (match: string) => `[ssn-${simpleHash(match)}]`;
const creditCardMask = (match: string) => `[creditcard-${simpleHash(match)}]`;
const passportMask = (match: string) => `[passport-${simpleHash(match)}]`;
const licenseMask = (match: string) => `[license-${simpleHash(match)}]`;
const addressMask = (match: string) => `[address-${simpleHash(match)}]`;
const dobMask = (match: string) => `[dob-${simpleHash(match)}]`;
const bankAccountMask = (match: string) => `[bankaccount-${simpleHash(match)}]`;
// Regular expressions for different types of PII
const patterns = {
email: { regex: /\S+@\S+\.\S+/g, mask: emailMask },
phone: { regex: /\b\d{3}-\d{3}-\d{4}\b/g, mask: phoneMask },
name: { regex: /\b[A-Z][a-z]+ [A-Z][a-z]+\b/g, mask: nameMask },
ssn: { regex: /\b\d{3}-\d{2}-\d{4}\b/g, mask: ssnMask },
creditCard: { regex: /\b(?:\d{4}[ -]?){3}\d{4}\b/g, mask: creditCardMask },
passport: { regex: /(?i)\b[A-Z]{1,2}\d{6,9}\b/g, mask: passportMask },
license: { regex: /(?i)\b[A-Z]{1,2}\d{6,8}\b/g, mask: licenseMask },
address: {
regex: /\b\d{1,5}\s[A-Z][a-z]+(?:\s[A-Z][a-z]+)\*\b/g,
mask: addressMask,
},
dob: { regex: /\b\d{4}-\d{2}-\d{2}\b/g, mask: dobMask },
bankAccount: { regex: /\b\d{8,17}\b/g, mask: bankAccountMask },
};
// Create a RegexMaskingTransformer with multiple patterns
const piiMaskingTransformer = new RegexMaskingTransformer(patterns);
// Hooks for different stages of masking and rehydrating
const onMaskingStart = (message: string) =>
console.log(`Starting to mask message: ${message}`);
const onMaskingEnd = (maskedMessage: string) =>
console.log(`Masked message: ${maskedMessage}`);
const onRehydratingStart = (message: string) =>
console.log(`Starting to rehydrate message: ${message}`);
const onRehydratingEnd = (rehydratedMessage: string) =>
console.log(`Rehydrated message: ${rehydratedMessage}`);
// Initialize MaskingParser with the transformer and hooks
const maskingParser = new MaskingParser({
transformers: [piiMaskingTransformer],
onMaskingStart,
onMaskingEnd,
onRehydratingStart,
onRehydratingEnd,
});
// Example message containing multiple types of PII
const message =
"Contact Jane Doe at jane.doe@email.com or 555-123-4567. Her SSN is 123-45-6789 and her credit card number is 1234-5678-9012-3456. Passport number: AB1234567, Driver's License: X1234567, Address: 123 Main St, Date of Birth: 1990-01-01, Bank Account: 12345678901234567.";
// Mask and rehydrate the message
maskingParser
.mask(message)
.then((maskedMessage: string) => {
console.log(`Masked message: ${maskedMessage}`);
return maskingParser.rehydrate(maskedMessage);
})
.then((rehydratedMessage: string) => {
console.log(`Final rehydrated message: ${rehydratedMessage}`);
});
API Reference:
- MaskingParser from
langchain/experimental/masking
- RegexMaskingTransformer from
langchain/experimental/masking
- 登录 发表评论