跳转到主要内容

标签(标签)

资源精选(342) Go开发(108) Go语言(103) Go(99) angular(82) LLM(78) 大语言模型(63) 人工智能(53) 前端开发(50) LangChain(43) golang(43) 机器学习(39) Go工程师(38) Go程序员(38) Go开发者(36) React(33) Go基础(29) Python(24) Vue(22) Web开发(20) Web技术(19) 精选资源(19) 深度学习(19) Java(18) ChatGTP(17) Cookie(16) android(16) 前端框架(13) JavaScript(13) Next.js(12) 安卓(11) 聊天机器人(10) typescript(10) 资料精选(10) NLP(10) 第三方Cookie(9) Redwoodjs(9) ChatGPT(9) LLMOps(9) Go语言中级开发(9) 自然语言处理(9) PostgreSQL(9) 区块链(9) mlops(9) 安全(9) 全栈开发(8) OpenAI(8) Linux(8) AI(8) GraphQL(8) iOS(8) 软件架构(7) RAG(7) Go语言高级开发(7) AWS(7) C++(7) 数据科学(7) whisper(6) Prisma(6) 隐私保护(6) JSON(6) DevOps(6) 数据可视化(6) wasm(6) 计算机视觉(6) 算法(6) Rust(6) 微服务(6) 隐私沙盒(5) FedCM(5) 智能体(5) 语音识别(5) Angular开发(5) 快速应用开发(5) 提示工程(5) Agent(5) LLaMA(5) 低代码开发(5) Go测试(5) gorm(5) REST API(5) kafka(5) 推荐系统(5) WebAssembly(5) GameDev(5) CMS(5) CSS(5) machine-learning(5) 机器人(5) 游戏开发(5) Blockchain(5) Web安全(5) Kotlin(5) 低代码平台(5) 机器学习资源(5) Go资源(5) Nodejs(5) PHP(5) Swift(5) devin(4) Blitz(4) javascript框架(4) Redwood(4) GDPR(4) 生成式人工智能(4) Angular16(4) Alpaca(4) 编程语言(4) SAML(4) JWT(4) JSON处理(4) Go并发(4) 移动开发(4) 移动应用(4) security(4) 隐私(4) spring-boot(4) 物联网(4) nextjs(4) 网络安全(4) API(4) Ruby(4) 信息安全(4) flutter(4) RAG架构(3) 专家智能体(3) Chrome(3) CHIPS(3) 3PC(3) SSE(3) 人工智能软件工程师(3) LLM Agent(3) Remix(3) Ubuntu(3) GPT4All(3) 软件开发(3) 问答系统(3) 开发工具(3) 最佳实践(3) RxJS(3) SSR(3) Node.js(3) Dolly(3) 移动应用开发(3) 低代码(3) IAM(3) Web框架(3) CORS(3) 基准测试(3) Go语言数据库开发(3) Oauth2(3) 并发(3) 主题(3) Theme(3) earth(3) nginx(3) 软件工程(3) azure(3) keycloak(3) 生产力工具(3) gpt3(3) 工作流(3) C(3) jupyter(3) 认证(3) prometheus(3) GAN(3) Spring(3) 逆向工程(3) 应用安全(3) Docker(3) Django(3) R(3) .NET(3) 大数据(3) Hacking(3) 渗透测试(3) C++资源(3) Mac(3) 微信小程序(3) Python资源(3) JHipster(3) 语言模型(2) 可穿戴设备(2) JDK(2) SQL(2) Apache(2) Hashicorp Vault(2) Spring Cloud Vault(2) Go语言Web开发(2) Go测试工程师(2) WebSocket(2) 容器化(2) AES(2) 加密(2) 输入验证(2) ORM(2) Fiber(2) Postgres(2) Gorilla Mux(2) Go数据库开发(2) 模块(2) 泛型(2) 指针(2) HTTP(2) PostgreSQL开发(2) Vault(2) K8s(2) Spring boot(2) R语言(2) 深度学习资源(2) 半监督学习(2) semi-supervised-learning(2) architecture(2) 普罗米修斯(2) 嵌入模型(2) productivity(2) 编码(2) Qt(2) 前端(2) Rust语言(2) NeRF(2) 神经辐射场(2) 元宇宙(2) CPP(2) 数据分析(2) spark(2) 流处理(2) Ionic(2) 人体姿势估计(2) human-pose-estimation(2) 视频处理(2) deep-learning(2) kotlin语言(2) kotlin开发(2) burp(2) Chatbot(2) npm(2) quantum(2) OCR(2) 游戏(2) game(2) 内容管理系统(2) MySQL(2) python-books(2) pentest(2) opengl(2) IDE(2) 漏洞赏金(2) Web(2) 知识图谱(2) PyTorch(2) 数据库(2) reverse-engineering(2) 数据工程(2) swift开发(2) rest(2) robotics(2) ios-animation(2) 知识蒸馏(2) 安卓开发(2) nestjs(2) solidity(2) 爬虫(2) 面试(2) 容器(2) C++精选(2) 人工智能资源(2) Machine Learning(2) 备忘单(2) 编程书籍(2) angular资源(2) 速查表(2) cheatsheets(2) SecOps(2) mlops资源(2) R资源(2) DDD(2) 架构设计模式(2) 量化(2) Hacking资源(2) 强化学习(2) flask(2) 设计(2) 性能(2) Sysadmin(2) 系统管理员(2) Java资源(2) 机器学习精选(2) android资源(2) android-UI(2) Mac资源(2) iOS资源(2) Vue资源(2) flutter资源(2) JavaScript精选(2) JavaScript资源(2) Rust开发(2) deeplearning(2) RAD(2)

category

屏蔽


实验掩蔽解析器和转换器是一个可扩展的模块,用于掩蔽和重新水合字符串。该模块的主要用例之一是在调用llm之前,从字符串中编辑PII(个人标识信息)。


真实世界场景​


客户支持系统接收包含敏感客户信息的消息。系统必须解析这些消息,屏蔽任何PII(如姓名、电子邮件地址和电话号码),并在遵守隐私法规的同时将其记录下来进行分析。在记录转录本之前,将使用llm生成摘要。
开始​


基本示例​


使用RegexMaskingTransformer为电子邮件和电话创建一个简单的掩码。

TIP

See this section for general instructions on installing integration packages.

  • npm
  • Yarn
  • pnpm
npm install @langchain/openai

import {
  MaskingParser,
  RegexMaskingTransformer,
} from "langchain/experimental/masking";

// Define masking strategy
const emailMask = () => `[email-${Math.random().toString(16).slice(2)}]`;
const phoneMask = () => `[phone-${Math.random().toString(16).slice(2)}]`;

// Configure pii transformer
const piiMaskingTransformer = new RegexMaskingTransformer({
  email: { regex: /\S+@\S+\.\S+/g, mask: emailMask },
  phone: { regex: /\d{3}-\d{3}-\d{4}/g, mask: phoneMask },
});

const maskingParser = new MaskingParser({
  transformers: [piiMaskingTransformer],
});
maskingParser.addTransformer(piiMaskingTransformer);

const input =
  "Contact me at jane.doe@email.com or 555-123-4567. Also reach me at john.smith@email.com";
const masked = await maskingParser.mask(input);

console.log(masked);
// Contact me at [email-a31e486e324f6] or [phone-da8fc1584f224]. Also reach me at [email-d5b6237633d95]

const rehydrated = await maskingParser.rehydrate(masked);
console.log(rehydrated);
// Contact me at jane.doe@email.com or 555-123-4567. Also reach me at john.smith@email.com

API Reference:

NOTE

如果计划存储掩蔽状态以异步重新水合原始值,请确保遵循最佳安全实践。在大多数情况下,您将希望定义一个自定义哈希和盐析策略。

Next.js stream

示例nextjs聊天端点利用RegexMaskingTransformer。每次使用聊天负载调用api时,都会屏蔽当前聊天消息和聊天消息历史记录。

// app/api/chat

import {
  MaskingParser,
  RegexMaskingTransformer,
} from "langchain/experimental/masking";
import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
import { BytesOutputParser } from "@langchain/core/output_parsers";

export const runtime = "edge";

// Function to format chat messages for consistency
const formatMessage = (message: any) => `${message.role}: ${message.content}`;

const CUSTOMER_SUPPORT = `You are a customer support summarizer agent. Always include masked PII in your response.
  Current conversation:
  {chat_history}
  User: {input}
  AI:`;

// Configure Masking Parser
const maskingParser = new MaskingParser();
// Define transformations for masking emails and phone numbers using regular expressions
const piiMaskingTransformer = new RegexMaskingTransformer({
  email: { regex: /\S+@\S+\.\S+/g }, // If a regex is provided without a mask we fallback to a simple default hashing function
  phone: { regex: /\d{3}-\d{3}-\d{4}/g },
});

maskingParser.addTransformer(piiMaskingTransformer);

export async function POST(req: Request) {
  try {
    const body = await req.json();
    const messages = body.messages ?? [];
    const formattedPreviousMessages = messages.slice(0, -1).map(formatMessage);
    const currentMessageContent = messages[messages.length - 1].content; // Extract the content of the last message
    // Mask sensitive information in the current message
    const guardedMessageContent = await maskingParser.mask(
      currentMessageContent
    );
    // Mask sensitive information in the chat history
    const guardedHistory = await maskingParser.mask(
      formattedPreviousMessages.join("\n")
    );

    const prompt = PromptTemplate.fromTemplate(CUSTOMER_SUPPORT);
    const model = new ChatOpenAI({ temperature: 0.8 });
    // Initialize an output parser that handles serialization and byte-encoding for streaming
    const outputParser = new BytesOutputParser();
    const chain = prompt.pipe(model).pipe(outputParser); // Chain the prompt, model, and output parser together

    console.log("[GUARDED INPUT]", guardedMessageContent); // Contact me at -1157967895 or -1626926859.
    console.log("[GUARDED HISTORY]", guardedHistory); // user: Contact me at -1157967895 or -1626926859. assistant: Thank you for providing your contact information.
    console.log("[STATE]", maskingParser.getState()); // { '-1157967895' => 'jane.doe@email.com', '-1626926859' => '555-123-4567'}

    // Stream the AI response based on the masked chat history and current message
    const stream = await chain.stream({
      chat_history: guardedHistory,
      input: guardedMessageContent,
    });

    return new Response(stream, {
      headers: { "content-type": "text/plain; charset=utf-8" },
    });
  } catch (e: any) {
    return new Response(JSON.stringify({ error: e.message }), {
      status: 500,
      headers: {
        "content-type": "application/json",
      },
    });
  }
}

API Reference:

Kitchen sink

import {
  MaskingParser,
  RegexMaskingTransformer,
} from "langchain/experimental/masking";

// A simple hash function for demonstration purposes
function simpleHash(input: string): string {
  let hash = 0;
  for (let i = 0; i < input.length; i += 1) {
    const char = input.charCodeAt(i);
    hash = (hash << 5) - hash + char;
    hash |= 0; // Convert to 32bit integer
  }
  return hash.toString(16);
}

const emailMask = (match: string) => `[email-${simpleHash(match)}]`;
const phoneMask = (match: string) => `[phone-${simpleHash(match)}]`;
const nameMask = (match: string) => `[name-${simpleHash(match)}]`;
const ssnMask = (match: string) => `[ssn-${simpleHash(match)}]`;
const creditCardMask = (match: string) => `[creditcard-${simpleHash(match)}]`;
const passportMask = (match: string) => `[passport-${simpleHash(match)}]`;
const licenseMask = (match: string) => `[license-${simpleHash(match)}]`;
const addressMask = (match: string) => `[address-${simpleHash(match)}]`;
const dobMask = (match: string) => `[dob-${simpleHash(match)}]`;
const bankAccountMask = (match: string) => `[bankaccount-${simpleHash(match)}]`;

// Regular expressions for different types of PII
const patterns = {
  email: { regex: /\S+@\S+\.\S+/g, mask: emailMask },
  phone: { regex: /\b\d{3}-\d{3}-\d{4}\b/g, mask: phoneMask },
  name: { regex: /\b[A-Z][a-z]+ [A-Z][a-z]+\b/g, mask: nameMask },
  ssn: { regex: /\b\d{3}-\d{2}-\d{4}\b/g, mask: ssnMask },
  creditCard: { regex: /\b(?:\d{4}[ -]?){3}\d{4}\b/g, mask: creditCardMask },
  passport: { regex: /(?i)\b[A-Z]{1,2}\d{6,9}\b/g, mask: passportMask },
  license: { regex: /(?i)\b[A-Z]{1,2}\d{6,8}\b/g, mask: licenseMask },
  address: {
    regex: /\b\d{1,5}\s[A-Z][a-z]+(?:\s[A-Z][a-z]+)\*\b/g,
    mask: addressMask,
  },
  dob: { regex: /\b\d{4}-\d{2}-\d{2}\b/g, mask: dobMask },
  bankAccount: { regex: /\b\d{8,17}\b/g, mask: bankAccountMask },
};

// Create a RegexMaskingTransformer with multiple patterns
const piiMaskingTransformer = new RegexMaskingTransformer(patterns);

// Hooks for different stages of masking and rehydrating
const onMaskingStart = (message: string) =>
  console.log(`Starting to mask message: ${message}`);
const onMaskingEnd = (maskedMessage: string) =>
  console.log(`Masked message: ${maskedMessage}`);
const onRehydratingStart = (message: string) =>
  console.log(`Starting to rehydrate message: ${message}`);
const onRehydratingEnd = (rehydratedMessage: string) =>
  console.log(`Rehydrated message: ${rehydratedMessage}`);

// Initialize MaskingParser with the transformer and hooks
const maskingParser = new MaskingParser({
  transformers: [piiMaskingTransformer],
  onMaskingStart,
  onMaskingEnd,
  onRehydratingStart,
  onRehydratingEnd,
});

// Example message containing multiple types of PII
const message =
  "Contact Jane Doe at jane.doe@email.com or 555-123-4567. Her SSN is 123-45-6789 and her credit card number is 1234-5678-9012-3456. Passport number: AB1234567, Driver's License: X1234567, Address: 123 Main St, Date of Birth: 1990-01-01, Bank Account: 12345678901234567.";

// Mask and rehydrate the message
maskingParser
  .mask(message)
  .then((maskedMessage: string) => {
    console.log(`Masked message: ${maskedMessage}`);
    return maskingParser.rehydrate(maskedMessage);
  })
  .then((rehydratedMessage: string) => {
    console.log(`Final rehydrated message: ${rehydratedMessage}`);
  });

API Reference: