Open-source, privacy-first PII redaction.
Open-source library for detecting and redacting personally identifiable information. Built on 500+ tested regex patterns. Self-host for complete privacy and control.
Try it out
Why Choose OpenRedaction?
Focus on what matters - we handle the complexity of PII detection
Comply with GDPR Instantly
Automatically detect and redact PII to meet GDPR, HIPAA, and CCPA requirements. Our 500+ tested regex patterns handle it all with deterministic, transparent results.
Protect Customer Data Automatically
Real-time PII detection ensures sensitive information never leaves your system unprotected.
Simple npm Install
Install via npm and use directly in your application. Self-host for complete control.
Complete Audit Trails
When self-hosted, you control all logging and audit trails. Track all PII detections with detailed reporting for compliance and security reviews.
Transparent Pattern Detection
Transparent, deterministic detection using 500+ tested regex patterns for detecting names, emails, SSNs, phone numbers, and more.
Zero Data Retention
When self-hosted, your data is processed in-memory and never stored. No persistent databases. You maintain complete control over your data.
Why Pattern-Based Detection?
Fast, transparent, and privacy-preserving PII detection built for developers
Deterministic & Transparent
Same input always produces the same output. Patterns are visible and testable - no black box AI.
Fast Processing
Processes in milliseconds with no external API calls. No waiting for third-party AI services.
Runs Locally
No data leaves your environment. Process everything on your infrastructure for maximum privacy.
Privacy-Preserving
No third-party AI models. No data sent to external services. Complete control over your data.
Easy to Audit
Patterns are visible and testable. Perfect for compliance reviews and security audits.
Predictable Costs
No per-token pricing. Self-hosted version has zero variable costs. Predictable and affordable.
AI Layer (Optional)
For messy, unstructured text, we offer an optional AI/NER layer. Note: AI layer may miss some entities or produce false positives. Use with caution.
When AI Helps
- ✓Messy chat logs and transcripts
- ✓Unstructured text with typos
- ✓Context-dependent entity detection
Trade-offs
- ⚠Higher latency (seconds vs milliseconds)
- ⚠Increased cost per request
- ⚠Less predictable results
Use AI layer only when necessary. For most structured data, regex patterns are faster, cheaper, and more reliable. AI layer is slower, costlier, and less predictable than regex patterns.
Regex vs AI: Choose the Right Tool
Each approach has strengths. Pick what works best for your use case.
Regex Patterns
- ✓Fast - processes in milliseconds
- ✓Deterministic - same input, same output
- ✓Easy to audit - patterns are visible
- ✓Transparent - no black box
- ✓Predictable costs - no per-token fees
- ✓Local processing - no external APIs
AI/NER Layer
- ⚠May help with messy data (not guaranteed)
- ⚠Slower - seconds vs milliseconds
- ⚠Less predictable - may vary by run
- ⚠Higher cost - per-token pricing
- ⚠May require external API calls
- ⚠Harder to audit - black box model
Getting Started
Get started in 3 simple steps
Try the Playground
Test OpenRedaction with our free playground. No signup required - see how it works instantly.
Install the Library
Install via npm: npm install openredaction. Use directly in your Node.js application.
Deploy Self-Hosted
Self-host on your infrastructure for complete privacy and control. Contribute on GitHub to help improve the library.
Use Cases
Simple Installation
Install the open-source library and start detecting PII in minutes
npm install openredaction
import { redact } from 'openredaction';
const result = await redact('Your text here');
console.log(result.redacted_text);Secure PII Detection for Self-Hosted Deployments
Self-hosted security with zero data retention
Self-Hosted Control
Self-hosted deployments give you complete control. Processes text in memory, never stores raw input. No persistent databases by default. Your data never leaves your environment.
Deploy Anywhere
Open-source library works with Node.js and can be integrated into any application. Self-host on your infrastructure for complete privacy.
Full Audit Trail
When self-hosted, you manage all audit logs. Complete detection logs with entity types, positions, and timestamps. Perfect for compliance reporting.
Loved by Developers Worldwide
See what our users are saying
"OpenRedaction saved us weeks of development time. The open-source library is transparent and easy to integrate. Self-hosting gives us complete control over our data."
"We needed HIPAA-compliant PII detection and OpenRedaction delivered. The self-hosted option gives us complete control, and the regex patterns are transparent and auditable."
"The regex-first approach is perfect for our needs. We can audit all patterns, and self-hosting ensures our data never leaves our environment. The open-source community is helpful."
Our Open-Source Tools
OpenRedaction offers open-source solutions for PII detection and redaction
OpenRedaction (npm library)
Open-source regex library, developer-friendly, available via npm. Use directly in your Node.js applications. Self-host for complete privacy and control.
View on GitHub →OpenRedaction-site (this site)
Playground where users can try redaction in the browser, with no storage. Free demo of the library capabilities.
Try Playground →Disclosurely.com
A separate enterprise-grade whistleblowing platform with compliance features and advanced auditing. Uses OpenRedaction for PII protection.
Visit Disclosurely.com →Cost-Effective Redaction
Self-hosted open-source solution vs. expensive per-token pricing from cloud providers
Self-Hosted OpenRedaction
One-time setup
- • No per-request fees
- • Only infrastructure costs
- • Unlimited usage
- • Open-source and free
AWS/Google Cloud
Variable pricing
- • Pay per character/token
- • Costs scale with usage
- • 1M requests: $100s-$1000s
- • Proprietary and vendor-locked
Why OpenRedaction vs. AWS/Google?
Open source, self-hostable, and privacy-first - data never leaves your environment
| Feature | OpenRedaction | AWS/Google |
|---|---|---|
| Open Source | ✓ Yes | ✗ Proprietary |
| Self-Hostable | ✓ Yes | ✗ Cloud-only |
| Data Retention | ✓ None | ⚠ May log data |
| Account Required | ✓ No | ✗ Yes |
| Pricing Model | ✓ Predictable | ⚠ Per-token |
| Compliance Setup | ✓ Simple | ⚠ Complex |
| Data Control | ✓ Full control | ✗ Vendor-dependent |
With self-hosted OpenRedaction, your data never leaves your environment.Complete privacy and control.
Frequently Asked Questions
Transparency & Community
OpenRedaction is open source. Audit the code, contribute patterns, and help improve the library.
Report Issues
Found a bug or have a suggestion? Open an issue on GitHub and help us improve.
View Issues →Contribute Patterns
Share new regex patterns or improve existing ones. The community helps maintain and expand pattern coverage.
Contribute →How to Contribute
Fork the repository, make your changes, and submit a pull request. We welcome contributions from the community.
View on GitHub →Self-Host OpenRedaction
Install the open-source library and deploy on your infrastructure for complete privacy and control
Installation
npm install openredaction
Basic Usage
import { redact } from 'openredaction';
const text = "Contact John Doe at john@example.com";
const result = await redact(text);
console.log(result.redacted_text);Deployment Options
- Node.js server - Run directly in your Node.js application
- Docker - Containerize and deploy on any infrastructure
- On-premise - Deploy on your own servers for maximum control
For detailed self-hosting instructions, configuration options, and deployment examples, see our documentation or the GitHub README.
Limitations & Best Practices
Important information about using OpenRedaction effectively
Best-Effort Redaction
Redaction is best-effort, not perfect. OpenRedaction uses regex patterns and optional AI to detect PII, but it may miss some entities or produce false positives. Always manually review output when handling highly sensitive data.
Structured vs Unstructured Data
Regex patterns work best on structured data (forms, databases, JSON, well-formatted text). Messy or unstructured input may still leak sensitive information. The optional AI layer may help with messy text but is slower, costlier, and not guaranteed to catch everything.
Manual Review Recommended
For legal documents, compliance-critical content, or highly sensitive data, always manually review the redacted output. Automatic redaction should be used as a first pass, not a final solution.
Self-Hosted Responsibility
When self-hosting, you are responsible for your own infrastructure, security, compliance certifications, and data handling. OpenRedaction provides the tools, but you maintain full control and responsibility.
Ready to Get Started?
Try the playground, install the library, or contribute on GitHub