npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

octocode-data-masker

v1.0.0

Published

A TypeScript library for masking sensitive data in strings, including PII, tokens, API keys, and more

Readme

sensitive-data-masker

A high-performance TypeScript library for detecting and masking sensitive data in strings. Protect PII, API keys, tokens, credentials, and other confidential information with intelligent masking algorithms and configurable accuracy levels.

npm version License: MIT TypeScript Node.js

Features

  • 🛡️ 200+ Detection Patterns: Comprehensive coverage for modern security needs
  • High Performance: Optimized regex engine with pattern caching
  • 🎯 Accuracy Control: Configure detection sensitivity (high/medium/low)
  • 🔧 Flexible Masking: Smart partial masking that preserves readability
  • 📦 Zero Dependencies: Lightweight and secure
  • 🌍 International Support: Handles US, UK, Canadian, and international formats
  • 🔍 Pattern Filtering: Include or exclude specific pattern types
  • 📊 Detailed Results: Get match counts, positions, and masked values

Installation

npm install sensitive-data-masker
yarn add sensitive-data-masker

Quick Start

import { mask, hasSensitiveContent, getPatternMatches } from 'sensitive-data-masker';

// Basic usage - intelligent partial masking
const text = 'My email is [email protected] and my SSN is 123-45-6789';
const result = mask(text);
console.log(result.output);
// "My email is **[email protected]** and my SSN is **3-45-67**"

console.log(result.found);
// { email: 1, ssn: 1 }

// Check if content contains sensitive data
const isSensitive = hasSensitiveContent(text);
console.log(isSensitive); // true

// Get detailed pattern matches with positions
const matches = getPatternMatches(text);
console.log(matches);
// [
//   {
//     pattern: 'email',
//     matches: [{ match: '[email protected]', startIndex: 12, endIndex: 27 }]
//   },
//   {
//     pattern: 'ssn',
//     matches: [{ match: '123-45-6789', startIndex: 44, endIndex: 54 }]
//   }
// ]

API Reference

mask(input: string, options?: MaskingOptions): MaskResult

Masks sensitive content in a string using intelligent partial masking.

Options

interface MaskingOptions {
  maskChar?: string;                    // Character used for masking (default: '*')
  preserveLength?: boolean;             // Preserve original length (default: false)
  excludePatterns?: string[];           // Patterns to exclude from masking
  onlyPatterns?: string[];              // Only mask these patterns
  matchAccuracy?: 'high' | 'medium' | 'low'; // Detection sensitivity
}

Returns

interface MaskResult {
  output: string;                       // Masked string
  found: { [name: string]: number };    // Count of each pattern found
  matches: string[];                    // Original matched values
  masked: string[];                     // Masked versions of matches
}

hasSensitiveContent(input: string, options?): boolean

Quickly check if a string contains sensitive data without performing masking.

import { hasSensitiveContent } from 'sensitive-data-masker';

hasSensitiveContent('[email protected]'); // true
hasSensitiveContent('hello world');      // false

// With options
hasSensitiveContent('sk-1234567890abcdef', { 
  matchAccuracy: 'high',
  excludePatterns: ['genericId']
}); // true

getPatternMatches(input: string, options?): PatternMatch[]

Get detailed information about all pattern matches including their positions.

import { getPatternMatches } from 'sensitive-data-masker';

const matches = getPatternMatches('Contact: [email protected] and key: sk-123abc');
console.log(matches);
// [
//   {
//     pattern: 'email',
//     matches: [{ match: '[email protected]', startIndex: 9, endIndex: 22 }]
//   },
//   {
//     pattern: 'openaiApiKey',
//     matches: [{ match: 'sk-123abc', startIndex: 33, endIndex: 41 }]
//   }
// ]

Advanced Usage

Custom Masking Options

import { mask } from 'sensitive-data-masker';

// Custom masking character
const result = mask('API key: sk-1234567890abcdef', { maskChar: '#' });
console.log(result.output);
// "API key: ##-1234567890ab##"

// Preserve original length
const result2 = mask('secret123', { preserveLength: true });
console.log(result2.output);
// "*********" (full length masked)

// Use high accuracy mode (fewer false positives)
const result3 = mask('sk-1234567890abcdef', { matchAccuracy: 'high' });
console.log(result3.output);
// "##-1234567890ab##"

Pattern Filtering

// Only mask specific patterns
const result = mask('Email: [email protected], API: sk-123', { 
  onlyPatterns: ['email', 'openaiApiKey'] 
});

// Exclude certain patterns
const result2 = mask('Email: [email protected], UUID: 123e4567-e89b-12d3-a456-426614174000', { 
  excludePatterns: ['uuid', 'genericId']
});

// Combine with accuracy control
const result3 = mask(sensitiveText, {
  matchAccuracy: 'high',
  excludePatterns: ['uuid']
});

Supported Pattern Categories

The library detects sensitive data across 25 categories with 200+ patterns:

🆔 Personal Identifiable Information (PII)

  • Email addresses (multiple formats)
  • Phone numbers (US, International, E.164)
  • Social Security Numbers (US with various formats)
  • Driver's license numbers, Medical record numbers
  • Tax IDs (TIN/EIN), Canadian SIN, UK National Insurance Numbers

☁️ Cloud Provider Credentials

  • AWS: Access keys, secret keys, session tokens, account IDs
  • AWS Resources: EC2, S3, RDS, Lambda ARNs, VPC IDs
  • Azure: Subscription IDs, client secrets, resource IDs
  • Google Cloud: API keys, service account keys, project IDs

💳 Financial & Payment Services

  • Credit card numbers (Visa, MasterCard, Amex, Discover)
  • Stripe: Secret keys, publishable keys, webhook secrets
  • PayPal: Access tokens, client IDs
  • Square: Access tokens, application IDs
  • Bank account numbers (US routing numbers, IBAN)

🤖 AI Provider Credentials

  • OpenAI: API keys, organization IDs
  • Anthropic/Claude: API keys
  • Google AI: Gemini API keys, Vertex AI tokens
  • Hugging Face: Access tokens, API keys
  • Other AI: Groq, Perplexity, Replicate, Together AI

🔐 Authentication & Security

  • JWT tokens, Bearer tokens
  • OAuth access tokens, refresh tokens
  • API keys in headers (X-API-Key, Authorization)
  • Session IDs, CSRF tokens
  • Generic secret patterns in environment variables

🔧 Developer Tools & Services

  • GitHub: Personal access tokens, app tokens
  • Slack: Bot tokens, webhook URLs, app secrets
  • Discord: Bot tokens, webhook URLs
  • Analytics: Google Analytics, Mixpanel, Amplitude
  • Monitoring: Datadog, New Relic, Sentry keys

🗄️ Database & Storage

  • Database connection strings (PostgreSQL, MySQL, MongoDB)
  • File Storage: S3 bucket URLs, Azure Blob Storage
  • CDN: CloudFront URLs, Azure CDN
  • Redis connection strings, Elasticsearch URLs

🔑 Cryptographic Materials

  • RSA private keys, SSH private keys
  • EC private keys, DSA private keys
  • X.509 certificates, PGP private key blocks
  • JSON Web Keys (JWK), PKCS#8 keys

🌐 Network & Location

  • IPv4/IPv6 addresses, MAC addresses
  • Geographic coordinates (latitude/longitude)
  • Private network ranges, subnet masks
  • URL patterns with embedded secrets

📱 Communication Services

  • Messaging: Twilio, SendGrid, Mailgun keys
  • Social Media: Twitter, Facebook, Instagram tokens
  • Email Services: Mailchimp, Postmark, SparkPost
  • SMS/Voice: Nexmo, Plivo, MessageBird

🛠️ Infrastructure & DevOps

  • Container Registries: Docker Hub, ECR, GCR tokens
  • CI/CD: Jenkins, GitLab CI, CircleCI tokens
  • Deployment: Vercel, Netlify, Heroku tokens
  • Monitoring: PagerDuty, Datadog, New Relic

🏢 Enterprise & Business

  • CRM: Salesforce, HubSpot tokens
  • E-commerce: Shopify, WooCommerce keys
  • Business Tools: Slack, Microsoft Teams tokens
  • Analytics: Google Analytics, Adobe Analytics

🎯 Generic Patterns

  • UUID v4, Generic IDs
  • Base64 encoded secrets
  • Hex-encoded keys (32, 64, 128 bit)
  • Custom secret patterns in configuration files

🔍 URL & Reference Patterns

  • URLs with embedded tokens
  • Database connection URIs
  • API endpoints with keys
  • Webhook URLs with secrets

💾 Version Control & Code

  • Git repository URLs with tokens
  • Package manager tokens (npm, PyPI)
  • Container registry credentials
  • Code hosting platform tokens

Pattern Accuracy Levels

Control detection sensitivity to balance between security and false positives:

High Accuracy

  • Most specific patterns with minimal false positives
  • Examples: AWS access keys with AKIA prefix, specific API key formats
  • Best for production environments

Medium Accuracy (Default)

  • Balanced detection with reasonable false positive rates
  • Examples: Generic API keys, common secret patterns
  • Good for most use cases

Low Accuracy

  • Broadest detection, may have higher false positive rates
  • Examples: Generic IDs, loose pattern matching
  • Useful for comprehensive scanning
// Use high accuracy for production
const prodResult = mask(text, { matchAccuracy: 'high' });

// Use medium accuracy for development  
const devResult = mask(text, { matchAccuracy: 'medium' });

// Use low accuracy for comprehensive scanning
const scanResult = mask(text, { matchAccuracy: 'low' });

TypeScript Support

Full TypeScript support with complete type definitions:

import { mask, hasSensitiveContent, getPatternMatches } from 'sensitive-data-masker';
import type { MaskResult, MaskingOptions } from 'sensitive-data-masker';

// Type-safe masking options
const options: MaskingOptions = {
  maskChar: '#',
  matchAccuracy: 'high',
  excludePatterns: ['uuid']
};

const result: MaskResult = mask(text, options);

Real-World Examples

Log File Sanitization

import { mask } from 'sensitive-data-masker';

const logEntry = `
[2024-01-15 10:30:45] INFO User [email protected] logged in
[2024-01-15 10:31:12] DEBUG API call with key sk-1234567890abcdef
[2024-01-15 10:31:15] ERROR Payment failed for card 4111-1111-1111-1111
[2024-01-15 10:31:20] WARN SSN in request: 123-45-6789
`;

const sanitized = mask(logEntry);
console.log(sanitized.output);
// [2024-01-15 10:30:45] INFO User **[email protected]** logged in
// [2024-01-15 10:31:12] DEBUG API call with key **-1234567890ab**
// [2024-01-15 10:31:15] ERROR Payment failed for card **11-1111-1111-11**
// [2024-01-15 10:31:20] WARN SSN in request: **3-45-67**

console.log(sanitized.found);
// { email: 1, openaiApiKey: 1, creditCard: 1, ssn: 1 }

Configuration File Security

const config = `
DATABASE_URL=postgresql://user:password123@localhost:5432/db
OPENAI_API_KEY=sk-1234567890abcdef1234567890abcdef
STRIPE_SECRET_KEY=sk_live_abcdef123456
[email protected]
JWT_SECRET=super-secret-key-123
`;

const result = mask(config);
console.log(result.output);
// DATABASE_URL=postgresql://user:**ssword1** @localhost:5432/db
// OPENAI_API_KEY=**-1234567890abcdef1234567890ab**
// STRIPE_SECRET_KEY=**_live_abcdef12**
// ADMIN_EMAIL=**[email protected]**
// JWT_SECRET=**per-secret-key-1**

Multi-Environment Setup

import { mask } from 'sensitive-data-masker';

// Production: Mask everything with high accuracy
const prodResult = mask(sensitiveData, { matchAccuracy: 'high' });

// Development: Allow test emails but mask real API keys
const devResult = mask(sensitiveData, { 
  matchAccuracy: 'medium',
  excludePatterns: ['email'] 
});

// Testing: Only mask financial data
const testResult = mask(sensitiveData, { 
  onlyPatterns: ['creditCard', 'bankAccount', 'ssn'],
  matchAccuracy: 'high'
});

Data Pipeline Processing

import { hasSensitiveContent, mask } from 'sensitive-data-masker';

// Check if data needs processing
function processBatch(records: string[]) {
  const results = records.map(record => {
    if (hasSensitiveContent(record)) {
      const masked = mask(record, { matchAccuracy: 'high' });
      return {
        data: masked.output,
        hadSensitiveData: true,
        patternsFound: Object.keys(masked.found)
      };
    }
    return { data: record, hadSensitiveData: false };
  });
  
  return results;
}

Performance Considerations

  • Optimized Regex Engine: Patterns are compiled and cached on first use
  • Single-Pass Processing: Efficient string traversal with minimal overhead
  • Memory Efficient: No unnecessary string copies or allocations
  • Pattern Filtering: Use onlyPatterns when you know which types to look for
  • Accuracy Optimization: Higher accuracy modes are faster due to more specific patterns
// Optimize for specific use cases
const emailsOnly = mask(text, { onlyPatterns: ['email'] }); // Faster
const highAccuracy = mask(text, { matchAccuracy: 'high' }); // Faster, fewer false positives
const comprehensive = mask(text, { matchAccuracy: 'low' }); // Slower, more thorough

Security Best Practices

  1. Always mask before logging: Ensure sensitive data is masked before writing to logs
  2. Use appropriate accuracy: Higher accuracy for production, lower for development/testing
  3. Store results securely: The matches array contains original sensitive values
  4. Regular updates: Keep the library updated for new pattern definitions
  5. Test your patterns: Verify masking works correctly with your specific data formats
  6. Environment-specific config: Use different settings for dev/staging/production

Development

Prerequisites

  • Node.js >= 18.12.0
  • Yarn or npm

Setup

git clone https://github.com/bgauryy/sensitive-data-mask.git
cd sensitive-data-mask
yarn install

Commands

yarn build          # Build the library
yarn dev           # Build in watch mode
yarn lint          # Run ESLint
yarn test          # Run tests
yarn typecheck     # Run TypeScript compiler checks

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Adding New Patterns

  1. Choose the appropriate category file in src/regexes/
  2. Add your pattern following the existing structure:
{
  name: 'myPattern',
  regex: /your-regex-here/gi,
  description: 'Description of what this detects',
  matchAccuracy: 'medium' // optional: 'high', 'medium', or 'low'
}
  1. Run tests to ensure no regressions
  2. Submit a PR with a clear description

License

MIT © guybary

Security

If you discover a security vulnerability, please email [email protected] instead of using the issue tracker.


Made with ❤️ for developers who care about data security