npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

contact-deduplication-lib

v0.4.0

Published

A TypeScript library for contact deduplication

Readme

Contact Deduplication Library

A TypeScript library for identifying and merging duplicate contacts.

Features

  • Find duplicate contacts using various matching algorithms
  • Configurable matching thresholds and field weights
  • Support for merging duplicate contacts
  • Multiple matching strategies (email-based, phone-based, combined)
  • Fuzzy string matching for names and other text fields
  • Field mapping utilities for integrating with different contact schemas
  • Dual module format (CommonJS and ESM) for maximum compatibility
  • Webpack-friendly ES5 output for seamless integration

Installation

# Using npm
npm install contact-deduplication-lib

# Using yarn
yarn add contact-deduplication-lib

# Using pnpm
pnpm add contact-deduplication-lib

Usage

Basic Usage

import { ContactDeduplicator } from 'contact-deduplication-lib';

// Create a deduplicator with default options
const deduplicator = new ContactDeduplicator();

// Find duplicates in your contacts
const result = deduplicator.findDuplicates(contacts);

// Access the results
console.log(`Found ${result.duplicateGroups.length} groups of duplicates`);
console.log(`${result.uniqueContacts.length} contacts have no duplicates`);

Auto-Merging Duplicates

// Create a deduplicator with auto-merge enabled
const deduplicator = new ContactDeduplicator({ autoMerge: true });

// Find and merge duplicates
const result = deduplicator.findDuplicates(contacts);

// Access the merged contacts
console.log(`Created ${result.mergedContacts?.length} merged contacts`);

Custom Matching Options

// Create a deduplicator with custom options
const deduplicator = new ContactDeduplicator({
  threshold: 0.8, // Higher threshold for stricter matching
  fieldsToCompare: ['firstName', 'lastName', 'email', 'phone'], // Only compare these fields
  fieldWeights: {
    email: 0.5, // Email matches are more important
    phone: 0.3,
    firstName: 0.1,
    lastName: 0.1,
  },
});

Using Different Matchers

import { ContactDeduplicator, emailMatcher, phoneMatcher, strictMatcher, hybridMatcher } from 'contact-deduplication-lib';

// Create a deduplicator that prioritizes email matches
const emailDeduplicator = new ContactDeduplicator({}, emailMatcher);

// Create a deduplicator that prioritizes phone matches
const phoneDeduplicator = new ContactDeduplicator({}, phoneMatcher);

// Create a deduplicator that requires fields with high weights to be present and not empty
// This is useful when you want to prevent matching contacts with empty required fields
const strictDeduplicator = new ContactDeduplicator({
  fieldWeights: {
    email: 0.8, // Fields with weight > 0.5 are considered required
    phone: 0.7,
    firstName: 0.15,
    lastName: 0.15,
  }
}, strictMatcher);

// Create a deduplicator that uses a hybrid approach
// Uses strict matcher if a contact has both empty email and phone arrays
// Otherwise uses the combined matcher for normal comparison
const hybridDeduplicator = new ContactDeduplicator({
  fieldWeights: {
    email: 0.8,
    phone: 0.7,
    firstName: 0.15,
    lastName: 0.15,
  }
}, hybridMatcher);

Handling Empty Fields

The library provides different ways to handle contacts with missing or empty fields:

  1. Default behavior: By default, contacts with empty arrays (e.g., email: []) are considered a perfect match for that field if both contacts have empty arrays.

  2. Using strictMatcher: The strictMatcher treats fields with high weights (> 0.5) as required fields. If any required field is an empty array in either contact, they will not be matched, regardless of other field similarities.

// Example: Contacts with empty email arrays will not match even if names match perfectly
const strictDeduplicator = new ContactDeduplicator({
  fieldWeights: {
    email: 0.8, // Email is required (weight > 0.5)
    phone: 0.7, // Phone is required (weight > 0.5)
    firstName: 0.15,
    lastName: 0.15,
  }
}, strictMatcher);
  1. Using hybridMatcher: The hybridMatcher provides a balanced approach by using the strictMatcher only when a contact has both empty email and phone arrays. If at least one of these fields has values, it falls back to the regular combinedMatcher.
// Example: Using the hybrid approach
const hybridDeduplicator = new ContactDeduplicator({
  fieldWeights: {
    email: 0.8,
    phone: 0.7,
    firstName: 0.15,
    lastName: 0.15,
  }
}, hybridMatcher);

This approach prevents matching contacts that have no identifying information (no email or phone) while still allowing matches when at least one of these fields is present.

Field Mapping for Different Contact Schemas

The library includes utilities for mapping contacts from different schemas:

import { mapExternalContact, Contact } from 'contact-deduplication-lib';

// External contact with different field names
const externalContact = {
  'First Name': 'John',
  'Last Name': 'Smith',
  'Public Email': '[email protected]',
  'Personal Email': '[email protected]',
};

// Map to our internal schema
const mappedContact = mapExternalContact(externalContact, {
  firstName: 'First Name',
  lastName: 'Last Name',
  emails: ['Public Email', 'Personal Email'],
});

// Now you can use the mapped contact with the deduplicator
const deduplicator = new ContactDeduplicator();
const result = deduplicator.findDuplicates([existingContact, mappedContact]);

For more complex mapping scenarios:

// External contact with custom field names
const externalContact = {
  contactId: '123',
  contactFirstName: 'Jane',
  contactLastName: 'Doe',
  primaryEmail: '[email protected]',
  secondaryEmail: '[email protected]',
  workPhone: '555-123-4567',
  mobilePhone: '555-987-6543',
  organization: 'Acme Inc',
  position: 'Software Engineer',
  dateCreated: '2023-01-15T00:00:00Z',
  dateModified: '2023-02-20T00:00:00Z',
};

// Define custom field mapping
const fieldMapping = {
  id: 'contactId',
  firstName: 'contactFirstName',
  lastName: 'contactLastName',
  emails: ['primaryEmail', 'secondaryEmail'],
  phones: ['workPhone', 'mobilePhone'],
  company: 'organization',
  jobTitle: 'position',
  createdAt: 'dateCreated',
  updatedAt: 'dateModified',
};

// Map the contact
const mappedContact = mapExternalContact(externalContact, fieldMapping);

API Reference

ContactDeduplicator

The main class for finding and merging duplicate contacts.

Constructor

constructor(
  options?: Partial<DeduplicationOptions>,
  matcher?: ContactMatcher
)

Methods

  • findDuplicates(contacts: Contact[]): DeduplicationResult - Finds duplicate contacts
  • mergeDuplicateGroups(duplicateGroups: Contact[][]): Contact[] - Merges groups of duplicate contacts
  • setOptions(options: Partial<DeduplicationOptions>): void - Updates the deduplication options
  • setMatcher(matcher: ContactMatcher): void - Sets a new matcher function

Field Mapping

mapExternalContact

function mapExternalContact(
  externalContact: ExternalContact,
  fieldMapping: FieldMapping
): Contact

Maps an external contact with any schema to our internal Contact type.

FieldMapping Interface

interface FieldMapping {
  id?: string;
  firstName?: string;
  lastName?: string;
  emails?: string[];
  phones?: string[];
  company?: string;
  jobTitle?: string;
  addressFields?: {
    street?: string;
    city?: string;
    state?: string;
    postalCode?: string;
    country?: string;
    type?: string;
  };
  createdAt?: string;
  updatedAt?: string;
}

Types

Contact

interface Contact {
  id: string;
  firstName?: string;
  lastName?: string;
  email?: string[];
  phone?: string[];
  address?: Address[];
  company?: string;
  jobTitle?: string;
  notes?: string;
  createdAt: Date;
  updatedAt: Date;
  [key: string]: any; // Additional properties
}

DeduplicationOptions

interface DeduplicationOptions {
  threshold: number; // 0.0 to 1.0
  fieldsToCompare: (keyof Contact)[];
  autoMerge: boolean;
  fieldWeights?: Record<keyof Contact, number>;
}

DeduplicationResult

interface DeduplicationResult {
  duplicateGroups: Contact[][]; // Groups of duplicate contacts
  uniqueContacts: Contact[]; // Contacts with no duplicates
  mergedContacts?: Contact[]; // Merged contacts (if autoMerge is true)
}

Development

Setup

# Clone the repository
git clone https://github.com/yourusername/contact-deduplication-lib.git
cd contact-deduplication-lib

# Install dependencies
pnpm install

# Build the library
pnpm build

# Run tests
pnpm test

Webpack Integration

If you're using webpack, the library should work out of the box with the following configuration:

// webpack.config.js
module.exports = {
  // ... your other webpack configuration
  resolve: {
    extensions: ['.ts', '.tsx', '.js', '.jsx'],
  },
  module: {
    rules: [
      {
        test: /\.tsx?$/,
        use: 'ts-loader',
        exclude: /node_modules/,
      },
      // If you still encounter issues, you can add this rule:
      {
        test: /\.m?js$/,
        include: /node_modules\/contact-deduplication-lib/,
        use: {
          loader: 'babel-loader',
          options: {
            presets: ['@babel/preset-env']
          }
        }
      }
    ],
  },
};

Scripts

  • pnpm build - Build the library
  • pnpm dev - Build with watch mode
  • pnpm test - Run tests
  • pnpm test:watch - Run tests in watch mode
  • pnpm lint - Type check without emitting files
  • pnpm clean - Remove build artifacts

License

ISC