npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

pickasso

v0.0.3

Published

Select diverse examples from JSON datasets

Readme

🎨 pickasso

License

Intelligent selection of diverse examples from JSON datasets

FeaturesQuick StartInstallationUsageAdvancedHow It Works

🎯 Features

  • 🔄 Smart Selection: Intelligently selects diverse examples from your dataset
  • 📊 Completeness Aware: Optionally prioritizes more complete records
  • 🎛️ Configurable Distance: Custom distance functions for your specific needs
  • 💪 Efficient Processing: Smart sampling for large datasets
  • 🎮 CLI Support: Easy command-line interface for quick analysis
  • 📦 TypeScript Ready: Full TypeScript support with comprehensive types

🚀 Quick Start

# From npm
bunx pickasso data.json -n 5

# Using key path for nested data
bunx pickasso response.json -n 5 -k "data.items"

# Prioritize complete records
bunx pickasso users.json -n 10 -p

📦 Installation

As a CLI Tool

bunx pickasso <file> -n <number_of_examples>

As a Package

bun install pickasso

💻 Usage

Command Line Interface

bunx pickasso <input-file> [options]

Options:

  • -n, --num-examples <number>: Number of examples to select (required)
  • -s, --sample-size <number>: Size of random sample to consider
  • -p, --prioritize-complete: Consider record completeness in selection
  • -k, --key-path <path>: Path to array in nested JSON (e.g., 'data.items')
  • -o, --out-file <file>: Output file (defaults to stdout)
  • -w, --completeness-weight <number>: Balance between diversity and completeness (0-1, only used with -p)
    • 0: Pure diversity-based selection
    • 1: Pure completeness-based selection
    • 0.3 (default): Balanced selection favoring diversity

As a Module

import { selectDiverseExamples } from "pickasso";

const dataset = [
  { id: 1, name: "John", age: 25 },
  { id: 2, name: "Jane", age: 30 },
  // ... more objects
];

const diverseExamples = selectDiverseExamples(dataset, {
  numExamples: 5,
  prioritizeComplete: true,
  completenessWeight: 0.3,
});

🔧 Advanced Usage

Custom Distance Functions

Define how similarity is calculated between objects:

const customDistance = (a: any, b: any) => {
  // Custom logic to calculate distance
  // Returns a number between 0 and 1
  return Math.abs(a.age - b.age) / 100;
};

const selected = selectDiverseExamples(dataset, {
  numExamples: 5,
  distanceFunction: customDistance,
});

Handling Large Datasets

Pickasso automatically handles large datasets efficiently:

// For large datasets, use sample size to control processing
const selected = selectDiverseExamples(largeDataset, {
  numExamples: 10,
  sampleSize: 1000, // Consider 1000 random items
});

Balancing Diversity and Completeness

When working with real-world data, you often want examples that are both diverse and well-populated. Pickasso lets you control this balance:

// Default behavior: Pure diversity-based selection
const diverse = selectDiverseExamples(dataset, {
  numExamples: 5,
});

// Prioritize complete records while maintaining diversity
const balancedSelection = selectDiverseExamples(dataset, {
  numExamples: 5,
  prioritizeComplete: true, // Enable completeness consideration
  completenessWeight: 0.3, // 30% completeness, 70% diversity
});

// Strongly favor complete records
const completeRecords = selectDiverseExamples(dataset, {
  numExamples: 5,
  prioritizeComplete: true,
  completenessWeight: 0.8, // 80% completeness, 20% diversity
});

Working with Nested Data

Both CLI and API support nested data structures:

// CLI
pickasso complex.json -n 5 -k "response.data.items"

// API
const data = {
  response: {
    data: {
      items: [/* ... */]
    }
  }
};

const selected = selectDiverseExamples(data.response.data.items, {
  numExamples: 5
});

⚙️ How It Works

Pickasso uses a multi-step algorithm to select diverse examples:

  1. Initial Selection

    • Randomly samples from the dataset if needed
    • Optionally starts with the most complete item
  2. Iterative Selection

    • Calculates distances between candidates and selected items
    • Maximizes minimum distance to ensure diversity
    • Optionally weights completeness scores
  3. Distance Calculation

    • Flattens nested objects for comparison
    • Normalizes numerical differences
    • Handles missing values gracefully

🛠️ Requirements

  • Node.js 14 or later
  • TypeScript 4.5+ (for development)

Contributing

Contributions are welcome! Check out our contribution guidelines for details.

Created by Hrishi Olickel • Support Pickasso by starring our GitHub repository