npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@openpets/internet-archive

v1.0.0

Published

Access the Internet Archive's vast digital library including the Wayback Machine, millions of books, audio recordings, videos, and more. Search items, retrieve metadata, check URL snapshots, download files, and list historical captures.

Readme

Internet Archive Plugin for OpenPets

Access the Internet Archive's vast digital library including the Wayback Machine, millions of books, audio recordings, videos, and more. This plugin enables searching, metadata retrieval, Wayback Machine queries, and file downloads from archive.org.

Features

  • Search - Query archive.org's catalog with advanced Lucene syntax
  • Metadata - Retrieve detailed item information, files, and collections
  • Wayback Machine - Check if URLs are archived and retrieve snapshots
  • CDX API - List historical captures with timestamps and status codes
  • Downloads - Generate direct download links for item files

Installation

# Install the pet
pets add internet-archive

# Or for development
pets new internet-archive
cd pets/internet-archive
bun install

Configuration

Get Internet Archive Credentials

  1. Create an account at https://archive.org/account/signup
  2. Get your S3-like API keys at https://archive.org/account/s3.php
  3. Set the environment variables:
# Copy and edit the configuration
cp .env.example .env

# Edit .env with your credentials:
IA_ACCESS_KEY=your_access_key
IA_SECRET_KEY=your_secret_key

Required Permissions

Most features work with any account. For write operations (metadata updates), ensure your account has proper permissions.

Available Tools

1. Test Connection

Verify your credentials are working:

opencode run "test internet-archive connection"

2. Search Items

Search archive.org's catalog:

# Basic search
opencode run "search archive.org for public domain astronomy books"

# With filters
opencode run "search archive.org for 'mediatype:audio AND year:2020'"

# Specific collection
opencode run "search archive.org for items in collection NASA"

Search Query Syntax:

The plugin supports Internet Archive's Lucene-style query syntax:

  • title:shakespeare - Search by title
  • mediatype:audio - Filter by media type (texts, audio, video, software, image)
  • year:2020 - Filter by year
  • subject:history - Filter by subject
  • collection:NASA - Search within a collection
  • creator:"Mark Twain" - Search by creator
  • language:eng - Filter by language code

3. Get Metadata

Retrieve detailed item information:

opencode run "get metadata for item goodytwoshoes00newyiala"

4. Check URL (Wayback Machine)

Check if a URL is archived:

# Check current availability
opencode run "check if https://example.com is archived"

# Check specific date
opencode run "check if https://example.com is archived with timestamp 2020"

5. List Snapshots (CDX)

Get all historical captures:

# List all snapshots
opencode run "list snapshots of https://google.com"

# With date range
opencode run "list snapshots of https://google.com from 2020 to 2021"

# With filters
opencode run "list snapshots of https://example.com with filter statuscode:200"

6. Download Files

Get download links for item files:

# List all downloadable files
opencode run "download files from item nasa-image-library"

# Filter by format
opencode run "download PDF files from item classic-literature-collection"

Example Workflows

Research Workflow

# 1. Search for vintage computing materials
opencode run "search archive.org for 'vintage computing manuals'"

# 2. Get detailed metadata for the most relevant item
opencode run "get metadata for item softwarelibrary"

# 3. Download the PDF version
opencode run "download PDF files from item softwarelibrary"

Wayback Research

# 1. Check if a site is archived
opencode run "check if https://old-website.com is archived"

# 2. Get snapshots from a specific year
opencode run "list snapshots of https://old-website.com from 2015 to 2016"

# 3. Access a specific snapshot (returns archived URL)
opencode run "check if https://old-website.com is archived with timestamp 20150601"

Environment Variables

| Variable | Required | Description | |----------|----------|-------------| | IA_ACCESS_KEY | Yes* | S3-like access key from archive.org | | IA_SECRET_KEY | Yes* | S3-like secret key from archive.org |

*Most search and Wayback features work without credentials, but authentication is required for full metadata access and all write operations.

Rate Limiting

The Internet Archive has rate limits in place. The plugin automatically includes proper User-Agent headers to identify requests. Please be respectful:

  • Add delays between bulk operations
  • Honor 429 Too Many Requests responses
  • Cache responses when possible
  • Use filters to limit result sizes

API Documentation

Bot Guidelines

This plugin follows Internet Archive's bot and LLM guidelines:

  • Includes descriptive User-Agent headers
  • Respects rate limits
  • Identifies as automated tool
  • Uses authentication when available

Troubleshooting

Authentication Errors

# Test your credentials
opencode run "test internet-archive connection"

# Check your keys at:
# https://archive.org/account/s3.php

Item Not Found

Items use specific identifiers. To find an identifier:

  1. Visit the item on archive.org
  2. Look at the URL: https://archive.org/details/[IDENTIFIER]
  3. Use that identifier in your queries

Rate Limited

If you receive 429 errors:

  • Wait a few minutes between requests
  • Reduce result sizes with rows parameter
  • Add more specific filters to your queries

Contributing

Contributions welcome! Please see the main OpenPets repository for contribution guidelines.

License

AGPL-3.0 - See LICENSE file for details.