npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

unstructured-client

v0.29.1

Published

<h3 align="center"> <img src="https://raw.githubusercontent.com/Unstructured-IO/unstructured/main/img/unstructured_logo.png" height="200" > </h3>

Downloads

210,479

Readme

This is a HTTP client for the Unstructured Platform API. You can sign up here and process 1000 free pages per day for 14 days.

Please refer to the our documentation for a full guide on integrating the Partition Endpoint into your JavaScript/TypeScript code. Support for the Workflow Endpoint is coming soon.

SDK Installation

NPM

npm install unstructured-client --include=dev

Yarn

yarn add unstructured-client --dev

Model Context Protocol (MCP) Server

This SDK is also an installable MCP server where the various SDK methods are exposed as tools that can be invoked by AI applications.

Node.js v20 or greater is required to run the MCP server.

Add the following server definition to your claude_desktop_config.json file:

{
  "mcpServers": {
    "Unstructured": {
      "command": "npx",
      "args": [
        "-y", "--package", "unstructured-client",
        "--",
        "mcp", "start",
      ]
    }
  }
}

Go to Cursor Settings > Features > MCP Servers > Add new MCP server and use the following settings:

  • Name: Unstructured
  • Type: command
  • Command:
npx -y --package unstructured-client -- mcp start

For a full list of server arguments, run:

npx -y --package unstructured-client -- mcp start --help

SDK Example Usage

Example

import { UnstructuredClient } from "unstructured-client";
import { PartitionResponse } from "unstructured-client/sdk/models/operations";
import { Strategy } from "unstructured-client/sdk/models/shared";
import * as fs from "fs";

const unstructuredClient = new UnstructuredClient({
    security: {
        apiKeyAuth: "YOUR_API_KEY",
    },
});

const filename = "./sample-file";
const data = fs.readFileSync(filename);

unstructuredClient.general.partition({
    partitionParameters: {
        files: {
            content: data,
            fileName: filename,
        },
	strategy: Strategy.Auto,
    }
}).then((res: PartitionResponse) => {
    if (res.statusCode == 200) {
        console.log(res.elements);
    }
}).catch((e) => {
    console.log(e.statusCode);
    console.log(e.body);
});

Refer to the API parameters page for all available parameters.

Change the base URL

If you are self hosting the API, or developing locally, you can change the server URL when setting up the client.

const client = new UnstructuredClient({
    serverURL: "http://localhost:8000",
    security: {
        apiKeyAuth: key,
    },
});

// OR

const client = new UnstructuredClient({
    serverURL: "https://my-server-url",
    security: {
        apiKeyAuth: key,
    },
});

Custom HTTP Client

The TypeScript SDK makes API calls using an HTTPClient that wraps the native Fetch API. This client is a thin wrapper around fetch and provides the ability to attach hooks around the request lifecycle that can be used to modify the request or handle errors and response.

The HTTPClient constructor takes an optional fetcher argument that can be used to integrate a third-party HTTP client or when writing tests to mock out the HTTP client and feed in fixtures.

The following example shows how to use the "beforeRequest" hook to to add a custom header and a timeout to requests and how to use the "requestError" hook to log errors:

import { UnstructuredClient } from "unstructured-client";
import { HTTPClient } from "unstructured-client/lib/http";

const httpClient = new HTTPClient({
  // fetcher takes a function that has the same signature as native `fetch`.
  fetcher: (request) => {
    return fetch(request);
  }
});

httpClient.addHook("beforeRequest", (request) => {
  const nextRequest = new Request(request, {
    signal: request.signal || AbortSignal.timeout(5000)
  });

  nextRequest.headers.set("x-custom-header", "custom value");

  return nextRequest;
});

httpClient.addHook("requestError", (error, request) => {
  console.group("Request Error");
  console.log("Reason:", `${error}`);
  console.log("Endpoint:", `${request.method} ${request.url}`);
  console.groupEnd();
});

const sdk = new UnstructuredClient({ httpClient: httpClient });

Retries

Some of the endpoints in this SDK support retries. If you use the SDK without any configuration, it will fall back to the default retry strategy provided by the API. However, the default retry strategy can be overridden on a per-operation basis, or across the entire SDK.

To change the default retry strategy for a single API call, simply provide a retryConfig object to the call:

import { openAsBlob } from "node:fs";
import { UnstructuredClient } from "unstructured-client";
import {
  Strategy,
  VLMModelProvider,
} from "unstructured-client/sdk/models/shared";

const unstructuredClient = new UnstructuredClient();

async function run() {
  const result = await unstructuredClient.general.partition({
    partitionParameters: {
      chunkingStrategy: "by_title",
      files: await openAsBlob("example.file"),
      splitPdfPageRange: [
        1,
        10,
      ],
      strategy: Strategy.Auto,
      vlmModel: "gpt-4o",
      vlmModelProvider: VLMModelProvider.Openai,
    },
  }, {
    retries: {
      strategy: "backoff",
      backoff: {
        initialInterval: 1,
        maxInterval: 50,
        exponent: 1.1,
        maxElapsedTime: 100,
      },
      retryConnectionErrors: false,
    },
  });

  console.log(result);
}

run();

If you'd like to override the default retry strategy for all operations that support retries, you can provide a retryConfig at SDK initialization:

import { openAsBlob } from "node:fs";
import { UnstructuredClient } from "unstructured-client";
import {
  Strategy,
  VLMModelProvider,
} from "unstructured-client/sdk/models/shared";

const unstructuredClient = new UnstructuredClient({
  retryConfig: {
    strategy: "backoff",
    backoff: {
      initialInterval: 1,
      maxInterval: 50,
      exponent: 1.1,
      maxElapsedTime: 100,
    },
    retryConnectionErrors: false,
  },
});

async function run() {
  const result = await unstructuredClient.general.partition({
    partitionParameters: {
      chunkingStrategy: "by_title",
      files: await openAsBlob("example.file"),
      splitPdfPageRange: [
        1,
        10,
      ],
      strategy: Strategy.Auto,
      vlmModel: "gpt-4o",
      vlmModelProvider: VLMModelProvider.Openai,
    },
  });

  console.log(result);
}

run();

Splitting PDF by pages

See page splitting for more details.

In order to speed up processing of large PDF files, the client splits up PDFs into smaller files, sends these to the API concurrently, and recombines the results. splitPdfPage can be set to false to disable this.

The amount of parallel requests is controlled by splitPdfConcurrencyLevel parameter. By default it equals to 5. It can't be more than 15, to avoid too high resource usage and costs. The size of each batch is determined internally and it can vary between 2 and 20 pages per split.

client.general.partition({
    partitionParameters: {
        files: {
            content: data,
            fileName: filename,
        },
        // Set splitPdfPage parameter to false in order to disable splitting PDF
        splitPdfPage: true,
        // Modify splitPdfConcurrencyLevel to change the limit of parallel requests
        splitPdfConcurrencyLevel: 10,
    },
}};

Summary

Table of Contents

Requirements

For supported JavaScript runtimes, please consult RUNTIMES.md.

Standalone functions

All the methods listed above are available as standalone functions. These functions are ideal for use in applications running in the browser, serverless runtimes or other environments where application bundle size is a primary concern. When using a bundler to build your application, all unused functionality will be either excluded from the final bundle or tree-shaken away.

To read more about standalone functions, check FUNCTIONS.md.

File uploads

Certain SDK methods accept files as part of a multi-part request. It is possible and typically recommended to upload files as a stream rather than reading the entire contents into memory. This avoids excessive memory consumption and potentially crashing with out-of-memory errors when working with very large files. The following example demonstrates how to attach a file stream to a request.

[!TIP]

Depending on your JavaScript runtime, there are convenient utilities that return a handle to a file without reading the entire contents into memory:

  • Node.js v20+: Since v20, Node.js comes with a native openAsBlob function in node:fs.
  • Bun: The native Bun.file function produces a file handle that can be used for streaming file uploads.
  • Browsers: All supported browsers return an instance to a File when reading the value from an <input type="file"> element.
  • Node.js v18: A file stream can be created using the fileFrom helper from fetch-blob/from.js.
import { openAsBlob } from "node:fs";
import { UnstructuredClient } from "unstructured-client";
import {
  Strategy,
  VLMModelProvider,
} from "unstructured-client/sdk/models/shared";

const unstructuredClient = new UnstructuredClient();

async function run() {
  const result = await unstructuredClient.general.partition({
    partitionParameters: {
      chunkingStrategy: "by_title",
      files: await openAsBlob("example.file"),
      splitPdfPageRange: [
        1,
        10,
      ],
      strategy: Strategy.Auto,
      vlmModel: "gpt-4o",
      vlmModelProvider: VLMModelProvider.Openai,
    },
  });

  console.log(result);
}

run();

Debugging

You can setup your SDK to emit debug logs for SDK requests and responses.

You can pass a logger that matches console's interface as an SDK option.

[!WARNING] Beware that debug logging will reveal secrets, like API tokens in headers, in log messages printed to a console or files. It's recommended to use this feature only during local development and not in production.

import { UnstructuredClient } from "unstructured-client";

const sdk = new UnstructuredClient({ debugLogger: console });

Maturity

This SDK is in beta, and there may be breaking changes between versions without a major version update. Therefore, we recommend pinning usage to a specific package version. This way, you can install the same version each time without breaking changes unless you are intentionally looking for the latest version.

Contributions

While we value open-source contributions to this SDK, this library is generated programmatically. Feel free to open a PR or a Github issue as a proof of concept and we'll do our best to include it in a future release!

SDK Created by Speakeasy