npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

url-inspector-abhinavzspace

v2.3.3

Published

Get metadata about any url

Downloads

5

Readme

NPM

url-inspector

Get metadata about any URL.

Limited memory and network usage.

This is a node.js module.

It returns and normalizes information found in http headers or in the resource itself using exiftool (which knows almost everything about files but html), or a sax parser to read oembed, opengraph, twitter cards, schema.org attributes or standard html tags.

Both tools stop inspection when they gathered enough tags, or stop when a max number of bytes (depending on media type) have been downloaded.

A demo using this module is available, with url-inspector-daemon

  • url
    url of the inspected resource

  • title
    title of the resource, or filename, or last component of pathname with query

  • description
    optional longer description, without title in it

  • site
    the name of the site, or the domain name

  • mime
    RFC 7231 mime type of the resource (defaults to Content-Type)
    The inspected mime type could be more accurate than the http header.

  • ext
    the extension matching the mime type (not the file extension)

  • type
    what the resource represents
    image, video, audio, link, file, embed, archive

  • html
    a canonical html representation of the full resource,
    depending on the type and mime, could be img, a, video, audio, iframe tag.

  • size
    optional Content-Length; discarded when type is embed

  • icon
    optional link to the favicon of the site

  • width, height
    optional dimensions

  • duration
    optional

  • thumbnail
    optional a URL to a thumbnail, could be a data-uri for embedded images

  • source
    optional a URL that can go in a 'src' attribute; for example a resource can be an html page representing an image type. The URL of the image itself would be stored here; same thing for audio, video, embed types.

  • error
    optional an http error code, or string

  • all
    an object with all additional metadata that was found

Installation

npm install url-inspector

Add -g switch to install the executable.

exiftool executable must be available:

  • a package is available for debian/ubuntu: libimage-exiftool-perl and for fedora: perl-Image-ExifTool.
  • Otherwise it is installable from http://owl.phy.queensu.ca/~phil/exiftool/

API

var inspector = require('url-inspector');

var opts = {
	all: false, // return all available non-normalized metadata
	ua: "Mozilla/5.0", // some oembed providers might not answer otherwise
	nofavicon: false, // disable any favicon-related additional request
	nosource: false, // disable any sub-source inspection for audio, video, image types
	providers: [{ // an array of custom OEmbed providers, or path to a module exporting such an array
		provider_name: "Custom OEmbed provider",
		endpoints: [{
			schemes: ["http:\/\/video\.com\/*"],
			builder: function(urlObj, obj) {
				// can see current obj and override arbitrary props
				obj.embed = "custom embed url";
			}
		}]
	}],
	// new in version 2.3.0
	file: true
};

inspector(url, opts, function(err, obj) {

});

// or simply

inspector(url, function(err, obj) {...});

Command-line client

inspector-url <url>
inspector-url <filepath>

Low resource usage

network:

  • a maximum of several hundreds of kilobytes (depending on resource type) is downloaded but it is usually much less, depending on connection speed.
  • inspection stops as soon as enough metadata is gathered

memory:

  • html is inspected using a sax parser, without building a full DOM.

exiftool:

  • runs using streat module, which keeps exiftool always open for performance

Since version 2.3.0, file:// protocol is supported through cli by default, or setting "file" flag to true (false by default) through api.

License

See LICENSE.

See also

https://github.com/kapouer/url-inspector-daemon

https://github.com/kapouer/node-streat