npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

new-link-preview

v3.1.0

Published

Javascript module to extract and fetch HTTP link information from blocks of text.

Downloads

235

Readme

Before creating an issue

It's more than likely there is nothing wrong with the library:

  • It's very simple; fetch HTML, parse HTML, and search for OpenGraph HTML tags.
  • Unless HTML or the OpenGraph standard change, the library will not break
  • If the target website you are trying to preview redirects you to a login page the preview will fail, because it will parse the login page
  • If the target website does not have OpenGraph tags the preview will most likely fail, there are some fallbacks but in general, it will not work
  • You cannot preview (fetch) another web page from YOUR web page. This is an intentional security feature of browsers called CORS

If you use this library and find it useful please consider sponsoring me, open source takes a lot of time and effort.

Link Preview

Allows you to extract information from an HTTP URL/link (or parse an HTML string) and retrieve meta information such as title, description, images, videos, etc. via OpenGraph tags.

GOTCHAs

  • You cannot request a different domain from your web app (Browsers block cross-origin-requests). If you don't know how same-origin-policy works, here is a good intro, therefore this library works on Node.js and certain mobile run-times (Cordova or React-Native).
  • This library acts as if the user would visit the page, sites might re-direct you to sign-up pages, consent screens, etc. You can try to change the user-agent header (try with google-bot or with Twitterbot), but you need to work around these issues yourself.

API

getLinkPreview: you have to pass a string, doesn't matter if it is just a URL or a piece of text that contains a URL, the library will take care of parsing it and returning the info o the first valid HTTP(S) URL info it finds.

getPreviewFromContent: useful for passing a pre-fetched Response object from an existing async/etc. call. Refer to the example below for required object values.

import { getLinkPreview, getPreviewFromContent } from "link-preview-js";

// pass the link directly
getLinkPreview("https://www.youtube.com/watch?v=MejbOFk7H6c").then((data) =>
  console.debug(data)
);

////////////////////////// OR //////////////////////////

// pass a chunk of text
getLinkPreview(
  "This is a text supposed to be parsed and the first link displayed https://www.youtube.com/watch?v=MejbOFk7H6c"
).then((data) => console.debug(data));

////////////////////////// OR //////////////////////////

// pass a pre-fetched response object
// The passed response object should include, at minimum:
// {
//   data: '<!DOCTYPE...><html>...',     // response content
//   headers: {
//     ...
//     // should include content-type
//     content-type: "text/html; charset=ISO-8859-1",
//     ...
//   },
//   url: 'https://domain.com/'          // resolved url
// }
yourAjaxCall(url, (response) => {
  getPreviewFromContent(response).then((data) => console.debug(data));
});

Options

Additionally, you can pass an options object which should add more functionality to the parsing of the link

| Property Name | Result | | -------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | | imagesPropertyType (optional) (ex: 'og') | Fetches images only with the specified property, meta[property='${imagesPropertyType}:image'] | | headers (optional) (ex: { 'user-agent': 'googlebot', 'Accept-Language': 'en-US' }) | Add request headers to fetch call | | timeout (optional) (ex: 1000) | Timeout for the request to fail | | followRedirects (optional) (default 'error') | For security reasons, the library does not automatically follow redirects ('error' value), a malicious agent can exploit redirects to steal data, posible values: ('error', 'follow', 'manual') | | handleRedirects (optional) (with followRedirects 'manual') | When followRedirects is set to 'manual' you need to pass a function that validates if the redirectinon is secure, below you can find an example | | resolveDNSHost (optional) | Function that resolves the final address of the detected/parsed URL to prevent SSRF attacks |

getLinkPreview("https://www.youtube.com/watch?v=MejbOFk7H6c", {
  imagesPropertyType: "og", // fetches only open-graph images
  headers: {
    "user-agent": "googlebot" // fetches with googlebot crawler user agent
    "Accept-Language": "fr-CA", // fetches site for French language
    // ...other optional HTTP request headers
  },
  timeout: 1000
}).then(data => console.debug(data));

SSRF Concerns

Doing requests on behalf of your users or using user-provided URLs is dangerous. One of such attack is trying to fetch a domain that redirects to localhost so the users get the contents of your server (doesn't affect mobile runtimes). To mitigate this attack you can use the resolveDNSHost option:

// example how to use node's dns resolver
const dns = require("node:dns");
getLinkPreview("http://maliciousLocalHostRedirection.com", {
  resolveDNSHost: async (url: string) => {
    return new Promise((resolve, reject) => {
      const hostname = new URL(url).hostname;
      dns.lookup(hostname, (err, address, family) => {
        if (err) {
          reject(err);
          return;
        }

        resolve(address); // if address resolves to localhost or '127.0.0.1' library will throw an error
      });
    });
  },
}).catch((e) => {
  // will throw a detected redirection to localhost
});

This might add some latency to your request but prevents loopback attacks.

Redirections

Same to SSRF, following redirections is dangerous, the library errors by default when the response tries to redirect the user. There are however some simple redirections that are valid (e.g. HTTP to HTTPS) and you might want to allow them, you can do it via:

await getLinkPreview(`http://google.com/`, {
  followRedirects: `manual`,
  handleRedirects: (baseURL: string, forwardedURL: string) => {
    const urlObj = new URL(baseURL);
    const forwardedURLObj = new URL(forwardedURL);
    if (
      forwardedURLObj.hostname === urlObj.hostname ||
      forwardedURLObj.hostname === "www." + urlObj.hostname ||
      "www." + forwardedURLObj.hostname === urlObj.hostname
    ) {
      return true;
    } else {
      return false;
    }
  },
});

Response

Returns a Promise that resolves with an object describing the provided link. The info object returned varies depending on the content type (MIME type) returned in the HTTP response (see below for variations of response). Rejects with an error if the response can not be parsed or if there was no URL in the text provided.

Text/HTML URL

{
  url: "https://www.youtube.com/watch?v=MejbOFk7H6c",
  title: "OK Go - Needing/Getting - Official Video - YouTube",
  siteName: "YouTube",
  description: "Buy the video on iTunes: https://itunes.apple.com/us/album/needing-getting-bundle-ep/id508124847 See more about the guitars at: http://www.gretschguitars.com...",
  images: ["https://i.ytimg.com/vi/MejbOFk7H6c/maxresdefault.jpg"],
  mediaType: "video.other",
  contentType: "text/html",
  charset: "utf-8"
  videos: [],
  favicons:["https://www.youtube.com/yts/img/favicon_32-vflOogEID.png","https://www.youtube.com/yts/img/favicon_48-vflVjB_Qk.png","https://www.youtube.com/yts/img/favicon_96-vflW9Ec0w.png","https://www.youtube.com/yts/img/favicon_144-vfliLAfaB.png","https://s.ytimg.com/yts/img/favicon-vfl8qSV2F.ico"]
}

Image URL

{
  url: "https://media.npr.org/assets/img/2018/04/27/gettyimages-656523922nunes-4bb9a194ab2986834622983bb2f8fe57728a9e5f-s1100-c15.jpg",
  mediaType: "image",
  contentType: "image/jpeg",
  favicons: [ "https://media.npr.org/favicon.ico" ]
}

Audio URL

{
  url: "https://ondemand.npr.org/anon.npr-mp3/npr/atc/2007/12/20071231_atc_13.mp3",
  mediaType: "audio",
  contentType: "audio/mpeg",
  favicons: [ "https://ondemand.npr.org/favicon.ico" ]
}

Video URL

{
  url: "https://www.w3schools.com/html/mov_bbb.mp4",
  mediaType: "video",
  contentType: "video/mp4",
  favicons: [ "https://www.w3schools.com/favicon.ico" ]
}

Application URL

{
  url: "https://assets.curtmfg.com/masterlibrary/56282/installsheet/CME_56282_INS.pdf",
  mediaType: "application",
  contentType: "application/pdf",
  favicons: [ "https://assets.curtmfg.com/favicon.ico" ]
}

License

MIT license