@jgarber/metacrap
v0.1.2
Published
Parse metacrap from a URL or HTML.
Downloads
62
Readme
@jgarber/metacrap
Parse metacrap from a URL or HTML.
Installation
npm install --save @jgarber/metacrapUsage
If the URL https://jgarber.example existed and its HTML were a little something like this:
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>Hello, world!</title>
<meta name="author" content="Jason Garber">
<meta property="og:author" content="Jason Garber">
<!-- Or, equivalently: -->
<!-- <meta name="author" property="og:author" content="Jason Garber"> -->
</head>
</html>On the command line:
npx metacrap https://jgarber.example
# { author: ["Jason Garber"], "og:author": ["Jason Garber"] }In a JavaScript file:
import metacrap from "@jgarber/metacrap";
console.log(await metacrap("https://jgarber.example"));
// { author: ["Jason Garber"], "og:author": ["Jason Garber"] }
console.log(await metacrap(`<meta name="author" content="Jason Garber">`));
// { author: ["Jason Garber"] }[!IMPORTANT] Values in the parsed data are returned as arrays of strings to account for HTML containing multiple
<meta>elements with duplicativenameorpropertyattributes. This behavior may not be pervasive in practice, but is allowable in HTML.
[!NOTE] Parsed data includes only
<meta>elements with acontentattribute and either anameorpropertyattribute.
License
This project is freely available under the MIT License.
