npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@ig3/epub

v0.0.3

Published

Parse ePub electronic book files with Node.JS

Downloads

4

Readme

epub Build Status

epub is a node.js module to parse EPUB electronic book files.

NB! Only ebooks in UTF-8 are currently supported!.

Installation

npm install epub

zipfile will be used if it is installed, but it is not a dependency or optional dependency. Install it globally if you want to use it. The version that was required used deprecated packages and it hasn't been updated in 4 years. Issue 83 indicates that it fails to build on node v12 and the author has not responded to the issue in over 2 years.

Usage

import EPub from '@ig3/epub';
const epub = new EPub(pathToFile, imageWebRoot, chapterWebRoot);

OR

const EPub = require('@ig3/epub');
const epub = new EPub(pathToFile, imageWebRoot, chapterWebRoot);

Where

  • pathToFile is the file path to an EPUB file
  • imageWebRoot is the prefix for image URL's. If it's /images/ then the actual URL (inside chapter HTML <img> blocks) is going to be /images/IMG_ID/IMG_FILENAME, IMG_ID can be used to fetch the image form the ebook with getImage. Default: /images/
  • chapterWebRoot is the prefix for chapter URL's. If it's /chapter/ then the actual URL (inside chapter HTML <a> links) is going to be /chapters/CHAPTER_ID/CHAPTER_FILENAME, CHAPTER_ID can be used to fetch the image form the ebook with getChapter. Default: /links/

Before the contents of the ebook can be read, it must be opened (EPub is an EventEmitter).

epub.on('end', function() {
  // epub is initialized now
  console.log(epub.metadata.title)

  epub.getChapter('chapter_id', (err, text) => {})
})

epub.parse()

events

error

Emitted every time there is an error during parsing of the epub file.

Passed an instance of Error.

end

Emitted when parsing is complete.

methods

constructor

Returns a new instance of EPub.

const epub = new EPub(pathToFile, imageroot, linkroot);
  • pathToFile
  • imageroot
  • linkroot

pathToFile is the file path to an EPUB file

imageWebRoot is the prefix for image URL's. If it's /images/ then the actual URL (inside chapter HTML <img> blocks) is going to be /images/IMG_ID/IMG_FILENAME, IMG_ID can be used to fetch the image form the ebook with getImage. Default: /images/

chapterWebRoot is the prefix for chapter URL's. If it's /chapter/ then the actual URL (inside chapter HTML <a> links) is going to be /chapters/CHAPTER_ID/CHAPTER_FILENAME, CHAPTER_ID can be used to fetch the image form the ebook with getChapter. Default: /links/

parse([options])

Initiate parsing of the epub file. The epub object is an event emitter. All results come by way of events.

The options object can be used to pass options.xml2jsOptions, to override the defaults for xml2js, which are the '0.1' defaults. Note that the options must be set in options.xml2jsOptions, not in options directly.

walkNavMap

Returns an array of objects for the TOC. This is written in a way that it might be useful independent of the parser, but it isn't obvious. It might be for internal use only.

getChapter(id, callback)

Returns the content of a chapter, given the manifest ID of the chapter.

id is a string: the manifest ID of the chapter to retrieve.

callback is a function which will be called with arguments err and str, where str is the content of the chapter, as a string.

If the chapter content includes <body>...</body> then only the content of the body tag is returned.

Script and style blocks and onEvent handlers are removed.

Image and link paths are modified according to imageroot and linkroot passed to the EPub constructor.

The chapter file is assumed to be utf-8 encoded. There is no option to change the encoding.

getChapterRaw(id, callback)

Like getChapter except that the full content of the chapter is returned without transformations.

The chapter file is assumed to be utf-8 encoded. There is no option to change the encoding.

getImage(id, callback)

Returns the content of an image file as a Buffer.

The image file mime type must be image/*.

getFile(id, callback)

  • id
  • callback

id is the Manifest id of the file to be read

callback(err, data, mediaType) is the callback function. The data is a Buffer.

readFile(filename[, options], callback)

  • filename
  • options
  • callback

filename is the path of an epub file.

options is the encoding of the epub file.

callback is a function that is called with the decoded, stringified file contents.

hasDRM

Parses the tree to see if there is an encryption file, signifying the presence of DRM.

Returns true if the zip file includes META-INF/encryption.xml, otherwise returns false.

metadata

Property of the epub object that holds several metadata fields about the book.

epub.metadata

Available fields:

  • creator Author of the book (if multiple authors, then the first on the list) (Lewis Carroll)
  • creatorFileAs Author name on file (Carroll, Lewis)
  • title Title of the book (Alice's Adventures in Wonderland)
  • language Language code (en or en-us etc.)
  • subject Topic of the book (Fantasy)
  • date creation of the file (2006-08-12)
  • description

flow

flow is a property of the epub object and holds the actual list of chapters (TOC is just an indication and can link to a # url inside a chapter file)

epub.flow.forEach(chapter => {
    console.log(chapter.id)
})

Chapter id is needed to load the chapters getChapter

toc

toc is a property of the epub object and indicates a list of titles/urls for the TOC. Actual chapter and it's ID needs to be detected with the href property

getChapter(chapter_id, callback)

Load chapter text from the ebook.

epub.getChapter('chapter1', (error, text) => {})

getChapterRaw(chapter_id, callback)

Load raw chapter text from the ebook.

getImage(image_id, callback)

Load image (as a Buffer value) from the ebook.

epub.getImage('image1', (error, img, mimeType) => {})

getFile(file_id, callback)

Load any file (as a Buffer value) from the ebook.

epub.getFile('css1', (error, data, mimeType) => {})