npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

spectator-document-viewer

v0.1.0

Published

The document viewer is a React component built in Typescript that allows a user to highlight parts of the text in order to create annotations.

Downloads

6

Readme

Document Viewer

The document viewer is a React component built in Typescript that allows a user to highlight parts of the text in order to create annotations.

Getting started

You will need node and yarn.

yarn install to install the document viewer's dependencies.

There's a playground that enables faster local development with a very simple test document (see playground folder). It uses (MirageJS)[https://miragejs.com/] to mock the server API.

  1. yarn install in the playground directory.
  2. yarn start will start server with hot-reload. Simply navigate to http://localhost:1234.

yarn build packages the component so that it's ready to be used in another project.

React component

The component can be rendered in any space, but it is recommended to allow the component to use all the width and height of the viewport for optimal usage.

A quick example of how to use it would be:

<div className="App" style={{height: "100%", width: "100%", margin: "0"}}>
  <DocumentViewer
    annotations={[{
      characterStart: 0,
      characterEnd: 19,
      pageStart: 1,
      pageEnd: 1,
      top: 304,
      left: 301,
      topic: "1"
    }]}
    name={"document.pdf"}
    lazyLoadingWindow={5}   // optional
    onAnnotationCreate={(annotation => console.log(annotation))}
    onAnnotationDelete={(annotation => console.log(annotation))}
    onClose={() => console.log('Close')}
    onNextDocument={() => console.log('Next')}
    onPreviousDocument={() => console.log('Previous')}
    pages={[{
      originalHeight: 842,
      originalWidth: 595,
      imageURL: "api/document/1/page/1/image", 
      tokensURL: "api/document/1/page/1/tokens"
    }]}
    topics={["1", "2", "3"]}
  />
</div>

Concepts

As a document viewer is somewhat of a complex user interface to build, it requires some concepts to be understood:

  • annotations: They are the highlights in the text. An annotation is mainly defined by the [characterStart characterEnd[ pair and should, as much as possible, correspond with the start and the end of a word. If the annotation doesn't wrap nicely, the document viewer will do its best to try to map it to the closest words.
    • characterStart: The 0-based index of the first character of the highlight.
    • characterEnd: The 0-based index of the last character of the highlight.
    • pageStart: The 1-based index of the page containing characterStart.
    • pageEnd: The 1-based index of the page containing characterEnd.
    • top and left corresponds to the top and left position, in pixels, of the first character of the annotation in the original document. It is used to be able to jump directly to annotations.
    • topic: The string representation of the topic.
  • name: The name of the document.
  • lazyLoadingWindow (optinal): As the document viewer is taking care of lazy loading the pages, it defines how many pages on both sides needs to be loaded. A value of 1 will mean to load 3 pages (-1, current, +1).
  • onAnnotationCreate: A callback used when a user highlights text and then click on a topic.
  • onAnnotationDelete: A callback used when a user clicks on the X of an annotation.
  • onClose: A callback used when a user clicks on the X of the document viewer.
  • onNextDocument: A callback used when a user clicks on the -> of the document viewer.
  • onPreviousDocument: A callback used when a user clicks on the <- of the document viewer.
  • pages: The definition of all pages. The page are lazy loaded which is why it requires some URLs in its definition.
    • originalHeight: The height, in pixels, of the original document. It is used to be able to create a translate the position of the mouse to something we can use the R-tree with.
    • originalWidth: The width, in pixels, of the original document. It is used to be able to create a translate the position of the mouse to something we can use the R-tree with.
    • imageURL: The URL to the image of the page. The PNG format is recommended as it performs well for our scenario.
    • tokensURL: The URL to the tokens of the page. As there's normally a lot of tokens inA token is defined by:
      • line: The 0-based index of the line containing the token. The line is supposed to reset to 0 when it's a new page.
      • characterStart: The 0-based index of the first character of the token.
      • characterEnd: The 0-based index of the last character of the token.
      • boundingBox: The enclosing box wrapping the token.
        • top: Position in pixels of top in the original document.
        • bottom: Position in pixels of bottom in the original document.
        • left: Position in pixels of left in the original document.
        • right: Position in pixels of right in the original document.
  • topics: An order list of topics (strings) to choose from.

Design

The document viewer loads the original image of the pages and overlays an R-tree containing all the tokens in the page. When a user selects text, in reality, it keeps track of the mouse position, will translate the coordinates to what it will be in the original document. It will then query the R-tree and return the tokens that intersect with the rectangle created by the selection. As highlighting a passage in the text happens way less often than reading the document, we made the tradeoff of keeping the visualization part of the document viewer simple (loading the original image) while bringing more complexity on the text selection part.

A word on annotation colours

To have a maximal number of distinct colours (12 colours), we used a generator. As we don't want the highlight to block the text, we reduced the opacity of the colours and we used the css property mix-blend-mode to multiply. It still doesn't prevent a user from massively highlighting the text to a point you can't see the text well.

Features that are left out (for now)

  • Search within and between documents.
  • Copying text.
  • Nuanced behaviour tracking.
  • Mobile/Tablet support.