npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

ilkkah-odt-to-html

v1.0.0

Published

odt.js is a Javascript library to convert odt to html and back.

Downloads

5

Readme

odt.js

odt.js is a Javascript library to convert odt to html and back.

Limitations

Currently, setHTML only supports html as returned by getHTML, maybe with minor modifications such as changing text. Currently, this means setHTML will throw when given arbitrary html.

Currently, odt.js depends on the browser's XML parser, DOM parser and DOM serializer. If you want to use odt.js on the server, one way forward is to modify it to add support for pure javascript parsers. Keep in mind that for strictness parity with the in-browser parser, you need a DOM parser which breaks up the <p> in <p><div></div></p>.

Unsupported odt features:

  • Features on which getHTML throws
    • Any encoding other than utf-8
    • Annotations
    • Tracked changes
    • Charts
  • Features on which getHTML doesn't throw
    • Various types of images
    • Non-"manual" styles
    • Unordered lists
    • List styles (bullets,)
    • Strikethrough
    • Underlined text nested inside non-underlined text
    • Underline-color
    • Loads of other styles

And much more.

Strictness

Warning: currently, the following goes only within a single version of odt.js. So you can't getHTML, store the html, setHTML with another version of odt.js and expect a correct odt.

getHTML throws when otherwise a getHTML -> set html -> get html -> setHTML roundtrip would not produce the original odt file (barring xml encoding and zip file changes). This is the case for most unsupported odt features, except unsupported text styles.

getHTMLUnsafe probably won't produce html that's totally useless, since browsers are very forgiving. It might produce html which does not accurately represent the odt, though getHTML will also do that for unsupported styles.

setHTML throws when otherwise a setHTML -> getHTML roundtrip would not produce the original html (barring style and html encoding changes). This is the case for most unsupported html features, except unsupported text styles. The hope is that it means the resulting odt is not broken, but it's no guarantee.

setHTMLUnsafe might produce completely broken odt files.

Usage

<script src="jszip.js"></script>
<script src="lib/odt.js"></script>

odt2html

var html;
try {
	html = new ODTDocument(odt).getHTML();
} catch(e) {
	alert("Couldn't parse odt file.");
	throw e;
}

If you definitely want html while caring less about whether or not it is correct:

var html = new ODTDocument(odt).getHTMLUnsafe();

If you want fallback html:

var odtdoc = new ODTDocument(odt);
var html = odtdoc.getHTMLUnsafe();
try {
	html = odtdoc.getHTML();
} catch(e) {
	console.error('html is probably broken');
}

html2odt

As mentioned in Limitations, this example is currently non-functional for arbitrary html.

var req = new XMLHttpRequest();
req.open('GET', 'res/empty.odt');
req.responseType = 'arraybuffer';
req.addEventListener('load', function() {
	var empty = req.response;
	
	var odtdoc = new ODTDocument(empty);
	try {
		odtdoc.setHTML(html);
	} catch(e) {
		alert("Couldn't generate odt document.");
		throw e;
	}
	var odt = odtdoc.getODT();
});
req.send();

If you definitely want odt while caring less about whether or not it is a valid odt file:

	var odtdoc = new ODTDocument(empty);
	odtdoc.setHTMLUnsafe(html);
	var odt = odtdoc.getODT();

If you want a fallback odt:

	var odtdoc = new ODTDocument(empty);
	try {
		odtdoc.setHTML(html);
	} catch(e) {
		odtdoc.setHTMLUnsafe(html);
		console.error('odt is probably broken');
	}
	var odt = odtdoc.getODT();

Simple odt editor:

var iframe = document.createElement('iframe');
var odtdoc = new ODTDocument(odt);
var html = odtdoc.getHTMLUnsafe();
try {
	html = odtdoc.getHTML();
} finally {
	iframe.contentDocument.write(html);
	iframe.contentDocument.close();
}
iframe.contentDocument.documentElement.addEventListener('input', function save() {
	try {
		odtdoc.setHTML(iframe.contentDocument.documentElement.outerHTML);
	} catch(e) {
		alert("Generating ODT file failed.");
		throw e;
	}
	odt = odtdoc.getODT();
});

Documentation

ODTDocument

new ODTDocument(String|ArrayBuffer|Uint8Array|Buffer odt[, Object options]) -> ODTDocument | Error

Initialize an ODTDocument.

For arguments and errors, see the JSZip documentation.

ODTDocument#getHTML

ODTDocument#getHTML() -> html | TypeError | Error

Convert the odt document to html.

Throws TypeError if JSZip or DOMParser is undefined or if DOMParser does not support parsing text/xml and text/html.

Throws Error if the odt uses unsupported features (it doesn't throw on unsupported text styles, though). For more details, see Strictness above.

ODTDocument#getHTMLUnsafe

ODTDocument#getHTMLUnsafe() -> html | TypeError

Throws TypeError if JSZip or DOMParser is undefined or if DOMParser does not support parsing text/xml and text/html.

ODTDocument#setHTML

ODTDocument#setHTML(String html) -> undefined | TypeError | Error

Throws TypeError if DOMParser is undefined or if DOMParser does not support parsing text/xml.

Throws Error if the html uses unsupported features. For more details, see Strictness above.

ODTDocument#setHTMLUnsafe

ODTDocument#setHTMLUnsafe(String html) -> undefined | TypeError

Throws TypeError if DOMParser is undefined or if DOMParser does not support parsing text/xml.

ODTDocument#getODT

ODTDocument#getODT([Object options]) -> String|ArrayBuffer|Uint8Array|Buffer | Error

Generate an odt file from the ODTDocument.

For options and errors, see the JSZip documentation.

Contributing

Tips

odt.js is very strict towards its own code (except the Unsafe functions, that is). getHTML throws when a odt 2 html 2 odt roundtrip doesn't produce exactly the same odt file, and also when it produced invalid html (it's not as strict about the latter, though, e.g. it throws when you produce one <p> inside another).

One way to go about adding features to odt.js is to use getHTMLUnsafe and iterate until that generates something sane.

Another way is to use getHTML and set a breakpoint on the line that throws, diff (using a word-granular diff tool) the two things that were different, and work backwards from there.

It also helps to decide in advance on a strategy for producing html from which the original odt can be derived losslessly.

Guidelines

Keep in mind that the html produced should be useful on both screen and print media.

Please follow the code style of surrounding code, so single quotes unless the string contains a single quote, if( instead of if (, etc.

If you want to modify odt.js for use outside the browser, see tips in Limitations.