@createiq/htmldiff
v1.0.3
Published
TypeScript port of htmldiff.net
Readme
@createiq/htmldiff
A library for comparing two HTML snippets and highlighting the differences using simple HTML.
This HTML Diff implementation is a TypeScript port of the C# port found here, which is a port of the ruby implementation found here.
Installation
npm install @createiq/htmldiff
Usage
This is intended to be a drop-in replacement for the htmldiff npm library (https://github.com/dfoverdx/htmldiff-js),
and intentionally mirrors the structure and classes of the HtmlDiff.NET
library that was initially ported, hence usage differing slightly from many JavaScript-ecosystem libraries.
import HtmlDiff from '@createiq/htmldiff'
const oldHtml = `<p>Some <em>old</em> html here</p>`
const newHtml = `<p>Some <b>new</b> html goes here</p>`
const diffHtml = HtmlDiff.execute(oldHtml, newHtml)
// Output: <p>Some <ins class='mod em'><del class='diffmod'>old</del></ins><b><ins class='mod b'><ins class='diffmod'>new</ins></ins></b> html <ins class='diffins'>goes </ins>here</p>Configuring the diff output
There are a couple of options that can be set directly on the HtmlDiff instance, HtmlDiff.execute will always create
a new instance and use defaults, but for custom output you can use new HtmlDiff(oldHtml, newHtml) and set the options
directly:
import HtmlDiff from '@createiq/htmldiff'
const oldHtml = `<p>Some <em>old</em> html here</p>`
const newHtml = `<p>Some <b>new</b> html goes here</p>`
const diff = new HtmlDiff(oldHtml, newHtml)
diff.repeatingWordsAccuracy = 0.5
diff.ignoreWhitespaceDifferences = true
diff.orphanMatchThreshold = 0.2
const diffHtml = diff.build()API
HtmlDiff.execute(oldHtml, newHtml)
Returns a HTML diff from oldHtml to newHtml with the default options. This is exactly the same as calling
new HtmlDiff(oldHtml, newHtml).build()
new HtmlDiff(oldHtml, newHtml)
Returns an HtmlDiff instance.
addBlockExpression(expression: RegExp)
Uses expression to group text together so that any change detected within the group is treated as a single block, for
example to keep dates together.
This MUST have the global flag (/g) enabled on the RegExp and must not overlap with other block expressions.
Example:
const oldHtml = 'This is a date 1 Jan 2016 that will change'
const newHtml = 'This is a date 22 Feb 2017 that did change'
const diff = new HtmlDiff(oldHtml, newHtml)
diff.addBlockExpression(/\d{1,2}\s*(Jan|Feb)\s*\d{4}/g)
const diffHtml = diff.build()
// Output: This is a date<del class='diffmod'> 1 Jan 2016</del><ins class='diffmod'> 22 Feb 2017</ins> that <del class='diffmod'>will</del><ins class='diffmod'>did</ins> change.repeatingWordsAccuracy get/set
Type: number
Default: 1.0
Defines how to compare repeating words. Valid values are from 0 to 1. This value allows to exclude some words from comparison that eventually reduces the total time of the diff algorithm.
0means that all words are excluded so the diff will not find any matching words at all.1(default value) means that all words participate in comparison so this is the most accurate case.0.5means that any word that occurs more than 50% times may be excluded from comparison. This doesn't mean that such words will definitely be excluded but only gives a permission to exclude them if necessary.
.ignoreWhitespaceDifferences get/set
Type: boolean
Default: false
If true all whitespaces are considered as equal
.orphanMatchThreshold get/set
Type: number
Default: 0.0
If some match is too small and located far from its neighbors then it is considered as orphan and removed. For example:
aaaaa bb ccccccccc dddddd ee
11111 bb 222222222 dddddd eewill find two matches bb and dddddd ee but the first will be considered as orphan and ignored, as result it will
consider texts aaaaa bb ccccccccc and 11111 bb 222222222 as single replacement:
<del>aaaaa bb ccccccccc</del><ins>11111 bb 222222222</ins> dddddd eeThis property defines relative size of the match to be considered as orphan, from 0 to 1.
1means that all matches will be considered as orphans.0(default) means that no match will be considered as orphan.0.2means that if match length is less than 20% of distance between its neighbors it is considered as orphan.
build()
Returns the diff from an HtmlDiff instance.
Contributing
The library uses Biome for linting and formatting, and Vitest for unit tests and benchmarking. It's worth ensuring that you have appropriate plugins for your development environment, particularly for Biome to avoid having to fix formatting issues late.
Releasing
Merge requests to the main branch should be reviewed by the team as normal but will not release a new version of the
library to npm. This happens when merge requests are made to the prod branch, this should be an MR directly from
main to prod and MUST include a bump to the version in package.json satisfying semver.
