npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

tei-xml-fmt

v1.0.4

Published

format tei-xml documents

Downloads

44

Readme

This project is under active development, expect breaking changes!

What is this?

This repository contains the code for a VSCode extension that formats TEI XML files to be more human readable. It uses saxes to parse XML files into code blocks that can be formatted back into text. This formatter expects valid XML files.

Resources Used

Yorick Peterse - How to write a code formatter

Gerard Huet - The Zipper

TEI Council - TEI Specification

Definitions and Observations

  • TEI XML prefers explicit spacing. It defines no standards for how implict spaces are treated. Thus these formatting rules are specific to the renderer used in the Eartha M. M. White project. I would recommend using explicit spacing wherever possible.
  • A singular space is the same as multiple spaces. One spacing node may be expanded to multiple.
  • New lines and tab lines are also treated as spaces.
  • Block tags are tags that make their own spacing during rendering thus ignoring the immediate spacing around them.
  • Inline tags are tags that depend on spacing near them. Having no space means the rendered text might be joined together. However, having even one space between multiple inline tags that aren't interrupted by text means that all of them can have spaces and not change the final layout.
  • I have yet to encouter a tag that has asymmetrical spacing requirements. So for now we disregard them.
  • Ignore everything but open tags, close tags, and text nodes for now. Comments, CDATA, Processing Instruction, and XML Declaration will be implemented at a later date.

Algorithm

TL;DR

First uses saxes to parse the .xml file into code which can be processed easier than raw text. First we construct a tree called an Abstract Syntax Tree (AST). This contains enough information to distinguish between Tag Nodes, Close Tag Nodes, Text, and Spaces. Then we take that, process it a bit more and lower it into a Formatting Tree which strips out even more information down to just Groups (contain Text and Space nodes), Text, and Spacing Nodes (Line Indent/Deindent, Space or Line). From here there is very little information to process and most of the formatting has been done. We render the Formatting Tree into raw text again.

Steps

  1. Construct an editable AST tree from the XML file.

    a. Combine adjacent text nodes into singular text nodes.

    b. Normalize all spaces ' ', new lines '\n', and tab lines '\t' within text to a singlar space.

    c. A text node containing a single space should be transformed into a Spacing Node. If the text node contains text, trailing and leading spaces become Spacing nodes.

     - If the Spacing node will reside next to another Spacing Node, do not insert it.
  2. Sanitize the AST using a Zipper to allow for better traversal.

    a. A space node can be \n, \t or ' ' as long as it does not reside between two text characters / nodes.

    b. Spacing nodes should be carried in both directions.

     - Carrying means inserting another Spacing node after the next node if the node in front of it can be crossed.
     - If we are carrying left, it can cross only open tags. If we are carrying right, it can cross only close tags.
     - If the Spacing node will reside next to another Spacing node, do not insert it.

    c. There should now be a single Spacing node everywhere we can insert spaces into.

  3. Translate the AST into a formatting tree.

    a. Convert all nodes normally into text. Spacing nodes require more attention.

    b. When we encounter a spacing node, we look backward and forward to see what type of FMT node to insert.

     - LineIndent - If the previous tag is an open tag and the next node is not a close tag
     - LineDeindent - If the previous tag is not an open tag and the next node is a close tag
     - SpaceOrLine - Default

    // TODO: A group of carried Spacing nodes should be linked together. As if all them dont need to be wrapped, only one of the Spacing nodes needs to become a space. Not all of them.

  4. Generate the final XML using the formatting tree.

    a. Use width() calculations on the FMT nodes to determine whether to wrap then output the correct string literal.

TODO

  • [ ] add tests for everything

Demonstrations

See below for an example formatting output.

Unformatted

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="custom.xsl"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<text><body><div type="letter">
<head>Letter from Emily to John</head><p><hi rend="italic">Dear John,</hi><lb/> I hope this letter finds you well.
The weather here has been <hi rend="bold">unusually warm</hi> for October.
</p><p>I have enclosed the sketches you asked for.
<note type="editorial">Original note: “See attached drawings.”</note></p>
<closer><salute> Yours sincerely, </salute>
<signed>Emily</signed>
</closer></div><hi> 80 808 0808 080808080808 0808008 8 8 08 08 08 80 80 80 8080 8080 8008 080 8080 8080 080 0
</hi></body></text></TEI>

Formatted

Significant improvements over the previous version!

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="custom.xsl"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
	<text>
		<body>
			<div type="letter">
				<head>Letter from Emily to John</head><p><hi rend="italic">Dear John,</hi>
					<lb />
					I hope this letter finds you well. The weather here has been
					<hi rend="bold">unusually warm</hi>
					for October.
				</p>
				<p>I have enclosed the sketches you asked for.
				<note type="editorial">Original note: “See attached drawings.”</note></p>
				<closer> <salute> Yours sincerely, </salute> <signed>Emily</signed> </closer>
			</div>
			<hi>
				80 808 0808 080808080808 0808008 8 8 08 08 08 80 80 80 8080 8080 8008 080 8080 8080 080 0
			</hi>
		</body>
	</text>
</TEI>

Pre-rewrite Formatted

Prior to the rewrite (commit 0704c5e) this was the formatting output.

<TEI xmlns="http://www.tei-c.org/ns/1.0">
	<text>
		<body>
			<div type="letter">
				<head>Letter from Emily to John</head>
				<p>
					<hi rend="italic">Dear John,</hi>
					<lb/>
					I hope this letter finds you well. The weather here has been
					<hi rend="bold">unusually warm</hi>
					for October.
				</p>
				<p>
					I have enclosed the sketches you asked for.
					<note type="editorial">Original note: “See attached drawings.”</note>
				</p>
				
				
				
				<closer> <salute>Yours sincerely, </salute>  <signed>Emily</signed>  </closer>
				</div>
			
			
			
			<hi>80 808 0808 080808080808 0808008 8 8 08 08 08 80 80 80 8080 8080 8008 080 8080 8080 080 0
			</hi>
			</body>
		</text>
	</TEI>