@gcu/gcu-press
v0.1.0
Published
Typesetting engine — Knuth-Plass line breaking, hyphenation, page layout, canvas rendering. Turn markdown-ish source into typeset pages.
Downloads
16
Maintainers
Readme
gcu-press
Typesetting engine for auditable notebooks. Produces beautifully typeset PDFs from notebook content.
concept
An auditable notebook is already a programmable document: markdown cells for prose, JS cells for computation, CSS for styling, HTML for layout. gcu-press adds a typesetting backend — same source, two outputs (browser for authoring, PDF for distribution).
The author writes in an auditable notebook. Markdown cells are prose. Code cells compute cross-references, generate tables, produce figures. // %hide cells are the "preamble" — they run but don't appear in output. The notebook builtin provides introspection (cell types, source, rendered output, scope). gcu-press reads this and typesets it.
Usage from a code cell:
const press = await load("gcu-press")
const pdf = await press.typeset(notebook, {
fonts: { body: 'CMU Serif', code: 'CMU Typewriter' },
page: 'A5',
})architecture
pipeline
notebook (cells + executed outputs)
→ collect visible content (skip %hide, %norun)
→ resolve cross-references (ref.ch(), ref.ex(), ref.fig())
→ Knuth-Plass line breaking (paragraph → lines)
→ page breaking (lines → pages, with headers/footers/TOC)
→ PDF emission (positioned glyphs + images)components
layout engine — Knuth-Plass optimal line breaking, page breaking with penalties (keep heading with next paragraph, avoid widows/orphans). the core algorithm is ~200-300 lines of dynamic programming on arrays of box/glue/penalty items.
font metrics — opentype.js or similar for glyph widths, kerning pairs, ligatures. needed to feed accurate measurements into line breaking. fonts: CMU (Computer Modern Unicode) for TeX-quality output, or any OTF/TTF.
PDF emitter — PDF is a simple format for positioned text: "place glyph X at (x,y)". pdf-lib or raw PDF stream generation. also needs image embedding for canvas outputs (figures, plots).
notebook bridge — walks
notebook.cells, collects rendered content. markdown cells → parsed prose (headings, paragraphs, emphasis, code spans). code cells → syntax-highlighted code blocks + output captures. HTML cells → rendered content. CSS cells → skipped (styling is for browser view, not PDF).cross-reference system —
std.refs()orpress.refs()helper that code cells use to register and reference figures, examples, chapters, equations. two-pass: first pass collects all refs, second pass resolves numbers.
what makes TeX output beautiful (and what we need)
Knuth-Plass line breaking — considers all possible break points in a paragraph simultaneously, minimizes total "badness" (deviation from ideal line width). this is the single biggest quality difference vs greedy line breaking. well-documented algorithm, has been reimplemented many times.
Kerning and ligatures — "fi", "fl", "ffi" ligatures, pair-wise kerning (AV, To, etc.). comes free from OpenType font metrics.
Microtypography — hanging punctuation, optical margin alignment, character protrusion. nice to have, not essential for v1.
Hyphenation — Liang's algorithm (also by Knuth). pattern-based, compact, effective. needed for good line breaking in narrow columns. tex hyphenation patterns are public domain.
what we DON'T need from TeX
- macro expansion engine (we have JS)
- TeX's input language/parser (we have markdown + JS)
- math typesetting (atra book has minimal math; add later if needed)
- float placement algorithm (figures go where the author puts them)
- bibliography/citation system (not needed for v1)
prerequisites (auditable changes)
before gcu-press can work, auditable needs a few small additions:
1. markdown cell ${expr} interpolation
HTML cells already evaluate ${expr} against scope. markdown cells should too. this lets authors write As shown in ${ref.fig("dot")}... in prose cells.
2. split view mode
side-by-side layout: cells (editors) on the left, continuous output flow on the right. this is the authoring experience for gcu-press — you edit on the left, see the "page" on the right. also a general UX improvement for auditable on wide screens.
3. notebook output introspection
extend notebook.cells to include rendered output — the innerHTML or DOM content of each cell's output element. gcu-press needs this to know what a code cell actually produced (a canvas? a table? text?).
the atra book
the first project for gcu-press. a concise language reference and tutorial for atra.
structure
- what is atra — one page. Wasm compilation target, tagged templates, numerical focus.
- first program — running atra in an auditable notebook and from JS. "hello world" equivalents.
- types and values — f64, i32, bool. literals, type inference.
- variables and expressions —
:=assignment, arithmetic, comparison, logical operators, precedence. - control flow — if/else, begin/end blocks, for (range), while, early return.
- functions — fn declaration, parameters, return types, multiple returns, recursion.
- memory — linear memory model, load/store, pointers as i32 offsets, arrays, structs-by-convention.
- interop — JS host imports, exports, tagged templates,
${}interpolation (numbers, strings, functions), curried formatra({imports}). - libraries — std.include, source distributions (.src.js), binary distributions, alpack as example.
- worked examples — 3-4 programs of increasing complexity: vector math → matrix operations → a small simulation.
tone
like the Go Tour or K&R C — concise, practical, progressive. not a reference manual (that's the SPEC.md). not a textbook. assumes the reader can program but hasn't seen atra or Wasm before. ~40-60 pages typeset.
naming
gcu-press. from Geoscientific Chaos Union. the typesetting engine for auditable notebooks.
prior art and references
- Knuth, "Breaking Paragraphs into Lines" (1981) — the line-breaking algorithm
- Liang, "Word Hy-phen-a-tion by Com-put-er" (1983) — hyphenation patterns
- opentype.js — JS OpenType font parser
- pdf-lib — JS PDF creation library
- Computer Modern Unicode fonts — free, high-quality, the TeX look
