lark-hirono

v0.1.29

Published

3 months ago

Feishu (Lark) doc creation pipeline — markdown to styled Feishu documents with heading numbering, table conversion, and upload workflow

0High
0Medium
0Low

hirono

feishu lark markdown document pipeline lark-table

lark-hirono

Markdown → Styled Feishu (Lark) documents with heading numbering, table conversion, and narrative optimizations.

Features

Subcommand CLI — upload, optimize, fetch, analyze, highlight, verify, auth
Document types — Automatic analysis: catalog_table, data_table, narrative, mixed
Heading styling — Blue number prefix + rainbow backgrounds per level
Chinese ordinals — 一、二、 → 1. 2. auto-convert
Table conversion — Markdown tables → <lark-table> XML with proportional widths
Keyword highlighting — LLM-assisted {red:keyword} for table titles
Narrative optimizations — Callout injection, code block tagging, signpost bolding
Callout DSL — [!callout ...]...[/callout] bracket syntax auto-converted to <callout> XML
Mermaid diagrams — <add-ons> whiteboard blocks ↔ ```mermaid ``` fences; auto-theming + whiteboard patching
Chunked upload — Large docs (>200KB) split automatically
Verify — Block-level structure validation

Install

CLI (npm)

Requires Node.js 20+ and the Feishu CLI dependency used by lark-hirono. After the package is published to npm, install the latest release with:

npm install -g lark-hirono@latest

Install `lark-cli`

lark-hirono shells out to lark-cli for auth, document APIs, and uploads. Requires lark-cli >= 1.0.9 (whiteboard query API). Install or update:

mkdir -p /tmp/larkcli
cd /tmp/larkcli
npm init -y
npm install @larksuite/cli
node node_modules/@larksuite/cli/scripts/install.js

Claude Code Skill

npx skills add charles1614/lark-hirono

Then use /lark-hirono in Claude Code. See skills/lark-hirono/SKILL.md for action reference.

Local Development

npm install

Authentication

Before using the CLI or skill, authenticate with Feishu:

lark-cli auth login --domain docs

Usage

Upload Local Markdown

lark-hirono upload input.md --title "My Document" --verify

Optimize Existing Document

Create optimized sibling (recommended):

lark-hirono optimize --doc GzlQwunV9iQAqmkQqOBcZzugnjf --new --verify

Update in-place (not recommended due to Feishu export corruption):

lark-hirono optimize --doc GzlQwunV9iQAqmkQqOBcZzugnjf --verify

Fetch Document as Markdown

lark-hirono fetch --doc GzlQwunV9iQAqmkQqOBcZzugnjf --output out.md

Analyze Document Type

lark-hirono analyze input.md
# Output: {"document_type": "narrative", "headings": 15, "tables": "0/0"}

Verify Existing Document

lark-hirono verify --doc GzlQwunV9iQAqmkQqOBcZzugnjf

Commands

| Command | Description | |---------|-------------| | upload <file.md> | Create new styled Feishu document | | optimize --doc <id> | Optimize existing document | | fetch --doc <id> | Retrieve document as markdown | | analyze <file.md> | Analyze document structure | | highlight <subcommand> | Extract/apply keyword highlights | | verify --doc <id> | Validate document structure | | auth <subcommand> | Feishu authentication |

Common Flags

| Flag | Description | |------|-------------| | --doc <id> | Feishu document ID | | --new | Create sibling doc instead of updating | | --input <file> | Local markdown source | | --title <title> | Document title | | --wiki-space <id> | Target wiki space | | --wiki-node <token> | Target parent node | | --bg-mode light\|dark | Heading background palette | | --verify | Validate after upload/optimize | | --no-highlight | Skip keyword highlighting | | --frontmatter-as-callout | Convert YAML frontmatter into a Meta callout | | --mention-map <path> | JSON map URL → { obj_token, obj_type?, title? }; patches matching text_run links into native mention_doc blocks after upload so Feishu backlinks populate | | --dry-run | Output to stdout, no API calls | | -v, --verbose | Detailed logging |

Pipeline

Upload:   Read → Normalize → Analyze → Preprocess → Split → Highlight
          → LarkTable → Upload → Patch → Whiteboard → Verify

Optimize: Fetch → Normalize → Analyze → Narrative → Preprocess → Split
          → LarkTable → Create → Patch → Whiteboard → Verify

Key Steps

Normalize — [!callout] DSL → <callout> XML; <quote-container> → <callout> (in tables) / > blockquote (elsewhere); <add-ons> mermaid → fences; mermaid theming; HTML (<p>, <ul>, <li>, <strong>, <a>) → clean Markdown
Analyze — Classify document type based on table count and heading density
Narrative — For narrative docs: callout injection, code block tagging, signpost bolding
Preprocess — Heading numbering, strip title, blue prefix + rainbow backgrounds
Split — Oversized sections (>40KB) into chunks for API limits
Highlight — Apply {red:keyword} from LLM-selected keywords (table docs only)
LarkTable — Markdown tables → <lark-table> XML
Upload/Create — Chunked doc creation via lark-cli
Patch — Block-level PATCH for heading backgrounds
Whiteboard patch — Color-patch mermaid edge label backgrounds in whiteboard blocks
Verify — Structure validation (block-level, not markdown export)

Narrative Optimizations

For documentType === "narrative" (≥3 headings, no tables):

Opening callout — Inject <callout emoji="bulb"> XML with first paragraph as description (skipped if already present, including [!callout] DSL form)
Code block tagging — Detect bash, nginx, yaml, python from content patterns
Blockquote conversion — TL;DR and summary phrases → callout format
Signpost bolding — Emphasize transition phrases (具体来说, 值得注意的是)
Chatbot tail stripping — Remove LLM artifact text from end of fetched docs

Config File

lark-hirono.json in current directory or ancestors:

{
  "wikiSpace": "7620053427331681234",
  "wikiNode": "UNtHwabqNiqc8ZkzvLscWNnwnYd",
  "bgMode": "light",
  "highlight": true
}

Priority: CLI flags → config file → built-in defaults.

Tests

npm test  # 127 checks, all local file-based

Architecture

src/
├── pipeline.ts          # Master orchestration
├── cli.ts               # Lark CLI wrapper (auth, API calls)
├── config.ts            # Config file resolution
├── commands/            # CLI subcommands
│   ├── upload.ts
│   ├── optimize.ts
│   ├── fetch.ts
│   ├── analyze.ts
│   ├── highlight.ts
│   ├── verify.ts
│   └── auth.ts
├── core/
│   ├── analyze.ts       # Document classification
│   ├── normalize.ts     # HTML→Markdown cleanup
│   ├── preprocess.ts    # Heading numbering, strip-title
│   ├── narrative.ts     # Narrative doc optimizations
│   ├── headings.ts      # Chinese ordinal conversion
│   ├── lark-table.ts    # Table → XML conversion
│   ├── chunked.ts       # Large doc splitting
│   └── highlight.ts     # Keyword highlighting
├── patch/
│   └── patch.ts         # Heading background PATCH
├── whiteboard/
│   └── mermaid-patch.ts # Mermaid whiteboard color patching
├── image/
│   └── images.ts        # Image upload
└── verify/
    └── verify.ts        # Structure validation

Limitations

Feishu Markdown Export Corruption

lark-cli docs +fetch does not faithfully round-trip markdown:

Plain text blocks exported as ## headings
Consecutive paragraph lines merged (blank lines lost)
Code block language tags stripped (bash → plaintext)
Callout format simplified (blank lines removed)

Mitigation: Verify uses block-level structure (accurate), not markdown export (corrupted).

lark-cli Markdown Parser Limitations

The following are lark-cli parser behaviors that cannot be fixed at the pipeline level:

# in table cells — # at line start creates a heading even inside <lark-td> and <callout> blocks. \# renders as literal \# (backslash visible). Pipeline inserts a zero-width space (U+200B) before # to prevent heading parsing.
***bold-italic*** with underscores — Any _ inside ***...** causes lark-cli to strip bold, leaving only *italic*. This includes __ (double underscore), single _, and even <strong><em> HTML. Use backtick code format for metric names with underscores.
<quote-container> is fetch-only — lark-cli emits <quote-container> XML when fetching but silently drops it on upload. Pipeline converts: inside tables → <callout>, outside → blockquote >.

LLM Content Emphasis Not Automated

The optimization guide requires LLM judgment for:

{red:关键结论} — identifying key conclusions
{green:技术术语} — identifying technical terms
Insight callouts — content understanding

These are not automated in the pipeline. Use the skill layer (skills/lark-hirono/SKILL.md) for LLM-assisted content optimization.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

lark-hirono

Features

Install

CLI (npm)

Install lark-cli

Claude Code Skill

Local Development

Authentication

Usage

Upload Local Markdown

Optimize Existing Document

Fetch Document as Markdown

Analyze Document Type

Verify Existing Document

Commands

Common Flags

Pipeline

Key Steps

Narrative Optimizations

Config File

Tests

Architecture

Limitations

Feishu Markdown Export Corruption

lark-cli Markdown Parser Limitations

LLM Content Emphasis Not Automated

License

Install `lark-cli`