@heripo/research-radar
v3.1.1
Published
AI-driven intelligence for Korean cultural heritage. This package serves as both a ready-to-use newsletter service and a practical implementation example for the LLM-Newsletter-Kit.
Maintainers
Readme
heripo Research Radar
English | 한국어
Code of Conduct • Security Policy • Contributing
What is this?
An AI-powered newsletter service for Korean cultural heritage. Built on @llm-newsletter-kit/core, it's both a production service (live at heripo.com) and a reference implementation showing how to build automated newsletters with LLMs.
Production metrics:
- Cost: $0.2-1 USD per issue
- Operation: Fully autonomous 24/7 (no human intervention)
- Engagement: 15% CTR
Technical highlights:
- Type-safe TypeScript with strict interfaces
- Provider pattern for swapping components (Crawling/Analysis/Content/Email)
- 66 crawling targets across heritage agencies, museums, academic societies
- Multi LLM providers: OpenAI GPT-5 (analysis) + selectable content generation (OpenAI / Anthropic / Google)
- Built-in retries, chain options, preview emails
Links: Live service • Newsletter example • Core engine
Background
Created by archaeologist-turned-engineer Hongyeon Kim to answer: "Why must research rely on labor-intensive manual work?"
A personal script evolved into a production service after completing research on Archaeological Informatization Using LLMs. This repository open-sources the running service so developers can build domain-specific newsletters without starting from scratch.
License
Apache License 2.0 — see LICENSE and NOTICE for details.
Citation & Attribution
If you fork this project to build your own newsletter service or use this code in your research, please include the following attribution:
Powered by LLM Newsletter KitWe recommend adding this notice to your newsletter template footer or service documentation. This attribution helps support the project and its continued development.
BibTeX Citation
For academic publications:
@software{heripo research radar,
author = {Kim, Hongyeon},
title = {heripo research radar},
year = {2025},
url = {https://github.com/heripo-lab/heripo-research-radar},
note = {Apache License 2.0}
}Installation
npm install @heripo/research-radar @llm-newsletter-kit/coreRequirements: Node.js >= 24, OpenAI API key, content generation API key (OpenAI / Anthropic / Google)
Note: @llm-newsletter-kit/core is a peer dependency and must be installed separately.
Quick Start
import { generateNewsletter } from '@heripo/research-radar';
const newsletterId = await generateNewsletter({
openAIApiKey: process.env.OPENAI_API_KEY,
contentGeneration: {
provider: 'anthropic', // 'openai' | 'anthropic' | 'google'
apiKey: process.env.ANTHROPIC_API_KEY,
// model: 'claude-sonnet-4-6', // optional, uses sensible default
},
// Implement these repository interfaces (see src/types/dependencies.ts)
taskRepository: {
createTask: async () => db.tasks.create({ status: 'running' }),
completeTask: async (id) => db.tasks.update(id, { status: 'completed' }),
},
articleRepository: {
findByUrls: async (urls) => db.articles.findByUrls(urls),
saveCrawledArticles: async (articles, ctx) => db.articles.save(articles, ctx),
findUnscoredArticles: async () => db.articles.findUnscored(),
updateAnalysis: async (article) => db.articles.updateAnalysis(article),
findCandidatesForNewsletter: async () => db.articles.findCandidates(),
},
tagRepository: {
findAllTags: async () => db.tags.findAll(),
},
newsletterRepository: {
getNextIssueOrder: async () => db.newsletters.getNextOrder(),
saveNewsletter: async (data) => db.newsletters.save(data),
},
// Optional parameters:
logger: console,
publishDate: '2026-02-20', // Override publication date (ISO format)
templateOptions: { /* ... */ }, // Newsletter template customization
customFetch: proxyFetch, // Custom fetch for proxy-based crawling
previewNewsletter: {
fetchNewsletterForPreview: async () => db.newsletters.latest(),
emailService: resendEmailService,
emailMessage: { from: '[email protected]', to: '[email protected]' },
},
});Repository interfaces are defined in src/types/dependencies.ts. Each method signature includes JSDoc with expected input/output types.
Architecture
Pipeline: Crawling → Analysis → Content Generation → Save
- Crawling: Fetch articles from target websites
- Analysis: LLM tags and scores articles
- Generation: Create newsletter from top-scoring articles
- Save: Store and optionally send preview email
Uses the Provider-Service pattern from @llm-newsletter-kit/core. See core docs for flow diagrams.
Components
Config (src/config/): Brand, language, LLM settings
Targets (src/config/crawling-targets.ts): 66 sources (News 52, Business 4, Employment 10)
Parsers (src/parsers/): Custom extractors per organization
Templates (src/templates/): newsletter-html.ts (responsive email with light/dark mode), welcome-html.ts (generateWelcomeHTML()), shared.ts (shared HTML components)
Development commands
# build
npm run build # clean dist/ and build with Rollup (CJS + ESM + types)
# type-check & lint
npm run lint # lint source files
npm run lint:fix # lint with autofix
npm run typecheck # TypeScript type-check
# formatting
npm run format # Prettier formattingCrawler Debugger
A web-based tool for testing crawling parsers during development. Built with Express.js and vanilla HTML/CSS/JS to minimize dependencies.
npm run dev:crawler # Start at http://localhost:3333
npm run dev:crawler:proxy # Start with proxy support (uses .env)Features:
- Test
parseList()andparseDetail()parsers via web UI - View raw HTML source for debugging
- Copy parsed results as JSON
- 5-minute response cache (with skip/clear options)
- Timing info for fetch and parse operations
Newsletter Preview
Preview rendered newsletter HTML with sample content.
npm run dev:newsletter-preview # Start at http://localhost:3334Query params: ?kras=true (KRAS mode), ?krasNews=true (KRAS news section), ?heripolabNews=true (heripo lab news section)
Welcome Email Preview
Preview rendered welcome email HTML.
npm run dev:welcome-preview # Start at http://localhost:3335Query params: ?kras=true (KRAS mode), ?name=홍길동 (subscriber name)
Parser Health-Check
CLI tool that validates all active crawling parsers against live websites. Detects silent failures (empty results, broken selectors) caused by upstream website redesigns.
npm run health-check # Run health-check
npm run health-check:proxy # Run with proxy support (uses .env)What it checks per target:
parseList(): Returns non-empty array with valid title, date, and detailUrlparseDetail(): Returns non-empty detailContent (20+ chars)
Output: Console table summary + compact text summary for CI integrations.
CI: Daily automated run via GitHub Actions (.github/workflows/parser-health-check.yml) with Slack notifications on pass/fail.
🤝 Contributing
You can use this project in two ways:
- Contribute directly to Heripo Research Radar: bug fixes, improvements, new crawl targets
- Build your own newsletter: fork this repo and adapt it to your domain
See CONTRIBUTING.md for contribution workflow, dev setup, and PR guidelines.
Forking for Your Domain
To build your own newsletter, update these files:
1. Template (src/templates/newsletter-html.ts):
- Logo URLs, brand colors (#D2691E, #E59866), contact info
- Platform intro and footer text
- Unsubscribe link format (currently Resend's
{{{RESEND_UNSUBSCRIBE_URL}}})
2. Config (src/config/index.ts):
brandName: 'Your Newsletter Name'
subscribeUrl: 'https://yourdomain.com/subscribe'3. Crawling targets (src/config/crawling-targets.ts):
- Replace Korean heritage sites with your domain sources
- Implement parsers in
src/parsers/
4. Switch content generation LLM provider (optional):
Content generation supports 3 built-in providers — just change contentGeneration.provider:
contentGeneration: {
provider: 'google', // 'openai' | 'anthropic' | 'google'
apiKey: process.env.GOOGLE_API_KEY,
model: 'gemini-3.1-pro-preview', // optional, each provider has a default
}Default models: openai=gpt-5.1, anthropic=claude-sonnet-4-6, google=gemini-3.1-pro-preview
Analysis provider (OpenAI) can be changed by modifying src/providers/analysis.provider.ts. Any Vercel AI SDK provider works.
Search keywords: heripo, kimhongyeon, #D2691E, openai, gpt-5, contentGeneration
Why Code-Based?
Code-based automation delivers superior output quality through advanced AI techniques:
No-code platforms: Generic content, limited to built-in features This kit: Self-reflection, chain-of-thought, multi-step verification workflows
Key advantages:
- Quality: Sophisticated prompting strategies, custom validation pipelines
- Cost control: Different models per step, token limits, retry logic
- Flexibility: Swap any component (Crawling/Analysis/Content/Email) via Provider interfaces
- Operations: Built-in retries, preview emails, integrates with CI/CD
- No lock-in: OSS, self-hostable, any LLM provider
Design philosophy:
- Logic in code (orchestration, deduplication)
- Reasoning in AI (analysis, scoring, content generation)
- Connections in architecture (swappable Providers)
Related Projects
@llm-newsletter-kit/core— Domain-agnostic newsletter engine- Archaeological Informatization Using LLMs — Academic research (Korean)
