npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

bbc-scraper

v1.1.1

Published

A study purpose program for parsing bbc news titles and contents

Downloads

3

Readme

BBC News Scraper

This project is a simple web scraper for extracting news titles and content from the BBC News website. Please note that this scraper is designed for educational purposes and may not comply with the BBC's terms of service. Use it responsibly and ensure you have permission to scrape their content.

Features

  • Crawl the BBC News homepage for article titles and links.
  • Fetch the full content of individual news articles.

Usage

  1. Install dependencies:
npm install bbc-scraper
  1. Example usage in a script:
import { getBBCNewsTitles, getBBCNewsContent, configBBC } from "bbc-scraper";
async function main() {
  configBBC({
    imageResolution: "low", // "medium", "high"
  });
  const titles = await getBBCNewsTitles();
  console.log("Latest BBC News Titles:");
  titles.forEach((title) => {
    console.log(`${title.title} - ${title.newsLink}`);
  });

  if (titles.length > 0) {
    const content = await getBBCNewsContent(titles[0].newsLink);
    console.log(`Content of the first article: ${content.content}`);
  }
}
main().catch(console.error);

Function Explanation

  • configBBC(config: BBCConfig): void: Configures the BBC scraper environment. Parameters:

    • imageResolution: you can use "low", "medium", or "high".
    • titlePageUrl: the URL of the BBC News title page. PLEASE DO NOT SET THIS UNLESS YOU KNOW WHAT YOU'RE DOING.
    • rootUrl: the root URL of the BBC News website. PLEASE DO NOT SET THIS UNLESS YOU KNOW WHAT YOU'RE DOING.
  • crawler(url: string): Promise<string>: Fetches the HTML content of a given URL. This function returns a promise that resolves to the HTML content as a string.

  • getBBCNewsTitles(): Promise<NewsTitles[]>: Fetches the titles and links of the latest news articles from the BBC News homepage.

  • getBBCNewsContent(url: string): Promise<NewsContent>: Fetches the full content of a news article given its URL.

  • getNewsTitle(body: string): NewsTitles[]: Extracts news titles and links from the HTML content.

  • getNewsContent(body: string): NewsContent: Extracts the full content of a news article from the HTML content.

Models

  • NewsTitles: Represents a news title with its text and link.
{
  "coverUrl": "string | null",
  "title": "string",
  "description": "string",
  "newsLink": "string"
}
  • NewsContent: Represents the full content of a news article.
{
  "title": "string",
  "author": "string",
  "date": "string",
  "content": "string"
}
  • BBCEnvironment: Represents the environment configuration for the BBC scraper.
{
  "titlePageUrl": "string | null",
  "rootUrl": "string | null",
  "imageResolution": "string | null" // "low", "medium", "high"
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Issues

If you encounter any issues while using the BBC News Scraper, please check the issues page for known problems and solutions. If your issue is not listed, feel free to open a new issue with a detailed description of the problem.

Self-develop guide

  1. Clone the repository:
    You will need to fork this repository first, then clone your fork. you can fork this repository from https://github.com/Rduanchen/BBC_Scraper.
git clone https://github.com/<yourusername>/BBC_Scraper.git
cd BBC_Scraper
  1. Install dependencies:
npm install
  1. Start developing:
npm run dev
  1. Run tests:
npm test

Collaboration

We welcome contributions to the BBC News Scraper project! If you'd like to contribute, please follow these steps:

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix.
  3. Make your changes and commit them with descriptive messages.
  4. Push your changes to your forked repository.
  5. Submit a pull request to the main repository.

Please ensure that your code adheres to the project's coding standards and includes appropriate tests.

Author

My name is Rduan(Justin), you can contact me via email: [email protected]