asimov-brightdata-module
v0.0.0
Published
ASIMOV module for data import powered by the Bright Data web data platform.
Maintainers
Readme
ASIMOV Bright Data Module
ASIMOV module for data import powered by the Bright Data web data platform.
✨ Features
- Imports structured data from Airbnb, Amazon, Crunchbase, eBay, Facebook, Google, Indeed, Instagram, LinkedIn, Walmart, X (aka Twitter), Yahoo, and YouTube.
- Collects the raw JSON data via the Bright Data API (requires an API key).
- Constructs a semantic knowledge graph based on the KNOW ontology.
- Supports plain JSON output as well as RDF output in the form of JSON-LD.
- Distributed as a standalone static binary with zero runtime dependencies.
🛠️ Prerequisites
- Rust 1.85+ (2024 edition) if building from source code
⬇️ Installation
Installation from PyPI
pip install -U asimov-brightdata-moduleInstallation from RubyGems
gem install asimov-brightdata-moduleInstallation from NPM
npm install -g asimov-brightdata-moduleInstallation from Source Code
cargo install asimov-brightdata-module👉 Examples
export BRIGHTDATA_API_KEY="..."Fetching X Profiles
asimov-brightdata-fetcher https://x.com/bright_init # JSON
asimov-brightdata-importer https://x.com/bright_init # JSON-LDFetching LinkedIn Profiles
asimov-brightdata-fetcher https://www.linkedin.com/in/orlenchner/
asimov-brightdata-fetcher https://www.linkedin.com/company/bright-data/Fetching Crunchbase Profiles
asimov-brightdata-fetcher https://www.crunchbase.com/organization/brightdataFetching Amazon Products
asimov-brightdata-fetcher https://www.amazon.com/Master-Algorithm-Ultimate-Learning-Machine/dp/0465094279⚙ Configuration
Environment Variables
BRIGHTDATA_API_KEY: (required) the Bright Data API key to use
📚 Reference
Installed Binaries
asimov-brightdata-cataloger: discovers entities via the Bright Data API (not implemented yet)asimov-brightdata-fetcher: collects JSON data from the Bright Data APIasimov-brightdata-importer: collects and transforms JSON into JSON-LD (not implemented yet)
Supported Datasets
Dataset | URL Prefix | JSON | RDF
:------ | :--------- | :--: | :--:
Airbnb | https://www.airbnb.com/rooms/ | ✅ | 🚧
Amazon | https://www.amazon.com/ | ✅ | 🚧
| https://www.amazon.com/sp?seller= | ✅ | 🚧
Crunchbase | https://www.crunchbase.com/organization/ | ✅ | 🚧
eBay | https://www.ebay.com/itm/ | ✅ | 🚧
Facebook | https://www.facebook.com/events/ | ✅ | 🚧
| https://www.facebook.com/groups/ | ✅ | 🚧
| https://www.facebook.com/marketplace/item/ | ✅ | 🚧
| https://www.facebook.com/share/p/ | ✅ | 🚧
Google | https://www.google.com/shopping/product/ | ✅ | 🚧
Indeed | https://www.indeed.com/cmp/ | ✅ | 🚧
Instagram | https://www.instagram.com/ | ✅ | 🚧
| https://www.instagram.com/p/ | ✅ | 🚧
| https://www.instagram.com/reel/ | ✅ | 🚧
LinkedIn | https://www.linkedin.com/company/ | ✅ | 🚧
| https://www.linkedin.com/in/ | ✅ | 🚧
| https://www.linkedin.com/jobs/ | ✅ | 🚧
| https://www.linkedin.com/posts/ | ✅ | 🚧
| https://www.linkedin.com/pulse/ | ✅ | 🚧
Walmart | https://www.walmart.com/global/seller/ | ✅ | 🚧
| https://www.walmart.com/ip/ | ✅ | 🚧
X (Twitter) | https://x.com/ | ✅ | ✅
Yahoo | https://finance.yahoo.com/quote/ | ✅ | 🚧
YouTube | https://www.youtube.com/@ | ✅ | 🚧
| https://www.youtube.com/watch?v= | ✅ | 🚧
| | |
👨💻 Development
git clone https://github.com/asimov-modules/asimov-brightdata-module.git