@agentics/get
v0.2.2
Published
A versatile command-line tool and library for making HTTP requests, scraping web content, automating web interactions, loading/saving cookies, and using Puppeteer for JavaScript evaluation.
Readme
@agentics/get
A versatile command-line tool and library for making HTTP requests, scraping web content, automating web interactions, loading/saving cookies, and using Puppeteer for JavaScript evaluation.
Table of Contents
Features
- Perform HTTP GET and POST requests.
- Scrape web pages using CSS selectors.
- Extract text, HTML, links, images, and cookies.
- Evaluate custom JavaScript on web pages using Puppeteer.
- Save and load cookies to/from files.
- Use customizable request headers and body.
- Puppeteer integration for headless or visible browser execution.
- Save responses, cookies, and evaluated JavaScript results to files.
Installation
npm install -g @agentics/getUsage
Command-Line Usage
get [options]Options
| Option | Alias | Type | Description | Default |
|------------------------|-------|---------|-------------------------------------------------------------------------------------------------------|---------|
| --get | -g | Boolean | Perform a GET request (default). | true |
| --post | -p | Boolean | Perform a POST request. | false |
| --url <url> | -u | String | URL to request. (Required) | |
| --eval <script> | -e | String | Evaluate JavaScript on the page using Puppeteer. | |
| --save | -s | Boolean | Save response, cookies, and evaluation result to files (response.json, cookies.json, eval.json). | false |
| --selector <css> | -S | String | CSS selector to scrape content from the page. | |
| --links | -l | Boolean | Return all links (<a href="">) from the page. | false |
| --text | -t | Boolean | Return the text content of the page (default if no other output option is specified). | false |
| --html | -H | Boolean | Return the HTML content of the page. | false |
| --cookies | -c | Boolean | Return cookies from the response headers. | false |
| --header <key:value> | -h | String | Custom headers for the request (can be used multiple times). | |
| --jar <file> | -j | String | Load cookies from a file (in .json format). | |
| --browser | -b | Boolean | Show browser window (Puppeteer headless mode is off). | false |
Examples
Basic GET Request
get -u https://example.comPOST Request with Custom Headers and JSON Body
get -p -u https://example.com/api -h "Content-Type: application/json" -h "Authorization: Bearer token"Scrape Text Using a CSS Selector
get -u https://example.com -S ".article-title"Evaluate JavaScript on the Page (Using Puppeteer)
get -u https://example.com -e "document.title"Get All Links from a Page
get -u https://example.com -lSave Response, Cookies, and JavaScript Evaluation Result to Files
get -u https://example.com -e "document.title" -sGet Page HTML
get -u https://example.com -HGet Cookies from Response Headers
get -u https://example.com -cLoad Cookies from a File and Make a Request
get -u https://example.com -j cookies.jsonCombining Options
You can combine multiple options to perform complex tasks. For example:
get -u https://example.com -S ".article-title" -l -c -s -e "document.title"This command will:
- Scrape content matching
.article-title. - Return all links from the page.
- Return cookies from the response headers.
- Save the response, cookies, and evaluation result to files.
- Evaluate JavaScript (
document.title) on the page.
Programmatic Usage
You can also use @agentics/get as a library in your Node.js projects.
Importing the Module
const get = require('@agentics/get');Using exportFunctions
The exportFunctions method allows you to perform web scraping, HTTP requests, and JavaScript evaluation programmatically.
Syntax
const results = await get(url, options);Parameters
url(String): The URL to request.options(Object, optional): Configuration options.
Available Options
| Option | Type | Description |
|------------|-----------|------------------------------------------------------------------------|
| post | Boolean | Use POST request instead of GET. |
| headers | Object | Custom headers for the request. |
| cookies | Boolean | Return cookies from the response headers. |
| links | Boolean | Return all links from the page. |
| html | Boolean | Return the HTML content of the page. |
| text | Boolean | Return the text content of the page. |
| selector | String | CSS selector to scrape content. |
| eval | String | JavaScript code to evaluate on the page (using Puppeteer). |
| save | Boolean | Save response, cookies, and evaluation result to files. |
| jar | String | Load cookies from a file (for maintaining session between requests). |
| headless | Boolean | Run Puppeteer in headless mode (true by default, unless --browser). |
Example: Making a Request with Cookies and JavaScript Evaluation
const get = require('@agentics/get');
(async () => {
try {
const url = 'https://example.com';
const options = {
headers: {
'User-Agent': 'CustomUserAgent/1.0',
},
text: true,
links: true,
cookies: true,
eval: "document.title",
};
const results = await get(url, options);
console.log('Text Content:', results.text);
console.log('Links:', results.links);
console.log('Cookies:', results.cookies);
console.log('Evaluated JS Result:', results.evalResult);
} catch (error) {
console.error('Error:', error);
}
})();Example: Using Puppeteer with Cookies Loaded from File
const get = require('@agentics/get');
(async () => {
try {
const url = 'https://example.com';
const options = {
jar: 'cookies.json',
headless: false,
text: true,
eval: "document.querySelector('.article-title').innerText",
};
const results = await get(url, options);
console.log('Text Content:', results.text);
console.log('Evaluated JS Result:', results.evalResult);
} catch (error) {
console.error('Error:', error);
}
})();Default Options
If no options are provided, the following defaults are used:
{
cookies: true,
links: true,
html: true,
text: true,
headless: true,
}Saving Cookies and Responses
To persist cookies and responses between sessions, the following methods can be used:
- Use the
saveoption to save the response, cookies, and evaluated result toresponse.json,cookies.json, andeval.json. - Load cookies using the
jaroption to maintain the session between multiple requests.
Example: Saving and Loading Cookies
- First, save cookies from a request:
get -u https://example.com -s- Then, load those cookies in a subsequent request:
get -u https://example.com -j cookies.jsonThis allows you to maintain sessions and reuse cookies for authenticated requests.
Contributing
Contributions are welcome! Please open an issue or submit a pull request on GitLab.
License
This project is licensed under the MIT License.
Author: Connor Etherington Email: [email protected] Website: https://agentics.co.za Upstream URL: https://gitlab.com/a4to/get
