@knowcode/imgfetch

v0.1.0

Published

5 months ago

Reliable image downloader CLI that works with any website

0High
0Medium
0Low

lindsay-knowcode

image download scraper puppeteer playwright cli fetch

📸 @knowcode/imgfetch

Download images from anywhere. Even the tricky sites.

A powerful CLI tool that combines multiple strategies to reliably download images from any website, including those with anti-bot protection, dynamic content, and authentication requirements.

Features • Installation • Quick Start • Strategies • Examples • API • FAQ

🎯 Why imgfetch?

Ever tried to download an image only to get blocked by anti-bot measures? Or struggled with JavaScript-rendered content? imgfetch solves these problems by intelligently selecting the right approach for each URL.

# Just works™
imgfetch https://linkedin.com/in/john-doe --strategy browser -o profile.jpg

✨ Features

🚀 Smart Detection

Automatically chooses the best download strategy based on URL patterns and site requirements.

🤖 Anti-Bot Bypass

Uses puppeteer-extra-plugin-stealth to avoid detection on protected sites.

📦 Batch Processing

Download multiple images efficiently with a single command.

🔄 Automatic Fallback

If one method fails, automatically tries alternative strategies.

📦 Installation

Global Installation (Recommended)

npm install -g @knowcode/imgfetch

Local Installation

npm install @knowcode/imgfetch

Requirements

Node.js 16.0.0 or higher
npm or yarn

🚀 Quick Start

Basic Usage

# Download a simple image
imgfetch https://example.com/photo.jpg

# Save with custom filename
imgfetch https://example.com/photo.jpg -o my-image.jpg

# Force browser strategy for dynamic content
imgfetch https://instagram.com/p/ABC123 --strategy browser

Batch Download

Create a file urls.txt:

https://example.com/image1.jpg
https://linkedin.com/in/user
https://unsplash.com/photos/abc123

Then run:

imgfetch batch urls.txt --output-dir ./images/

🎨 Strategies

Direct Strategy

Best for: Static images, CDN links, direct URLs

imgfetch https://cdn.example.com/image.png --strategy direct

⚡ Fastest method
📦 Minimal resource usage
✅ Works with most image hosting services

Browser Strategy (Puppeteer)

Best for: Social media, JavaScript-rendered content

imgfetch https://linkedin.com/in/user --strategy browser

🛡️ Bypasses anti-bot protection
🔍 Handles dynamic content
🍪 Maintains session state

Playwright Strategy

Best for: Complex sites, when Puppeteer fails

imgfetch https://complex-site.com/image --strategy playwright

🌐 Cross-browser support
🔧 Alternative automation engine
🎭 Better handling of modern web apps

📖 Examples

Download LinkedIn Profile Pictures

imgfetch https://linkedin.com/in/john-doe \
  --strategy browser \
  -o john-doe-profile.jpg

Extract Images from Instagram Posts

imgfetch https://instagram.com/p/ABC123XYZ \
  --strategy browser \
  --timeout 45

Batch Download with Mixed Sources

# Create urls.txt with various image sources
cat > urls.txt << EOF
https://pbs.twimg.com/profile_images/123/abc.jpg
https://linkedin.com/in/jane-doe
https://github.com/user.png
EOF

# Download all images
imgfetch batch urls.txt --output-dir ./profile-pics/

JSON Output for Automation

# Get structured output for scripting
result=$(imgfetch https://example.com/img.jpg --json)
echo $result | jq '.path'

🧩 API Usage

Node.js Integration

const ImageFetcher = require('@knowcode/imgfetch');

async function downloadImage() {
  const fetcher = new ImageFetcher({
    strategy: 'auto',
    timeout: 30000
  });

  try {
    const result = await fetcher.fetch(
      'https://example.com/image.jpg',
      './output.jpg'
    );
    
    console.log('Success!', result);
    // {
    //   success: true,
    //   strategy: 'direct',
    //   path: './output.jpg',
    //   size: 245632,
    //   mimeType: 'image/jpeg'
    // }
  } catch (error) {
    console.error('Failed:', error.message);
  } finally {
    await fetcher.close();
  }
}

Advanced Configuration

const fetcher = new ImageFetcher({
  strategy: 'browser',  // Force specific strategy
  timeout: 60000,       // 60 second timeout
  retries: 5,          // Number of retry attempts
});

🎛️ Command Line Options

| Option | Short | Description | Default | |--------|-------|-------------|---------| | --output | -o | Output file path | Auto-generated | | --strategy | -s | Download strategy (auto, direct, browser, playwright) | auto | | --timeout | -t | Timeout in seconds | 30 | | --json | | Output as JSON | false | | --quiet | -q | Suppress progress output | false | | --help | -h | Show help | | | --version | -V | Show version | |

🔧 Configuration

Environment Variables

# Set default timeout
export IMGFETCH_TIMEOUT=60

# Set default output directory
export IMGFETCH_OUTPUT_DIR=./downloads

Custom Headers

const fetcher = new ImageFetcher({
  headers: {
    'User-Agent': 'Custom User Agent',
    'Referer': 'https://example.com'
  }
});

❓ FAQ

Why does it download the wrong image from some sites?

Some sites load multiple images. imgfetch tries to identify the main image, but you might need to use browser DevTools to find the specific image URL.

Can it handle sites that require login?

Currently, imgfetch doesn't support authentication. For sites requiring login, you'll need to download images manually or use browser automation with saved cookies.

Why is browser strategy slow?

Browser automation launches a real browser instance, which takes time. Use direct strategy when possible for better performance.

How do I download from sites with CAPTCHA?

CAPTCHA-protected sites require manual intervention. imgfetch can't automatically solve CAPTCHAs.

🛠️ Troubleshooting

"No suitable image found"

The page might not contain images
Try using --strategy browser for dynamic content
Check if the URL requires authentication

"All strategies failed"

Verify the URL is accessible in a regular browser
Check your internet connection
Some sites may have strong anti-bot protection

Installation Issues

# Clear npm cache
npm cache clean --force

# Install with verbose logging
npm install -g @knowcode/imgfetch --verbose

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built on puppeteer-extra and playwright-extra
Uses puppeteer-extra-plugin-stealth for anti-detection
Inspired by the need for reliable image downloading in web scraping projects

Made with ❤️ by knowcode

Report Bug • Request Feature