instasave-sdk

v1.2.9

Published

4 days ago

Instagram image scraper SDK with authentication support and direct download functionality

0High
0Medium
0Low

nykcze

instagram scraper images photos downloader playwright crawlee sdk

InstaSave SDK

📸 Simple Instagram scraper for downloading images from posts

Fast, reliable, and easy-to-use Instagram media downloader with authentication and direct download support.

🚀 Quick Start

npm install instasave-sdk

import { MediaScraper } from 'instasave-sdk';

const scraper = new MediaScraper();
const data = await scraper.scrape('https://www.instagram.com/p/ABC123/');

console.log(`Downloaded ${data.media.length} images from @${data.profile_name}`);

📖 Usage Guide

Basic Scraping

import { MediaScraper } from 'instasave-sdk';

const scraper = new MediaScraper();

// Download images from Instagram post
const data = await scraper.scrape('https://www.instagram.com/p/ABC123/');

console.log(data);
// Result:
// {
//   url: 'https://www.instagram.com/p/ABC123/',
//   post_id: 'ABC123',
//   platform: 'instagram',
//   profile_name: 'username',
//   profile_url: 'https://www.instagram.com/username/',
//   media: [
//     { url: 'https://instagram.com/image1.jpg', index: 0 },
//     { url: 'https://instagram.com/image2.jpg', index: 1 }
//   ],
//   metadata: {
//     likesCount: 1234,
//     commentsCount: 56,
//     caption: 'Post caption...',
//     timestamp: null,
//     location: null
//   }
// }

Progress Tracking

const data = await scraper.scrape('https://www.instagram.com/p/ABC123/', {
  progressCallback: (event) => {
    console.log(`${event.type}: ${event.percentage}% - ${event.message}`);
    // Output:
    // browser: 0% - Launching browser...
    // navigation: 25% - Navigating to post...
    // extraction: 50% - Extracting media...
    // metadata: 75% - Fetching metadata...
    // complete: 100% - Done!
  }
});
### Save to File

```javascript
const data = await scraper.scrape('https://www.instagram.com/p/ABC123/', {
  saveToFile: true,           // Save data to JSON file
  outputPath: './downloads'   // Folder to save to
});

Private Posts (Login Required)

// Login first
await scraper.login({
  username: 'your_username',
  password: 'your_password'
});

// Then download private posts
const data = await scraper.scrape('https://www.instagram.com/p/private_post/');

Advanced Options

const data = await scraper.scrape('URL', {
  saveToFile: true,        // Save to file? (default: false)
  outputPath: './folder',  // Where to save? (default: current folder)
  timeout: 60,            // Timeout in seconds (default: 60)
  retries: 2,             // How many retries on error (default: 0)
  headless: true,         // Hidden browser? (default: true)
  progressCallback: (event) => {
    console.log(`${event.type}: ${event.percentage}%`);
  }
});

📥 Download Images

Stáhněte obrázky přímo z scraped dat:

import { MediaScraper, downloadImages } from 'instasave-sdk';

const scraper = new MediaScraper();

// Nejprve scrape data
const data = await scraper.scrape('https://www.instagram.com/p/ABC123/');

// Potom stáhněte obrázky
const result = await downloadImages(data, {
  savePath: './downloads',        // Kam ukládat
  getFile: true,                  // 'download' | 'save' | false
  filename: '{username}_{postid}_{index}.{ext}',
  skipDuplicates: true,
  onProgress: (percent, current, total) => {
    console.log(`${percent}% (${current}/${total})`);
  }
});

console.log(`✅ Staženo: ${result.downloaded}/${result.total}`);
console.log(`📁 Soubory: ${result.files.map(f => f.filename).join(', ')}`);

Download Options

| Parametr | Typ | Default | Popis | |----------|-----|---------|-------| | savePath | string | - | Cílová složka | | getFile | boolean | true | Ukládat soubory | | filename | string | {username}_{postid}_{index}.{ext} | Pattern názvu | | skipDuplicates | boolean | false | Přeskočit duplikáty | | createSubfolders | boolean | false | Vytvořit podsložku podle username | | onProgress | function | - | Callback pro progress | | timeout | number | 30000 | Timeout v ms |

Download Result

{
  success: true,           // true pokud všechny úspěšné
  total: 5,                // celkem obrázků
  downloaded: 4,           // úspěšně staženo
  failed: 1,               // selhalo
  skipped: 0,              // přeskočeno (duplikáty)
  files: [
    {
      index: 0,
      url: 'https://...',
      path: './downloads/user_ABC123_01.jpg',
      filename: 'user_ABC123_01.jpg',
      bytes: 245000,
      format: 'jpg'
    }
  ],
  errors: [
    {
      index: 4,
      url: 'https://...',
      code: 'DOWNLOAD_FAILED',
      message: 'Timeout after 3 attempts'
    }
  ],
  totalBytes: 980000,      // celkem staženo bytes
  duration: 5234           // ms
}

📥 Download Images

Download images directly from scraped data:

import { MediaScraper, downloadImages } from 'instasave-sdk';

const scraper = new MediaScraper();
const data = await scraper.scrape('https://www.instagram.com/p/ABC123/');

// Download all images with progress tracking
const result = await downloadImages(data, {
  savePath: './downloads',
  getFile: true,
  filename: '{username}_{postid}_{index}.{ext}',
  skipDuplicates: false,
  onProgress: (percent, current, total) => {
    console.log(`${percent}% (${current}/${total})`);
  }
});

console.log(`Downloaded: ${result.downloaded}/${result.total}`);
console.log(`Files: ${result.files.map(f => f.filename).join(', ')}`);

Download Options

| Option | Type | Default | Description | |--------|------|---------|-------------| | savePath | string | './downloads' | Destination folder | | getFile | boolean | true | Save to file or return URLs | | filename | string | {username}_{postid}_{index}.{ext} | Filename pattern | | skipDuplicates | boolean | false | Skip already downloaded files | | createSubfolders | boolean | false | Create subfolder by username | | onProgress | function | - | Progress callback | | timeout | number | 30000 | Request timeout in ms |

Download Result

{
  success: true,           // true if all successful
  total: 5,                // total images
  downloaded: 4,           // successfully downloaded
  failed: 1,               // failed downloads
  skipped: 0,              // skipped duplicates
  files: [
    {
      index: 0,
      url: 'https://...',
      path: './downloads/user_ABC123_01.jpg',
      filename: 'user_ABC123_01.jpg',
      bytes: 245000,
      format: 'jpg'
    }
  ],
  errors: [
    {
      index: 4,
      url: 'https://...',
      code: 'DOWNLOAD_FAILED',
      message: 'Timeout after 3 attempts'
    }
  ],
  totalBytes: 980000,      // total bytes
  duration: 5234           // milliseconds
}

🔗 Direct URL Download

Download any image URL directly with downloadScrape():

import { downloadScrape } from 'instasave-sdk';

// Download with Base64 conversion (default)
const result = await downloadScrape(imageUrl, {
  useBase64: true,
  outputPath: './images',
  filename: 'my-image.jpg'
});
// Flow: Download → Base64 → File

// Direct download (faster)
const result = await downloadScrape(imageUrl, {
  useBase64: false,
  outputPath: './images',
  filename: 'my-image.jpg'
});
// Flow: Download → File

console.log(result.localPath);  // "./images/my-image.jpg"
console.log(result.bytes);      // 265661
console.log(result.format);     // "jpg"

downloadScrape Options

| Option | Type | Default | Description | |--------|------|---------|-------------| | useBase64 | boolean | true | Download→Base64→File or direct Download→File | | outputPath | string | './downloads' | Destination folder | | filename | string | auto-generated | Filename | | timeout | number | 30000 | Request timeout in ms | | retries | number | 3 | Retry attempts on failure |

downloadScrape Result

{
  success: true,
  url: 'https://instagram.com/image.jpg',
  localPath: './images/my-image.jpg',
  filename: 'my-image.jpg',
  format: 'jpg',
  bytes: 265661,
  error: undefined
}

⚙️ Configuration

Create instasave.config in your project root for automatic setup:

[DEFAULT]
PUPPETEER_EXECUTABLE_PATH = /path/to/chrome
BROWSER_TYPE = chrome
WORKFLOWS_DIR = ./workflows

[ACCOUNT]
USERNAME = your_username
PASSWORD = your_password

[SETTINGS]
OUTPUT_DIR = ./downloads
HEADLESS = true
TIMEOUT = 60

Supported Browser Types:

chrome - Google Chrome
chromium - Chromium
edge - Microsoft Edge
brave - Brave Browser
opera - Opera
vivaldi - Vivaldi
arc - Arc Browser
comet - Comet Browser

Browser Priority Order:

Explicit path (PUPPETEER_EXECUTABLE_PATH) - highest priority
Browser type (BROWSER_TYPE) - resolves to standard macOS path
Environment variable (PUPPETEER_EXECUTABLE_PATH env var)
Default - Puppeteer's bundled Chromium

Benefits:

✅ Auto-login - Automatic authentication on startup
✅ Custom browser - Use specific Chrome/Chromium installation
✅ Custom workflows - Configure workflow directory with WORKFLOWS_DIR
✅ Project-specific - Different settings per project
✅ Environment fallback - Falls back to environment variables
✅ Hot reload - Config changes apply automatically without restart

✨ Features

What's Supported

✅ Single posts - One image
✅ Carousel posts - Multiple images in one post
✅ Private posts - After login
✅ Auto save - To JSON file
✅ Error handling - When something fails
✅ Metadata extraction - Likes, comments, captions ✅ Progress tracking - Real-time feedback during scraping ✅ Reliable login - Stable authentication with Instagram ✅ Browser reuse - 61-70% faster performance through connection pooling ✅ Hot reload config - Automatic config updates without restart ✅ Direct download - Download images directly from scraped data ✅ Download API - Batch download with progress tracking and retry logic

Limitations

❌ Videos/Reels - Images only
❌ Stories - Posts only

🔧 Troubleshooting

Common Issues

"Login required"

// Solution: Login before downloading
await scraper.login({ username: 'user', password: 'pass' });

"Failed to extract post ID"

// Solution: Check URL - must contain /p/
// ✅ Correct: https://www.instagram.com/p/ABC123/
// ❌ Wrong: https://www.instagram.com/username/

"Timeout"

// Solution: Increase timeout
const data = await scraper.scrape(url, { timeout: 120 });

📋 Requirements

Node.js 16 or newer
Internet connection
For private posts: valid Instagram account

📄 License

MIT - use anywhere

🆘 Support

Having issues? Create an issue on GitHub

// Download images from Instagram post const data = await scraper.scrape('https://www.instagram.com/p/ABC123/');

console.log(data); // Result: // { // url: 'https://www.instagram.com/p/ABC123/', // post_id: 'ABC123', // platform: 'instagram', // profile_name: 'username', // profile_url: 'https://www.instagram.com/username/', // media: [ // { url: 'https://instagram.com/image1.jpg', index: 0 }, // { url: 'https://instagram.com/image2.jpg', index: 1 } // ] // }


### 2. Save to file

```javascript
const data = await scraper.scrape('https://www.instagram.com/p/ABC123/', {
  saveToFile: true,           // Save data to JSON file
  outputPath: './downloads'   // Folder to save to
});

3. Access private posts

// Login first
await scraper.login({
  username: 'your_username',
  password: 'your_password'
});

// Then you can download private posts
const data = await scraper.scrape('https://www.instagram.com/p/private_post/');

What SDK supports

❌ Videos/Reels - Images only
❌ Stories - Posts only

All options

const data = await scraper.scrapePost('URL', {
  saveToFile: true,        // Save to file? (default: false)
  outputPath: './folder',  // Where to save? (default: current folder)
  timeout: 60,            // Timeout in seconds (default: 60)
  retries: 2,             // How many retries on error (default: 0)
  headless: true          // Hidden browser? (default: true)
});

Common errors and solutions

"Login required"

// Solution: Login before downloading
await scraper.login({ username: 'user', password: 'pass' });

"Failed to extract post ID"

// Solution: Check URL - must contain /p/
// ✅ Correct: https://www.instagram.com/p/ABC123/
// ❌ Wrong: https://www.instagram.com/username/

"Timeout"

// Solution: Increase timeout
const data = await scraper.scrapePost(url, { timeout: 120 });

Complete example

import { MediaScraper } from 'instasave-sdk';

async function downloadInstagramPost() {
  const scraper = new MediaScraper();
  
  try {
    // Optionally login for private posts
    // await scraper.login({
    //   username: 'your_username',
    //   password: 'your_password'
    // });
    
    const data = await scraper.scrape(
      'https://www.instagram.com/p/ABC123/',
      {
        saveToFile: true,
        outputPath: './instagram_downloads',
        timeout: 90
      }
    );
    
    console.log(`✅ Downloaded ${data.media.length} images from @${data.profile_name}`);
    console.log('Images:', data.media.map(item => item.url));
    
  } catch (error) {
    console.error('❌ Error:', error.message);
  }
}

downloadInstagramPost();

Requirements

Node.js 16 or newer
Internet connection
For private posts: valid Instagram account

License

MIT - use anywhere

Support

Having issues? Create an issue on GitHub

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

InstaSave SDK

🚀 Quick Start

📖 Usage Guide

Basic Scraping

Progress Tracking

Private Posts (Login Required)

Advanced Options

📥 Download Images

Download Options

Download Result

📥 Download Images

Download Options

Download Result

🔗 Direct URL Download

downloadScrape Options

downloadScrape Result

⚙️ Configuration

✨ Features

What's Supported

Limitations

🔧 Troubleshooting

Common Issues

📋 Requirements

📄 License

🆘 Support

3. Access private posts

What SDK supports

All options

Common errors and solutions

"Login required"

"Failed to extract post ID"

"Timeout"

Complete example

Requirements

License

Support