image-recognition-microservice

v0.1.4

Published

2 years ago

Image Recognition Microsevice

0High
0Medium
0Low

ivanoff

Image Recognition Microservice

Image Recognition Microsevice

This is a standalone microservice for image recognition and question answering. It can analyze an image and respond to specific questions about its content. The server is built using Python and moondream (a small vision language model designed to run efficiently on edge devices, (more on huggingface) for image recognition and question answering.

Features

Recognizes objects and scenes in images.
Answers questions related to the image's content.
Simple REST API for integration.

Brief Example

image-example

Describe this image (default question)

The image depicts the animated character Homer Simpson in a room, pointing to a drawing of a car on a whiteboard.

Request:

curl -X POST http://127.0.0.1:5000/ -H "Authorization: Bearer 123" -F "image=@./assets/example.png"

Response:

{
  "answer": "The image depicts the animated character Homer Simpson in a room, pointing to a drawing of a car on a whiteboard.",
  "question": "Describe this image"
}

Describe in detail this image

The image depicts a scene from the animated television series "The Simpsons". The central figure is Homer Simpson, a renowned character known for his love of cars. He is standing in front of a whiteboard, which displays a drawing of a car. Homer is pointing towards the drawing, suggesting he is explaining or admiring it. The background is a vibrant purple color, providing a contrast to the whiteboard and the yellow figure of Homer.

Request:

curl -X POST http://127.0.0.1:5000/ -H "Authorization: Bearer 123" -F "image=@./assets/example.png" -F "question=Describe in detail this image"

Response:

{
  "answer": "The image depicts a scene from the animated television series \"The Simpsons\". The central figure is Homer Simpson, a renowned character known for his love of cars. He is standing in front of a whiteboard, which displays a drawing of a car. Homer is pointing towards the drawing, suggesting he is explaining or admiring it. The background is a vibrant purple color, providing a contrast to the whiteboard and the yellow figure of Homer.",
  "question": "Describe in detail this image"
}

What color is the skin?

The color of skin in the image is yellow.

Request:

curl -X POST http://127.0.0.1:5000/ -H "Authorization: Bearer 123" -F "image=@./assets/example.png" -F "question=What color is the skin?"

Response:

{
  "answer": "The color of skin in the image is yellow.",
  "question": "What the color of skin?"
}

Image Recognition Microsevice

Server

Prerequisites

Install Docker and Docker Compose.
Prepare an .env file for API token configuration.

Installation

Clone this repository:

git clone <repository_url>
cd <repository_name>

Edit .env file and set the API_TOKEN:
```
API_TOKEN=your_secret_token
```
Build and run the service using Docker Compose:
```
docker-compose up --build
```
The API will be available at http://127.0.0.1:5000/.

Configuration

Set the API_TOKEN in the .env file to secure the API. Example:

API_TOKEN=your_secret_token

Start the server

To start the server, run the following command:

sudo docker compose build
sudo docker compose up

After running the command, wait for the following message to appear in the terminal:

api-1  |  * Serving Flask app 'server'
api-1  |  * Debug mode: off
api-1  | WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
api-1  |  * Running on all addresses (0.0.0.0)
api-1  |  * Running on http://127.0.0.1:5000
api-1  |  * Running on http://172.21.0.2:5000
api-1  | Press CTRL+C to quit

The server will be available at http://localhost:5000/.

Dependencies

Python 3.10
Flask
PyTorch
Pillow
Transformers
Einops

API Usage

Endpoint

POST /

Headers

Authorization: Bearer <your_secret_token>

Form Data

image: The image file to be analyzed (e.g., .jpg, .png).
question: A string containing the question about the image. Optional, default value is Describe this image..

Example Request

curl -X POST http://127.0.0.1:5000/ \
-H "Authorization: Bearer your_secret_token" \
-F "image=@./assets/example.png" \
-F "question=Describe this image."

Example Response

{
  "question":"Describe this image.",
  "answer":"A close-up image of a pile of ripe, red strawberries with green leaves."
}

Example Response with Error

{
  "question":"Describe this image.",
  "error":"Invalid image format."
}

Node.js Image Recognition Client

This client provides a simple interface to interact with the Image Recognition. It allows you to easily recognize objects and scenes in images and ask questions about them.

See Server section to set up the server.

Installation

To install the image recognition client, use npm:

npm i -S image-recognition-microservice

Usage

Here's a basic example of how to use the image recognition client:

import ImageRecogniton from 'image-recognition-microservice';

// Initialize the client with the server URL
const imageRecognition = new ImageRecogniton('http://localhost:3000');

const imageBuffer = await readFile('path/to/your/image.jpg');
const image = new File([imageBuffer], 'image.jpg', { type: 'image/jpeg' });

// Check the file
const result = await imageRecognition.recognize(image, 'Describe this image.');

// Log the result
console.log(result);

In this example, we're reading an image file from disk, creating a File object, and passing it to the recognize method along with a question. The method returns a promise that resolves to an object with the answer to the question.

API

The ImageRecogniton class provides the following method:

recognize(file: File | Blob, question: string): Promise<{ question: string, answer?: string, error?: string }>
Scans the provided file for viruses. Returns a promise that resolves to an object with:
- answer: a string with the answer to the question asked about the image.
- error: a string with error message if the file is not recognized.
- question: the question that was asked.

Notes

Make sure the (Image Recognition Server)[#server] is running and accessible at the URL you provide when initializing the ImageRecogniton client.
The client works with both File and Blob objects, making it flexible for various use cases.
Error handling is built into the client. If there's an error communicating with the server, the recognize method will return { error: 'Error message', question: 'The question asked' }.

For more information on setting up and using the server, refer to the Image Recognition Server documentation above.

Project Structure

Dockerfile: Docker configuration for the service.
docker-compose.yaml: Docker Compose configuration.
requirements.txt: Python dependencies.
src/server.py: Server implementation.
src/client.js: Node.js client.
.env.example: Example of environment variables.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Support

If you encounter any problems or have questions, please open an issue in the project repository.

Created by

Dimitry Ivanov [email protected] # curl -A cv ivanoff.org.ua

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Image Recognition Microsevice

Features

Brief Example

Describe this image (default question)

Describe in detail this image

What color is the skin?

Table of Contents

Server

Prerequisites

Installation

Configuration

Start the server

Dependencies

API Usage

Endpoint

Headers

Form Data

Example Request

Example Response

Example Response with Error

Node.js Image Recognition Client

Installation

Usage

API

Notes

Project Structure

License

Contributing

Support

Created by