image-recognition-microservice
v0.1.4
Published
Image Recognition Microsevice
Maintainers
Readme

Image Recognition Microsevice
This is a standalone microservice for image recognition and question answering. It can analyze an image and respond to specific questions about its content. The server is built using Python and moondream (a small vision language model designed to run efficiently on edge devices, (more on huggingface) for image recognition and question answering.
Features
- Recognizes objects and scenes in images.
- Answers questions related to the image's content.
- Simple REST API for integration.
Brief Example

Describe this image (default question)
The image depicts the animated character Homer Simpson in a room, pointing to a drawing of a car on a whiteboard.
Request:
curl -X POST http://127.0.0.1:5000/ -H "Authorization: Bearer 123" -F "image=@./assets/example.png"Response:
{
"answer": "The image depicts the animated character Homer Simpson in a room, pointing to a drawing of a car on a whiteboard.",
"question": "Describe this image"
}Describe in detail this image
The image depicts a scene from the animated television series "The Simpsons". The central figure is Homer Simpson, a renowned character known for his love of cars. He is standing in front of a whiteboard, which displays a drawing of a car. Homer is pointing towards the drawing, suggesting he is explaining or admiring it. The background is a vibrant purple color, providing a contrast to the whiteboard and the yellow figure of Homer.
Request:
curl -X POST http://127.0.0.1:5000/ -H "Authorization: Bearer 123" -F "image=@./assets/example.png" -F "question=Describe in detail this image"Response:
{
"answer": "The image depicts a scene from the animated television series \"The Simpsons\". The central figure is Homer Simpson, a renowned character known for his love of cars. He is standing in front of a whiteboard, which displays a drawing of a car. Homer is pointing towards the drawing, suggesting he is explaining or admiring it. The background is a vibrant purple color, providing a contrast to the whiteboard and the yellow figure of Homer.",
"question": "Describe in detail this image"
}What color is the skin?
The color of skin in the image is yellow.
Request:
curl -X POST http://127.0.0.1:5000/ -H "Authorization: Bearer 123" -F "image=@./assets/example.png" -F "question=What color is the skin?"Response:
{
"answer": "The color of skin in the image is yellow.",
"question": "What the color of skin?"
}Table of Contents
- Image Recognition Microsevice
Server
Prerequisites
- Install Docker and Docker Compose.
- Prepare an
.envfile for API token configuration.
Installation
Clone this repository:
git clone <repository_url> cd <repository_name>Edit
.envfile and set theAPI_TOKEN:API_TOKEN=your_secret_tokenBuild and run the service using Docker Compose:
docker-compose up --buildThe API will be available at
http://127.0.0.1:5000/.
Configuration
Set the API_TOKEN in the .env file to secure the API. Example:
API_TOKEN=your_secret_tokenStart the server
To start the server, run the following command:
sudo docker compose build
sudo docker compose upAfter running the command, wait for the following message to appear in the terminal:
api-1 | * Serving Flask app 'server'
api-1 | * Debug mode: off
api-1 | WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
api-1 | * Running on all addresses (0.0.0.0)
api-1 | * Running on http://127.0.0.1:5000
api-1 | * Running on http://172.21.0.2:5000
api-1 | Press CTRL+C to quitThe server will be available at http://localhost:5000/.
Dependencies
- Python 3.10
- Flask
- PyTorch
- Pillow
- Transformers
- Einops
API Usage
Endpoint
POST /
Headers
Authorization: Bearer <your_secret_token>
Form Data
image: The image file to be analyzed (e.g.,.jpg,.png).question: A string containing the question about the image. Optional, default value isDescribe this image..
Example Request
curl -X POST http://127.0.0.1:5000/ \
-H "Authorization: Bearer your_secret_token" \
-F "image=@./assets/example.png" \
-F "question=Describe this image."Example Response
{
"question":"Describe this image.",
"answer":"A close-up image of a pile of ripe, red strawberries with green leaves."
}Example Response with Error
{
"question":"Describe this image.",
"error":"Invalid image format."
}Node.js Image Recognition Client
This client provides a simple interface to interact with the Image Recognition. It allows you to easily recognize objects and scenes in images and ask questions about them.
See Server section to set up the server.
Installation
To install the image recognition client, use npm:
npm i -S image-recognition-microserviceUsage
Here's a basic example of how to use the image recognition client:
import ImageRecogniton from 'image-recognition-microservice';
// Initialize the client with the server URL
const imageRecognition = new ImageRecogniton('http://localhost:3000');
const imageBuffer = await readFile('path/to/your/image.jpg');
const image = new File([imageBuffer], 'image.jpg', { type: 'image/jpeg' });
// Check the file
const result = await imageRecognition.recognize(image, 'Describe this image.');
// Log the result
console.log(result);In this example, we're reading an image file from disk, creating a File object, and passing it to the recognize method along with a question. The method returns a promise that resolves to an object with the answer to the question.
API
The ImageRecogniton class provides the following method:
recognize(file: File | Blob, question: string): Promise<{ question: string, answer?: string, error?: string }>Scans the provided file for viruses. Returns a promise that resolves to an object with:
answer: a string with the answer to the question asked about the image.error: a string with error message if the file is not recognized.question: the question that was asked.
Notes
- Make sure the (Image Recognition Server)[#server] is running and accessible at the URL you provide when initializing the
ImageRecognitonclient. - The client works with both
FileandBlobobjects, making it flexible for various use cases. - Error handling is built into the client. If there's an error communicating with the server, the
recognizemethod will return{ error: 'Error message', question: 'The question asked' }.
For more information on setting up and using the server, refer to the Image Recognition Server documentation above.
Project Structure
Dockerfile: Docker configuration for the service.docker-compose.yaml: Docker Compose configuration.requirements.txt: Python dependencies.src/server.py: Server implementation.src/client.js: Node.js client..env.example: Example of environment variables.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Support
If you encounter any problems or have questions, please open an issue in the project repository.
Created by
Dimitry Ivanov [email protected] # curl -A cv ivanoff.org.ua
