@ciphercross/nestjs-ai

v1.0.2

Published

2 months ago

Production-ready NestJS module for integrating with AI models (OpenAI and Google Gemini) with automatic provider switching, OCR, audio transcription, and custom prompts support

AI Module 1

@ciphercross/nestjs-ai - Production-ready NestJS module for integrating with AI models (OpenAI and Google Gemini) with automatic provider switching, OCR, audio transcription, and custom prompts support.

📋 Table of Contents

Overview

The AI Module is a comprehensive, production-ready solution for integrating AI capabilities into NestJS applications. It provides a unified interface for multiple AI providers, automatic fallback mechanisms, and specialized features like OCR and audio transcription.

Key Benefits

✅ Unified API - Single interface for multiple AI providers
✅ High Availability - Automatic fallback between providers
✅ Production Ready - Error handling, logging, and monitoring support
✅ Flexible - Easy to extend with new providers
✅ Type Safe - Full TypeScript support
✅ Well Documented - Comprehensive API documentation

Features

🤖 Multi-Provider Support
- OpenAI (GPT-4o, GPT-3.5, GPT-4 Turbo)
- Google Gemini (Gemini 1.5 Flash, Gemini Pro)
- Easy to extend with new providers
🔀 Automatic Fallback
- Seamless switching between providers on failures
- Configurable primary/secondary providers
- Comprehensive error handling
📝 OCR (Optical Character Recognition)
- Text recognition from images (base64)
- Handwritten text support
- Structure preservation
- Custom prompts for specialized OCR tasks
🎤 Audio Transcription
- Speech-to-Text using OpenAI Whisper
- Multi-language support
- Language auto-detection
- Quality optimization options
🎨 HTML Generation
- Convert plain text to formatted HTML
- Preserve structure and formatting
- Custom styling support
📋 Custom Prompts
- Create reusable prompts
- Organize by type (OCR, transcription, HTML, general)
- Store custom options and parameters
🔌 Flexible Logging
- Automatic detection of DatadogLoggerService
- Fallback to NestJS Logger
- Comprehensive operation logging
🛡️ Production Features
- Error handling and recovery
- Request/response validation
- Swagger documentation
- Unit test coverage

Architecture

Component Structure

AIModule
├── AIService (Core orchestration)
│   ├── OpenAIAdapter (OpenAI integration)
│   ├── GeminiAdapter (Gemini integration)
│   └── PromptsService (Custom prompts management)
├── AIController (REST API endpoints)
├── LoggerAdapter (Flexible logging)
└── DTOs (Request/Response validation)

Design Patterns

Adapter Pattern - Each AI provider implements AIProviderAdapter interface
Strategy Pattern - Provider selection and fallback logic
Factory Pattern - Adapter initialization based on configuration
Service Layer - Business logic separated from API layer

Data Flow

Request → Controller → Service → Adapter → AI Provider API
                ↓
         Fallback Logic
                ↓
         Error Handling
                ↓
         Logging
                ↓
         Response

Installation

1. Install dependencies

npm install openai

# Optional: For Datadog logging support
npm install @ciphercross/nestjs-datadog

2. Configure environment variables

Add to your .env file:

# OpenAI
OPENAI_API_KEY=your_openai_api_key
OPENAI_DEFAULT_MODEL=gpt-4o
OPENAI_ENABLED=true
OPENAI_BASE_URL=  # Optional, for custom URL

# Gemini
GEMINI_API_KEY=your_gemini_api_key
GEMINI_DEFAULT_MODEL=gemini-1.5-flash
GEMINI_ENABLED=true
GEMINI_BASE_URL=  # Optional

# Fallback settings
AI_FALLBACK_ENABLED=true
AI_PRIMARY_PROVIDER=openai  # openai or gemini
AI_SECONDARY_PROVIDER=gemini  # openai or gemini

3. Import the module

import { Module } from '@nestjs/common';
import { AIModule } from 'modules/ai/ai.module';
// Optional: For Datadog logging
import { DatadogModule } from '@ciphercross/nestjs-datadog';

@Module({
  imports: [
    // Optional: Datadog module (if you want Datadog logging)
    DatadogModule.register({ serviceName: 'my-app' }),
    AIModule,
  ],
  // ...
})
export class YourModule {}

Note: If DatadogLoggerService is available in your DI container, the AI module will automatically use it. Otherwise, it falls back to NestJS Logger.

4. Add authentication (optional but recommended)

The AI controller endpoints are not protected by default. To add authentication, you can:

Option 1: Global Guard (Recommended)

import { Module } from '@nestjs/common';
import { APP_GUARD } from '@nestjs/core';
import { AuthGuard } from '@nestjs/passport';

@Module({
  imports: [AIModule],
  providers: [
    {
      provide: APP_GUARD,
      useClass: AuthGuard('jwt'), // or your custom guard
    },
  ],
})
export class AppModule {}

Option 2: Controller-level Guard Create a custom guard wrapper or use @UseGuards() decorator directly in your application code.

Usage

Basic usage (simplest way)

import { Injectable } from '@nestjs/common';
import { AIService } from 'modules/ai/ai.service';

@Injectable()
export class YourService {
  constructor(private readonly aiService: AIService) {}

  async askAI(question: string) {
    // Just pass the prompt - everything else is configured automatically
    const response = await this.aiService.generateResponse(question);
    return response;
  }
}

With additional options

const response = await this.aiService.generateResponse(
  'What is artificial intelligence?',
  {
    systemPrompt: 'You are a helpful assistant',
    provider: 'openai',  // 'openai' | 'gemini' | 'auto'
    model: 'gpt-4o',
    temperature: 0.7,
    maxTokens: 1000,
  }
);

Chat with message history

const messages = [
  { role: 'system', content: 'You are a helpful assistant' },
  { role: 'user', content: 'Hello!' },
  { role: 'assistant', content: 'Hello! How can I help?' },
  { role: 'user', content: 'Tell me about AI' },
];

const response = await this.aiService.generateChatResponse(messages, {
  provider: 'auto',  // Automatically selects provider with fallback
});

OCR - text recognition from image

// base64Image - base64 string of the image
const text = await this.aiService.recognizeTextFromBase64(base64Image, {
  provider: 'auto',
  model: 'gpt-4o',  // Uses vision models
  prompt: 'Recognize text, preserve structure',  // Optional
});

Audio transcription (Speech-to-Text)

// base64Audio - base64 string of the audio file

// With one language (default is English)
const text = await this.aiService.transcribeAudioFromBase64(base64Audio, {
  provider: 'openai',  // Whisper is only supported by OpenAI
  language: 'uk',  // Optional, language code (default: 'en')
  prompt: 'This is a recording in Ukrainian',  // Optional, to improve quality
  temperature: 0,  // Optional
});

// With multiple languages (first language is used)
const textMulti = await this.aiService.transcribeAudioFromBase64(base64Audio, {
  language: ['uk', 'en', 'ru'],  // Array of languages
});

// Without specifying language (automatically English)
const textDefault = await this.aiService.transcribeAudioFromBase64(base64Audio);

Supported language codes:

en - English (default)
uk - Ukrainian
ru - Russian
pl - Polish
de - German
fr - French
es - Spanish
and others (ISO 639-1 codes)

HTML generation from text

const html = await this.aiService.generateHtmlFromText(
  'Title\n\nThis is a paragraph of text.',
  {
    provider: 'auto',
    model: 'gpt-4o',
  }
);

Custom prompts

Creating a custom prompt

// Create prompt for OCR
this.aiService.createCustomPrompt(
  'OCR for handwritten text',
  'Recognize text from image. Preserve structure: indents, lists. If there is crossed out text, mark it as ~~text~~.',
  'ocr',
  { temperature: 0.3 }
);

// Create prompt for transcription
this.aiService.createCustomPrompt(
  'Transcription in Ukrainian',
  'Transcribe audio in Ukrainian. Remove background sounds and pauses.',
  'transcription',
  { language: 'uk' }
);

Using a custom prompt

// Use custom prompt for generation
const response = await this.aiService.generateWithCustomPrompt(
  'OCR for handwritten text',
  base64Image,  // Input data
  {
    provider: 'auto',
  }
);

// Or for OCR directly
const text = await this.aiService.recognizeTextWithCustomPrompt(
  'OCR for handwritten text',
  base64Image,
);

// Or for transcription
const transcript = await this.aiService.transcribeWithCustomPrompt(
  'Transcription in Ukrainian',
  base64Audio,
);

Managing custom prompts

// Get all prompts
const allPrompts = this.aiService.getAllCustomPrompts();

// Get specific prompt
const prompt = this.aiService.getCustomPrompt('OCR for handwritten text');

// Delete prompt
this.aiService.deleteCustomPrompt('OCR for handwritten text');

API Endpoints

The module also provides REST API endpoints (protected by AccessTokenGuard):

POST `/ai/generate`

Generate response from prompt.

Request:

{
  "prompt": "What is artificial intelligence?",
  "systemPrompt": "You are a helpful assistant",
  "provider": "auto",
  "model": "gpt-4o",
  "temperature": 0.7,
  "maxTokens": 1000
}

Response:

{
  "content": "Artificial intelligence is..."
}

POST `/ai/ocr/base64`

Text recognition from base64 image.

Request:

{
  "base64Image": "iVBORw0KGgoAAAANSUhEUgAA...",
  "provider": "auto",
  "model": "gpt-4o",
  "prompt": "Recognize text"
}

Response:

{
  "text": "Recognized text..."
}

POST `/ai/html/generate`

HTML generation from text.

Request:

{
  "text": "Title\n\nText",
  "provider": "auto",
  "model": "gpt-4o"
}

Response:

{
  "html": "<h1>Title</h1><p>Text</p>"
}

POST `/ai/transcribe/base64`

Audio transcription from base64 (Speech-to-Text). Supports one language or an array of languages. Default language is English (en).

Request (one language):

{
  "base64Audio": "UklGRiQAAABXQVZFZm10...",
  "provider": "openai",
  "language": "uk",
  "prompt": "This is a recording in Ukrainian",
  "temperature": 0
}

Request (multiple languages):

{
  "base64Audio": "UklGRiQAAABXQVZFZm10...",
  "provider": "openai",
  "language": ["uk", "en", "ru"],
  "prompt": "This recording may be in Ukrainian, English, or Russian",
  "temperature": 0
}

Request (default - English):

{
  "base64Audio": "UklGRiQAAABXQVZFZm10...",
  "provider": "openai"
}

Response:

{
  "text": "Transcribed text..."
}

Note: If an array of languages is provided, the first language from the array is used. OpenAI Whisper supports only one language at a time.

POST `/ai/prompts`

Create a custom prompt.

Request:

{
  "name": "OCR for handwritten text",
  "prompt": "Recognize text from image. Preserve structure.",
  "type": "ocr",
  "options": {
    "temperature": 0.3
  }
}

Response:

{
  "name": "OCR for handwritten text",
  "prompt": "Recognize text from image. Preserve structure.",
  "type": "ocr",
  "options": { "temperature": 0.3 },
  "createdAt": "2024-01-01T00:00:00.000Z",
  "updatedAt": "2024-01-01T00:00:00.000Z"
}

GET `/ai/prompts`

Get all custom prompts.

Response:

{
  "prompts": [
    {
      "name": "OCR for handwritten text",
      "prompt": "...",
      "type": "ocr",
      ...
    }
  ]
}

GET `/ai/prompts/:name`

Get a custom prompt by name.

DELETE `/ai/prompts/:name`

Delete a custom prompt.

POST `/ai/prompts/use`

Use a custom prompt for generation.

Request:

{
  "promptName": "OCR for handwritten text",
  "input": "base64ImageString",
  "provider": "auto"
}

Response:

{
  "content": "Generated response..."
}

POST `/ai/providers`

Get list of available providers.

Response:

{
  "providers": ["openai", "gemini"]
}

Fallback logic

The module automatically switches between providers on failures:

Primary provider - tries to use the first one (configured in AI_PRIMARY_PROVIDER)
Fallback - if primary is unavailable or returns an error, automatically uses secondary provider
Logging - all switches are logged for monitoring

Example:

// If OpenAI is unavailable, automatically uses Gemini
const response = await this.aiService.generateResponse('Hello!', {
  provider: 'auto',  // Automatic selection with fallback
});

Configuration

Default models

OpenAI: gpt-4o (can be changed via OPENAI_DEFAULT_MODEL)
Gemini: gemini-1.5-flash (can be changed via GEMINI_DEFAULT_MODEL)

Supported models

OpenAI:

gpt-4o (recommended for OCR and complex tasks)
gpt-4-turbo
gpt-3.5-turbo
and others

Gemini:

gemini-1.5-flash (fast and economical)
gemini-1.5-pro (more powerful)
gemini-pro
and others

Preparing for npm package

The module is ready to be used as an npm package. For this:

Export the module in package.json:

{
  "name": "@your-org/nestjs-ai",
  "version": "1.0.0",
  "main": "dist/modules/ai/index.js",
  "types": "dist/modules/ai/index.d.ts",
  "exports": {
    ".": {
      "import": "./dist/modules/ai/index.js",
      "require": "./dist/modules/ai/index.js",
      "types": "./dist/modules/ai/index.d.ts"
    }
  },
  "peerDependencies": {
    "@nestjs/common": "^11.0.0",
    "@nestjs/config": "^4.0.0",
    "openai": "^4.0.0"
  }
}

Index file is already created in src/modules/ai/index.ts
Publishing:
Automatic publishing via GitHub Actions (recommended):
The package is automatically published to npm when:
- You push changes to the main branch and the version in package.json is updated
- You create a GitHub release
Setup required:
1. Create an npm access token at https://www.npmjs.com/settings/YOUR_USERNAME/tokens
2. Add it as a secret in your GitHub repository:
  - Go to Settings → Secrets and variables → Actions
  - Click "New repository secret"
  - Name: NPM_TOKEN
  - Value: your npm access token
  - Click "Add secret"
Manual publishing (if needed):
```
npm publish --access public
```

Using as npm package in other projects

npm install @your-org/nestjs-ai

import { Module } from '@nestjs/common';
import { AIModule } from '@your-org/nestjs-ai';

@Module({
  imports: [AIModule],
})
export class AppModule {}

import { Injectable } from '@nestjs/common';
import { AIService } from '@your-org/nestjs-ai';

@Injectable()
export class MyService {
  constructor(private readonly aiService: AIService) {}

  async ask(prompt: string) {
    return await this.aiService.generateResponse(prompt);
  }
}

Note about Gemini API

The Gemini adapter uses OpenAI SDK with a custom baseURL. For full Gemini support you can:

Use Google Vertex AI API (compatible with OpenAI format)
Or install @google/generative-ai and update GeminiAdapter to use the native SDK

Current implementation works with Gemini through OpenAI-compatible API endpoint.

Use Cases & Examples

Real-World Use Cases

1. 📱 Mobile App with Handwritten Notes Recognition

Scenario: A journaling app where users take photos of handwritten notes and convert them to digital text.

@Injectable()
export class JournalService {
  constructor(
    private readonly aiService: AIService,
    private readonly prisma: PrismaService,
  ) {}

  async createEntryFromPhoto(userId: string, photoBase64: string) {
    // Set up custom OCR prompt for handwritten text
    this.aiService.createCustomPrompt(
      'handwritten-journal',
      'Recognize handwritten text from journal entry. Preserve line breaks, paragraphs, and formatting. If text is crossed out, mark as ~~text~~.',
      'ocr',
      { temperature: 0.2 }
    );

    // Recognize text
    const text = await this.aiService.recognizeTextWithCustomPrompt(
      'handwritten-journal',
      photoBase64,
    );

    // Generate HTML version
    const html = await this.aiService.generateHtmlFromText(text, {
      provider: 'auto',
    });

    // Save to database
    return this.prisma.entry.create({
      data: {
        userId,
        text,
        html,
        source: 'photo',
      },
    });
  }
}

2. 🎙️ Voice Notes Transcription Service

Scenario: A productivity app that transcribes voice memos in multiple languages.

@Injectable()
export class VoiceNotesService {
  constructor(
    private readonly aiService: AIService,
    private readonly storageService: StorageService,
  ) {}

  async transcribeVoiceNote(
    audioFile: Express.Multer.File,
    userId: string,
    language?: string,
  ) {
    // Convert file to base64
    const base64Audio = await this.storageService.fileToBase64(audioFile);

    // Transcribe with language detection
    const transcript = await this.aiService.transcribeAudioFromBase64(
      base64Audio,
      {
        provider: 'openai',
        language: language || 'auto', // Auto-detect if not specified
        prompt: 'This is a voice note. Remove filler words and background noise.',
      },
    );

    // Save transcription
    return this.prisma.voiceNote.create({
      data: {
        userId,
        transcript,
        language: language || 'auto',
        audioUrl: await this.storageService.upload(audioFile),
      },
    });
  }

  async batchTranscribe(audioFiles: Express.Multer.File[], languages: string[]) {
    // Process multiple files with different languages
    const results = await Promise.all(
      audioFiles.map((file, index) =>
        this.transcribeVoiceNote(file, 'user-id', languages[index]),
      ),
    );
    return results;
  }
}

3. 📄 Document Processing Pipeline

Scenario: An enterprise document management system that processes various document types.

@Injectable()
export class DocumentProcessingService {
  constructor(private readonly aiService: AIService) {}

  async processDocument(document: Document) {
    const tasks = [];

    // OCR for scanned documents
    if (document.type === 'scanned') {
      tasks.push(
        this.aiService.recognizeTextFromBase64(document.imageBase64, {
          prompt: 'Extract all text. Preserve tables, lists, and formatting.',
        }),
      );
    }

    // Transcription for audio documents
    if (document.type === 'audio') {
      tasks.push(
        this.aiService.transcribeAudioFromBase64(document.audioBase64, {
          language: document.language || 'en',
        }),
      );
    }

    // Generate summary
    if (document.requiresSummary) {
      tasks.push(
        this.aiService.generateResponse(
          `Summarize this document: ${document.content}`,
          {
            systemPrompt: 'You are a professional document analyst',
            maxTokens: 500,
          },
        ),
      );
    }

    const [text, transcript, summary] = await Promise.all(tasks);

    return {
      ...document,
      processedText: text || transcript,
      summary,
      processedAt: new Date(),
    };
  }
}

4. 🤖 AI-Powered Customer Support Chatbot

Scenario: An e-commerce platform with intelligent customer support.

@Injectable()
export class SupportChatService {
  constructor(
    private readonly aiService: AIService,
    private readonly orderService: OrderService,
  ) {}

  async handleCustomerMessage(userId: string, message: string) {
    // Get user context
    const user = await this.getUser(userId);
    const recentOrders = await this.orderService.getRecentOrders(userId);

    // Build context-aware system prompt
    const systemPrompt = `You are a helpful customer support agent for an e-commerce platform.
User: ${user.name} (${user.email})
Recent orders: ${recentOrders.length}
Be friendly, professional, and helpful.`;

    // Generate response with context
    const response = await this.aiService.generateChatResponse(
      [
        { role: 'system', content: systemPrompt },
        { role: 'user', content: message },
      ],
      {
        provider: 'auto',
        temperature: 0.7, // More creative for customer service
      },
    );

    // Save conversation
    await this.saveConversation(userId, message, response);

    return response;
  }
}

5. 📝 Content Generation Platform

Scenario: A content management system that generates articles, social media posts, and marketing copy.

@Injectable()
export class ContentGenerationService {
  constructor(private readonly aiService: AIService) {}

  // Generate blog article
  async generateArticle(topic: string, style: 'professional' | 'casual' = 'professional') {
    const systemPrompt =
      style === 'professional'
        ? 'You are an experienced journalist writing for a professional publication.'
        : 'You are a friendly blogger writing engaging content.';

    return this.aiService.generateResponse(
      `Write a comprehensive article about: ${topic}`,
      {
        systemPrompt,
        maxTokens: 2000,
        temperature: 0.8,
      },
    );
  }

  // Generate social media posts
  async generateSocialMediaPost(topic: string, platform: 'twitter' | 'linkedin' | 'instagram') {
    const prompts = {
      twitter: 'Write a concise, engaging tweet (max 280 characters)',
      linkedin: 'Write a professional LinkedIn post (2-3 paragraphs)',
      instagram: 'Write an engaging Instagram caption with relevant hashtags',
    };

    return this.aiService.generateResponse(`${prompts[platform]} about: ${topic}`, {
      systemPrompt: 'You are a social media expert',
      maxTokens: platform === 'twitter' ? 100 : 300,
    });
  }

  // Generate email marketing copy
  async generateEmailCampaign(product: Product, audience: string) {
    return this.aiService.generateResponse(
      `Write a marketing email for ${product.name} targeting ${audience}`,
      {
        systemPrompt: 'You are a professional email marketing copywriter',
        temperature: 0.7,
        maxTokens: 500,
      },
    );
  }
}

6. 🏥 Healthcare: Medical Records Processing

Scenario: A healthcare application that processes medical forms and prescriptions.

@Injectable()
export class MedicalRecordsService {
  constructor(private readonly aiService: AIService) {}

  async processPrescription(imageBase64: string) {
    // Custom prompt for medical documents
    this.aiService.createCustomPrompt(
      'prescription-ocr',
      'Extract prescription details: patient name, medication names, dosages, frequency, doctor name, date. Format as structured JSON.',
      'ocr',
      { temperature: 0.1 } // Low temperature for accuracy
    );

    const text = await this.aiService.recognizeTextWithCustomPrompt(
      'prescription-ocr',
      imageBase64,
    );

    // Parse structured data
    const prescription = JSON.parse(text);

    return {
      patientName: prescription.patientName,
      medications: prescription.medications,
      doctor: prescription.doctor,
      date: prescription.date,
    };
  }
}

7. 🎓 Educational Platform: Assignment Grading

Scenario: An online learning platform with AI-assisted grading and feedback.

@Injectable()
export class GradingService {
  constructor(private readonly aiService: AIService) {}

  async gradeAssignment(
    studentAnswer: string,
    assignmentPrompt: string,
    rubric: string,
  ) {
    const systemPrompt = `You are an experienced educator. Grade the student's answer based on the rubric.
Assignment: ${assignmentPrompt}
Rubric: ${rubric}
Provide: score (0-100), detailed feedback, and suggestions for improvement.`;

    const response = await this.aiService.generateResponse(
      `Student's answer:\n${studentAnswer}`,
      {
        systemPrompt,
        temperature: 0.3, // Consistent grading
        maxTokens: 1000,
      },
    );

    // Parse response (could be structured JSON)
    return this.parseGradingResponse(response);
  }
}

8. 🛒 E-commerce: Product Description Generation

Scenario: An online marketplace that auto-generates product descriptions.

@Injectable()
export class ProductService {
  constructor(
    private readonly aiService: AIService,
    private readonly prisma: PrismaService,
  ) {}

  async generateProductDescription(product: Product) {
    // Generate description from product image
    const imageDescription = await this.aiService.recognizeTextFromBase64(
      product.imageBase64,
      {
        prompt: 'Describe this product in detail: features, colors, materials, style.',
      },
    );

    // Generate marketing copy
    const marketingCopy = await this.aiService.generateResponse(
      `Create compelling product description for: ${product.name}\nFeatures: ${imageDescription}`,
      {
        systemPrompt: 'You are a professional e-commerce copywriter',
        temperature: 0.7,
        maxTokens: 300,
      },
    );

    // Update product
    return this.prisma.product.update({
      where: { id: product.id },
      data: {
        description: marketingCopy,
        aiGenerated: true,
      },
    });
  }
}

9. 📊 Business Intelligence: Report Generation

Scenario: A business analytics platform that generates insights from data.

@Injectable()
export class AnalyticsService {
  constructor(private readonly aiService: AIService) {}

  async generateInsights(data: AnalyticsData) {
    const dataSummary = this.formatDataForAI(data);

    const insights = await this.aiService.generateResponse(
      `Analyze this business data and provide insights:\n${dataSummary}`,
      {
        systemPrompt: 'You are a business analyst. Provide actionable insights and recommendations.',
        temperature: 0.5,
        maxTokens: 1500,
      },
    );

    // Generate executive summary
    const summary = await this.aiService.generateResponse(
      `Create an executive summary (3-4 sentences) from these insights:\n${insights}`,
      {
        systemPrompt: 'You are a C-level executive assistant',
        maxTokens: 200,
      },
    );

    return {
      insights,
      summary,
      generatedAt: new Date(),
    };
  }
}

10. 🌐 Multi-language Content Translation

Scenario: A content management system with automatic translation.

@Injectable()
export class TranslationService {
  constructor(private readonly aiService: AIService) {}

  async translateContent(
    content: string,
    targetLanguage: string,
    sourceLanguage?: string,
  ) {
    const systemPrompt = `You are a professional translator. Translate the content accurately while preserving meaning, tone, and style.
Target language: ${targetLanguage}
${sourceLanguage ? `Source language: ${sourceLanguage}` : 'Auto-detect source language'}`;

    return this.aiService.generateResponse(content, {
      systemPrompt,
      temperature: 0.3, // Consistent translations
      maxTokens: content.length * 2, // Estimate token count
    });
  }

  async batchTranslate(
    contents: string[],
    targetLanguage: string,
  ) {
    // Process in parallel with rate limiting
    const translations = await Promise.all(
      contents.map((content) =>
        this.translateContent(content, targetLanguage),
      ),
    );

    return translations;
  }
}

Integration Patterns

Pattern 1: Service Layer Integration

// Best practice: Create a dedicated service that wraps AIService
@Injectable()
export class MyBusinessService {
  constructor(
    private readonly aiService: AIService,
    private readonly otherService: OtherService,
  ) {}

  async processWithAI(data: any) {
    // Combine AI with business logic
    const aiResult = await this.aiService.generateResponse(data.prompt);
    const businessResult = await this.otherService.process(aiResult);
    return businessResult;
  }
}

Pattern 2: Queue-Based Processing

// For heavy AI tasks, use queues
@Injectable()
export class AIQueueService {
  constructor(
    @InjectQueue('ai-tasks') private aiQueue: Queue,
    private readonly aiService: AIService,
  ) {}

  async processImageAsync(imageBase64: string) {
    // Add to queue
    await this.aiQueue.add('ocr', { imageBase64 });
  }

  @Process('ocr')
  async handleOCR(job: Job) {
    return this.aiService.recognizeTextFromBase64(job.data.imageBase64);
  }
}

Pattern 3: Caching AI Responses

@Injectable()
export class CachedAIService {
  constructor(
    private readonly aiService: AIService,
    @Inject(CACHE_MANAGER) private cacheManager: Cache,
  ) {}

  async generateCachedResponse(prompt: string) {
    const cacheKey = `ai:${this.hashPrompt(prompt)}`;
    
    // Check cache
    const cached = await this.cacheManager.get(cacheKey);
    if (cached) return cached;

    // Generate and cache
    const response = await this.aiService.generateResponse(prompt);
    await this.cacheManager.set(cacheKey, response, 3600); // 1 hour

    return response;
  }
}

Error handling

The module automatically handles errors and switches to fallback provider. If both providers are unavailable, an error will be thrown:

try {
  const response = await this.aiService.generateResponse('Hello!');
} catch (error) {
  console.error('AI service error:', error.message);
  // Error handling
}

Logging

The module uses a flexible logging system that automatically detects and uses the best available logger:

Supported Loggers

DatadogLoggerService from @ciphercross/nestjs-datadog (recommended)

import { DatadogModule } from '@ciphercross/nestjs-datadog';
import { DatadogLoggerService } from '@ciphercross/nestjs-datadog';
   
@Module({
  imports: [DatadogModule.register({ serviceName: 'my-app' })],
})
export class AppModule {}

The module will automatically detect and use DatadogLoggerService if available.

Custom LoggerService with methods: logInfo, logDebug, logWarning, logError
- Any logger service implementing these methods will be automatically used
NestJS Logger (fallback)
- If no custom logger is provided, the module automatically falls back to the standard NestJS Logger

Automatic Detection

The module automatically detects the logger type:

If DatadogLoggerService is available → uses it
If custom logger with logInfo/logWarning methods → uses it
Otherwise → falls back to NestJS Logger

All operations are logged:

Adapter initialization
Switching between providers
API errors

Note for npm package usage: The module works without any custom logger - it will use NestJS Logger by default. To use Datadog logging, install @ciphercross/nestjs-datadog as an optional peer dependency.

Testing

Swagger Documentation

Swagger is available at http://localhost:8000/api (or your configured port).

All AI endpoints are documented and can be tested directly from Swagger UI:

POST /ai/generate - Generate response
POST /ai/ocr/base64 - OCR from image
POST /ai/transcribe/base64 - Audio transcription
POST /ai/html/generate - Generate HTML
POST /ai/prompts - Create custom prompt
GET /ai/prompts - Get all prompts
POST /ai/prompts/use - Use custom prompt

Note: All endpoints require authentication. Use /auth/login to get an access token, then click "Authorize" in Swagger UI.

Unit Tests

# Run all tests
npm test

# Run AI module tests
npm test ai.service.spec
npm test ai.controller.spec

# Run with coverage
npm run test:cov

See TESTING.md for detailed testing instructions.

Best Practices

1. Error Handling

Always wrap AI calls in try-catch blocks:

try {
  const response = await this.aiService.generateResponse(prompt);
  return response;
} catch (error) {
  // Log error
  this.logger.error('AI generation failed', error);
  
  // Provide fallback
  return 'Sorry, I encountered an error. Please try again.';
}

2. Rate Limiting

Implement rate limiting for AI endpoints:

@UseGuards(ThrottlerGuard)
@Throttle(10, 60) // 10 requests per minute
@Post('generate')
async generate(@Body() dto: GenerateResponseDto) {
  return this.aiService.generateResponse(dto.prompt);
}

3. Cost Optimization

Use appropriate models for tasks (GPT-3.5 for simple tasks, GPT-4o for complex)
Cache frequently requested responses
Set maxTokens to limit response length
Use temperature: 0 for deterministic outputs

4. Security

Never expose API keys in client-side code
Validate and sanitize all user inputs
Implement proper authentication for AI endpoints
Monitor API usage and costs

5. Performance

Use Promise.all() for parallel AI requests when possible
Implement request queuing for high-volume scenarios
Cache responses for identical prompts
Monitor response times and optimize

6. Monitoring

Log all AI operations
Track success/failure rates
Monitor API costs
Set up alerts for provider failures

Troubleshooting

Common Issues

1. "No AI providers are available"

Cause: API keys not configured or providers disabled.

Solution:

Check .env file for OPENAI_API_KEY or GEMINI_API_KEY
Verify OPENAI_ENABLED=true or GEMINI_ENABLED=true
Check application logs for initialization errors

2. "Provider X is not available"

Cause: API key invalid, network issues, or quota exceeded.

Solution:

Verify API key is correct
Check network connectivity
Verify API quota/limits
Check provider status page

3. Slow Response Times

Cause: Large prompts, network latency, or provider issues.

Solution:

Reduce prompt size
Use faster models (GPT-3.5 vs GPT-4o)
Implement caching
Check provider status

4. High Costs

Cause: Using expensive models for simple tasks or no token limits.

Solution:

Use GPT-3.5 for simple tasks
Set maxTokens limits
Implement response caching
Monitor usage regularly

5. Authentication Errors

Cause: Invalid or expired access token.

Solution:

Refresh access token
Check token expiration
Verify AccessAuthGuard configuration

Extending

The module is easily extensible to support new providers:

Adding a New Provider

Create Adapter - Implement AIProviderAdapter interface:

@Injectable()
export class ClaudeAdapter implements AIProviderAdapter {
  async generateResponse(
    messages: ChatMessage[],
    options?: AIRequestOptions,
  ): Promise<AIResponse> {
    // Implementation
  }

  isAvailable(): boolean {
    // Check configuration
  }

  getProviderName(): 'claude' {
    return 'claude';
  }
}

Register in Module:

@Module({
  providers: [
    AIService,
    OpenAIAdapter,
    GeminiAdapter,
    ClaudeAdapter, // Add new adapter
  ],
})
export class AIModule {}

Register in Service:

private initializeAdapters() {
  // ... existing adapters
  if (this.claudeAdapter.isAvailable()) {
    this.adapters.set('claude', this.claudeAdapter);
  }
}

Performance Considerations

Token Limits

GPT-3.5: ~4,096 tokens context window
GPT-4o: ~128,000 tokens context window
Gemini 1.5: ~1,000,000 tokens context window

Response Times

GPT-3.5: ~1-3 seconds
GPT-4o: ~3-10 seconds
Gemini 1.5 Flash: ~1-2 seconds

Cost Estimates (as of 2024)

GPT-3.5: ~$0.002 per 1K tokens
GPT-4o: ~$0.01 per 1K tokens
Gemini 1.5 Flash: ~$0.0001 per 1K tokens

Contributing

When contributing to this module:

Follow NestJS best practices
Write unit tests for new features
Update documentation
Ensure backward compatibility
Test with multiple providers

License

ISC

Support

For issues, questions, or contributions:

Check TESTING.md for testing instructions
Review QUICK_TEST.md for quick start guide
Open an issue in the repository

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

AI Module 1

📋 Table of Contents

Overview

Key Benefits

Features

Architecture

Component Structure

Design Patterns

Data Flow

Installation

1. Install dependencies

2. Configure environment variables

3. Import the module

4. Add authentication (optional but recommended)

Usage

Basic usage (simplest way)

With additional options

Chat with message history

OCR - text recognition from image

Audio transcription (Speech-to-Text)

HTML generation from text

Custom prompts

Creating a custom prompt

Using a custom prompt

Managing custom prompts

API Endpoints

POST /ai/generate

POST /ai/ocr/base64

POST /ai/html/generate

POST /ai/transcribe/base64

POST /ai/prompts

GET /ai/prompts

GET /ai/prompts/:name

DELETE /ai/prompts/:name

POST /ai/prompts/use

POST /ai/providers

Fallback logic

Configuration

Default models

Supported models

Preparing for npm package

Using as npm package in other projects

Note about Gemini API

Use Cases & Examples

Real-World Use Cases

1. 📱 Mobile App with Handwritten Notes Recognition

2. 🎙️ Voice Notes Transcription Service

3. 📄 Document Processing Pipeline

4. 🤖 AI-Powered Customer Support Chatbot

5. 📝 Content Generation Platform

6. 🏥 Healthcare: Medical Records Processing

7. 🎓 Educational Platform: Assignment Grading

8. 🛒 E-commerce: Product Description Generation

9. 📊 Business Intelligence: Report Generation

10. 🌐 Multi-language Content Translation

Integration Patterns

Pattern 1: Service Layer Integration

Pattern 2: Queue-Based Processing

Pattern 3: Caching AI Responses

Error handling

Logging

Supported Loggers

Automatic Detection

Testing

Swagger Documentation

Unit Tests

Best Practices

1. Error Handling

2. Rate Limiting

3. Cost Optimization

4. Security

5. Performance

6. Monitoring

Troubleshooting

POST `/ai/generate`

POST `/ai/ocr/base64`

POST `/ai/html/generate`

POST `/ai/transcribe/base64`

POST `/ai/prompts`

GET `/ai/prompts`

GET `/ai/prompts/:name`

DELETE `/ai/prompts/:name`

POST `/ai/prompts/use`

POST `/ai/providers`