Documentation

How It Works

ocr uses a distributed pipeline architecture to process documents at scale. Each document flows through these five stages:

1

Upload

Send documents via REST API or SDK. We accept JPEG, PNG, TIFF, BMP, WebP, and PDF formats up to 100MB.

2

Categorize

Our AI classifies the document type (invoice, receipt, contract, etc.) to optimize recognition settings.

3

Split

Multi-page documents are intelligently segmented. PDFs are rasterized at 300 DPI for optimal accuracy.

4

Recognize

Our OCR engine extracts text with 99.9% precision using advanced deep learning models.

5

Deliver

Results are sent via WebSocket in real-time or REST callback. Data is encrypted in transit and at rest.

TypeScript SDK

Install and use our TypeScript SDK for Node.js applications.

TS

Installation & Quick Start

Install

npm install https://ocr.dasolution.sk/sdk/typescript

Quick Start

import { OCRClient } from '@dasolution/ocr-sdk';

// Connects to ocr.dasolution.sk by default
const client = new OCRClient();

// Open WebSocket, get token
const token = await client.registerCallback();

// Listen for OCR results
client.onResultReceived((result) => {
  console.log('Text:', result.recognized_text);
});

// Upload a document
await client.uploadFile('./document.pdf');
            

TS

Configuration

Options

serverUrlServer URL (default: https://ocr.dasolution.sk)

tokenExisting token for reconnection

reconnectIntervalReconnect delay in ms (default: 3000)

maxReconnectAttemptsMax retries (default: 10)

Batch Processing

const client = new OCRClient();
await client.registerCallback();

// Upload multiple files
const files = ['doc1.pdf', 'doc2.jpg'];
for (const f of files) {
  await client.uploadFile(f);
}

// Or upload a Buffer
await client.uploadFile(buffer, 'scan.png');

// Clean up
client.disconnect();
            

Go SDK

Install and use our Go SDK for high-performance backend applications.

Go

Installation & Quick Start

Install

GONOSUMCHECK=ocr.dasolution.sk/* \ GONOSUMDB=ocr.dasolution.sk/* \ GOPRIVATE=ocr.dasolution.sk/* \ go get ocr.dasolution.sk/sdk/go

Quick Start

package main

import (
    "log"
    ocr "ocr.dasolution.sk/sdk/go"
)

func main() {
    // Connects to ocr.dasolution.sk by default
    client := ocr.NewClient(ocr.ClientOptions{})

    token, err := client.RegisterCallback()
    if err != nil {
        log.Fatal(err)
    }

    client.OnResult(func(r ocr.OCRResult) {
        log.Printf("%s", *r.RecognizedText)
    })

    _, err = client.UploadFile("./doc.pdf")
    if err != nil {
        log.Fatal(err)
    }

    select {} // wait for results
}
            

Go

Configuration

Options

ServerURLServer URL (default: https://ocr.dasolution.sk)

TokenExisting token for reconnection

ReconnectIntervalReconnect delay (default: 3s)

MaxReconnectAttemptsMax retries (default: 10)

Channel-Based Consumption

// Use the Results channel instead of callbacks
go func() {
    for result := range client.Results {
        switch result.Type {
        case "result":
            log.Printf("OCR: %s", *result.RecognizedText)
        case "error":
            log.Printf("Err: %s", *result.Error)
        }
    }
}()

// Upload from an io.Reader
f, _ := os.Open("scan.pdf")
client.UploadReader(f, "scan.pdf")
            

Java SDK

Install and use our Java SDK for JVM-based applications. Java 11+ compatible.

JV

Installation & Quick Start

Maven Dependency

<!-- Download JAR from the server -->
<dependency>
  <groupId>sk.dasolution</groupId>
  <artifactId>ocr-sdk</artifactId>
  <version>1.0.0</version>
  <scope>system</scope>
  <systemPath>${project.basedir}/lib/ocr-sdk-1.0.0.jar</systemPath>
</dependency>
            

Direct Download

curl -o ocr-sdk-1.0.0.jar https://ocr.dasolution.sk/sdk/java

Quick Start

import sk.dasolution.ocr.*;

var client = new OCRClient();

// Open WebSocket, get token
String token = client.registerCallback();

// Listen for OCR results
client.onResult(result -> {
    System.out.println(result.getRecognizedText());
});

// Upload a document
client.uploadFile(Path.of("document.pdf"));
            

JV

Configuration

Options

serverUrlServer URL (default: https://ocr.dasolution.sk)

tokenExisting token for reconnection

reconnectIntervalMsReconnect delay in ms (default: 3000)

maxReconnectAttemptsMax retries (default: 10)

Queue-Based Consumption

// Use the results queue instead of callbacks
new Thread(() -> {
    while (true) {
        OCRResult r = client.results.take();
        switch (r.getType()) {
        case "result":
            System.out.println(r.getRecognizedText());
            break;
        case "error":
            System.err.println(r.getError());
            break;
        }
    }
}).start();

// Upload from an InputStream
client.uploadStream(inputStream, "scan.pdf");
            

Python SDK

Async Python SDK using websockets and aiohttp. Requires Python 3.9+.

Py

Installation & Quick Start

Install

pip install https://ocr.dasolution.sk/sdk/python

Quick Start

import asyncio
from ocr_sdk import OCRClient

async def main():
    # Connects to ocr.dasolution.sk by default
    client = OCRClient()

    # Open WebSocket, get token
    token = await client.register_callback()

    # Listen for OCR results
    client.on_result(lambda r: print(r.recognized_text))

    # Upload a document
    await client.upload_file("document.pdf")

    # Wait for results
    await client.wait_for_results()

asyncio.run(main())
            

Py

Configuration

Options

server_urlServer URL (default: https://ocr.dasolution.sk)

tokenExisting token for reconnection

reconnect_intervalReconnect delay in seconds (default: 3.0)

max_reconnect_attemptsMax retries (default: 10)

Queue-Based Consumption

# Use the async queue instead of callbacks
while True:
    result = await client.results.get()
    if result.type == "result":
        print(result.recognized_text)
    elif result.type == "error":
        print(result.error)

# Upload from bytes or file-like object
await client.upload_bytes(data, "scan.png")
await client.upload_stream(file_obj, "scan.pdf")
            

API Reference

REST and WebSocket endpoints for the OCR service.

WS /ws

WebSocket endpoint. Opens a connection and receives a token. Reconnect with ?token=<token> to resume a session and receive pending results.

Query Parameters:

token - Existing token for reconnection (optional)

Server Messages:

{"type":"connected","token":"..."} - Sent on connection
{"type":"result","file_id":"...","recognized_text":"..."} - OCR result
{"type":"error","file_id":"...","error":"..."} - Processing error
{"type":"unknown_type","file_id":"...","error":"..."} - Unsupported format

POST /upload

Upload a document for OCR processing. Requires a valid WebSocket token.

Content-Type: multipart/form-data Form Fields:

token - Your WebSocket session token (required)
file - The document file (required). Supports JPEG, PNG, TIFF, BMP, WebP, PDF.

Response:

{"file_id":"uuid","status":"uploaded"}

GET /health

Health check endpoint. Returns server and database status.

GET /sdk/typescript

Download the TypeScript SDK as an npm-installable tarball.

GET /sdk/java

Download the Java SDK as a self-contained JAR with all dependencies.

GET /sdk/python

Download the Python SDK as a pip-installable wheel.

Response Format

WebSocket messages delivered in real-time as JSON.

// Successful OCR result
{
  "type": "result",
  "file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "recognized_text": "Extracted text content..."
}

// Error response
{
  "type": "error",
  "file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "error": "recognition failed: timeout"
}
        

typeMessage type: connected, result, error, unknown_type

file_idUUID of the uploaded file

recognized_textExtracted text (present on type: result)

errorError description (present on type: error)

tokenSession token (present on type: connected)

API Documentation

How It Works

Upload

Categorize

Split

Recognize

Deliver

TypeScript SDK

Installation & Quick Start

Install

Quick Start

Configuration

Options

Batch Processing

Go SDK

Installation & Quick Start

Install

Quick Start

Configuration

Options

Channel-Based Consumption

Java SDK

Installation & Quick Start

Maven Dependency

Direct Download

Quick Start

Configuration

Options

Queue-Based Consumption

Python SDK

Installation & Quick Start

Install

Quick Start

Configuration

Options

Queue-Based Consumption

API Reference

Response Format