API Documentation

Everything you need to integrate ocr into your applications. SDKs, API reference, and response formats.

How It Works

ocr uses a distributed pipeline architecture to process documents at scale. Each document flows through these five stages:

1

Upload

Send documents via REST API or SDK. We accept JPEG, PNG, TIFF, BMP, WebP, and PDF formats up to 100MB.

2

Categorize

Our AI classifies the document type (invoice, receipt, contract, etc.) to optimize recognition settings.

3

Split

Multi-page documents are intelligently segmented. PDFs are rasterized at 300 DPI for optimal accuracy.

4

Recognize

Our OCR engine extracts text with 99.9% precision using advanced deep learning models.

5

Deliver

Results are sent via WebSocket in real-time or REST callback. Data is encrypted in transit and at rest.

TypeScript SDK

Install and use our TypeScript SDK for Node.js applications.

TS

Installation & Quick Start

Install

npm install https://ocr.dasolution.sk/sdk/typescript

Quick Start

import { OCRClient } from '@dasolution/ocr-sdk'; // Connects to ocr.dasolution.sk by default const client = new OCRClient(); // Open WebSocket, get token const token = await client.registerCallback(); // Listen for OCR results client.onResultReceived((result) => { console.log('Text:', result.recognized_text); }); // Upload a document await client.uploadFile('./document.pdf');
TS

Configuration

Options

serverUrlServer URL (default: https://ocr.dasolution.sk)
tokenExisting token for reconnection
reconnectIntervalReconnect delay in ms (default: 3000)
maxReconnectAttemptsMax retries (default: 10)

Batch Processing

const client = new OCRClient(); await client.registerCallback(); // Upload multiple files const files = ['doc1.pdf', 'doc2.jpg']; for (const f of files) { await client.uploadFile(f); } // Or upload a Buffer await client.uploadFile(buffer, 'scan.png'); // Clean up client.disconnect();

Go SDK

Install and use our Go SDK for high-performance backend applications.

Go

Installation & Quick Start

Install

GONOSUMCHECK=ocr.dasolution.sk/* \ GONOSUMDB=ocr.dasolution.sk/* \ GOPRIVATE=ocr.dasolution.sk/* \ go get ocr.dasolution.sk/sdk/go

Quick Start

package main import ( "log" ocr "ocr.dasolution.sk/sdk/go" ) func main() { // Connects to ocr.dasolution.sk by default client := ocr.NewClient(ocr.ClientOptions{}) token, err := client.RegisterCallback() if err != nil { log.Fatal(err) } client.OnResult(func(r ocr.OCRResult) { log.Printf("%s", *r.RecognizedText) }) _, err = client.UploadFile("./doc.pdf") if err != nil { log.Fatal(err) } select {} // wait for results }
Go

Configuration

Options

ServerURLServer URL (default: https://ocr.dasolution.sk)
TokenExisting token for reconnection
ReconnectIntervalReconnect delay (default: 3s)
MaxReconnectAttemptsMax retries (default: 10)

Channel-Based Consumption

// Use the Results channel instead of callbacks go func() { for result := range client.Results { switch result.Type { case "result": log.Printf("OCR: %s", *result.RecognizedText) case "error": log.Printf("Err: %s", *result.Error) } } }() // Upload from an io.Reader f, _ := os.Open("scan.pdf") client.UploadReader(f, "scan.pdf")

Java SDK

Install and use our Java SDK for JVM-based applications. Java 11+ compatible.

JV

Installation & Quick Start

Maven Dependency

<!-- Download JAR from the server --> <dependency> <groupId>sk.dasolution</groupId> <artifactId>ocr-sdk</artifactId> <version>1.0.0</version> <scope>system</scope> <systemPath>${project.basedir}/lib/ocr-sdk-1.0.0.jar</systemPath> </dependency>

Direct Download

curl -o ocr-sdk-1.0.0.jar https://ocr.dasolution.sk/sdk/java

Quick Start

import sk.dasolution.ocr.*; var client = new OCRClient(); // Open WebSocket, get token String token = client.registerCallback(); // Listen for OCR results client.onResult(result -> { System.out.println(result.getRecognizedText()); }); // Upload a document client.uploadFile(Path.of("document.pdf"));
JV

Configuration

Options

serverUrlServer URL (default: https://ocr.dasolution.sk)
tokenExisting token for reconnection
reconnectIntervalMsReconnect delay in ms (default: 3000)
maxReconnectAttemptsMax retries (default: 10)

Queue-Based Consumption

// Use the results queue instead of callbacks new Thread(() -> { while (true) { OCRResult r = client.results.take(); switch (r.getType()) { case "result": System.out.println(r.getRecognizedText()); break; case "error": System.err.println(r.getError()); break; } } }).start(); // Upload from an InputStream client.uploadStream(inputStream, "scan.pdf");

Python SDK

Async Python SDK using websockets and aiohttp. Requires Python 3.9+.

Py

Installation & Quick Start

Install

pip install https://ocr.dasolution.sk/sdk/python

Quick Start

import asyncio from ocr_sdk import OCRClient async def main(): # Connects to ocr.dasolution.sk by default client = OCRClient() # Open WebSocket, get token token = await client.register_callback() # Listen for OCR results client.on_result(lambda r: print(r.recognized_text)) # Upload a document await client.upload_file("document.pdf") # Wait for results await client.wait_for_results() asyncio.run(main())
Py

Configuration

Options

server_urlServer URL (default: https://ocr.dasolution.sk)
tokenExisting token for reconnection
reconnect_intervalReconnect delay in seconds (default: 3.0)
max_reconnect_attemptsMax retries (default: 10)

Queue-Based Consumption

# Use the async queue instead of callbacks while True: result = await client.results.get() if result.type == "result": print(result.recognized_text) elif result.type == "error": print(result.error) # Upload from bytes or file-like object await client.upload_bytes(data, "scan.png") await client.upload_stream(file_obj, "scan.pdf")

API Reference

REST and WebSocket endpoints for the OCR service.

WS /ws

WebSocket endpoint. Opens a connection and receives a token. Reconnect with ?token=<token> to resume a session and receive pending results.

Query Parameters:
  • token - Existing token for reconnection (optional)
Server Messages:
  • {"type":"connected","token":"..."} - Sent on connection
  • {"type":"result","file_id":"...","recognized_text":"..."} - OCR result
  • {"type":"error","file_id":"...","error":"..."} - Processing error
  • {"type":"unknown_type","file_id":"...","error":"..."} - Unsupported format
POST /upload

Upload a document for OCR processing. Requires a valid WebSocket token.

Content-Type: multipart/form-data Form Fields:
  • token - Your WebSocket session token (required)
  • file - The document file (required). Supports JPEG, PNG, TIFF, BMP, WebP, PDF.
Response:
  • {"file_id":"uuid","status":"uploaded"}
GET /health

Health check endpoint. Returns server and database status.

GET /sdk/typescript

Download the TypeScript SDK as an npm-installable tarball.

GET /sdk/java

Download the Java SDK as a self-contained JAR with all dependencies.

GET /sdk/python

Download the Python SDK as a pip-installable wheel.

Response Format

WebSocket messages delivered in real-time as JSON.

// Successful OCR result { "type": "result", "file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "recognized_text": "Extracted text content..." } // Error response { "type": "error", "file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "error": "recognition failed: timeout" }
typeMessage type: connected, result, error, unknown_type
file_idUUID of the uploaded file
recognized_textExtracted text (present on type: result)
errorError description (present on type: error)
tokenSession token (present on type: connected)