🔍 Powered by Tesseract OCR

Extract Text from
Any Image or Document

OCR API powered by Tesseract. Extract text from receipts, invoices, scanned PDFs, business cards, and screenshots in 20 languages.

Get API Key Free → Get API Key →
Request
POST /extract
Content-Type: application/json

{
  "url": "https://example.com/receipt.jpg",
  "language": "eng"
}
Response — 200 OK
{
  "status": "ok",
  "text": "RECEIPT\nDate: 2024-01-15\nItem 1  $12.50\nItem 2  $8.99\nTotal: $21.49",
  "confidence": 94.2,
  "word_count": 12,
  "language": "eng",
  "words": [...]
}
20
Languages supported
URL
or Base64 input
Bbox
Word-level bounding boxes
Free
Plans available
Features

Production-grade text extraction

🌐

20 Languages

English, Chinese (Simplified/Traditional), Japanese, Korean, Arabic, Hindi, Russian, German, French, Spanish, and more.

📊

Confidence Score

Every extraction includes a confidence score (0–100) so you know how reliable the result is.

📍

Bounding Boxes

Optional word-level bounding box coordinates for layout analysis and document understanding.

🔗

URL or Base64

Pass an image URL or base64-encoded image. Supports JPEG, PNG, GIF, and WebP.

📄

Document Types

Receipts, invoices, IDs, business cards, scanned PDFs, screenshots, handwritten notes.

Fast Processing

Average response under 2 seconds for standard documents. Larger images may take longer.

20 languages supported

🇬🇧 English (eng)
🇨🇳 Chinese Simplified
🇹🇼 Chinese Traditional
🇯🇵 Japanese (jpn)
🇰🇷 Korean (kor)
🇸🇦 Arabic (ara)
🇮🇳 Hindi (hin)
🇷🇺 Russian (rus)
🇩🇪 German (deu)
🇫🇷 French (fra)
🇪🇸 Spanish (spa)
🇵🇹 Portuguese (por)
🇮🇹 Italian (ita)
🇳🇱 Dutch (nld)
🇵🇱 Polish (pol)
🇸🇪 Swedish (swe)
🇳🇴 Norwegian (nor)
🇩🇰 Danish (dan)
🇫🇮 Finnish (fin)
🇹🇷 Turkish (tur)

Simple, transparent pricing

Free
$0/mo
  • 1,000 requests / month
  • All API endpoints
  • 1 API key
  • Priority support
Get Free Key
Basic
$9/mo
  • 10,000 requests / month
  • All API endpoints
  • 1 API key
  • Email support
Subscribe $9/mo
Ultra
$79/mo
  • 200,000 requests / month
  • All API endpoints
  • 1 API key
  • Priority support + SLA
Subscribe $79/mo
FAQ

Common questions

What image formats are supported?
JPEG, PNG, GIF, and WebP. You can pass either a public image URL or a base64-encoded image. Maximum file size is 10MB.
How accurate is the OCR?
Accuracy depends on image quality. For clear, high-resolution images: 90-98% accuracy. Blurry, low-contrast, or handwritten text will have lower accuracy. The confidence score indicates reliability.
What are bounding boxes used for?
Word-level bounding boxes give you the x, y coordinates and dimensions of each detected word. Useful for document layout analysis, redaction, or highlighting specific words in the original image.
Can it read handwritten text?
Tesseract can read some handwritten text, but accuracy is lower than printed text. Clear, consistent handwriting works better than cursive or irregular styles.

Start extracting text today

Free plan available — 500 requests per month, no credit card required.

Get API Key →