Image to Text OCR
Extract text from photos, screenshots, and scanned documents in your browser. Free, unlimited, and completely private — your images never leave your device.
100% Private
Images never leave your device
8 Languages
English, Spanish, French, German, Chinese, Japanese, Hindi, Arabic
Browser OCR
tesseract.js WebAssembly engine
Drop images here
JPG, PNG, WebP, BMP, HEIC up to 100 MB each
Add one or more images above to get started. Pick the language first for the most accurate results.
About Image to Text OCR
Image to text conversion — also known as optical character recognition, or OCR — turns pixels of printed text into selectable, copyable characters. It is the fastest way to pull the numbers off a paper receipt for expense reports, digitise the contact details on a stack of business cards, lift quotes out of a textbook photo for study notes, transcribe a whiteboard snapshot after a meeting, or recover the text from a scanned PDF that has no text layer. FormatFuse runs the whole pipeline on your device, so even receipts with personal addresses, medical notes, or confidential contracts stay private.
Most free OCR tools upload your images to a server, process them in the cloud, and hand you back the extracted text. That model is convenient but it means every photo you run through the tool — including anything personal or sensitive — sits on someone else's infrastructure. FormatFuse is built on tesseract.js, a WebAssembly port of the Tesseract OCR engine, which runs entirely inside your browser tab. The only network activity is a one-time download of the OCR engine and the language model for the language you pick; after that, recognition happens offline and nothing leaves your device.
FormatFuse supports eight widely used languages out of the box — English, Spanish, French, German, Simplified Chinese, Japanese, Hindi, and Arabic — and you can switch languages in a dropdown before running OCR. Accuracy is strongest on clear, high-contrast printed text at 300 DPI or better: book pages, typed documents, receipts, menus, and screenshots. Results on handwriting, blurry photos, low light, curved surfaces, and stylised fonts will be hit and miss — Tesseract is a general-purpose printed-text engine, not a handwriting model. For best results, crop to the text region, hold the camera square to the page, and take the photo in even lighting.
Image to Text OCR — Frequently Asked Questions
Are my images uploaded to a server?
No. All OCR processing happens inside your browser using WebAssembly. The only network request is a one-time download of the OCR engine and the language model for the language you pick — both are served from FormatFuse and cached by your browser afterwards. You can verify this in your browser's Network tab: after the first load, running OCR on a new image produces zero outbound traffic.
Can it read handwriting?
Not reliably. Tesseract is trained on printed text, so neat block capitals sometimes work, but cursive, rushed notes, or stylised handwriting will produce poor results. For handwritten content, specialist models perform much better than general-purpose OCR. Treat any handwritten output as a rough draft that needs manual correction.
What image quality do I need for good results?
Aim for 300 DPI or better with sharp focus, even lighting, and good contrast between the text and background. Avoid skewed angles, shadows, glare, and heavy JPG compression. If your text is small in the frame, crop to the text area before uploading. Screenshots and scanned pages typically work better than phone photos of documents.
Which languages work best?
English gives the most consistent results because it has the largest training corpus. Other Latin-script languages (Spanish, French, German) are also strong. Chinese, Japanese, Hindi, and Arabic work well on clear printed text but are more sensitive to resolution and noise. If your document mixes languages, pick the one that appears most often — multi-language OCR in a single pass is not supported in this tool.
Why is the first run slow?
The first time you run OCR in a given language, your browser downloads the Tesseract engine (roughly 3 MB) and the language-specific training data (2–15 MB depending on language). That download happens once per language, is cached by your browser, and lets every subsequent run start instantly. Switching to a new language triggers one more download for that language's data, then that one is cached too.
Which image formats can I use?
JPG, PNG, WebP, BMP, and HEIC are all supported. HEIC files from iPhones are decoded in the browser before OCR runs. For scanned PDFs, convert the pages to images first with our PDF to JPG or PDF to PNG tools, then run OCR on the images.
Are my images uploaded to a server?
No. All OCR processing happens inside your browser using WebAssembly. The only network request is a one-time download of the OCR engine and the language model for the language you pick — both are served from FormatFuse and cached by your browser afterwards. You can verify this in your browser's Network tab: after the first load, running OCR on a new image produces zero outbound traffic.
Can it read handwriting?
Not reliably. Tesseract is trained on printed text, so neat block capitals sometimes work, but cursive, rushed notes, or stylised handwriting will produce poor results. For handwritten content, specialist models perform much better than general-purpose OCR. Treat any handwritten output as a rough draft that needs manual correction.
What image quality do I need for good results?
Aim for 300 DPI or better with sharp focus, even lighting, and good contrast between the text and background. Avoid skewed angles, shadows, glare, and heavy JPG compression. If your text is small in the frame, crop to the text area before uploading. Screenshots and scanned pages typically work better than phone photos of documents.
Which languages work best?
English gives the most consistent results because it has the largest training corpus. Other Latin-script languages (Spanish, French, German) are also strong. Chinese, Japanese, Hindi, and Arabic work well on clear printed text but are more sensitive to resolution and noise. If your document mixes languages, pick the one that appears most often — multi-language OCR in a single pass is not supported in this tool.
Why is the first run slow?
The first time you run OCR in a given language, your browser downloads the Tesseract engine (roughly 3 MB) and the language-specific training data (2–15 MB depending on language). That download happens once per language, is cached by your browser, and lets every subsequent run start instantly. Switching to a new language triggers one more download for that language's data, then that one is cached too.
Which image formats can I use?
JPG, PNG, WebP, BMP, and HEIC are all supported. HEIC files from iPhones are decoded in the browser before OCR runs. For scanned PDFs, convert the pages to images first with our PDF to JPG or PDF to PNG tools, then run OCR on the images.