Build an OCR Document Scanner for searchable PDFs

In this tutorial, you will learn how to build a web app for scanning documents and post-processing them with OCR (optical character recognition) technology to create searchable PDFs. To achieve this, all we need is a few lines of HTML and JavaScript code – and the Scanbot Web Document Scanner SDK.

Our single-page app will display a “Start scanning” button that opens the document scanner interface. Users can then scan one or several pages of a document. Tapping on “Submit” will trigger the OCR post-processing and the user will be transported back to the initial page. As soon as the searchable PDF has been created, it will be downloaded to the device.

Scanning and performing OCR on a document and exporting it as a searchable PDF

To accomplish all this, we only need to follow three simple steps:

Setting up the HTML page and including the Web SDK
Implementing the Document Scanner
Enabling OCR and configuring the PDF export

Let’s get started!

Want to see the final code right away? Click here.

index.html:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0" />
    <title>OCR Document Scanner</title>
</head>

<body style="margin: 0">
    <button id="scan-document">Scan document</button>
    <script type="module">
        import "https://cdn.jsdelivr.net/npm/scanbot-web-sdk@7.1.0/bundle/ScanbotSDK.ui2.min.js";
        const sdk = await ScanbotSDK.initialize({
            enginePath:
                "https://cdn.jsdelivr.net/npm/scanbot-web-sdk@7.1.0/bundle/bin/complete/"
        });
        document
            .getElementById("scan-document")
            .addEventListener("click", async () => {
                const config = new ScanbotSDK.UI.Config.DocumentScanningFlow();
                const scanResult = await ScanbotSDK.UI.createDocumentScanner(config);
                const pages = scanResult?.document?.pages;

                if (!pages || !pages.length) {
                    return;
                }

                const options = { pageSize: "A4", pageDirection: "PORTRAIT", pageFit: "FIT_IN", dpi: 72, jpegQuality: 80, runOcr: true };
                const bytes = await scanResult?.document?.createPdf(options);

                function saveBytes(data, name) {
                    const extension = name.split(".")[1];
                    const a = document.createElement("a");
                    document.body.appendChild(a);
                    a.style = "display: none";
                    const blob = new Blob([data], { type: `application/${extension}` });
                    const url = window.URL.createObjectURL(blob);
                    a.href = url;
                    a.download = name;
                    a.click();
                    window.URL.revokeObjectURL(url);
                }

                saveBytes(bytes, "document-scan.pdf");
            });
    </script>
</body>

</html>

Requirements

The Scanbot Web SDK is optimized for mobile devices. Although you can test the document scanner app’s basic functionality using your laptop camera or a webcam, we recommend you run it on a smartphone.

You can use a service like ngrok, which creates a tunnel to one of their SSL-certified domains. Their Quick Start guide will help you get up and running quickly.

Step 1: Set up the HTML page and include the Web SDK

First, create an index.html with the following code:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0" />
    <title>OCR Document Scanner</title>
</head>

<body style="margin: 0">
    <button id="scan-document">Scan document</button>
    <script type="module">
        import "https://cdn.jsdelivr.net/npm/scanbot-web-sdk@7.1.0/bundle/ScanbotSDK.ui2.min.js";
        const sdk = await ScanbotSDK.initialize({
            enginePath:
                "https://cdn.jsdelivr.net/npm/scanbot-web-sdk@7.1.0/bundle/bin/complete/"
        });
    </script>
</body>

</html>

This sets up an HTML5 document with a viewport and a button that we’ll use to start the document scanner in the next step. The code also loads the Scanbot Web SDK from a CDN (content delivery network) and initializes it.

Please note that using a CDN is only recommended for prototyping. In your production environment, please download the Web SDK directly (or install it via npm) and include its files in your project.

💡 Without a license key, our SDK only runs for 60 seconds per session. This is more than enough for the purposes of our tutorial, but if you like, you can generate a license key on our website.

Step 2: Implement the Document Scanner

Next, we need to attach an event listener to the button. When the button is clicked, a new instance of the SDK’s configuration object will be created and the createDocumentScanner method will launch the document scanner.

We’ll store the result of the scanning process in a scanResult object and ensure that if no pages are scanned (e.g., when the user cancels the operation), the function exits early.

The result will look like this:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0" />
    <title>OCR Document Scanner</title>
</head>

<body style="margin: 0">
    <button id="scan-document">Scan document</button>
    <script type="module">
        import "https://cdn.jsdelivr.net/npm/scanbot-web-sdk@7.1.0/bundle/ScanbotSDK.ui2.min.js";
        const sdk = await ScanbotSDK.initialize({
            enginePath:
                "https://cdn.jsdelivr.net/npm/scanbot-web-sdk@7.1.0/bundle/bin/complete/"
        });
        document
            .getElementById("scan-document")
            .addEventListener("click", async () => {
                const config = new ScanbotSDK.UI.Config.DocumentScanningFlow();
                const scanResult = await ScanbotSDK.UI.createDocumentScanner(config);
                const pages = scanResult?.document?.pages;

                if (!pages || !pages.length) {
                    return;
                }
            });
    </script>
</body>

</html>

Feel free to run the app to see if everything is working so far. At the moment, the user cannot do anything with the scanned pages, but we’ll change that in the next step.

Step 3: Enable OCR and configure the PDF export

First, let’s define how the PDF should be generated. You can adjust this as you see fit.

const options = { pageSize: "A4", pageDirection: "PORTRAIT", pageFit: "FIT_IN", dpi: 72, jpegQuality: 80, runOcr: true };

You can find more information about each parameter in the corresponding section of the SDK’s API references. The last parameter runOcr is especially important for our use case, as it toggles post-processing the document using OCR, which results in a searchable PDF.

Next, we’ll use the SDK’s built-in createPdf() method to generate the PDF file.

const bytes = await scanResult?.document?.createPdf(options);

This code will save the PDF and trigger the download.

function saveBytes(data, name) {
    const extension = name.split(".")[1];
    const a = document.createElement("a");
    document.body.appendChild(a);
    a.style = "display: none";
    const blob = new Blob([data], { type: `application/${extension}` });
    const url = window.URL.createObjectURL(blob);
    a.href = url;
    a.download = name;
    a.click();
    window.URL.revokeObjectURL(url);
}

saveBytes(bytes, "document-scan.pdf");

Your final index.html will look something like this:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0" />
    <title>OCR Document Scanner</title>
</head>

<body style="margin: 0">
    <button id="scan-document">Scan document</button>
    <script type="module">
        import "https://cdn.jsdelivr.net/npm/scanbot-web-sdk@7.1.0/bundle/ScanbotSDK.ui2.min.js";
        const sdk = await ScanbotSDK.initialize({
            enginePath:
                "https://cdn.jsdelivr.net/npm/scanbot-web-sdk@7.1.0/bundle/bin/complete/"
        });
        document
            .getElementById("scan-document")
            .addEventListener("click", async () => {
                const config = new ScanbotSDK.UI.Config.DocumentScanningFlow();
                const scanResult = await ScanbotSDK.UI.createDocumentScanner(config);
                const pages = scanResult?.document?.pages;

                if (!pages || !pages.length) {
                    return;
                }

                const options = { pageSize: "A4", pageDirection: "PORTRAIT", pageFit: "FIT_IN", dpi: 72, jpegQuality: 80, runOcr: true };
                const bytes = await scanResult?.document?.createPdf(options);

                function saveBytes(data, name) {
                    const extension = name.split(".")[1];
                    const a = document.createElement("a");
                    document.body.appendChild(a);
                    a.style = "display: none";
                    const blob = new Blob([data], { type: `application/${extension}` });
                    const url = window.URL.createObjectURL(blob);
                    a.href = url;
                    a.download = name;
                    a.click();
                    window.URL.revokeObjectURL(url);
                }

                saveBytes(bytes, "document-scan.pdf");
            });
    </script>
</body>

</html>

Now run your app to test scanning a single- or multi-page document and exporting it as a searchable PDF.

Conclusion

🎉 Congratulations! You can now scan documents from your browser, automatically perform OCR on them, and export them as searchable PDFs!

If this tutorial has piqued your interest in integrating document scanning functionalities into your web app or website, make sure to take a look at our SDK’s other neat features in our documentation – or run our example project for a more hands-on experience.

Should you have questions about this tutorial or run into any issues, we’re happy to help! Just shoot us an email via tutorial-support@scanbot.io.

Happy scanning! 🤳

How to build an OCR Document Scanner for searchable PDFs in JavaScript

Requirements

Step 1: Set up the HTML page and include the Web SDK

Step 2: Implement the Document Scanner

Step 3: Enable OCR and configure the PDF export

Conclusion

Ready to get started?

Related blog posts

Building an Angular Document Scanner web app in TypeScript

How to build a Progressive Web App (PWA) Document Scanner

Building a Vue.js Document Scanner web app with TypeScript and Vite

Products

Developer

Company