Linux Document Scanner

Integrate the Scanbot Linux Document Scanner SDK into your C, Java, Python, and Node.js applications. Deploy enterprise-grade document scanning on your servers, embedded systems, and edge devices running Debian, Ubuntu, Raspberry Pi OS, or other Linux distributions.

Trusted by 400+ global
industry leaders

Document scanning for server environments

Use the Scanbot Linux SDK to build a server-based document scanner and extract documents from live camera streams and static image files.

The SDK automatically deskews and crops the resulting scans and applies quality-enhancing image filters, such as binarization.

Linux Document Scanner SDK for server environments and embedded devices

On-device intelligence

100% offline – no servers, no tracking, complete data security.

Extract high-quality document scans from any source

Hosted on your server, the Linux Document Scanner SDK accepts input from email attachments, mobile uploads, and even device cameras and outputs clean image files in PDF, JPEG, or TIFF format.

The built-in Document Quality Analyzer rates the image quality of each incoming document, flagging any that might not be suitable for further processing. By only accepting high-quality input, you can optimize scans for machine-readability and minimize manual work.

High-quality scans

Let your users generate high-quality scans suitable for automated processing.

Integrate an enterprise-grade Linux Document Scanner with C, Java, Python, and Node.js

The Scanbot Linux Document Scanner SDK provides powerful, easy-to-integrate, and easy-to-use high-level APIs for analyzing and processing document images. It runs on all Linux distributions with glibc version 2.27 or higher, including Debian, Ubuntu, and Raspberry Pi OS.

The SDK’s core API is written in pure C, with convenient wrappers for Java, Python, and Node.js. This makes it easy to integrate into a wide variety of environments.

On devices with CUDA and TensorRT support, such as the NVIDIA Jetson, you can make the most of the Document Scanner SDK’s capabilities by enabling GPU acceleration.

Technical requirements

The Scanbot Linux Document Scanner SDK supports the following platforms:

Linux systems with glibc ≥ 2.27 (e.g., Ubuntu ≥ 18.04, Debian ≥ 11, Raspberry Pi OS 64-bit)

NVIDIA Jetson devices

- CPU only: all Jetson devices

- GPU acceleration: Jetson AGX Orin Series, Jetson Orin NX Series, Jetson Orin Nano Series

- - Jetpack 6.1

- - CUDA 12.6

- - TensorRT 10.3

Raspberry Pi devices

- Pi 4 series supported, but not recommended

- Pi 5 series recommended

Integrate this morning
with

Quick start guide: Scanning documents from a live camera stream with Python

Here’s how you can scan documents from a live camera stream with Python. This example uses OpenCV for camera capture and display and the Scanbot Linux SDK for real-time document detection. The command-line interface accepts a single image path as input, and the Scanbot SDK license key is embedded directly in the source code. The implementation is split into several small, reusable components:

Head over to the Linux Document Scanner SDK documentation for a quick start guide on how to scan documents from still images.

Requirements

Make sure you have these packages installed:

pip install scanbotsdk opencv-python numpy

Your system must also provide:

- GStreamer plugins if using jetson_csi or libcamera pipelines

- A camera device (USB webcam, Jetson CSI, or libcamera-compatible camera)

Optional: Install CUDA + TensorRT

Requires Jetpack 6.1, CUDA 12.6, and TensorRT 10.3 for GPU acceleration.

sudo apt install -y nvidia-l4t-cuda libnvinfer10 libnvinfer-plugin10 libnvonnxparsers10

Camera helper

import cv2 as cv

# Default camera settings (resolution and framerate)
WIDTH, HEIGHT, FPS = 1280, 720, 30

def _gst_pipeline(device: str) -> str:
    if device == "jetson_csi":
        return (
            "nvarguscamerasrc ! "
            f"video/x-raw(memory:NVMM), width=(int){WIDTH}, height=(int){HEIGHT}, "
            f"framerate=(fraction){FPS}/1 ! "
            "nvvidconv flip-method=0 ! "
            "video/x-raw, format=(string)BGRx ! "
            "videoconvert ! video/x-raw, format=(string)BGR ! appsink drop=1"
        )
    if device == "libcamera":
        return (
            "libcamerasrc ! video/x-raw, format=BGR, "
            f"width=(int){WIDTH}, height=(int){HEIGHT}, "
            f"framerate=(fraction){FPS}/1 ! appsink"
        )
    raise ValueError(f"Invalid device for GStreamer: {device!r}")


def open_camera(device: str) -> cv.VideoCapture:
    """Open V4L2 (numeric index) or GStreamer ('jetson_csi' / 'libcamera')."""
    try:
        cap = cv.VideoCapture(int(device), cv.CAP_V4L2)
        cap.set(cv.CAP_PROP_FRAME_WIDTH, WIDTH)
        cap.set(cv.CAP_PROP_FRAME_HEIGHT, HEIGHT)
        cap.set(cv.CAP_PROP_FPS, FPS)
    except ValueError:
        cap = cv.VideoCapture(_gst_pipeline(device), cv.CAP_GSTREAMER)
    if not cap.isOpened():
        raise RuntimeError("Cannot open camera")
    return cap

- Opens a camera using V4L2 (numeric index like "0") or GStreamer ("jetson_csi" / "libcamera").

- Requests 1280×720 @ 30fps.

Device strings:

- "0", "1" — USB webcams via V4L2

- "jetson_csi" — Jetson CSI camera (nvarguscamerasrc)

- "libcamera" — modern libcamera-based pipeline (e.g., Raspberry Pi OS Bookworm)

Document scanner

Creates a DocumentScanner instance with ProcessingMode.AUTO.

def create_document_scanner() -> DocumentScanner:
    config = DocumentScannerConfiguration(
        processing_mode=ProcessingMode.AUTO,
    )
    return DocumentScanner(configuration=config)

def print_documents_text(res: DocumentDetectionResult, frame):
    print(res.status.name)
    return None

Scanning loop

import cv2 as cv
from scanbotsdk import ImageRef, RawImageLoadOptions

from snippets.live.preview import preview_on_frame

def run_scanner(cap: cv.VideoCapture, scan, on_result):
    """Read frames, run scanner, and show processed frame if returned."""
    show_preview = None
    try:
        while True:
            ok, frame = cap.read()
            if not ok:
                break

            with ImageRef.from_ndarray(frame, RawImageLoadOptions(live_source=True)) as img:
                res = scan(image=img)

            frame_to_show = on_result(res, frame)
            if frame_to_show is not None:
                if show_preview is None:
                    show_preview = preview_on_frame("Live")
                show_preview(frame_to_show)
    except KeyboardInterrupt:
        pass
    finally:
        cap.release()

- Each captured frame is wrapped in an ImageRef with live_source=True for maximum FPS.

- Continuously reads frames from the camera.

- Sends them to the scanner.

- Passes results to an on_result callback.

Optional: Live preview window

import cv2 as cv

def preview_on_frame(window_name: str = "Live"):
    cv.namedWindow(window_name, cv.WINDOW_AUTOSIZE)

    def _on_frame(frame):
        cv.imshow(window_name, frame)
        if cv.waitKey(1) == 27:  # ESC
            cv.destroyAllWindows()
            raise KeyboardInterrupt

    return _on_frame

This displays the processed frame in a real-time window. Press ESC or close the window to stop.

def draw_documents_frame(res: DocumentDetectionResult, frame, color=(0, 255, 0), thickness=2):
    pts_list = list(res.points or [])
    if not pts_list and res.points_normalized:
        h, w = frame.shape[:2]
        pts = np.array([[int(p.x * w), int(p.y * h)] for p in res.points_normalized], dtype=np.int32)
    elif pts_list:
        pts = np.array([[int(p.x), int(p.y)] for p in pts_list], dtype=np.int32)
    else:
        return frame

    # Draw contour
    cv.polylines(frame, [pts], True, color, thickness, cv.LINE_AA)

    # Put status label above the top-most vertex
    top_idx = int(np.argmin(pts[:, 1]))
    x, y = int(pts[top_idx, 0]), int(pts[top_idx, 1])
    y = max(0, y - 6)
    label = res.status.name
    cv.putText(frame, label, (x, y), cv.FONT_HERSHEY_SIMPLEX, 0.5, color, 1, cv.LINE_AA)

    return frame

Running the example

The document_live function combines all components into a runnable demo. It opens the camera, creates the document scanner, and starts the scanning loop with or without a live preview window.

def document_live(device_input: str, show_preview: bool):
    cap = open_camera(device_input)
    scanner = create_document_scanner()
    if show_preview:
        run_scanner(cap, scanner.run, on_result=draw_documents_frame)
    else:
        run_scanner(cap, scanner.run, on_result=print_documents_text)

Performance tip (for NVIDIA Jetson)

To avoid throttling, set max GPU/CPU/memory clocks:

sudo jetson_clocks --store
sudo jetson_clocks

To later restore the defaults, run:

sudo jetson_clocks --restore

Linux Document Scanner

Integrate an enterprise-grade Linux Document Scanner with C, Java, Python, and Node.js

Technical requirements

Integrate this morning
with

Quick start guide: Scanning documents from a live camera stream with Python

Requirements

Optional: Install CUDA + TensorRT

Camera helper

Document scanner

Scanning loop

Optional: Live preview window

Running the example

Performance tip (for NVIDIA Jetson)

Ready to get started?

Products

Developer

Company

Experience our demo apps

Barcode Scanner SDK

Document Scanner SDK

Data Capture Modules

Linux Document Scanner

Integrate an enterprise-grade Linux Document Scanner with C, Java, Python, and Node.js

Technical requirements

Integrate this morning with

Quick start guide: Scanning documents from a live camera stream with Python

Requirements

Optional: Install CUDA + TensorRT

Camera helper

Document scanner

Scanning loop

Optional: Live preview window

Running the example

Performance tip (for NVIDIA Jetson)

Ready to get started?

Experience our demo apps

Barcode Scanner SDK

Document Scanner SDK

Data Capture Modules

Integrate this morning
with