Linux Document Scanner
Integrate the Scanbot Linux Document Scanner SDK into your C, Java, Python, and Node.js applications. Deploy enterprise-grade document scanning on your servers, embedded systems, and edge devices running Debian, Ubuntu, Raspberry Pi OS, or other Linux distributions.
Trusted by
400+
global
industry leaders
Document scanning for server environments
Use the Scanbot Linux SDK to build a server-based document scanner and extract documents from live camera streams and static image files.
The SDK automatically deskews and crops the resulting scans and applies quality-enhancing image filters, such as binarization.
On-device intelligence
100% offline – no servers, no tracking, complete data security.
Extract high-quality document scans from any source
Hosted on your server, the Linux Document Scanner SDK accepts input from email attachments, mobile uploads, and even device cameras and outputs clean image files in PDF, JPEG, or TIFF format.
The built-in Document Quality Analyzer rates the image quality of each incoming document, flagging any that might not be suitable for further processing. By only accepting high-quality input, you can optimize scans for machine-readability and minimize manual work.
High-quality scans
Let your users generate high-quality scans suitable for automated processing.
Integrate an enterprise-grade Linux Document Scanner with C, Java, Python, and Node.js
The Scanbot Linux Document Scanner SDK provides powerful, easy-to-integrate, and easy-to-use high-level APIs for analyzing and processing document images. It runs on all Linux distributions with glibc version 2.27 or higher, including Debian, Ubuntu, and Raspberry Pi OS.
The SDK’s core API is written in pure C, with convenient wrappers for Java, Python, and Node.js. This makes it easy to integrate into a wide variety of environments.
On devices with CUDA and TensorRT support, such as the NVIDIA Jetson, you can make the most of the Document Scanner SDK’s capabilities by enabling GPU acceleration.
Technical requirements
The Scanbot Linux Document Scanner SDK supports the following platforms:
- Linux systems with glibc ≥ 2.27 (e.g., Ubuntu ≥ 18.04, Debian ≥ 11, Raspberry Pi OS 64-bit)
- NVIDIA Jetson devices
-
- CPU only: all Jetson devices
-
- GPU acceleration: Jetson AGX Orin Series, Jetson Orin NX Series, Jetson Orin Nano Series
-
-
- Jetpack 6.1
-
-
-
- CUDA 12.6
-
-
-
- TensorRT 10.3
-
- Raspberry Pi devices
-
- Pi 4 series supported, but not recommended
-
- Pi 5 series recommended
Quick start guide: Scanning documents from a live camera stream with Python
Here’s how you can scan documents from a live camera stream with Python. This example uses OpenCV for camera capture and display and the Scanbot Linux SDK for real-time document detection. The command-line interface accepts a single image path as input, and the Scanbot SDK license key is embedded directly in the source code. The implementation is split into several small, reusable components:
Head over to the Linux Document Scanner SDK documentation for a quick start guide on how to scan documents from still images.
Requirements
Make sure you have these packages installed:
pip install scanbotsdk opencv-python numpy
Your system must also provide:
-
- GStreamer plugins if using
jetson_csiorlibcamerapipelines
- GStreamer plugins if using
-
- A camera device (USB webcam, Jetson CSI, or libcamera-compatible camera)
Optional: Install CUDA + TensorRT
Requires Jetpack 6.1, CUDA 12.6, and TensorRT 10.3 for GPU acceleration.
sudo apt install -y nvidia-l4t-cuda libnvinfer10 libnvinfer-plugin10 libnvonnxparsers10
Camera helper
import cv2 as cv
# Default camera settings (resolution and framerate)
WIDTH, HEIGHT, FPS = 1280, 720, 30
def _gst_pipeline(device: str) -> str:
if device == "jetson_csi":
return (
"nvarguscamerasrc ! "
f"video/x-raw(memory:NVMM), width=(int){WIDTH}, height=(int){HEIGHT}, "
f"framerate=(fraction){FPS}/1 ! "
"nvvidconv flip-method=0 ! "
"video/x-raw, format=(string)BGRx ! "
"videoconvert ! video/x-raw, format=(string)BGR ! appsink drop=1"
)
if device == "libcamera":
return (
"libcamerasrc ! video/x-raw, format=BGR, "
f"width=(int){WIDTH}, height=(int){HEIGHT}, "
f"framerate=(fraction){FPS}/1 ! appsink"
)
raise ValueError(f"Invalid device for GStreamer: {device!r}")
def open_camera(device: str) -> cv.VideoCapture:
"""Open V4L2 (numeric index) or GStreamer ('jetson_csi' / 'libcamera')."""
try:
cap = cv.VideoCapture(int(device), cv.CAP_V4L2)
cap.set(cv.CAP_PROP_FRAME_WIDTH, WIDTH)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, HEIGHT)
cap.set(cv.CAP_PROP_FPS, FPS)
except ValueError:
cap = cv.VideoCapture(_gst_pipeline(device), cv.CAP_GSTREAMER)
if not cap.isOpened():
raise RuntimeError("Cannot open camera")
return cap
-
- Opens a camera using V4L2 (numeric index like
"0") or GStreamer ("jetson_csi"/"libcamera").
- Opens a camera using V4L2 (numeric index like
-
- Requests 1280×720 @ 30fps.
Device strings:
-
"0","1"— USB webcams via V4L2
-
"jetson_csi"— Jetson CSI camera (nvarguscamerasrc)
-
"libcamera"— modern libcamera-based pipeline (e.g., Raspberry Pi OS Bookworm)
Document scanner
Creates a DocumentScanner instance with ProcessingMode.AUTO.
def create_document_scanner() -> DocumentScanner:
config = DocumentScannerConfiguration(
processing_mode=ProcessingMode.AUTO,
)
return DocumentScanner(configuration=config)
def print_documents_text(res: DocumentDetectionResult, frame):
print(res.status.name)
return None
Scanning loop
import cv2 as cv
from scanbotsdk import ImageRef, RawImageLoadOptions
from snippets.live.preview import preview_on_frame
def run_scanner(cap: cv.VideoCapture, scan, on_result):
"""Read frames, run scanner, and show processed frame if returned."""
show_preview = None
try:
while True:
ok, frame = cap.read()
if not ok:
break
with ImageRef.from_ndarray(frame, RawImageLoadOptions(live_source=True)) as img:
res = scan(image=img)
frame_to_show = on_result(res, frame)
if frame_to_show is not None:
if show_preview is None:
show_preview = preview_on_frame("Live")
show_preview(frame_to_show)
except KeyboardInterrupt:
pass
finally:
cap.release()
-
- Each captured frame is wrapped in an
ImageRefwithlive_source=Truefor maximum FPS.
- Each captured frame is wrapped in an
-
- Continuously reads frames from the camera.
-
- Sends them to the scanner.
-
- Passes results to an
on_resultcallback.
- Passes results to an
Optional: Live preview window
import cv2 as cv
def preview_on_frame(window_name: str = "Live"):
cv.namedWindow(window_name, cv.WINDOW_AUTOSIZE)
def _on_frame(frame):
cv.imshow(window_name, frame)
if cv.waitKey(1) == 27: # ESC
cv.destroyAllWindows()
raise KeyboardInterrupt
return _on_frame
This displays the processed frame in a real-time window. Press ESC or close the window to stop.
def draw_documents_frame(res: DocumentDetectionResult, frame, color=(0, 255, 0), thickness=2):
pts_list = list(res.points or [])
if not pts_list and res.points_normalized:
h, w = frame.shape[:2]
pts = np.array([[int(p.x * w), int(p.y * h)] for p in res.points_normalized], dtype=np.int32)
elif pts_list:
pts = np.array([[int(p.x), int(p.y)] for p in pts_list], dtype=np.int32)
else:
return frame
# Draw contour
cv.polylines(frame, [pts], True, color, thickness, cv.LINE_AA)
# Put status label above the top-most vertex
top_idx = int(np.argmin(pts[:, 1]))
x, y = int(pts[top_idx, 0]), int(pts[top_idx, 1])
y = max(0, y - 6)
label = res.status.name
cv.putText(frame, label, (x, y), cv.FONT_HERSHEY_SIMPLEX, 0.5, color, 1, cv.LINE_AA)
return frame
Running the example
The document_live function combines all components into a runnable demo. It opens the camera, creates the document scanner, and starts the scanning loop with or without a live preview window.
def document_live(device_input: str, show_preview: bool):
cap = open_camera(device_input)
scanner = create_document_scanner()
if show_preview:
run_scanner(cap, scanner.run, on_result=draw_documents_frame)
else:
run_scanner(cap, scanner.run, on_result=print_documents_text)
Performance tip (for NVIDIA Jetson)
To avoid throttling, set max GPU/CPU/memory clocks:
sudo jetson_clocks --store
sudo jetson_clocks
To later restore the defaults, run:
sudo jetson_clocks --restore