Barcode parser: How to parse GS1 barcodes, driver’s licenses, and more

Since barcodes store information as ones and zeros, making the data human-readable requires more than just scanning the barcode. The raw data must be parsed and prepared for further processing, especially when dealing with multiple data points in one code. Learn why a capable parser is an essential feature of high-quality barcode scanner software.

Kevin February 15, 2023 4 mins read
app store

What is barcode parsing?

Barcodes store data in a compact, machine-readable format. A barcode scanner extracts the raw data and converts it into a string, such as a serial number or a URL.

But some barcodes, especially 2D barcodes, can store multiple data elements. A good example is a US driver’s license: The PDF417 barcode on the back encodes at least 22 different pieces of information about the document holder. Other examples are the wide-spread GS1-128 barcode, and the Swiss QR Code, which contains all information required for an electronic bank transfer.

While the strings are often human-readable to some extent, they frequently need further processing to be useful. Scanning the barcode on a US driver’s license without parsing the data would give you something like this:

DCSSMITH\nDACJOHN\nDBB06161986\nDBC1\nDAYBRO\n...

What makes this information so hard to decipher is the use of three-letter signifiers for each data element (e.g., DCS for the last name and DAY for the eye color, which is also abbreviated) and the lack of punctuation such as spaces or commas. 

Parsing the data and applying formatting to the elements according to their types gives us the following results:

LastName: Smith
FirstName: John
DateOfBirth: 06.16.1986
Sex: Male
EyesColor: Brown

Transforming the information into these key-value pairs makes working with it much easier, both for staff and for backend systems.

Data parsing using GS1 Application Identifiers

Thanks to its flexibility and high performance, the GS1-128 barcode is a key logistics standard.

The big advantage of GS1 barcodes are the so-called application identifiers. These sequences of two to four digits tell barcode scanners which type of information comes next. This allows them to parse the data into a human-readable format. 

Let’s take a look at this GS1-128 barcode:

It contains three pieces of information marked by the application identifiers (10), (11), and (15). These signify the batch or lot number, the manufacturing date, and the “best before” date. By parsing the data according to the application identifiers, barcode scanners can prepare it for further processing, e.g., for entry into a database.

Here is the parsed information:

Batch/lot number: 123456789
Manufacturing date: 22.07.2024
Best-before date: 24.07.2024

Using line breaks to separate information

Some barcode standards don’t use letter codes or application identifiers to separate data elements. Instead, they store each item on a separate line in a fixed order. Here’s an EPC QR Code for electronic SEPA payments:

The raw data converted to a string looks something like this (each “\n” is a line break):

BCD\n002\n1\nSCT\nABCDEFGH\nJohn_Doe\nDE000000000000123456\nEUR128\n\n\nTest_payment

By looking at the line number, a barcode scanner equipped with a parser for this barcode type can infer how to prepare the data for processing. The result would look like this:

Service Tag: BCD
Version: 002
Character Set: 1
Identification: SCT
BIC: ABCDEFGH
Name of Beneficiary: John Doe
IBAN: DE000000000000123456
Amount: EUR128
Remittance Information: Test payment

The disadvantage of this approach is that extra or missing lines can mess up the structure. Therefore, a barcode parser additionally needs to check if the information in any given line matches what it would typically expect to find there.

Data parsing for many different barcodes

Our Scanbot Barcode Scanner SDK covers a wide range of use cases. That’s why we implemented data parsers for several data-dense barcode types, including PDF417, Data Matrix, and QR Code.

Our parsers always check the decoded information for any aberrations to guarantee the quality of your data. In addition, our solution can be freely configured to seamlessly interact with any backend. This ensures that you get the most out of digitalizing your workflows.