MRZs consist of lines of alphanumeric characters and are primarily found on the bottom of ID documents and passports. They encode information about the document holder in a standardized format that can be easily processed by machines.
The development of machine-readable zones is closely intertwined with the introduction of standardized travel documents in many countries. Shortly after WWII, the United Nations established the International Civil Aviation Organization (ICAO). Its purpose is to coordinate air transport on an international scale and improve the safety of air navigation.
The ICAO’s Doc 9303 describes standards for international travel documents. These documents could be read automatically using optical character recognition (OCR) – a relatively new technology at that time. In the 1980s, the first countries began issuing machine-readable passports (MRP).
Different types of MRZs
Nowadays, machine-readable zones are present on all kinds of identity documents. They encode the most important information (but not all) recorded in a document’s visual inspection zone (VIZ).
Depending on the size of the document, the MRZ’s formatting varies slightly, which reading devices must account for. There are five standard document formats, each with a different type of MRZ: TD1, TD2, TD3, MRV-A, and MRV-B.
This format is used for credit card-sized documents, like German ID cards. Its convenient size has the downside that the MRZ is on the back, which means both sides of the document must be scanned.
The MRZ consists of three lines of 30 characters each, including check digits. If the encoded data in a line does not fill it entirely, filler characters (<) are used to complete it. The MRZ code includes the following information:
- Document type
- Issuing state
- Document number
- Check digit for document number
- (Space for optional data)
- Date of birth
- Check digit for date of birth
- Gender (can be omitted by replacing with “<”)
- Expiry date of document
- Check digit for expiry date
- (Space for optional data)
- Check digit for first two MRZ lines
- Last name(s) (or other primary identifier)
- First name(s) (or other secondary identifier)
The TD2 document format is slowly being replaced by the TD1 standard, but is still being used for ID cards in some countries. Since the format is bigger, the MRZ fits on the same side as the human-readable information. It also occupies just two lines of 36 characters each.
The encoded data is the same as for TD1, but in a different order. Notably, the holder’s name is on the first line rather than the last.
This format is used for most international passports, specifically the identification card at the beginning of the booklet. Just like TD2, it has two MRZ lines on the front of the document, but with 44 characters each. The first line begins with a “P” (for “passport”) and the second ends with an additional check digit for optional data.
MRV stands for machine-readable visa. This document type is indicated by a “V” at the beginning of the first MRZ line. The order and types of information encoded in the MRZ of an MRV-A document are similar to that of TD2 and TD3, but with no check digit for the second line. Each line is 44 characters long.
MRV-B documents are slightly smaller than MRV-A ones, thus the two MRZ lines are just 36 characters long. The information encoded is the same, however.
How to scan an MRZ
Just like when machine-readable passports were introduced in the 80s, MRZs are scanned using OCR (optical character recognition). Recognition is made easier by the use of a special font called OCR-B, which was specially designed to be read by electronic devices.
There are three steps to scanning and decoding an MRZ:
- The scanner software captures an image of the document. This image can be saved alongside the information contained in the MRZ if necessary.
- The MRZ is read using OCR. Some software also extracts the information as key-value pairs.
- The software validates the extracted information against the check digits in the MRZ. If the check digit does not match the information it refers to, there is an error and the scan must be repeated.
Modern text recognition software can extract information from the visual inspection zone as well. With a scanned image of the travel document, the information from the VIZ, and the MRZ data, mistakes can be ruled out almost entirely.
Common problems with scanning MRZs
The traditional MRZ scanner is a stationary device with a level scanning surface. These are still being used in places like airports or administrative offices. While inflexible, this approach has several advantages for OCR: First, the document lies completely flat while being scanned. Second, the document does not move during the scanning process. And third, lighting is controlled by the device.
When using mobile MRZ scanning software, none of this is guaranteed: While you can lay the document flat on a table, the angle at which you hold the mobile device also plays a role. Same with the movement of your hands. Lighting conditions are also more likely to vary. The scanning software has to account for all of this.
So, while mobile MRZ scanners offer much more flexibility than fixed hardware, they also must be capable of dealing with a greater number of variables. This is why using powerful MRZ scanning software is important.
Integrating an MRZ scanner into your mobile or web app
You can add MRZ scanning capabilities to your apps with a software development kit. Our Scanbot MRZ Reader SDK for iOS, Android, cross-platform, and web apps gives you all the features you need to quickly and reliably scan MRZs on ID and travel documents. It extracts the document holder’s information as key-value pairs and automatically assigns the correct data types. All of this works without an internet connection, ensuring maximum data security.
Driver’s licenses: Extract data from PDF417 using a barcode parser￼
Driver's licenses with PDF417 codes can be read by a barcode scanner with a built-in data parser to access a variety of useful information.