In today’s times when almost every single item comes with its own barcode, reading barcodes and QR codes seems to be the simplest thing to do. After all, just place the barcode or the QR code under the laser or camera-based scanner and your barcode is scanned.
However, reading physical barcodes on items is one thing and reading barcodes or QR codes from the PDFs or documents is completely another.
How do we read barcodes from PDFs? Or how do we read barcodes from already scanned documents in general?
In this blog post, we are going to learn the best way to read barcodes and QR codes from PDFs, images, and scanned documents. Image files, such as PNG, JPEG, or TIFF format (e.g. a scanned document) are accepted. We will also learn about PDF Barcode Scanners and PDF QR Codes Scanner Software.
Reading Barcodes and QR Codes From PDFs and Images
Usually, reading barcodes and QR Codes is simple if they are in the scannable format. Which means printed on a product or a piece of paper. The difficulty arises when these codes are in PDF or scanned documents that are stored on your computer. What do we do then? Obviously, we don’t want to print the document just be able to scan the barcode with a physical scanner, right?
You basically have two ways of scanning barcodes from documents:
1) You code your own script in your favorite programming language (Shell, Java, C++, you name it …)
2) Or you use a ready-to-use document processing and PDF OCR scanner solution such as Docparser
Option 1: Build Your Own PDF Barcode Scanner
Building your own program to scan Barcodes and QR-codes from PDFs and scanned documents is possible by using a couple of free open source tools. Your program will basically consist of two steps, which are converting your documents to image files and then scanning the images for barcodes and QR-codes.
Convert Your PDF Documents To Image Files
The first step is to convert your documents into a standard image format, such as PNG, JPEG or TIFF. There are many different options available and you might actually have already a library installed on your local machine.
For example, ImageMagick is a very popular open source library which can be used to convert PDFs to image files. ImageMagick is a command line tool which comes with bindings for many programming languages.
In case you want to work with ImageMagick on the command line, you can convert your PDF documents with the following command:
convert -r 150 original.pdf image.png
Adding the parameter -r 150 is important. Without this parameter, the resolution of your image will be too low and the barcode scanning will fail.
Scan Barcodes From Images
Once you have your PDF document converted to an image file, it’s time for the next step: scan barcodes from an image file. A wide-spread, and probably the best, library to read Barcodes and QR-codes from images is called Zbar.
Several wrappers for different programming languages, including C++, Java, Python and Ruby, exist for ZBar and you can easily integrate it in your system. Obviously you can also use the command line version of Zbar which is really easy to use. Just run the following command to scan all barcodes which are located in your image file:
zbarimg -q --raw image.png
The parameter –raw makes sure that only the values of your barcodes are returned. Without this parameter, more boilerplate information is returned which you might not need. To see all command line parameters of Zbar, have a look at the Zbar Manpage.
Option 2: Use A Document Processing Software With Built-In PDF Barcode Recognition
PDF barcode recognition is a great way to identify and categorize a document. But what if you could go one step further and extract more data fields from your documents? For example also extracting a date which is located in the document header, a shipping address, order details, etc.
Extracting any type of data from PDF documents and images, including barcode values, is possible with so called Document Data Extraction Software. And spoiler alert … Docparser is one of them.
Setting up a data extraction software is easy and can save your business a lot of time and money. Once set up, documents can be processed in batches and manual workflows can be replaced with a fast and reliable automated data-entry system.
How do I read barcodes from PDF with Docparser?
Docparser uses the open source software ZBar under the hood which we mentioned above. Zbar is the industry standard when it comes to reading barcodes and QR codes from a variety of image file formats, such as PNG, JPEG, and TIFF. It can help in decoding almost all types of barcode symbols such as EAN 13, EAN 8, Code 128, Code 39 etc.
Hence, with Docparser’s PDF Barcode Scanner you can do any of the following:
1) Read barcodes from PDF or QR code from PDF
2) Read a barcode or QR code from an image or a document
3) Extract that data in the PDF containing the barcode into any file of your choice
4) Customise your data extraction rules
5) Automate entire workflows thanks to API and our cloud integrations
How Can Docparser Help Your Business With OCR Barcode Recognition?
Docparser is a batch processing software which can extract data, including Barcode and QR codes, from PDFs and scanned documents, e.g. invoices, purchase orders, work order, shipping notes, etc.
Docparser is used by thousands of customers around the globe to read barcodes from PDF. Our customers use Docparser to automate tedious and manual document based workflows. Next to saving our customers time and money, Docparser also reduces error rates and allows you to accelerate your business.
Setting up Docparser is easy and our customers usually get first results within a couple of minutes. All you need to do is to sign up for a free account, create your first document parser, and upload your documents. Once Docparser extracted the data from your documents, you can either download it or use our integrations to fully automate your workflow.