Best Softwares to Extract Tables from PDF (and export them to Excel, CSV, …)

PDF files and scanned documents are ubiquitous in today’s business environment. Often times, important business data is trapped inside these documents and extracting data from PDF is unfortunately more often than not a manual and tedious task. This task becomes even more daunting when we need to extract tables from PDFs or scanned images. Continue reading “Best Softwares to Extract Tables from PDF (and export them to Excel, CSV, …)”

Read Barcodes & QR-codes From PDFs, Scanned Documents And Images

In today’s times when almost every single item comes with its own barcode, reading barcodes and QR codes seems to be the simplest thing to do. After all, just place the barcode or the QR code under the laser or camera-based scanner and your barcode is scanned.

However, reading physical barcodes on items is one thing and reading barcodes or QR codes from the PDFs or documents is completely another.

How do we read barcodes from PDFs? Or how do we read barcodes from already scanned documents in general? Continue reading “Read Barcodes & QR-codes From PDFs, Scanned Documents And Images”

What is a PDF Parser? An introduction to PDF and Document Parsing

A PDF Parser (also sometimes called PDF scraper) is a software which can be used to extract data from PDF documents. PDF Parsers can come in form of libraries for developers or as standalone software products for end-users.

PDF Parsers are used mainly to extract data from a batch of PDF files. Manual data entry (copy & paste) is a common alternative when data needs to be extracted from only a handful of documents. Continue reading “What is a PDF Parser? An introduction to PDF and Document Parsing”

Build Your Own Automated Purchase Order System With Docparser

Does your business receive tons of purchase orders and sales orders by e-mail, fax, telephone or snail mail? No matter if you are dealing with just a handful of orders per week or hundreds each months. Handling those orders manually is time consuming and error prone to say the least. Even worse, purchase orders contain crucial data for your business which you just can’t miss. Continue reading “Build Your Own Automated Purchase Order System With Docparser”

Email PDF’s to our Parsing Engine

Are you receiving PDF Documents containing important data by email? Good news! With Docparser, it’s easy to extract data from PDF email attachments. If you have recurring PDFs, with the same physical layout, you can simply email them to the Docparser app and get structured data back in return.

Once you have created, and tested your PDF layout parser, you can upload additional PDFs with our email option. Simply select the layout parser you would like to send attachments to, select “Settings” from the navigation and you will see your layout parser “inbox”. Continue reading “Email PDF’s to our Parsing Engine”

OCR PDF Scanner

Optical Character Recognition (OCR) is a technology that allows you to extract data from scanned documents. Text which you can then edit, update, or aggregate with other tools for data analysis and a range of other uses.

Optical Character Recognition (OCR), is essentially the conversion of scanned images with text, be it typed, in print, or written by hand, into … well … text. Typically you see OCR used in extracting text information from photos, passports, and scanned documents. OCR is often used for “digitizing” recognized text, so it can be utilized later, edited, searched, aggregated for analysis, etc. Continue reading “OCR PDF Scanner”

Convert PDF to JSON – Turn PDF Documents Into Structured JSON Data Objects

Without a doubt, PDF became the de-facto exchange format for business documents. But PDF is “only” a replacement for paper and businesses around the globe have a hard time accessing important data which is trapped inside their PDF documents. On the other hand, JSON became probably the most popular data exchange format when it comes to syncing data between two web applications.

That being said, wouldn’t it be great to be able to automatically convert PDF documents into JSON data objects? What if it would actually be possible to leverage data which is trapped inside PDF documents to automate business processes?

This post will show you how you can do exactly that with Docparser. Docparser allows you to convert PDF to JSON data which can then be used to automate your document based workflows. Continue reading “Convert PDF to JSON – Turn PDF Documents Into Structured JSON Data Objects”

An Introduction to Portable Document Format (PDF)

What is a PDF, and how can I get all that information into structured data, which you can download, or send to hundreds of other platforms? That is what we at Docparser are here to help you with. There are piles of articles out there on PDFs, but they seem to fall between extremely technical, and leaving us wanting a little more. So we figured we would take a stab at our interpretation. Let’s go back to the basics. Continue reading “An Introduction to Portable Document Format (PDF)”