What is the best PHP pdf table parser class?: Extract data form a table on multipage pdf

Recommend this page to a friend!

All requests

What is the best PHP pdf table parser...

Request new recommendation

Featured requests

No recommendations

What is the best PHP pdf table parser class? #pdf table parser

Edit

by Francesco Facco de Lagarda - 5 years ago (2019-12-03)

Extract data form a table on multipage pdf

I need to extract data from rows and columns of a table from a PDF file.

The PDF document contains a 5 column table. I need to extract the data from it.

All attempts with various libraries have not been able to understand the table and cant accurately extract the data contained in the individual cells.

2 Clarification requests
2. by Marco van Oostende - 5 years ago (2020-02-18) Reply
It is very much depending on the quality of the PDF. It is not uncommon that cell content is cluttered around the table, or that text is gibberish. I would suggest to simply copy the text you wish from that table onto the clipboard and paste it into something like Notepad or any other text-based tool. This should give you an indication on what is actually possible: if you can find a structure in that text, the above package may work. Big chance it won't however.

1. by Manuel Lemos - 5 years ago (2020-02-18) Reply
Parsing and extracting data from PDF documents is not an easy task due to the complexity of that kind of documents.

There is this PDF document parser but I am not sure if it can handle tables well in PDF document. Can you please try it and let us know if it works well for you?

https://www.phpclasses.org/package/9732-PHP-Extract-text-contents-from-PDF-files.html

Ask clarification

Recommend package

About us

Advertise on this site

For more information send a message to info at phpclasses dot org.