Solved: Load PDF pages as Strings into a HANA Table

Former Member · ‎10-13-2015

Hi SCN community,

i created a table in HANA which should be filled with the content of a local stored PDF file.

The table has two columns, PAGE NUMBER and CONTENT.

Each row should represent one page of the PDF.

I tried this extracting part via a external Python Script, but the content couldn't be extracted for a lot of the PDF files.

The reason for that might be the diversification of PDF types or versions.

My questions are:

How can i extract these information out of an PDF file and load it into my HANA table without using external tools like Python and so on?

Is there already a file upload / extract tool integrated within HANA?

Thanks in advance!

Sebastian

Bojan-lv-85 · ‎10-13-2015

Hi Sebastian,

what about the File Adapter feature of EIM:

Chapter "6.5 File"

Not sure whether this is configurable in that way to consider page-numbers.

BR, Bojan

Load PDF pages as Strings into a HANA Table