on 08-22-2016 2:43 PM
Hi,
I have some pdf files which contain some data and images. In each of these pdf files, there is a reference number maintained like (Ref: 00.00.00001).
I need to extract this Ref No in a column in HANA table from various pdf files placed in the directory.
For this purpose, I have uploaded pdf files in HANA using a python script. All the content of pdf files goes into a single column of datatype BLOB of HANA table.
Now, I need to search within this BLOB column (which is pdf file content), extract the reference number and put it in another column.
I am not sure how to do this. Can you please guide me how this can be done ?
Is it possible to get this done in HANA via some text mining or text analysis technique or any other way ? I am new to text mining and tech analysis in HANA.
Regards,
Amandeep Singh
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Amandeep,
i have not done this myself, but i took one of the openSAP course on text mining, etc.. here is the official documentation
SAP HANA Advanced Data Processing – SAP Help Portal Page
for another reference, please check out the opensap course ontext analytics
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
93 | |
11 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.