cancel
Showing results for 
Search instead for 
Did you mean: 

TREX doesn't index some PDF documents ...

zbynek_kabrt3
Participant
0 Kudos

Hello all,

we have installed EP'04s and TREX ver. 7.00.42.00. It works well except one thing. TREX is not able to index some PDF documents. Most of them is indexed correctly but some documents not. Actually the problematic PDF documents are indexed but their content is not. In search result is displayed message "No document excerpt available" for the documents.

I found SAP Note 622419 that could relate with that but I don't know how can I check:

1. What encoding was used in particular PDF form.

2. What fonts or what font types were used in particular PDF form.

Do you have any idea how to find out these information about a document? Or do you know where could be problem when TREX are able to index content of just some PDF documents?

Regards,

Zbynek

Accepted Solutions (0)

Answers (1)

Answers (1)

detlev_beutner
Active Contributor
0 Kudos

Hi Zbynek,

Open the PDF in question (within A.-Reader or within Acrobat itself), "File" - "Properties" - "Fonts" (I have translated from German to English, so maybe "Properties" isn't "Properties" but something similar). There you'll see fonts and their type / encoding used within the document.

Hope it helps

Detlev

zbynek_kabrt3
Participant
0 Kudos

Thank you, that was what I needed.

So now I know the test PDF document use only TrueType fonts. That means its content should be indexable for TREX but it isn't. There is just message "No document excerpt available" for this document in search results.

Could someone look at the document and try to index it? It can be downloaded from http://www.volny.cz/kabrtz/TREX/indexing_test.pdf

Regards,

Zbynek

detlev_beutner
Active Contributor
0 Kudos

Hi Zbynek,

I have downloaded the PDF, and got the same result as you. I saw that the encoding of the fonts have been user-defined, maybe that's a problem. Anyhow, the SAP Note is quite generic, and in the end it does not say much more than: Better use Distiller / PDFWriter (in the end: original Acrobat products) then any other 3rd party product. Maybe that's the way to go for you...!?!?

Hope it helps

Detlev