cancel
Showing results for 
Search instead for 
Did you mean: 

TREX finds only folders and not documents

saurabh_vakil
Active Contributor
0 Kudos

We have TREX installed and configured on our system. Indexes have also been created. When I try to search for folders, it shows the results, but when I try to search using a document name, it does not show the specific document.

For example, I have a folder named ABC in which I have a word document named xyz.doc.

When I type ABC in the search field, it displays the folder. But when I enter xyz.doc in the search field it shows no matches found.

What could be the problem?

Accepted Solutions (1)

Accepted Solutions (1)

frank_friedrich
Contributor
0 Kudos

Hi Saurabh,

any things are still mentioned which should be considered:

- create a index with indexes all (documents and folder)

- it is not neccessary to create a taxonomy

- it is not neccessary to define permission on the index

- wait until all documenst and folders are indexed

- you will see in the result set only these documents which you have at least read permissions

You are using "search from here"?

So in these case you must as well define a index on this folder structure otherwise you get no result set.

The easiest way to check you index is to use the search field in the tool area iView. Use "*" as a search term. On the other hand you can use as well the stand alone tool "TREX Admin" to check the content of your index.

The next question is: What type of repository to you use?

And is the TREX preprocessor is able to fetch the documents?

Is the monitoring green of indexing?

Best Regards

Frank

saurabh_vakil
Active Contributor
0 Kudos

Hi Frank.

Many thanks for your reply.

For my index, Items to Index is All(documents and folders). Thanks for clarifying about using search from here. Yes TREX monitoring shows green indicator, so does the status in index administration. When I search using *.doc, I am able to see only a few word documents and not all them which I know are present in the particular folder. Why may this be happening?

Regards,

Saurabh.

Edited by: Saurabh Vakil on Sep 26, 2008 11:06 AM

former_member188556
Active Contributor
0 Kudos

So now u have some docs returning in the results?

Earlier it was not like that or what?

Answers (8)

Answers (8)

peressin_giorgio
Explorer
0 Kudos

I installed an EP7/TREX SP10 two years ago. All worked fine until I decided to apply windows SP2 and some security patch. My machine is a win2003 x64.

Now EP7 works fine, but TREX in not able to reindex documents > 10kb.

Trex preprocessor trace says:

Preprocessor.cpp(03550) : HANDLE: DISPATCH - Processing Document with key '/documents/Segreterie/Documenti PCTP/Doc. in Arrivo/Anno 2009/2009_03123.pdf' failed, returning PREPROCESSOR_ACTIVITY_ERROR (Code 6401)

In the portal security log file I found the corresponding error:

System/Security/Authentication#sap.com/irj#com.sap.engine.services.security.authentication.logincontext#Guest#2####764602d0407311dea83600188b77747b#SAPEngine_Application_Thread[impl:3]_17##0#0#Info#1#com.sap.engine.services.security.authentication.logincontext#Plain###LOGIN.FAILED

User: N/A

Authentication Stack: ticket

It seems to me that the portal is not able to pass user name to Trex (User: N/A) and then It is not authorized to retrive the documents.

I tried to change the user of indexmanager service "index_service"; to set the alternative host in url generator service. Nothing changed.

Any suggestion?

Giorgio Peressin

saurabh_vakil
Active Contributor
0 Kudos

Hi Søgaard.

I am asking this just for my knowledge. Previously I configured TREX 7 on EP 7 and did the indexing. It worked perfectly fine. But now I have configured TREX 7 on EP 6.4 SP17. And it is here I am facing problems indexing the documents. Could it be that trex 7 doesn't work on EP 6 ?

Regards,

Saurabh.

Former Member
0 Kudos

Hi Saurabh

TREX 7.0 should work on EP 6.0 without any problems, see: Note 1010800 - TREX Java Client Compatibility for Java-Based Applications.

But there might be some configurations of security zones, rights of system user "index_service", etc. that are different between the two systems.

Best regards,

Martin Søgaard

saurabh_vakil
Active Contributor
0 Kudos

Hi Søgaard.

I found the following in the preprocessor trace file :

2008-09-24 22:09:16.556 e SERVER_TRACE TRexApiTextMining.cpp(07042) : calling preprocessor for 'http://entegsr02:50300/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/01-Business%20Process/02-Corporate%20Procedures/08-Accounts/04-R/01-Weekly%20Review%20Meetings/04-weekly%20summary%20sheet%20as%20on%2022.05.2007.xls' failed rc=6401

All files show the same error 6401.

When I tried to access the above document by entering the url in the browser, I was able to view the xls document.

What else can be the problem?

Regards,

Saurabh.

Former Member
0 Kudos

Hi Saurabh

Unfortunately I dont know what rc=6401 is short for, so I suggest you log an OSS message to SAP, if that is the only error, you find in the logs.

You still have the same problem... your preprocessor cannot access the documents in some way, but is able to process the smaller documents that are sent to it. You "just" need to find the cause...

Best regards,

Martin Søgaard

saurabh_vakil
Active Contributor
0 Kudos

We have installed TREX 7 and configured it on EP6.4 SP17.

Then I created an index where I selected the Service as Trex Search and Classification and Items to Index as All.

Then I assigned the /documents folder as the data source and clicked on reindex.

After a while, under Indexing Monitor, it shows 5552 under Indexed and 7182 under Errors. When I see the details -> View -> Index information for all folders under /documents, it shows

Status of resource OK: index operation.

But for all the documents for which indexing has failed (documents greater than 10 kb in size), it shows Status of resource No information available.

I have entered the correct value in the Host field of the URL Generator Service. But still indexing fails for documents which are greater than 10 kb. What could have gone wrong now?

Regards,

Saurabh.

Former Member
0 Kudos

Hi Saurabh

Your problem can be caused by a bunch of different settings, so you need to do further investigation in order for us to help more.

I still suggest the same:

1) take a look into the preprocessor logs, 2) find one of the documents that fails and notice the error message, 3) Try entering the URL TREX is trying to reach into a local browser on the TREX server and 4) If you get a "portal-error" in the browser, then go check what is written to the default trace file in the portal.

Best regards,

Martin Søgaard

frank_friedrich
Contributor
0 Kudos

Hi,

there are many documents with preparation failed.

The most common reason for this is that the PreProcessor is not able for fetching the documents.

What kind of repository do you use?

In URL Generator Service what portal URL have you entered (field Host)?

Have a closer look at the stand alone application TREXAdmin and see the trace file for the preprocessor. There is an error message.

Best regards

Frank

saurabh_vakil
Active Contributor
0 Kudos

The value for the host field in URL Generator Service is http://xyzsr02.myxyz.com:50300

Should it be http://xyzsr02:50300 instead?

And if the host value is changed, is it required to restart the J2EE engine?

Regards,

Saurabh.

Edited by: Saurabh Vakil on Sep 30, 2008 1:31 PM

Edited by: Saurabh Vakil on Sep 30, 2008 1:39 PM

Former Member
0 Kudos

Hi Saurabh Vakil

I believe Frank is correct, the preprocessor cannot access your documents. The reason you can search your folders, is because they are smaller than 10kb and are thus sent to the preprocessor. With documents larger than 10kb the preprocessor receives an URI and appends the settings in the url generator service as "prefix". So it is somewhere in this process you run into trouble. Either the preprocessor tries to reach an incorrect URL or it has got no access to the files.

You should use the Alternative Host URL in order to make TREX work, see http://help.sap.com/saphelp_nw70/helpdata/EN/7d/236cfa17034a37a439dc392ec59eb0/frameset.htm. I think you should use the fully qualified domain name in the Alternative Host URL.

Like Frank I will suggest you 1) take a look into the preprocessor logs, 2) find one of the documents that fails and notice the error message, 3) Try entering the URL TREX is trying to reach into a local browser on the TREX server and 4) If you get a "portal-error" in the browser, then go check what is written to the default trace file in the portal.

Best regards,

Martin Søgaard

frank_friedrich
Contributor
0 Kudos

Hi Saurabh,

go into the portal in System Administration > Monitoring > KM > TREX Monitor > Display Queues.

And have a closer look to all your queues.

At least in the column OK you will find all the documents which are indexed (since the last reset of your queue).

Are there other columns for example ".... failed" which are not equel 0?

Best Regards

Frank

saurabh_vakil
Active Contributor
0 Kudos

In the Index Monitoring, it shows number of documents indexed as 5551 and preparation failed for 7182 documents. I also checked for the appropriate host name in System Administration -> System Configuration -> Knowledge Management ->Content Management -> Global Service (advanced) -> URL Generator Service. Why is it not able to index all the documents in the source folder? What could be the problem?

Regards,

Saurabh.

Edited by: Saurabh Vakil on Sep 26, 2008 2:08 PM

saurabh_vakil
Active Contributor
0 Kudos

When we create index, there are three properties - Data Source, Taxonomies and Permissions.

I have given appropriate values for Data Source and Permissions. Is it mandatory to define a taxonomy?

Regards,

Saurabh.

former_member188556
Active Contributor
0 Kudos

Hi,

Creation of taxonomy is not required for searching the files.

Is the behavior same for all the portal search? or any custom searhc only?

Please do a check...

Goto ur folder->details->View->Index information

Do the same for the files also in the same locations

Another thing, on the search page do a right click and get the view source code.

Find the term "SelectedItems" and check its value

Regards

BP

saurabh_vakil
Active Contributor
0 Kudos

Hi Bobu,

The behaviour is same for all portal search. When I viewed the source of the search page, I found the term SelectedItems=ALL appearing numerous times.

I navigated to the folder which I have given as the datasource for my index, clicked View -> Index Details.

It displays following Index Information:

Index name KM

Service types Classification, Search

State of Resources OK:index operation

Suppose I enter a term "abcd" in the search field and press Search, it displays some folders with the String abcd in its name. If I select a folder "abcd1", there are 2 word documents inside it (say one.doc and two.doc). If I click on the Search from Here option beside the folder name and try to search for the file two.doc, it displays the message "No matches have been found".

Why am I not able to search for .doc documents???

Regards,

Saurabh.

former_member188556
Active Contributor
0 Kudos

Hi

I navigated to the folder which I have given as the datasource for my index, clicked View -> Index Details.

It displays following Index Information:

Index name KM

Service types Classification, Search

State of Resources OK:index operation

I want the same info for the files that u see.

Also

Goto Content Management-> Global Services -> Crawler Parameters

Standard

Check whether u have any resource filters for scope or result.

If so please post those details u see for the filters

Regards

BP

saurabh_vakil
Active Contributor
0 Kudos

Checked Content Management -> Global Services -> Crawler Parameters.

For both Resource Filter(Scope) and Resource Filter(Result) it shows Not set.

Regards,

Saurabh.

former_member188556
Active Contributor
0 Kudos

Hi,

In the trex config, have u mentioned the "Item to index" only as folders?

Check this [Image|https://www.sdn.sap.com/irj/sdn/wiki?path=/display/sandbox/KMForumthreadID%3D1062014]. It should contain both folders and documents.

Regards

BP

saurabh_vakil
Active Contributor
0 Kudos

The "Items to Index" option for my index is set to All. Even then, I am not able to search for .doc, .xls, .ppt etc files. I can only search for folders.

Can someone please tell me if I am missing something??

Regards,

Saurabh.