on 05-03-2006 3:08 PM
Hi,
I'm new to TREX search Engine. i want to know that what are Crawlers task and how & when they perform their duties after the DataSource is assigned to a newly created index. I mean to say, how they are closely related to search indexes.
Regards
Nitin Mishra
Hi,
The crawler service allows crawlers to collect resources located in internal or external repositories, for example, for indexing purposes. A crawler returns the resources and the hierarchical or net-like structures of the respective repositories.
Services and applications that need repositories to be crawled (for example, the index management service) request a crawler from the crawler service.
http://help.sap.com/saphelp_nw2004s/helpdata/en/fb/38ef207d0a47ee9dc08deeed855392/frameset.htm
Patricio.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Nitin,
let me add an important architectural fact:
Architecturally, the Crawler Service is part of KM. Thus, its Java processes run on your Portal Server.
Architecturally, the Crawler Service is <b>not</b> part of the TREX engine. The TREX engine merely receives the lists of objects to be indexed from the Crawler Service of KM through the Index Management Service of KM (IMS).
Regards, Karsten
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
hi,
The crawler only searches Web sites or parts of Web sites that are not protected by robot instructions. Robot instructions are part of Internet standards. They allow Web site owners to permit or forbid the crawling of their sites or parts thereof
Depending on the type of repository, you may have to set up a crawler and a schedule.
For Web repositories, the index is updated using a crawler. If you are assigning a Web repository to an index for the first time, it is indexed immediately. You then need to regularly schedule the crawler so that the index is updated.
For hierarchical repositories, the index is updated by using events. Therefore, it is not absolutely necessary that the crawler be started at regular intervals. However, you can start the crawler at regular intervals in order to make changes in the index for which no event is triggered. This can be the case if documents have been created, changed, or deleted directly in the file system without using Knowledge Management.
Regards,
Ganesh N
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
88 | |
23 | |
11 | |
9 | |
8 | |
5 | |
5 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.