Search

Understanding Knowledge Advanced Content Processing

Answer ID 10101 | Last Review Date 06/27/2022

How can I better understand Knowledge Advanced Content Processing on my site?

Environment:

Knowledge Advanced
Oracle B2C Service 19B release and later

Resolution:

Knowledge Advanced (KA) content processing jobs are found at the bottom of the Collection Setup. These jobs crawl and index KB content, web content, and external content to make it available in the Search on the agent desktop, Browser UI and CP web portal. There are several different jobs in the content processing.

Please see the Documentation for a full description of each type of job in the Content Processing Documentation.

The incremental crawl job is two jobs, crawling and indexing. For incremental changes in content to become visible to the users during search an indexing job has to process the content after its crawl job completes. The incremental jobs are optimized to run every 15 minutes in parallel, this allows for more frequent updates to the KB articles. Generally this means that an article update will be available in search between 15 to 30 minutes after the change.

The Full content processing jobs are also scheduled or can be run on demand.

The Queue button can be used to add additional processing for each type of job. This is used for the on demand type jobs and can be used to run additional once a day or once a week jobs as they are needed. It is not helpful to queue the incremental jobs as they are already running every 15 minutes. The incremental jobs are started by a utility that runs every 15 minutes so queuing them will not run them more quickly.

If a full crawl is queued it will take the place of the next incremental crawl. It is good to run full crawls after authoring has slowed down for the day and the incremental crawls are not needed.

When a job is already queued you do not need to queue another job of the same type.

For more information on the reasons to run the full processing see Answer ID 9945: Understanding full vs incremental content processing.

Notes:

The web crawl only runs 1 time a week unless it is run manually from the collection setup page. The utility still runs every 15 mintues to check if a manual update has been queued. The reasons that it is only scheduled to run once a week is because it is not an incremental crawl. There are no reliable way to determine if a webpage has been updated/changed. So the Web crawl is always a full crawl. Also the website can be a large amount of crawling. Depending on how it is set up in the collection definition, recursively with N-level depth. And websites tend to be more 'static' in nature.

Search

How can I better understand Knowledge Advanced Content Processing on my site?

Was this answer helpful?

Still have questions?

Related Answers

Login