Skip Navigation
Expand
Understanding Ranking to Better Administer the Managed Search Query tool
Answer ID 12409   |   Last Review Date 09/15/2022

In Managed Search Query (MSQ) how should questions be tuned?

Environment:

  • Oracle B2C Service
  • Sites with Knowledge Advanced only, all versions
  • Search Tuning
  • Manage Search Query

Issue:

When searches are executed there is a content retrieval process that can be tuned in the Manage Search Query (MSQ) tool.  Together with the reports;
  • Questions with Low Score Answers
  • Recent Questions
  • Submitting a Question for Tuning
  • Words without Concepts
the searches can be tuned to bring back more relevant results and raise the click thru rate.  It is important first to understand the scoring process.
 
Scoring Process:
  1. The question words are spell corrected.  You can see what spell corrections are made in the MSQ tool on each word in the question.  When you hover over the word the system will show if the word is spell corrected.  Note:  There is no backend spell correction to content words.
  2. The words are set as skip words, stemmed words, concepts and synonyms.  When you hover over the word you can see how the word is available for scoring.  It could be a stem in the content, it could be a concept exact match, or it could be a synonym to the concept.  These are all scored differently.  These words will be compared to the words stored in the excerpts in the index.
  3. A set of content is created according to the facets are passed in the search, these are first the users access levels, and then any products or categories or other facets passed.
  4. The content is then compared to the tokenized question and scored.  You can see what these matches might be in MSQ.  Click on each word and it will give you example excerpt matches.
  5. All scores are added together to make one score for the content.  When there is a gap in scoring and scores drop dramatically the rest of the content is not shown.
This is the matching criteria and scoring with highest scoring first.
a. Exact Token match to content with concept synonym
--- user uses exact word in the content that is a concept.
b. Synonym match (concept annotation)       
--- user uses a synonym of word in the content.
c. Stem match (concept annotation)
--- user uses a stem of the word in the content which is also in a concept.
d. Stem match (no concept annotation)
--- user uses a stem of the word in the content which is not in a concept.
e. Token match (no concept annotation)
--- user uses the same word in the content which is not in a concept.

The scoring also considered word proximity, closeness to the top, frequency in the content, and title match. 
 
6.  Intents will trigger answers to be moved to the top.

7. There is a second pass at scoring for the popularity of the content.  Popularity is determined by click-thrus are used to populate "Popular Answers", and to tie-break the ranking of search results.  When content is created or updated in Authoring if its display position is set to "fix@top, its click-thru count will be increased above the highest click-thru count.  As time goes by, a process adjusts these counts, if the content is not clicked and viewed, the natural decay process will eventually bring its click-thru count to zero.  Then answers with a fix to top score brings that content up to the top.  Then the click-thru score on the content is used to address any tie breakers bringing those answers before other equivalent answers.  The answers with a fix to top score bring that content up to the top.  Then the click-thru score on the content is used to address any tie breakers bringing those answers before other equivalent answers.  Note:  Where content scores are the same this content can be in random order from search to search.
 
Note:
There are two counts maintained for content usage.  These two types of usage are click thru and browse.  The two reports that show this are
  • Articles Browsed
  • Search Click-thrus of Articles
Anytime content is viewed on the browse tab, latest or most popular documents lists, or typed in directly with docid or answerid, there is a view associated with the content in the article browse metrics.    The second count is a search result click-thru which is maintained as the click-thru count. 
 
This is an important comparison because agents and users that know the articles well and use article ids to retrieve them will be contributing to the articles browsed count and not the click-thru count.
 
Resolution:
Out of the box (OOTB), the concepts shipped with the product have been tuned over many years of usage.  The balance in the search engine is very good.  If you add too many concepts or change the ranking of these concepts too much, for example create too many high ranking concepts, then the balance in the scoring can get thrown off. 
 
Tuning search is meant to add your unique business terms to the concept base so that common terms do not outrank your business terms.  Note:  Product terms in your product hierarchy are automatically added from the product hierarchy.  These may or may not need to have additional synonyms added to them.
 
When a new product is added to the product hierarchy, during the next content indexing run, a product concept is automatically created.  The auto-product concept generation task takes into account the frequency of the product name in the knowledge base.  Product names that occur a lot in the content receive a low concept score.  Product names that are rare receive a high concept score. Most product names will receive a medium score.  In MSQ, if a question contains a product name, the annotations also show the product concepts. MSQ user can select the annotation and change the product concept score and synonyms. The product concepts follow the same match criteria just like OOTB concepts and other custom concepts.
 
There are only a few high ranking concepts OOTB.  So ranking concepts as high will give a lot more weight to those terms.
 
Some questions to ask when tuning concept scores.
Why does a certain term need to become more important in search?
Is there too much content with that term?  If there is perhaps the term should rank lower.
Are the questions about that term vague or too general?
Does the customer want to show some content higher in the search results than others, even though both sort of contents include the terms used?
Is it because of the launching of a new campaign (of a product)?

 
Consider creating an intent if the questions hide a very specific purpose or content.  You don't want to tune the concepts for one outcome - this is the purpose of an intent.