The evolution of enterprise search technology
Iain Fletcher, VP marketing at Search Technologies, shares his expertise on enterprise search technology. In the previous article, a fundamental three-step process of searching was defined as follows:
- Create and submit a search clue
- The search engine then matches that against available documents and returns results
- Browse the results to find the desired information
There have been three key areas of innovation in enterprise search technology over the past 20 years. These align conveniently with our three-step search process and are as follows:
- Automated query enhancement
- Relevancy ranking
- Results browsing and navigation aids
This article will explore the first of these:
Automated Query Enhancement
Years ago, searchers worked hard at formulating their queries using a combination of Boolean syntax and fielded (metadata-based) search operators.
Today, with few exceptions, searchers use relatively simple search clues: for example, more than two-thirds of Google searches contain three words or fewer. Search behaviour is much the same for enterprise search applications.
A first principle of searching is that better (more detailed) search clues produce better results. This is the case, whatever search technology has been installed. Using a range of techniques, technology can assist by improving the search clue. Some of these techniques are hidden from the user, others require user interaction.
The most commonly used query enhancement methods are:
- Stemming and lemmatisation, to map singular and plural forms of nouns together, so a search for one version makes hits on all versions of the word
- Semantic expansion, using a thesaurus or other resource, to automatically add synonyms to the search clue. In a banking application, for example, a search clue FSA might be expanded to become FSA OR “financial services authority”. Note that searches submitted to enterprise search systems tend to be heavy with industry and company jargon, so most search systems will not provide the full vocabulary to support semantic expansion out-of-the-box
- Spell checking, which is typically deployed via a ‘did you mean’ link, which provides an interactive option to change the spelling. Again, jargon used in searches that is specific to the industry may not work perfectly out-of-the-box with a spell-checking algorithm
- Query auto-completion, as extensively used by the Web search engines such as Google. Based on the first few letters of a search request, this function tries to predict what the user is going to type and provides an option to complete the query with a single click. Although this is typically viewed as a time-saving device, it is also a way to influence the searcher into creating a better (that is, longer and more specific) search clue than the user might have otherwise intended
A common denominator throughout this series of articles is that many of the key innovations used by modern search engines will not operate perfectly out of the box. In the case of query enhancement, for example, techniques may require additional support from industry-specific vocabularies. All of the leading search products provide capabilities for these to be added to the system.
Even though today’s searchers generally create short, simple search clues, in most Enterprise Search systems the search clue that arrives at the server is somewhat more sophisticated. The resulting benefit is that this provides more material for the relevance ranking algorithms to work with.
The next article will focus on relevancy ranking.
This was posted in Bdaily's Members' News section by Iain Fletcher .
Enjoy the read? Get Bdaily delivered.
Sign up to receive our popular morning National email for free.