Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
- 90−100%
- 80−90%
- 60−80%
- 40−60%
- 20−40%
- 0−20%
|
|