Integral Search Quality
Assessors Analyzer
To keep the search quality up to the mark, it is necessary that the search engine development companies regularly control and assess the output. Moreover, the assessment has to be made not by robots, but by trained professionals. The analyzer presented here is Runet's first independent project of SE assessment.
The algorithm is as follows: the search engines under assessment are given an identical set of queries; their output is collected; each query is then randomly directed to an assessor whose task is to try and guess what the searcher's intention could be and which answers would matter for him.
Since the raison d'être of this analyzer is its complete objectivity, the person assessing a specific webpage is not aware either of the search engine by which the page was yielded, or of the page's position in the output. The only thing known to him is the corresponding query.
The "Assessor's Guide" - a set of strict instructions concerning the method of assessing - allows for an even greater objectivity.
The parameters assessed are the website's relevance to the query and its quality (which is not assessed in case of an obligatory page). The first grade is a definitive one, but it can be lowered by the second grade, if two or more assessors agree that the quality of the site lies below some standardized threshold. Once the page is evaluated, its grade is multiplied by a coefficient depending of its position in the output (the higher the webpage found, the higher the coefficient). Thus we obtain a summarized grade for the whole output page. An additional parameter of evaluation for navigational queries is presence or absence of an obligatory page: its absence results in penalty points.
Once all the output pages corresponding to all the queries are evaluated, the analyzer calculates the average search quality grade for a given search engine. This is the number you can see in our informer.
Since the raison d'être of this analyzer is its complete objectivity, the person assessing a specific webpage is not aware either of the search engine by which the page was yielded, or of the page's position in the output. The only thing known to him is the corresponding query.
The "Assessor's Guide" - a set of strict instructions concerning the method of assessing - allows for an even greater objectivity.
The parameters assessed are the website's relevance to the query and its quality (which is not assessed in case of an obligatory page). The first grade is a definitive one, but it can be lowered by the second grade, if two or more assessors agree that the quality of the site lies below some standardized threshold. Once the page is evaluated, its grade is multiplied by a coefficient depending of its position in the output (the higher the webpage found, the higher the coefficient). Thus we obtain a summarized grade for the whole output page. An additional parameter of evaluation for navigational queries is presence or absence of an obligatory page: its absence results in penalty points.
Once all the output pages corresponding to all the queries are evaluated, the analyzer calculates the average search quality grade for a given search engine. This is the number you can see in our informer.
Overall Search Quality
This rating helps to assess the overall quality of search for each search engine. It is based on the results of all the special analyzers, the analyzers of 'Clicks' and 'Updates' being excluded, since their results have a purely informational value.
The overall rating is calculated as follows:
1) For each special analyzer, the search engines' scores are normalized to 100 with respect to the best score. This is done in order to eliminate the differencies between the scoring scales of various analyzers.
2) For each analyzer, the normalized search engine scores are multiplied by the specific coefficient assigned to that analyzer. These coefficients reflect our understanding of how important the given feature / type of search is for the overall search quality. If in your opinion some search features have more or less merit for the overall search, please feel free to adjust the coefficients of the corresponding analyzers by moving the sliders on top of this page. Once you adjust the weights, the overall search quality scores will be recalculated.
3) Thereafter the numbers obtained are added up and divided by the sum of the coefficients. This operation yields a number in the range between 1 and 100 that represents the overall search quality of the search engine.
1) For each special analyzer, the search engines' scores are normalized to 100 with respect to the best score. This is done in order to eliminate the differencies between the scoring scales of various analyzers.
2) For each analyzer, the normalized search engine scores are multiplied by the specific coefficient assigned to that analyzer. These coefficients reflect our understanding of how important the given feature / type of search is for the overall search quality. If in your opinion some search features have more or less merit for the overall search, please feel free to adjust the coefficients of the corresponding analyzers by moving the sliders on top of this page. Once you adjust the weights, the overall search quality scores will be recalculated.
3) Thereafter the numbers obtained are added up and divided by the sum of the coefficients. This operation yields a number in the range between 1 and 100 that represents the overall search quality of the search engine.
Click Analyzer
This analyzer shows what percentage of clicks leading to Russian web pages comes from each search engine. Unlike the other analyzers, this one does not directly assess the search quality. Rather it reflects the popularity and usage of the search engines. The analyzer utilizes the data from Liveinternet.ru. We only take into account the clicks on sites that have a Liveinternet.ru counter installed.
Out of all the data of the LiveInternet counter, we only take into account the data on Russian users (Russian IP addresses). This is done to filter out the noise produced by the so-called "idiot clicks", i.e. random clicks of non-Russian-speaking users of "big" search engines such as Google, MSN Live Search, and Yahoo. These are not really Russian search engine users, but they can significantly distort the statistics (since the Internet outside Russia is vast, and the number of such random users is high).
The numbers cited in this analyzer are usually considered the shares of the search engines' market, but this is not quite correct. Here is why:
a) The LiveInternet counter only shows clicks on the sites where it is installed. Some big websites do not install it. Thus the statistics is not, strictly speaking, representative of the whole Russian Internet.
b) It is unclear how exactly the percentage of clicks from a search engine correlates with its true popularity. What if, using a "bad" search engine, the user has to click on multiple search results before (s)he finds the right site, while using a "good" one (s)he finds what (s)he needs at the first click? The "bad" search engine would in this case generate many clicks per user while the "good" one would generate only one. In general, the exact connection between popularity and clicks is unknown. A huge change in the percentage of clicks (say, 5 points or more) would probably reflect a real change in attendance of a search engine. Smaller fluctuations (1-2%) are probably less informative.
It is important to keep in mind that these figures represent percentage, not the absolute attendance or the absolute number of clicks. Thus the small dips clearly visible on the monthly graph of Yandex are mirrored by small increases on the part of Google. The attendance of Yandex decreases on weekends while that of Google suffers less (the reason is unknown to us). Since the share of Yandex is high, its decrease results in proportional growth of the share of Google on weekends (the sum of all search engines' shares remains constant). For Rambler, the weekend decrease is just as pronounced as it is for Yandex, so its share of percentage does not rise in the way that of Google does.
In the informer of this analyzer, the search engines are arranged in the descending order of the share of clicks.
The numbers cited in this analyzer are usually considered the shares of the search engines' market, but this is not quite correct. Here is why:
a) The LiveInternet counter only shows clicks on the sites where it is installed. Some big websites do not install it. Thus the statistics is not, strictly speaking, representative of the whole Russian Internet.
b) It is unclear how exactly the percentage of clicks from a search engine correlates with its true popularity. What if, using a "bad" search engine, the user has to click on multiple search results before (s)he finds the right site, while using a "good" one (s)he finds what (s)he needs at the first click? The "bad" search engine would in this case generate many clicks per user while the "good" one would generate only one. In general, the exact connection between popularity and clicks is unknown. A huge change in the percentage of clicks (say, 5 points or more) would probably reflect a real change in attendance of a search engine. Smaller fluctuations (1-2%) are probably less informative.
It is important to keep in mind that these figures represent percentage, not the absolute attendance or the absolute number of clicks. Thus the small dips clearly visible on the monthly graph of Yandex are mirrored by small increases on the part of Google. The attendance of Yandex decreases on weekends while that of Google suffers less (the reason is unknown to us). Since the share of Yandex is high, its decrease results in proportional growth of the share of Google on weekends (the sum of all search engines' shares remains constant). For Rambler, the weekend decrease is just as pronounced as it is for Yandex, so its share of percentage does not rise in the way that of Google does.
In the informer of this analyzer, the search engines are arranged in the descending order of the share of clicks.
Updates Analyzer
‘Update’ refers to the process of search results renewal. When the results are updated, some sites may make it to the top 10, some other sites may "sink". Every search engine has its own update style which becomes clear in this analyzer. Every day the search engine update analyzer monitors the top ten responses to 140 queries in order to assess the number of sites that changed their positions, and how much the positions have changed.
Let Di be the change in position for the page that appeared i-th in top 10 search results on day 1. For example, if the fifth page from the first day top 10 appeared third or seventh on the second day, D5=2. If the second day top 10 did not contain a certain page which was present on the first day, then we will assume that Di=10 for that page.
The update indicator is calculated using the formula:
10
∑ Di/100
i=1
Consider a couple of examples:
Example 1
On Day 1, a certain query has the following Top 10:
C1, C2, C3, C4, C5, C6, C7, C8, C9, C10.
On Day 2, the same query has this Top 10:
Cn, C1, C2, C3, C4, C5, C6, C7, C8, C9.
In this case the update indicator is calculated as follows:
((2-1)+(3-2)+(4-3)+(10-9)+10)/100 = 0.19 (19%)
Example 2
For Day 1, a certain query has the following Top 10:
C1, C2, C3, C4, C5, C6, C7, C8, C9, C10.
For Day 2, the same query has this Top 10:
Cn1, Cn2, Cn3, Cn4, Cn5, Cn6, Cn7, Cn8, Cn9, Cn10.
In this case the update indicator equals:
10*10/100 = 1.00 (100%)
The analyzer also calculates the additional parameters: the number of sites which disappeared from the search results and the number of sites which changed their positions.
This analyzer has no valuation. The results can be interpreted in two ways: a search engine that has frequent large updates could be considered more up-to-date; a search engine with rare updates can be considered more stable and predictable. The informer of this analyzer sorts the search engines in the ascending order of update level.
The update indicator is calculated using the formula:
10
∑ Di/100
i=1
Consider a couple of examples:
Example 1
On Day 1, a certain query has the following Top 10:
C1, C2, C3, C4, C5, C6, C7, C8, C9, C10.
On Day 2, the same query has this Top 10:
Cn, C1, C2, C3, C4, C5, C6, C7, C8, C9.
In this case the update indicator is calculated as follows:
((2-1)+(3-2)+(4-3)+(10-9)+10)/100 = 0.19 (19%)
Example 2
For Day 1, a certain query has the following Top 10:
C1, C2, C3, C4, C5, C6, C7, C8, C9, C10.
For Day 2, the same query has this Top 10:
Cn1, Cn2, Cn3, Cn4, Cn5, Cn6, Cn7, Cn8, Cn9, Cn10.
In this case the update indicator equals:
10*10/100 = 1.00 (100%)
The analyzer also calculates the additional parameters: the number of sites which disappeared from the search results and the number of sites which changed their positions.
This analyzer has no valuation. The results can be interpreted in two ways: a search engine that has frequent large updates could be considered more up-to-date; a search engine with rare updates can be considered more stable and predictable. The informer of this analyzer sorts the search engines in the ascending order of update level.
Commercial Updates Analyzer
Search results for commercial queries are subject to rather drastic changes. It is partly due to market dynamics, which is all the more lively when there are many small players involved. The other reason is the SEO factor, often used by companies to influence the web output. That's why it is particularly interesting to take a look at the output updates for such queries.
The analyzer was created after Yandex had announced its decision to cease attributing any ranking factor value to links for commercial queries in its Moscow regional results. It looked tempting to investigate the consequences that such a decision would have for commercial output. For the sake of comparison, the same set of queries is presented to the search engines from several other locations in Russia. Moreover, the results can at any moment be compared to those of the basic Updates Analyzer .
The formula that calculates the Analyzer's values for each region is the same as in the Update Analyzer. For each domain that is present in the SERP on both the current and the previous days, the sum of position changes is calculated (it is divided by the total number of elements in the SERP). The number of newly appearing domains is then added and the Analyzer index is the total sum divided by the number of different sites in the current SERP.
The TIC link shows the average Thematic Citation Index, an index used by Yandex to determine the total weight of links, leading to a certain site. The PageRank tab in turn shows the average Google PageRank of all pages shown by each search engine.
The formula that calculates the Analyzer's values for each region is the same as in the Update Analyzer. For each domain that is present in the SERP on both the current and the previous days, the sum of position changes is calculated (it is divided by the total number of elements in the SERP). The number of newly appearing domains is then added and the Analyzer index is the total sum divided by the number of different sites in the current SERP.
The TIC link shows the average Thematic Citation Index, an index used by Yandex to determine the total weight of links, leading to a certain site. The PageRank tab in turn shows the average Google PageRank of all pages shown by each search engine.
Snippet Quality Analyzer
In the web search terminology, a snippet is a unit that appears on search engine results pages, represents its own page and consists of a title, a short fragment of the page text and a web link.
The quality of snippets is an important component of an overall search quality. It is mainly by reading snippets that the user decides whether or not to open a specific webpage. Thus, if the snippet contains the gist of the page or some information, which is highly relevant to the query, it actually helps the user in finding what he needed, while a badly formed or inappropriate snippet often confuses and irritates people.
The quality of snippets is evaluated by our assessors as a part of their search engine output assessment. But we have to bear in mind that the relevance and the quality of a page are not connected to the quality of its snippet. To make this point quite clear, we show both grades side-by-side on this analyzer's page. The evaluation is made according to a common set of instructions and on condition of the snippet's "anonymity" (i.e., the assessor does not know whose results he is assessing). Therefore we can rely on the objectivity of the analysis.
Not only should the content of the snippet correspond to that of the webpage. The quality of the text in the snippet is also taken into consideration. For instance, cut-off phrases or disconnected words, html-codes or unintelligible lists from the page menu appearing in the snippet significanlty harm its transparency and therefore earn the snippet a lower score. If a snippet consists of nothing more than a title, its score will not be too high, too.
Every snippet is graded on a 5-point scale from 0.2 to 1. Then the average grade for all queries is calculated and presented in the informer.
The quality of snippets is an important component of an overall search quality. It is mainly by reading snippets that the user decides whether or not to open a specific webpage. Thus, if the snippet contains the gist of the page or some information, which is highly relevant to the query, it actually helps the user in finding what he needed, while a badly formed or inappropriate snippet often confuses and irritates people.
The quality of snippets is evaluated by our assessors as a part of their search engine output assessment. But we have to bear in mind that the relevance and the quality of a page are not connected to the quality of its snippet. To make this point quite clear, we show both grades side-by-side on this analyzer's page. The evaluation is made according to a common set of instructions and on condition of the snippet's "anonymity" (i.e., the assessor does not know whose results he is assessing). Therefore we can rely on the objectivity of the analysis.
Not only should the content of the snippet correspond to that of the webpage. The quality of the text in the snippet is also taken into consideration. For instance, cut-off phrases or disconnected words, html-codes or unintelligible lists from the page menu appearing in the snippet significanlty harm its transparency and therefore earn the snippet a lower score. If a snippet consists of nothing more than a title, its score will not be too high, too.
Every snippet is graded on a 5-point scale from 0.2 to 1. Then the average grade for all queries is calculated and presented in the informer.
Navigational Search Analyzers
The analyzers in this group make an estimation of the search engine's navigational functioning. Different kinds of queries are used to check, whether or not the site / page in question is found on the first result page.
Navigational queries are those looking for a specific site, file or page. Such queries will usually consist of the name of some organization or business (e.g., "Punjab and Sind Bank" or "Moores Glassworks"), of some print source or web site (e.g., "Cooking Light" or "bash.org"), or they will just name the page needed (like "Bofinger Rue Sherbrooke Ouest Montréal"). Likewise, eminent bloggers or official site owners often become a target of navigational queries (think of Art Garfunkel or Jessica Gottlieb).
Evidently, a navigational query can have more than one meaning: user searching for "alabama state university" or "avril lavigne" might look for an independent information about the organization or the person in question. Still, the official site must be present in the SERP, and its position must be high enough. Furthermore, the analyzers of this group allow the switch from stricter (the official site takes first or close to first position) to looser (it's enough that the official site is among the top ten results) examination criteria.
Analyzer of Navigational Search
A search query with a purpose of finding a certain website is called a navigational query. Such queries include "sberbank", "komsomolskaya pravda", "rambler", "gazeta ru", etc.
The best result for a navigational query is the required site in the first position of search results.
For evaluation of navigational search, the search engines were tested with 200 queries randomly selected from the array of navigational queries. Each query was assigned one or more site/marker. The top 10 search results are checked for the site/marker entries. When several sites/markers were assigned to a query, each of them listed in one of the top positions was considered a hit. The percentage of queries which yielded the site/marker on the first page was calculated. This number is the aggregate indicator of the quality of navigational search.
The best search engine is the one with highest aggregate indicator for this analyzer. In the informer, the search engines are sorted by the aggregate indicator.
Peripheral Navigation Analyzer
The peripheral navigation analyzer works in much the same way as the general navigation analyzer, i.e. the search engines are given a set of purely navigational queries (ones aiming at a specific web page) and then the presence of the marker's site among the search results is checked.
The only distinction between the two analyzers lies in the set of queries (and markers). Whereas the general navigation analyzer is concerned with nationwide businesses or organizations, the new analyzer performs the search of smaller companies, mainly those functioning only locally (hence the name). Such organizations being smaller, their sites are of course less popular and thus more difficult for the search engines to find, which explains that the results in this analyzer are usually lower than in the general navigational analyzer.
The result of the "any position" tab is the percentage of requests where the marker has been found on the first search results page. In the "first position" tab the scores are adjusted to each query result as follows: 0 when the marker is not found; 0.1 for the marker found on the 10th position; 0.2: for ninth etc. till 1 when it's found on first position. These scores are then summed up to make the overall mark for this tab.
Analyzer of Internal Page Navigation
Users often make what they mean quite clear - they mention not only the desired website, but also the page or the part of the website that they would like to access. Unfortunately the search engines often fail to make use of the query's precision: the results contain only the main page or the non-requested internal pages.
The queries in this analyzer offer a precise description of the requested pages. The score of the search engine is the percentage of queries which returned the desired page in top 10 results. Neither the position of the target page nor the number of times it occurred in top 10 affects the result.
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Person Search
The queries consisting of a first and a last name are often targeting a particular web page - the official personal web page of the person whose name is in the query. Even if the user is not sure that the personal site exists, such website is an obvious hit and it should appear in top 10 search results.
The queries for this analyzer include the names of celebrities as well as people who are well-known in some domain (for example, photographers, scientists, psychologists etc).
The analysis of queries and hits in this analyzer is analogous to the Analyzer of navigational search
Analyzer of Blog Search
Many celebrities have blogs. In fact many people got famous because of their blog. The popular blogs often have more readers than many mass media.
A query containing a first and a last name is thus often targeting a blog (or a microblog, social network page etc) of the given person. The search engine results for these queries should contain the blogs.
This analyzer complements the analyzer of person search . It assesses whether the blogs appear in the results page for the name queries. The search engine scores are calculated in the same way as in the analyzer of navigational search with one exception. We do not take into account the position of the target page in search results, since the official web pages are arguably more relevant responses than blogs for the name queries.
The queries in this analyzer are generally in the first name + last name format. We specifically wanted to include people who got famous because of their blog.
Social Network Navigation Analyzer
The audience of social networks grows by the day, as does the time spent there by a medium user. That's why most businesses have found it convenient to set up their own page on one or more social networks. Nowadays, every respectable hairdresser or garage owner can be found on Facebook or similar. Moreover, the social network page often becomes the only official source of information about the company.
Now, that means that for many small businesses or business-like structures, not to be found on Facebook / VKontakte/ Livejournal etc. means not to be found at all: there simply won't be any other place to look for them. This is how the queries in the Analyzer were chosen : we only used such companies that haven't got any website, except their social network page (but naturally, we took care to gather all such pages for each company). Thus, the Analyzer evaluates the web user's chances of finding official information about smaller organizations. At the same time it evaluates, along with the Blog Search Analyzer, the overall quality of search on the social networks segment of the internet, a parameter that nowadays becomes extremely significant.
The principle of the Analyzer's work is the same as with any other from the Navigation search group: for each query, we check the presence of the given page in the search results as well as its position.
Information Search Analyzers
The largest and the least defined group of queries are those aiming at finding information, in a broad sense of the word. Although an exhaustive survey of all such queries would seem impossible, yet some aspects of informational search come under close scrutiny in this group of analyzers.
Our analyzers cover Quote Search (Quotations, Catch Phrase and partly Originals) and Answer Search. It is very important that the search engine is able to (and is willing to) distinguish the original information source from its copies or imitations. This is the issue of the Originals' analyzer.
Our plans include broadening the scope of the search aspects under investigation. Yet, even at this moment, this scope is wider than it may seem, since a whole chain of analyzers in other groups are immediately related to information search. Thus, it is usually informational query that is affected by the search engine's "mistakes". Several of the Data Freshness analyzers also deal with informational queries. And naturally, informational queries form the most part of the Assessing analyzers, as they form the most part of the web search in general.
Analyzer of Quotation Search Quality
Quotation search is the search for the source of a certain text fragment, i.e. either the original text (in which case a larger portion of it should appear on the site), or at least the author and the title of this text.
This analyzer examines 100 queries that consist of significantly long extracts from texts, published on the Web. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the larger fragment of the original text or b. the name of the author and the title of the text.
The positions of the pages in the search results are not taken into consideration. Neither are (unlike the original texts analyzer, where the priority of the copyright holder is important) the sites where the text in question was first published.
Catch Phrase Analyzer
This analyzer, is devoted to the queries containing short popular quotations. These quotations often come from fiction, but they are also used in everyday life.
For example, a Russian user typing the query [контора пишет] is likely to be looking for the meaning or the original source (the title of the text and the name of the author) of the expression. The search engines, however, often provide multiple examples of use of the expression, which is hardly what the user is looking for.
The analyzer examines 100 queries that consist of a popular quotation, the source of which is known. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the given fragment (or one of several fragments) of the text where the quotation comes from or b. the name of the author and the title of the text. The positions of the pages among the search results are not taken into consideration.
Analyzer of Question Answering
This analyzer assesses the ability of search engines to find answers to question queries. The question queries include obvious quesitons with a question word ([when did CSKA win UEFA cup], [where is Lhasa located]) as well as other queries that imply an obvious short answer ([currency of Brasil], [nitric acid formula]).
The users typing such queries are most likely looking for the answer to their question. The quicker they get the answer, the better. Ideally, the snippet of the first page in search results should include the answer.
This analyzer has several tabs. In each of the tabs the total score of a search engine is the sum of its scores for each query.
1. Answer position in top 10 snippets
Here a search engine gets a score from 0 to 1 for each query, reflecting the highest position of a snippet, that contained the answer, in top 10 results. For example, if the answer was found in the first snippet, the search engine gets 1 for the query. If it was found in the second place, the score is 0.9 etc. If the answer was not found in the snippets of top 10 results, the score is 0.
2. Answer presence in snippets
Here a search engine gets 1 for a query if at least one of the top 10 snippets contained an answer to the question. 0 is assigned otherwise.
3. Position of answer web pages in top 10
A search engine gets a score from 0 to 1 for each query, reflecting the highest position of a web page, that contained the answer, in top 10 results. The score is 1 if the first page found contained an answer, 0.9 if the answer was present on the second page and 0 if none of the web pages found contained the answer.
4. Answer presence on web pages found
Here a search engine gets 1 for a query if at least one of the top 10 web pages contained an answer to the question. 0 is assigned otherwise.
A question query may have multiple possible answers. For example, the possible answers for [Where are Maldives located] are "Indian ocean" and "Asia".
Analyzer of Original Texts Ranking
Unfortunately, the copyrighted content is illegaly copied all too much on the web. Any author faces the fact that his or her original texts are being stolen: the text of a newly created article can be copied on some web page over days or even hours after the article has been created. The websites stealing the content in this way usually claim that it was "taken from open sources" or "uploaded by one of the users". The stolen content can be used to attract search engine users to a web page and the resulting traffic can be converted to money. This is the main reason for such 'borrowing'. The ability to identify the original texts and rank the corresponding web pages higher than the pages containing copied materials is a crucial property of any search engine.
The analyzer of the ranking of original texts uses exact quotation queries to daily monitor the position of 100 marker articles. For these articles, the web sites of copyright holders are known. The analyzer can thus calculate the percentage of queries for which the original text is ranked higher than the copied material.
The queries in this analyzer are fragments from the original article. By default the queries are submitted to the search engines in quotation marks. This way we expect that the only responses found will be the original article and its copies. However the real users rarely rely on quotation marks, so an additional tab estimates the results based on queries submitted without quotation marks.
The search engines are sorted by their ability to rank original texts higher than the copies in the informer of this analyzer.
Analyzer of the Location Search Quality
One of the most frequent and most obvious uses of search engines is for mere geographical navigation. We often want to find out whether a certain organization, business, service etc. is located where it should be, from our point of view. Although the overall search quality for this type of queries is not bad at all, mistakes sometimes occur. E.g., the search engine promptly finds the entity in question in some other city district or even some other city. Or, conversely, it supplies us with similar addresses of other organizations, presuming it doesn't make any difference what we shall do, provided we do it in the right place. This analyzer was made to compare the respective merits of search engines in dealing with such queries.
As in real life, our input queries consist of the organization's name plus the approximate locality, like a city district, a street or a nearby underground station. To make the results of the evaluation more precise, we only use such queries where the object of search is just one single entity. The ideal output in such case is a helper with full list of contacts. But at this stage we judged that just the presence of the correct address in the upper snippet will suffice for the SE to get the maximum grade. On the other hand, the results containing some other useful information, but not the address, won't get any.
The results are calculated on the same principle, as in the Navigational Analyzer: the higher we can find the snippet with the correct address, the more points gets the search engine (the first snippet gets 1.0, the 10th - 0.1 point). In addition, we calculate the ratio of correct answers found outside the snippets.
Transactional Search Analyzers
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Fortunately, the amount of software, movies and music you can download or watch /listen quite legally (usually on the websites of their creators) has significantly grown, as of late. The problem of finding a good source to download this content has therefore an unequivocal solution. It is all the more important that the search engines know how to find these correct, official sources, instead of the "fun portals" and all the scam sites.
This analyzer is the first in the intended series of analyzers measuring the search quality for transactional queries, i.e. for queries with a narrow practical goal: to download some file, to listen to a specific piece of music, to buy tickets, to transfer money etc.
Official Versions Search Analyzer
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Official Software Search
Looking up for a necessary program on the Web - a task that has often arisen with most of users. However, the search engines frequently offer unofficial and sometimes most untrustable sites with the program available for download. But can they do better given the free official version of what the user is after is available?
Data Freshness Analyzer
The degree of the result's relevance often depends not only on the content of the information retrieved, but also on its freshness. When the object of the search is a rapidly changing thing, such as news, schedules, forecasts etc., the out-of-date invalid information can be as harmful, as it is meaningless, since it may confuse and mislead the user.
This group's analyzers check the presence of valid or, vice versa, outdated results in the search engine output. (Needless to say, we have picked out such types of queries, as to make this check at all sensible. These are phone numbers and positions of executive managers.) As it so happens, the results marked as "fresh" also tend to get outdated. We therefore have to regularly arrange additional tests of the analyzers' results. If you catch sight of some outdated marker, please do let us know!
The Indexation analyzer should be considered separately: it shows what time does it take for a new Web page to enter the search engines' indexes. It is clear that the fast indexation is a necessary prerequisite for maintaining the freshness of search results.
Data Freshness Aanalyzer: Jobs
This analyzer is the first one in the series of analyzers which will estimate to which extent the search results are up to date.
For many queries it is important that a search engine does not give outdated results. For example, a user looking for "president of Zimbabwe" is most likely to be interested in finding out who is the current president of Zimbabwe. The user will not be satisifed if they only get the information about the previous presidents.
Within this analyzer each query is associated with one or more actual up-to-date response marker as well as one ore more outdated response marker. Any web page containing the up-to-date information increases the search engine's score in this analyzer. On the other hand, the score is decreased for any page which only contains outdated information. The results which are not recognized as containing either up-to-date or outdated information are not analyzed. Some of these unrecognized results may be irrelevant to the query, but our focus here is only on the actuality of the results.
We plan to develop new analyzers which will assess actuality of other types of information: recent news, pricing information for everyday goods as well as for stocks and currencies, last-minute deals etc.
Data Freshness: Phone Numbers
Continuing the series of data freshness analyzers, this one estimates the quality of phone numbers' search.
The growing figures of outdated numbers in search results are mostly affected by the mass phones changing in Russia. Undoubtedly, actual company's contacts are very important for both the clients and the staff. The timely measures taken by search engines prevent the firms from losing new clients and the customers from the irritation caused by unsuccessful search process.
The queries in the analyzer are the names of companies, that have lately changed their contacts. Any web page containing the up-to-date information increases the search engine's score in this analyzer. Otherwise, the score is decreased for any page which only contains outdated information. The pages which contain neither old nor updated information are left out of account. It is to be said that the cases when the dialling code is significant for appreciating (for instance, 8 (499) xxx-xx-xx in Moscow) are also taken into account.
As the quality coefficient we use the share of web pages with actual phones.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Recall & Diversity Analyzers
As the web search is used for every possible need, it is vital that a search engine be able to find meaningful answers to most diverse questions. This ability depends both on the search engine's scope and on its interpreting potential.
These analyzers reflect the two above-mentioned aspects of the search engines' functioning, the quantity of the output and its diversity (when allowed by the non-strict sense of the query). To keep the results valid, we regularly put to test our set of queries.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
Recall analyzer
The recall analyzer estimates the relative size of indices of the Internet search engines.
The total number of documents indexed, as reported by the SE itself cannot be used for comparison because different SEs have different methods of document count. For example, some of them include duplicate documents in the count while others do not. Counting the duplicate documents may double the reported index size or increase it even more.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
Analyzer of subject search
A human being is often able to interpret a search query, determine what the user wants, evaluate the information on the Web and form the search results better than a machine. For this reason, the results formed by an expert are always better than those of an algorithm.
This analyzer monitors daily the search results for a set of queries, for which the corresponding links have been selected by experts. It calculates the number of sites found by each search engine that were on the experts' list. For each query, the percentage of similarity between algorithmic and expert results is calculated.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
Analyzer of Ambiguous Queries
When Russian users search for «белки» do they mean the furry animals or the proteins? There is about a dozen universities which can be referred to as "КГУ", so is the user entering this query looking for Kazan State, Kursk State or any other educational institution which happens to have the same abbreviated name?
In the case of an ambiguous query, the user could have had in mind any of the "readings". Therefore, instead of showing the results based on the most frequent interpretation, it seems to be the best bet for a search engine to include the results corresponding to all the different readings. This way the users may at least realize that their query is ambiguous which in turn will help them to make the query more precise.
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
Local Search Analyzers
One of the different factors influencing the web search is the location where the search is conducted. Not that it is of overall importance, but quite often the set of answers to a query will differ according to the specific region.
We here inquire into the quality of the search conducted from different cities of Russia. The analyzer's queries are entered by the servers located in ten cities standing as far away as Krasnodar and Vladivostok. Then the results are tested to see if they are region-appropriate.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
Analyzer of Regional Search
When a user from Ufa or Novosibirsk enters the query "pizza delivery", he or she is most likely looking to get some pizza. The websites of companies offering pizza delivery in Moscow would probably be of no use to the user since it does not make sense to deliver pizza from Moscow to Novosibirsk.
In this analyzer, we query the search engines from different cities across Russia. We select the search engine results for which the title, snippet or URL clearly mentions the location from where the query was sent. We only evaluate the information available as part of SERP's because this is all that the regional users see when they make decisions as to whether they want to follow a link to a given webpage found in SE results.
The percentage of responses to queries averaged over all cities except Moscow is taken to be a good estimate of how friendly the search engine is to the regional users.
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Evidently, a navigational query can have more than one meaning: user searching for "alabama state university" or "avril lavigne" might look for an independent information about the organization or the person in question. Still, the official site must be present in the SERP, and its position must be high enough. Furthermore, the analyzers of this group allow the switch from stricter (the official site takes first or close to first position) to looser (it's enough that the official site is among the top ten results) examination criteria.
A search query with a purpose of finding a certain website is called a navigational query. Such queries include "sberbank", "komsomolskaya pravda", "rambler", "gazeta ru", etc.
The best result for a navigational query is the required site in the first position of search results.
The best result for a navigational query is the required site in the first position of search results.
For evaluation of navigational search, the search engines were tested with 200 queries randomly selected from the array of navigational queries. Each query was assigned one or more site/marker. The top 10 search results are checked for the site/marker entries. When several sites/markers were assigned to a query, each of them listed in one of the top positions was considered a hit. The percentage of queries which yielded the site/marker on the first page was calculated. This number is the aggregate indicator of the quality of navigational search.
The best search engine is the one with highest aggregate indicator for this analyzer. In the informer, the search engines are sorted by the aggregate indicator.
The best search engine is the one with highest aggregate indicator for this analyzer. In the informer, the search engines are sorted by the aggregate indicator.
Peripheral Navigation Analyzer
The peripheral navigation analyzer works in much the same way as the general navigation analyzer, i.e. the search engines are given a set of purely navigational queries (ones aiming at a specific web page) and then the presence of the marker's site among the search results is checked.
The only distinction between the two analyzers lies in the set of queries (and markers). Whereas the general navigation analyzer is concerned with nationwide businesses or organizations, the new analyzer performs the search of smaller companies, mainly those functioning only locally (hence the name). Such organizations being smaller, their sites are of course less popular and thus more difficult for the search engines to find, which explains that the results in this analyzer are usually lower than in the general navigational analyzer.
The result of the "any position" tab is the percentage of requests where the marker has been found on the first search results page. In the "first position" tab the scores are adjusted to each query result as follows: 0 when the marker is not found; 0.1 for the marker found on the 10th position; 0.2: for ninth etc. till 1 when it's found on first position. These scores are then summed up to make the overall mark for this tab.
Analyzer of Internal Page Navigation
Users often make what they mean quite clear - they mention not only the desired website, but also the page or the part of the website that they would like to access. Unfortunately the search engines often fail to make use of the query's precision: the results contain only the main page or the non-requested internal pages.
The queries in this analyzer offer a precise description of the requested pages. The score of the search engine is the percentage of queries which returned the desired page in top 10 results. Neither the position of the target page nor the number of times it occurred in top 10 affects the result.
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Person Search
The queries consisting of a first and a last name are often targeting a particular web page - the official personal web page of the person whose name is in the query. Even if the user is not sure that the personal site exists, such website is an obvious hit and it should appear in top 10 search results.
The queries for this analyzer include the names of celebrities as well as people who are well-known in some domain (for example, photographers, scientists, psychologists etc).
The analysis of queries and hits in this analyzer is analogous to the Analyzer of navigational search
Analyzer of Blog Search
Many celebrities have blogs. In fact many people got famous because of their blog. The popular blogs often have more readers than many mass media.
A query containing a first and a last name is thus often targeting a blog (or a microblog, social network page etc) of the given person. The search engine results for these queries should contain the blogs.
This analyzer complements the analyzer of person search . It assesses whether the blogs appear in the results page for the name queries. The search engine scores are calculated in the same way as in the analyzer of navigational search with one exception. We do not take into account the position of the target page in search results, since the official web pages are arguably more relevant responses than blogs for the name queries.
The queries in this analyzer are generally in the first name + last name format. We specifically wanted to include people who got famous because of their blog.
Social Network Navigation Analyzer
The audience of social networks grows by the day, as does the time spent there by a medium user. That's why most businesses have found it convenient to set up their own page on one or more social networks. Nowadays, every respectable hairdresser or garage owner can be found on Facebook or similar. Moreover, the social network page often becomes the only official source of information about the company.
Now, that means that for many small businesses or business-like structures, not to be found on Facebook / VKontakte/ Livejournal etc. means not to be found at all: there simply won't be any other place to look for them. This is how the queries in the Analyzer were chosen : we only used such companies that haven't got any website, except their social network page (but naturally, we took care to gather all such pages for each company). Thus, the Analyzer evaluates the web user's chances of finding official information about smaller organizations. At the same time it evaluates, along with the Blog Search Analyzer, the overall quality of search on the social networks segment of the internet, a parameter that nowadays becomes extremely significant.
The principle of the Analyzer's work is the same as with any other from the Navigation search group: for each query, we check the presence of the given page in the search results as well as its position.
Information Search Analyzers
The largest and the least defined group of queries are those aiming at finding information, in a broad sense of the word. Although an exhaustive survey of all such queries would seem impossible, yet some aspects of informational search come under close scrutiny in this group of analyzers.
Our analyzers cover Quote Search (Quotations, Catch Phrase and partly Originals) and Answer Search. It is very important that the search engine is able to (and is willing to) distinguish the original information source from its copies or imitations. This is the issue of the Originals' analyzer.
Our plans include broadening the scope of the search aspects under investigation. Yet, even at this moment, this scope is wider than it may seem, since a whole chain of analyzers in other groups are immediately related to information search. Thus, it is usually informational query that is affected by the search engine's "mistakes". Several of the Data Freshness analyzers also deal with informational queries. And naturally, informational queries form the most part of the Assessing analyzers, as they form the most part of the web search in general.
Analyzer of Quotation Search Quality
Quotation search is the search for the source of a certain text fragment, i.e. either the original text (in which case a larger portion of it should appear on the site), or at least the author and the title of this text.
This analyzer examines 100 queries that consist of significantly long extracts from texts, published on the Web. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the larger fragment of the original text or b. the name of the author and the title of the text.
The positions of the pages in the search results are not taken into consideration. Neither are (unlike the original texts analyzer, where the priority of the copyright holder is important) the sites where the text in question was first published.
Catch Phrase Analyzer
This analyzer, is devoted to the queries containing short popular quotations. These quotations often come from fiction, but they are also used in everyday life.
For example, a Russian user typing the query [контора пишет] is likely to be looking for the meaning or the original source (the title of the text and the name of the author) of the expression. The search engines, however, often provide multiple examples of use of the expression, which is hardly what the user is looking for.
The analyzer examines 100 queries that consist of a popular quotation, the source of which is known. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the given fragment (or one of several fragments) of the text where the quotation comes from or b. the name of the author and the title of the text. The positions of the pages among the search results are not taken into consideration.
Analyzer of Question Answering
This analyzer assesses the ability of search engines to find answers to question queries. The question queries include obvious quesitons with a question word ([when did CSKA win UEFA cup], [where is Lhasa located]) as well as other queries that imply an obvious short answer ([currency of Brasil], [nitric acid formula]).
The users typing such queries are most likely looking for the answer to their question. The quicker they get the answer, the better. Ideally, the snippet of the first page in search results should include the answer.
This analyzer has several tabs. In each of the tabs the total score of a search engine is the sum of its scores for each query.
1. Answer position in top 10 snippets
Here a search engine gets a score from 0 to 1 for each query, reflecting the highest position of a snippet, that contained the answer, in top 10 results. For example, if the answer was found in the first snippet, the search engine gets 1 for the query. If it was found in the second place, the score is 0.9 etc. If the answer was not found in the snippets of top 10 results, the score is 0.
2. Answer presence in snippets
Here a search engine gets 1 for a query if at least one of the top 10 snippets contained an answer to the question. 0 is assigned otherwise.
3. Position of answer web pages in top 10
A search engine gets a score from 0 to 1 for each query, reflecting the highest position of a web page, that contained the answer, in top 10 results. The score is 1 if the first page found contained an answer, 0.9 if the answer was present on the second page and 0 if none of the web pages found contained the answer.
4. Answer presence on web pages found
Here a search engine gets 1 for a query if at least one of the top 10 web pages contained an answer to the question. 0 is assigned otherwise.
A question query may have multiple possible answers. For example, the possible answers for [Where are Maldives located] are "Indian ocean" and "Asia".
Analyzer of Original Texts Ranking
Unfortunately, the copyrighted content is illegaly copied all too much on the web. Any author faces the fact that his or her original texts are being stolen: the text of a newly created article can be copied on some web page over days or even hours after the article has been created. The websites stealing the content in this way usually claim that it was "taken from open sources" or "uploaded by one of the users". The stolen content can be used to attract search engine users to a web page and the resulting traffic can be converted to money. This is the main reason for such 'borrowing'. The ability to identify the original texts and rank the corresponding web pages higher than the pages containing copied materials is a crucial property of any search engine.
The analyzer of the ranking of original texts uses exact quotation queries to daily monitor the position of 100 marker articles. For these articles, the web sites of copyright holders are known. The analyzer can thus calculate the percentage of queries for which the original text is ranked higher than the copied material.
The queries in this analyzer are fragments from the original article. By default the queries are submitted to the search engines in quotation marks. This way we expect that the only responses found will be the original article and its copies. However the real users rarely rely on quotation marks, so an additional tab estimates the results based on queries submitted without quotation marks.
The search engines are sorted by their ability to rank original texts higher than the copies in the informer of this analyzer.
Analyzer of the Location Search Quality
One of the most frequent and most obvious uses of search engines is for mere geographical navigation. We often want to find out whether a certain organization, business, service etc. is located where it should be, from our point of view. Although the overall search quality for this type of queries is not bad at all, mistakes sometimes occur. E.g., the search engine promptly finds the entity in question in some other city district or even some other city. Or, conversely, it supplies us with similar addresses of other organizations, presuming it doesn't make any difference what we shall do, provided we do it in the right place. This analyzer was made to compare the respective merits of search engines in dealing with such queries.
As in real life, our input queries consist of the organization's name plus the approximate locality, like a city district, a street or a nearby underground station. To make the results of the evaluation more precise, we only use such queries where the object of search is just one single entity. The ideal output in such case is a helper with full list of contacts. But at this stage we judged that just the presence of the correct address in the upper snippet will suffice for the SE to get the maximum grade. On the other hand, the results containing some other useful information, but not the address, won't get any.
The results are calculated on the same principle, as in the Navigational Analyzer: the higher we can find the snippet with the correct address, the more points gets the search engine (the first snippet gets 1.0, the 10th - 0.1 point). In addition, we calculate the ratio of correct answers found outside the snippets.
Transactional Search Analyzers
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Fortunately, the amount of software, movies and music you can download or watch /listen quite legally (usually on the websites of their creators) has significantly grown, as of late. The problem of finding a good source to download this content has therefore an unequivocal solution. It is all the more important that the search engines know how to find these correct, official sources, instead of the "fun portals" and all the scam sites.
This analyzer is the first in the intended series of analyzers measuring the search quality for transactional queries, i.e. for queries with a narrow practical goal: to download some file, to listen to a specific piece of music, to buy tickets, to transfer money etc.
Official Versions Search Analyzer
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Official Software Search
Looking up for a necessary program on the Web - a task that has often arisen with most of users. However, the search engines frequently offer unofficial and sometimes most untrustable sites with the program available for download. But can they do better given the free official version of what the user is after is available?
Data Freshness Analyzer
The degree of the result's relevance often depends not only on the content of the information retrieved, but also on its freshness. When the object of the search is a rapidly changing thing, such as news, schedules, forecasts etc., the out-of-date invalid information can be as harmful, as it is meaningless, since it may confuse and mislead the user.
This group's analyzers check the presence of valid or, vice versa, outdated results in the search engine output. (Needless to say, we have picked out such types of queries, as to make this check at all sensible. These are phone numbers and positions of executive managers.) As it so happens, the results marked as "fresh" also tend to get outdated. We therefore have to regularly arrange additional tests of the analyzers' results. If you catch sight of some outdated marker, please do let us know!
The Indexation analyzer should be considered separately: it shows what time does it take for a new Web page to enter the search engines' indexes. It is clear that the fast indexation is a necessary prerequisite for maintaining the freshness of search results.
Data Freshness Aanalyzer: Jobs
This analyzer is the first one in the series of analyzers which will estimate to which extent the search results are up to date.
For many queries it is important that a search engine does not give outdated results. For example, a user looking for "president of Zimbabwe" is most likely to be interested in finding out who is the current president of Zimbabwe. The user will not be satisifed if they only get the information about the previous presidents.
Within this analyzer each query is associated with one or more actual up-to-date response marker as well as one ore more outdated response marker. Any web page containing the up-to-date information increases the search engine's score in this analyzer. On the other hand, the score is decreased for any page which only contains outdated information. The results which are not recognized as containing either up-to-date or outdated information are not analyzed. Some of these unrecognized results may be irrelevant to the query, but our focus here is only on the actuality of the results.
We plan to develop new analyzers which will assess actuality of other types of information: recent news, pricing information for everyday goods as well as for stocks and currencies, last-minute deals etc.
Data Freshness: Phone Numbers
Continuing the series of data freshness analyzers, this one estimates the quality of phone numbers' search.
The growing figures of outdated numbers in search results are mostly affected by the mass phones changing in Russia. Undoubtedly, actual company's contacts are very important for both the clients and the staff. The timely measures taken by search engines prevent the firms from losing new clients and the customers from the irritation caused by unsuccessful search process.
The queries in the analyzer are the names of companies, that have lately changed their contacts. Any web page containing the up-to-date information increases the search engine's score in this analyzer. Otherwise, the score is decreased for any page which only contains outdated information. The pages which contain neither old nor updated information are left out of account. It is to be said that the cases when the dialling code is significant for appreciating (for instance, 8 (499) xxx-xx-xx in Moscow) are also taken into account.
As the quality coefficient we use the share of web pages with actual phones.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Recall & Diversity Analyzers
As the web search is used for every possible need, it is vital that a search engine be able to find meaningful answers to most diverse questions. This ability depends both on the search engine's scope and on its interpreting potential.
These analyzers reflect the two above-mentioned aspects of the search engines' functioning, the quantity of the output and its diversity (when allowed by the non-strict sense of the query). To keep the results valid, we regularly put to test our set of queries.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
Recall analyzer
The recall analyzer estimates the relative size of indices of the Internet search engines.
The total number of documents indexed, as reported by the SE itself cannot be used for comparison because different SEs have different methods of document count. For example, some of them include duplicate documents in the count while others do not. Counting the duplicate documents may double the reported index size or increase it even more.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
Analyzer of subject search
A human being is often able to interpret a search query, determine what the user wants, evaluate the information on the Web and form the search results better than a machine. For this reason, the results formed by an expert are always better than those of an algorithm.
This analyzer monitors daily the search results for a set of queries, for which the corresponding links have been selected by experts. It calculates the number of sites found by each search engine that were on the experts' list. For each query, the percentage of similarity between algorithmic and expert results is calculated.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
Analyzer of Ambiguous Queries
When Russian users search for «белки» do they mean the furry animals or the proteins? There is about a dozen universities which can be referred to as "КГУ", so is the user entering this query looking for Kazan State, Kursk State or any other educational institution which happens to have the same abbreviated name?
In the case of an ambiguous query, the user could have had in mind any of the "readings". Therefore, instead of showing the results based on the most frequent interpretation, it seems to be the best bet for a search engine to include the results corresponding to all the different readings. This way the users may at least realize that their query is ambiguous which in turn will help them to make the query more precise.
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
Local Search Analyzers
One of the different factors influencing the web search is the location where the search is conducted. Not that it is of overall importance, but quite often the set of answers to a query will differ according to the specific region.
We here inquire into the quality of the search conducted from different cities of Russia. The analyzer's queries are entered by the servers located in ten cities standing as far away as Krasnodar and Vladivostok. Then the results are tested to see if they are region-appropriate.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
Analyzer of Regional Search
When a user from Ufa or Novosibirsk enters the query "pizza delivery", he or she is most likely looking to get some pizza. The websites of companies offering pizza delivery in Moscow would probably be of no use to the user since it does not make sense to deliver pizza from Moscow to Novosibirsk.
In this analyzer, we query the search engines from different cities across Russia. We select the search engine results for which the title, snippet or URL clearly mentions the location from where the query was sent. We only evaluate the information available as part of SERP's because this is all that the regional users see when they make decisions as to whether they want to follow a link to a given webpage found in SE results.
The percentage of responses to queries averaged over all cities except Moscow is taken to be a good estimate of how friendly the search engine is to the regional users.
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
The only distinction between the two analyzers lies in the set of queries (and markers). Whereas the general navigation analyzer is concerned with nationwide businesses or organizations, the new analyzer performs the search of smaller companies, mainly those functioning only locally (hence the name). Such organizations being smaller, their sites are of course less popular and thus more difficult for the search engines to find, which explains that the results in this analyzer are usually lower than in the general navigational analyzer.
Users often make what they mean quite clear - they mention not only the desired website, but also the page or the part of the website that they would like to access. Unfortunately the search engines often fail to make use of the query's precision: the results contain only the main page or the non-requested internal pages.
The queries in this analyzer offer a precise description of the requested pages. The score of the search engine is the percentage of queries which returned the desired page in top 10 results. Neither the position of the target page nor the number of times it occurred in top 10 affects the result.
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Person Search
The queries consisting of a first and a last name are often targeting a particular web page - the official personal web page of the person whose name is in the query. Even if the user is not sure that the personal site exists, such website is an obvious hit and it should appear in top 10 search results.
The queries for this analyzer include the names of celebrities as well as people who are well-known in some domain (for example, photographers, scientists, psychologists etc).
The analysis of queries and hits in this analyzer is analogous to the Analyzer of navigational search
Analyzer of Blog Search
Many celebrities have blogs. In fact many people got famous because of their blog. The popular blogs often have more readers than many mass media.
A query containing a first and a last name is thus often targeting a blog (or a microblog, social network page etc) of the given person. The search engine results for these queries should contain the blogs.
This analyzer complements the analyzer of person search . It assesses whether the blogs appear in the results page for the name queries. The search engine scores are calculated in the same way as in the analyzer of navigational search with one exception. We do not take into account the position of the target page in search results, since the official web pages are arguably more relevant responses than blogs for the name queries.
The queries in this analyzer are generally in the first name + last name format. We specifically wanted to include people who got famous because of their blog.
Social Network Navigation Analyzer
The audience of social networks grows by the day, as does the time spent there by a medium user. That's why most businesses have found it convenient to set up their own page on one or more social networks. Nowadays, every respectable hairdresser or garage owner can be found on Facebook or similar. Moreover, the social network page often becomes the only official source of information about the company.
Now, that means that for many small businesses or business-like structures, not to be found on Facebook / VKontakte/ Livejournal etc. means not to be found at all: there simply won't be any other place to look for them. This is how the queries in the Analyzer were chosen : we only used such companies that haven't got any website, except their social network page (but naturally, we took care to gather all such pages for each company). Thus, the Analyzer evaluates the web user's chances of finding official information about smaller organizations. At the same time it evaluates, along with the Blog Search Analyzer, the overall quality of search on the social networks segment of the internet, a parameter that nowadays becomes extremely significant.
The principle of the Analyzer's work is the same as with any other from the Navigation search group: for each query, we check the presence of the given page in the search results as well as its position.
Information Search Analyzers
The largest and the least defined group of queries are those aiming at finding information, in a broad sense of the word. Although an exhaustive survey of all such queries would seem impossible, yet some aspects of informational search come under close scrutiny in this group of analyzers.
Our analyzers cover Quote Search (Quotations, Catch Phrase and partly Originals) and Answer Search. It is very important that the search engine is able to (and is willing to) distinguish the original information source from its copies or imitations. This is the issue of the Originals' analyzer.
Our plans include broadening the scope of the search aspects under investigation. Yet, even at this moment, this scope is wider than it may seem, since a whole chain of analyzers in other groups are immediately related to information search. Thus, it is usually informational query that is affected by the search engine's "mistakes". Several of the Data Freshness analyzers also deal with informational queries. And naturally, informational queries form the most part of the Assessing analyzers, as they form the most part of the web search in general.
Analyzer of Quotation Search Quality
Quotation search is the search for the source of a certain text fragment, i.e. either the original text (in which case a larger portion of it should appear on the site), or at least the author and the title of this text.
This analyzer examines 100 queries that consist of significantly long extracts from texts, published on the Web. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the larger fragment of the original text or b. the name of the author and the title of the text.
The positions of the pages in the search results are not taken into consideration. Neither are (unlike the original texts analyzer, where the priority of the copyright holder is important) the sites where the text in question was first published.
Catch Phrase Analyzer
This analyzer, is devoted to the queries containing short popular quotations. These quotations often come from fiction, but they are also used in everyday life.
For example, a Russian user typing the query [контора пишет] is likely to be looking for the meaning or the original source (the title of the text and the name of the author) of the expression. The search engines, however, often provide multiple examples of use of the expression, which is hardly what the user is looking for.
The analyzer examines 100 queries that consist of a popular quotation, the source of which is known. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the given fragment (or one of several fragments) of the text where the quotation comes from or b. the name of the author and the title of the text. The positions of the pages among the search results are not taken into consideration.
Analyzer of Question Answering
This analyzer assesses the ability of search engines to find answers to question queries. The question queries include obvious quesitons with a question word ([when did CSKA win UEFA cup], [where is Lhasa located]) as well as other queries that imply an obvious short answer ([currency of Brasil], [nitric acid formula]).
The users typing such queries are most likely looking for the answer to their question. The quicker they get the answer, the better. Ideally, the snippet of the first page in search results should include the answer.
This analyzer has several tabs. In each of the tabs the total score of a search engine is the sum of its scores for each query.
1. Answer position in top 10 snippets
Here a search engine gets a score from 0 to 1 for each query, reflecting the highest position of a snippet, that contained the answer, in top 10 results. For example, if the answer was found in the first snippet, the search engine gets 1 for the query. If it was found in the second place, the score is 0.9 etc. If the answer was not found in the snippets of top 10 results, the score is 0.
2. Answer presence in snippets
Here a search engine gets 1 for a query if at least one of the top 10 snippets contained an answer to the question. 0 is assigned otherwise.
3. Position of answer web pages in top 10
A search engine gets a score from 0 to 1 for each query, reflecting the highest position of a web page, that contained the answer, in top 10 results. The score is 1 if the first page found contained an answer, 0.9 if the answer was present on the second page and 0 if none of the web pages found contained the answer.
4. Answer presence on web pages found
Here a search engine gets 1 for a query if at least one of the top 10 web pages contained an answer to the question. 0 is assigned otherwise.
A question query may have multiple possible answers. For example, the possible answers for [Where are Maldives located] are "Indian ocean" and "Asia".
Analyzer of Original Texts Ranking
Unfortunately, the copyrighted content is illegaly copied all too much on the web. Any author faces the fact that his or her original texts are being stolen: the text of a newly created article can be copied on some web page over days or even hours after the article has been created. The websites stealing the content in this way usually claim that it was "taken from open sources" or "uploaded by one of the users". The stolen content can be used to attract search engine users to a web page and the resulting traffic can be converted to money. This is the main reason for such 'borrowing'. The ability to identify the original texts and rank the corresponding web pages higher than the pages containing copied materials is a crucial property of any search engine.
The analyzer of the ranking of original texts uses exact quotation queries to daily monitor the position of 100 marker articles. For these articles, the web sites of copyright holders are known. The analyzer can thus calculate the percentage of queries for which the original text is ranked higher than the copied material.
The queries in this analyzer are fragments from the original article. By default the queries are submitted to the search engines in quotation marks. This way we expect that the only responses found will be the original article and its copies. However the real users rarely rely on quotation marks, so an additional tab estimates the results based on queries submitted without quotation marks.
The search engines are sorted by their ability to rank original texts higher than the copies in the informer of this analyzer.
Analyzer of the Location Search Quality
One of the most frequent and most obvious uses of search engines is for mere geographical navigation. We often want to find out whether a certain organization, business, service etc. is located where it should be, from our point of view. Although the overall search quality for this type of queries is not bad at all, mistakes sometimes occur. E.g., the search engine promptly finds the entity in question in some other city district or even some other city. Or, conversely, it supplies us with similar addresses of other organizations, presuming it doesn't make any difference what we shall do, provided we do it in the right place. This analyzer was made to compare the respective merits of search engines in dealing with such queries.
As in real life, our input queries consist of the organization's name plus the approximate locality, like a city district, a street or a nearby underground station. To make the results of the evaluation more precise, we only use such queries where the object of search is just one single entity. The ideal output in such case is a helper with full list of contacts. But at this stage we judged that just the presence of the correct address in the upper snippet will suffice for the SE to get the maximum grade. On the other hand, the results containing some other useful information, but not the address, won't get any.
The results are calculated on the same principle, as in the Navigational Analyzer: the higher we can find the snippet with the correct address, the more points gets the search engine (the first snippet gets 1.0, the 10th - 0.1 point). In addition, we calculate the ratio of correct answers found outside the snippets.
Transactional Search Analyzers
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Fortunately, the amount of software, movies and music you can download or watch /listen quite legally (usually on the websites of their creators) has significantly grown, as of late. The problem of finding a good source to download this content has therefore an unequivocal solution. It is all the more important that the search engines know how to find these correct, official sources, instead of the "fun portals" and all the scam sites.
This analyzer is the first in the intended series of analyzers measuring the search quality for transactional queries, i.e. for queries with a narrow practical goal: to download some file, to listen to a specific piece of music, to buy tickets, to transfer money etc.
Official Versions Search Analyzer
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Official Software Search
Looking up for a necessary program on the Web - a task that has often arisen with most of users. However, the search engines frequently offer unofficial and sometimes most untrustable sites with the program available for download. But can they do better given the free official version of what the user is after is available?
Data Freshness Analyzer
The degree of the result's relevance often depends not only on the content of the information retrieved, but also on its freshness. When the object of the search is a rapidly changing thing, such as news, schedules, forecasts etc., the out-of-date invalid information can be as harmful, as it is meaningless, since it may confuse and mislead the user.
This group's analyzers check the presence of valid or, vice versa, outdated results in the search engine output. (Needless to say, we have picked out such types of queries, as to make this check at all sensible. These are phone numbers and positions of executive managers.) As it so happens, the results marked as "fresh" also tend to get outdated. We therefore have to regularly arrange additional tests of the analyzers' results. If you catch sight of some outdated marker, please do let us know!
The Indexation analyzer should be considered separately: it shows what time does it take for a new Web page to enter the search engines' indexes. It is clear that the fast indexation is a necessary prerequisite for maintaining the freshness of search results.
Data Freshness Aanalyzer: Jobs
This analyzer is the first one in the series of analyzers which will estimate to which extent the search results are up to date.
For many queries it is important that a search engine does not give outdated results. For example, a user looking for "president of Zimbabwe" is most likely to be interested in finding out who is the current president of Zimbabwe. The user will not be satisifed if they only get the information about the previous presidents.
Within this analyzer each query is associated with one or more actual up-to-date response marker as well as one ore more outdated response marker. Any web page containing the up-to-date information increases the search engine's score in this analyzer. On the other hand, the score is decreased for any page which only contains outdated information. The results which are not recognized as containing either up-to-date or outdated information are not analyzed. Some of these unrecognized results may be irrelevant to the query, but our focus here is only on the actuality of the results.
We plan to develop new analyzers which will assess actuality of other types of information: recent news, pricing information for everyday goods as well as for stocks and currencies, last-minute deals etc.
Data Freshness: Phone Numbers
Continuing the series of data freshness analyzers, this one estimates the quality of phone numbers' search.
The growing figures of outdated numbers in search results are mostly affected by the mass phones changing in Russia. Undoubtedly, actual company's contacts are very important for both the clients and the staff. The timely measures taken by search engines prevent the firms from losing new clients and the customers from the irritation caused by unsuccessful search process.
The queries in the analyzer are the names of companies, that have lately changed their contacts. Any web page containing the up-to-date information increases the search engine's score in this analyzer. Otherwise, the score is decreased for any page which only contains outdated information. The pages which contain neither old nor updated information are left out of account. It is to be said that the cases when the dialling code is significant for appreciating (for instance, 8 (499) xxx-xx-xx in Moscow) are also taken into account.
As the quality coefficient we use the share of web pages with actual phones.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Recall & Diversity Analyzers
As the web search is used for every possible need, it is vital that a search engine be able to find meaningful answers to most diverse questions. This ability depends both on the search engine's scope and on its interpreting potential.
These analyzers reflect the two above-mentioned aspects of the search engines' functioning, the quantity of the output and its diversity (when allowed by the non-strict sense of the query). To keep the results valid, we regularly put to test our set of queries.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
Recall analyzer
The recall analyzer estimates the relative size of indices of the Internet search engines.
The total number of documents indexed, as reported by the SE itself cannot be used for comparison because different SEs have different methods of document count. For example, some of them include duplicate documents in the count while others do not. Counting the duplicate documents may double the reported index size or increase it even more.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
Analyzer of subject search
A human being is often able to interpret a search query, determine what the user wants, evaluate the information on the Web and form the search results better than a machine. For this reason, the results formed by an expert are always better than those of an algorithm.
This analyzer monitors daily the search results for a set of queries, for which the corresponding links have been selected by experts. It calculates the number of sites found by each search engine that were on the experts' list. For each query, the percentage of similarity between algorithmic and expert results is calculated.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
Analyzer of Ambiguous Queries
When Russian users search for «белки» do they mean the furry animals or the proteins? There is about a dozen universities which can be referred to as "КГУ", so is the user entering this query looking for Kazan State, Kursk State or any other educational institution which happens to have the same abbreviated name?
In the case of an ambiguous query, the user could have had in mind any of the "readings". Therefore, instead of showing the results based on the most frequent interpretation, it seems to be the best bet for a search engine to include the results corresponding to all the different readings. This way the users may at least realize that their query is ambiguous which in turn will help them to make the query more precise.
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
Local Search Analyzers
One of the different factors influencing the web search is the location where the search is conducted. Not that it is of overall importance, but quite often the set of answers to a query will differ according to the specific region.
We here inquire into the quality of the search conducted from different cities of Russia. The analyzer's queries are entered by the servers located in ten cities standing as far away as Krasnodar and Vladivostok. Then the results are tested to see if they are region-appropriate.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
Analyzer of Regional Search
When a user from Ufa or Novosibirsk enters the query "pizza delivery", he or she is most likely looking to get some pizza. The websites of companies offering pizza delivery in Moscow would probably be of no use to the user since it does not make sense to deliver pizza from Moscow to Novosibirsk.
In this analyzer, we query the search engines from different cities across Russia. We select the search engine results for which the title, snippet or URL clearly mentions the location from where the query was sent. We only evaluate the information available as part of SERP's because this is all that the regional users see when they make decisions as to whether they want to follow a link to a given webpage found in SE results.
The percentage of responses to queries averaged over all cities except Moscow is taken to be a good estimate of how friendly the search engine is to the regional users.
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
The queries consisting of a first and a last name are often targeting a particular web page - the official personal web page of the person whose name is in the query. Even if the user is not sure that the personal site exists, such website is an obvious hit and it should appear in top 10 search results.
The queries for this analyzer include the names of celebrities as well as people who are well-known in some domain (for example, photographers, scientists, psychologists etc).
The analysis of queries and hits in this analyzer is analogous to the Analyzer of navigational search
The analysis of queries and hits in this analyzer is analogous to the Analyzer of navigational search
Analyzer of Blog Search
Many celebrities have blogs. In fact many people got famous because of their blog. The popular blogs often have more readers than many mass media.
A query containing a first and a last name is thus often targeting a blog (or a microblog, social network page etc) of the given person. The search engine results for these queries should contain the blogs.
This analyzer complements the analyzer of person search . It assesses whether the blogs appear in the results page for the name queries. The search engine scores are calculated in the same way as in the analyzer of navigational search with one exception. We do not take into account the position of the target page in search results, since the official web pages are arguably more relevant responses than blogs for the name queries.
The queries in this analyzer are generally in the first name + last name format. We specifically wanted to include people who got famous because of their blog.
Social Network Navigation Analyzer
The audience of social networks grows by the day, as does the time spent there by a medium user. That's why most businesses have found it convenient to set up their own page on one or more social networks. Nowadays, every respectable hairdresser or garage owner can be found on Facebook or similar. Moreover, the social network page often becomes the only official source of information about the company.
Now, that means that for many small businesses or business-like structures, not to be found on Facebook / VKontakte/ Livejournal etc. means not to be found at all: there simply won't be any other place to look for them. This is how the queries in the Analyzer were chosen : we only used such companies that haven't got any website, except their social network page (but naturally, we took care to gather all such pages for each company). Thus, the Analyzer evaluates the web user's chances of finding official information about smaller organizations. At the same time it evaluates, along with the Blog Search Analyzer, the overall quality of search on the social networks segment of the internet, a parameter that nowadays becomes extremely significant.
The principle of the Analyzer's work is the same as with any other from the Navigation search group: for each query, we check the presence of the given page in the search results as well as its position.
Information Search Analyzers
The largest and the least defined group of queries are those aiming at finding information, in a broad sense of the word. Although an exhaustive survey of all such queries would seem impossible, yet some aspects of informational search come under close scrutiny in this group of analyzers.
Our analyzers cover Quote Search (Quotations, Catch Phrase and partly Originals) and Answer Search. It is very important that the search engine is able to (and is willing to) distinguish the original information source from its copies or imitations. This is the issue of the Originals' analyzer.
Our plans include broadening the scope of the search aspects under investigation. Yet, even at this moment, this scope is wider than it may seem, since a whole chain of analyzers in other groups are immediately related to information search. Thus, it is usually informational query that is affected by the search engine's "mistakes". Several of the Data Freshness analyzers also deal with informational queries. And naturally, informational queries form the most part of the Assessing analyzers, as they form the most part of the web search in general.
Analyzer of Quotation Search Quality
Quotation search is the search for the source of a certain text fragment, i.e. either the original text (in which case a larger portion of it should appear on the site), or at least the author and the title of this text.
This analyzer examines 100 queries that consist of significantly long extracts from texts, published on the Web. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the larger fragment of the original text or b. the name of the author and the title of the text.
The positions of the pages in the search results are not taken into consideration. Neither are (unlike the original texts analyzer, where the priority of the copyright holder is important) the sites where the text in question was first published.
Catch Phrase Analyzer
This analyzer, is devoted to the queries containing short popular quotations. These quotations often come from fiction, but they are also used in everyday life.
For example, a Russian user typing the query [контора пишет] is likely to be looking for the meaning or the original source (the title of the text and the name of the author) of the expression. The search engines, however, often provide multiple examples of use of the expression, which is hardly what the user is looking for.
The analyzer examines 100 queries that consist of a popular quotation, the source of which is known. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the given fragment (or one of several fragments) of the text where the quotation comes from or b. the name of the author and the title of the text. The positions of the pages among the search results are not taken into consideration.
Analyzer of Question Answering
This analyzer assesses the ability of search engines to find answers to question queries. The question queries include obvious quesitons with a question word ([when did CSKA win UEFA cup], [where is Lhasa located]) as well as other queries that imply an obvious short answer ([currency of Brasil], [nitric acid formula]).
The users typing such queries are most likely looking for the answer to their question. The quicker they get the answer, the better. Ideally, the snippet of the first page in search results should include the answer.
This analyzer has several tabs. In each of the tabs the total score of a search engine is the sum of its scores for each query.
1. Answer position in top 10 snippets
Here a search engine gets a score from 0 to 1 for each query, reflecting the highest position of a snippet, that contained the answer, in top 10 results. For example, if the answer was found in the first snippet, the search engine gets 1 for the query. If it was found in the second place, the score is 0.9 etc. If the answer was not found in the snippets of top 10 results, the score is 0.
2. Answer presence in snippets
Here a search engine gets 1 for a query if at least one of the top 10 snippets contained an answer to the question. 0 is assigned otherwise.
3. Position of answer web pages in top 10
A search engine gets a score from 0 to 1 for each query, reflecting the highest position of a web page, that contained the answer, in top 10 results. The score is 1 if the first page found contained an answer, 0.9 if the answer was present on the second page and 0 if none of the web pages found contained the answer.
4. Answer presence on web pages found
Here a search engine gets 1 for a query if at least one of the top 10 web pages contained an answer to the question. 0 is assigned otherwise.
A question query may have multiple possible answers. For example, the possible answers for [Where are Maldives located] are "Indian ocean" and "Asia".
Analyzer of Original Texts Ranking
Unfortunately, the copyrighted content is illegaly copied all too much on the web. Any author faces the fact that his or her original texts are being stolen: the text of a newly created article can be copied on some web page over days or even hours after the article has been created. The websites stealing the content in this way usually claim that it was "taken from open sources" or "uploaded by one of the users". The stolen content can be used to attract search engine users to a web page and the resulting traffic can be converted to money. This is the main reason for such 'borrowing'. The ability to identify the original texts and rank the corresponding web pages higher than the pages containing copied materials is a crucial property of any search engine.
The analyzer of the ranking of original texts uses exact quotation queries to daily monitor the position of 100 marker articles. For these articles, the web sites of copyright holders are known. The analyzer can thus calculate the percentage of queries for which the original text is ranked higher than the copied material.
The queries in this analyzer are fragments from the original article. By default the queries are submitted to the search engines in quotation marks. This way we expect that the only responses found will be the original article and its copies. However the real users rarely rely on quotation marks, so an additional tab estimates the results based on queries submitted without quotation marks.
The search engines are sorted by their ability to rank original texts higher than the copies in the informer of this analyzer.
Analyzer of the Location Search Quality
One of the most frequent and most obvious uses of search engines is for mere geographical navigation. We often want to find out whether a certain organization, business, service etc. is located where it should be, from our point of view. Although the overall search quality for this type of queries is not bad at all, mistakes sometimes occur. E.g., the search engine promptly finds the entity in question in some other city district or even some other city. Or, conversely, it supplies us with similar addresses of other organizations, presuming it doesn't make any difference what we shall do, provided we do it in the right place. This analyzer was made to compare the respective merits of search engines in dealing with such queries.
As in real life, our input queries consist of the organization's name plus the approximate locality, like a city district, a street or a nearby underground station. To make the results of the evaluation more precise, we only use such queries where the object of search is just one single entity. The ideal output in such case is a helper with full list of contacts. But at this stage we judged that just the presence of the correct address in the upper snippet will suffice for the SE to get the maximum grade. On the other hand, the results containing some other useful information, but not the address, won't get any.
The results are calculated on the same principle, as in the Navigational Analyzer: the higher we can find the snippet with the correct address, the more points gets the search engine (the first snippet gets 1.0, the 10th - 0.1 point). In addition, we calculate the ratio of correct answers found outside the snippets.
Transactional Search Analyzers
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Fortunately, the amount of software, movies and music you can download or watch /listen quite legally (usually on the websites of their creators) has significantly grown, as of late. The problem of finding a good source to download this content has therefore an unequivocal solution. It is all the more important that the search engines know how to find these correct, official sources, instead of the "fun portals" and all the scam sites.
This analyzer is the first in the intended series of analyzers measuring the search quality for transactional queries, i.e. for queries with a narrow practical goal: to download some file, to listen to a specific piece of music, to buy tickets, to transfer money etc.
Official Versions Search Analyzer
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Official Software Search
Looking up for a necessary program on the Web - a task that has often arisen with most of users. However, the search engines frequently offer unofficial and sometimes most untrustable sites with the program available for download. But can they do better given the free official version of what the user is after is available?
Data Freshness Analyzer
The degree of the result's relevance often depends not only on the content of the information retrieved, but also on its freshness. When the object of the search is a rapidly changing thing, such as news, schedules, forecasts etc., the out-of-date invalid information can be as harmful, as it is meaningless, since it may confuse and mislead the user.
This group's analyzers check the presence of valid or, vice versa, outdated results in the search engine output. (Needless to say, we have picked out such types of queries, as to make this check at all sensible. These are phone numbers and positions of executive managers.) As it so happens, the results marked as "fresh" also tend to get outdated. We therefore have to regularly arrange additional tests of the analyzers' results. If you catch sight of some outdated marker, please do let us know!
The Indexation analyzer should be considered separately: it shows what time does it take for a new Web page to enter the search engines' indexes. It is clear that the fast indexation is a necessary prerequisite for maintaining the freshness of search results.
Data Freshness Aanalyzer: Jobs
This analyzer is the first one in the series of analyzers which will estimate to which extent the search results are up to date.
For many queries it is important that a search engine does not give outdated results. For example, a user looking for "president of Zimbabwe" is most likely to be interested in finding out who is the current president of Zimbabwe. The user will not be satisifed if they only get the information about the previous presidents.
Within this analyzer each query is associated with one or more actual up-to-date response marker as well as one ore more outdated response marker. Any web page containing the up-to-date information increases the search engine's score in this analyzer. On the other hand, the score is decreased for any page which only contains outdated information. The results which are not recognized as containing either up-to-date or outdated information are not analyzed. Some of these unrecognized results may be irrelevant to the query, but our focus here is only on the actuality of the results.
We plan to develop new analyzers which will assess actuality of other types of information: recent news, pricing information for everyday goods as well as for stocks and currencies, last-minute deals etc.
Data Freshness: Phone Numbers
Continuing the series of data freshness analyzers, this one estimates the quality of phone numbers' search.
The growing figures of outdated numbers in search results are mostly affected by the mass phones changing in Russia. Undoubtedly, actual company's contacts are very important for both the clients and the staff. The timely measures taken by search engines prevent the firms from losing new clients and the customers from the irritation caused by unsuccessful search process.
The queries in the analyzer are the names of companies, that have lately changed their contacts. Any web page containing the up-to-date information increases the search engine's score in this analyzer. Otherwise, the score is decreased for any page which only contains outdated information. The pages which contain neither old nor updated information are left out of account. It is to be said that the cases when the dialling code is significant for appreciating (for instance, 8 (499) xxx-xx-xx in Moscow) are also taken into account.
As the quality coefficient we use the share of web pages with actual phones.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Recall & Diversity Analyzers
As the web search is used for every possible need, it is vital that a search engine be able to find meaningful answers to most diverse questions. This ability depends both on the search engine's scope and on its interpreting potential.
These analyzers reflect the two above-mentioned aspects of the search engines' functioning, the quantity of the output and its diversity (when allowed by the non-strict sense of the query). To keep the results valid, we regularly put to test our set of queries.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
Recall analyzer
The recall analyzer estimates the relative size of indices of the Internet search engines.
The total number of documents indexed, as reported by the SE itself cannot be used for comparison because different SEs have different methods of document count. For example, some of them include duplicate documents in the count while others do not. Counting the duplicate documents may double the reported index size or increase it even more.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
Analyzer of subject search
A human being is often able to interpret a search query, determine what the user wants, evaluate the information on the Web and form the search results better than a machine. For this reason, the results formed by an expert are always better than those of an algorithm.
This analyzer monitors daily the search results for a set of queries, for which the corresponding links have been selected by experts. It calculates the number of sites found by each search engine that were on the experts' list. For each query, the percentage of similarity between algorithmic and expert results is calculated.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
Analyzer of Ambiguous Queries
When Russian users search for «белки» do they mean the furry animals or the proteins? There is about a dozen universities which can be referred to as "КГУ", so is the user entering this query looking for Kazan State, Kursk State or any other educational institution which happens to have the same abbreviated name?
In the case of an ambiguous query, the user could have had in mind any of the "readings". Therefore, instead of showing the results based on the most frequent interpretation, it seems to be the best bet for a search engine to include the results corresponding to all the different readings. This way the users may at least realize that their query is ambiguous which in turn will help them to make the query more precise.
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
Local Search Analyzers
One of the different factors influencing the web search is the location where the search is conducted. Not that it is of overall importance, but quite often the set of answers to a query will differ according to the specific region.
We here inquire into the quality of the search conducted from different cities of Russia. The analyzer's queries are entered by the servers located in ten cities standing as far away as Krasnodar and Vladivostok. Then the results are tested to see if they are region-appropriate.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
Analyzer of Regional Search
When a user from Ufa or Novosibirsk enters the query "pizza delivery", he or she is most likely looking to get some pizza. The websites of companies offering pizza delivery in Moscow would probably be of no use to the user since it does not make sense to deliver pizza from Moscow to Novosibirsk.
In this analyzer, we query the search engines from different cities across Russia. We select the search engine results for which the title, snippet or URL clearly mentions the location from where the query was sent. We only evaluate the information available as part of SERP's because this is all that the regional users see when they make decisions as to whether they want to follow a link to a given webpage found in SE results.
The percentage of responses to queries averaged over all cities except Moscow is taken to be a good estimate of how friendly the search engine is to the regional users.
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
A query containing a first and a last name is thus often targeting a blog (or a microblog, social network page etc) of the given person. The search engine results for these queries should contain the blogs.
The queries in this analyzer are generally in the first name + last name format. We specifically wanted to include people who got famous because of their blog.
The audience of social networks grows by the day, as does the time spent there by a medium user. That's why most businesses have found it convenient to set up their own page on one or more social networks. Nowadays, every respectable hairdresser or garage owner can be found on Facebook or similar. Moreover, the social network page often becomes the only official source of information about the company.
Now, that means that for many small businesses or business-like structures, not to be found on Facebook / VKontakte/ Livejournal etc. means not to be found at all: there simply won't be any other place to look for them. This is how the queries in the Analyzer were chosen : we only used such companies that haven't got any website, except their social network page (but naturally, we took care to gather all such pages for each company). Thus, the Analyzer evaluates the web user's chances of finding official information about smaller organizations. At the same time it evaluates, along with the Blog Search Analyzer, the overall quality of search on the social networks segment of the internet, a parameter that nowadays becomes extremely significant.
The principle of the Analyzer's work is the same as with any other from the Navigation search group: for each query, we check the presence of the given page in the search results as well as its position.
The principle of the Analyzer's work is the same as with any other from the Navigation search group: for each query, we check the presence of the given page in the search results as well as its position.
Information Search Analyzers
Our plans include broadening the scope of the search aspects under investigation. Yet, even at this moment, this scope is wider than it may seem, since a whole chain of analyzers in other groups are immediately related to information search. Thus, it is usually informational query that is affected by the search engine's "mistakes". Several of the Data Freshness analyzers also deal with informational queries. And naturally, informational queries form the most part of the Assessing analyzers, as they form the most part of the web search in general.
Quotation search is the search for the source of a certain text fragment, i.e. either the original text (in which case a larger portion of it should appear on the site), or at least the author and the title of this text.
This analyzer examines 100 queries that consist of significantly long extracts from texts, published on the Web. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the larger fragment of the original text or b. the name of the author and the title of the text.
The positions of the pages in the search results are not taken into consideration. Neither are (unlike the original texts analyzer, where the priority of the copyright holder is important) the sites where the text in question was first published.
The positions of the pages in the search results are not taken into consideration. Neither are (unlike the original texts analyzer, where the priority of the copyright holder is important) the sites where the text in question was first published.
Catch Phrase Analyzer
This analyzer, is devoted to the queries containing short popular quotations. These quotations often come from fiction, but they are also used in everyday life.
For example, a Russian user typing the query [контора пишет] is likely to be looking for the meaning or the original source (the title of the text and the name of the author) of the expression. The search engines, however, often provide multiple examples of use of the expression, which is hardly what the user is looking for.
The analyzer examines 100 queries that consist of a popular quotation, the source of which is known. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the given fragment (or one of several fragments) of the text where the quotation comes from or b. the name of the author and the title of the text. The positions of the pages among the search results are not taken into consideration.
Analyzer of Question Answering
This analyzer assesses the ability of search engines to find answers to question queries. The question queries include obvious quesitons with a question word ([when did CSKA win UEFA cup], [where is Lhasa located]) as well as other queries that imply an obvious short answer ([currency of Brasil], [nitric acid formula]).
The users typing such queries are most likely looking for the answer to their question. The quicker they get the answer, the better. Ideally, the snippet of the first page in search results should include the answer.
This analyzer has several tabs. In each of the tabs the total score of a search engine is the sum of its scores for each query.
1. Answer position in top 10 snippets
Here a search engine gets a score from 0 to 1 for each query, reflecting the highest position of a snippet, that contained the answer, in top 10 results. For example, if the answer was found in the first snippet, the search engine gets 1 for the query. If it was found in the second place, the score is 0.9 etc. If the answer was not found in the snippets of top 10 results, the score is 0.
2. Answer presence in snippets
Here a search engine gets 1 for a query if at least one of the top 10 snippets contained an answer to the question. 0 is assigned otherwise.
3. Position of answer web pages in top 10
A search engine gets a score from 0 to 1 for each query, reflecting the highest position of a web page, that contained the answer, in top 10 results. The score is 1 if the first page found contained an answer, 0.9 if the answer was present on the second page and 0 if none of the web pages found contained the answer.
4. Answer presence on web pages found
Here a search engine gets 1 for a query if at least one of the top 10 web pages contained an answer to the question. 0 is assigned otherwise.
A question query may have multiple possible answers. For example, the possible answers for [Where are Maldives located] are "Indian ocean" and "Asia".
Analyzer of Original Texts Ranking
Unfortunately, the copyrighted content is illegaly copied all too much on the web. Any author faces the fact that his or her original texts are being stolen: the text of a newly created article can be copied on some web page over days or even hours after the article has been created. The websites stealing the content in this way usually claim that it was "taken from open sources" or "uploaded by one of the users". The stolen content can be used to attract search engine users to a web page and the resulting traffic can be converted to money. This is the main reason for such 'borrowing'. The ability to identify the original texts and rank the corresponding web pages higher than the pages containing copied materials is a crucial property of any search engine.
The analyzer of the ranking of original texts uses exact quotation queries to daily monitor the position of 100 marker articles. For these articles, the web sites of copyright holders are known. The analyzer can thus calculate the percentage of queries for which the original text is ranked higher than the copied material.
The queries in this analyzer are fragments from the original article. By default the queries are submitted to the search engines in quotation marks. This way we expect that the only responses found will be the original article and its copies. However the real users rarely rely on quotation marks, so an additional tab estimates the results based on queries submitted without quotation marks.
The search engines are sorted by their ability to rank original texts higher than the copies in the informer of this analyzer.
Analyzer of the Location Search Quality
One of the most frequent and most obvious uses of search engines is for mere geographical navigation. We often want to find out whether a certain organization, business, service etc. is located where it should be, from our point of view. Although the overall search quality for this type of queries is not bad at all, mistakes sometimes occur. E.g., the search engine promptly finds the entity in question in some other city district or even some other city. Or, conversely, it supplies us with similar addresses of other organizations, presuming it doesn't make any difference what we shall do, provided we do it in the right place. This analyzer was made to compare the respective merits of search engines in dealing with such queries.
As in real life, our input queries consist of the organization's name plus the approximate locality, like a city district, a street or a nearby underground station. To make the results of the evaluation more precise, we only use such queries where the object of search is just one single entity. The ideal output in such case is a helper with full list of contacts. But at this stage we judged that just the presence of the correct address in the upper snippet will suffice for the SE to get the maximum grade. On the other hand, the results containing some other useful information, but not the address, won't get any.
The results are calculated on the same principle, as in the Navigational Analyzer: the higher we can find the snippet with the correct address, the more points gets the search engine (the first snippet gets 1.0, the 10th - 0.1 point). In addition, we calculate the ratio of correct answers found outside the snippets.
Transactional Search Analyzers
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Fortunately, the amount of software, movies and music you can download or watch /listen quite legally (usually on the websites of their creators) has significantly grown, as of late. The problem of finding a good source to download this content has therefore an unequivocal solution. It is all the more important that the search engines know how to find these correct, official sources, instead of the "fun portals" and all the scam sites.
This analyzer is the first in the intended series of analyzers measuring the search quality for transactional queries, i.e. for queries with a narrow practical goal: to download some file, to listen to a specific piece of music, to buy tickets, to transfer money etc.
Official Versions Search Analyzer
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Official Software Search
Looking up for a necessary program on the Web - a task that has often arisen with most of users. However, the search engines frequently offer unofficial and sometimes most untrustable sites with the program available for download. But can they do better given the free official version of what the user is after is available?
Data Freshness Analyzer
The degree of the result's relevance often depends not only on the content of the information retrieved, but also on its freshness. When the object of the search is a rapidly changing thing, such as news, schedules, forecasts etc., the out-of-date invalid information can be as harmful, as it is meaningless, since it may confuse and mislead the user.
This group's analyzers check the presence of valid or, vice versa, outdated results in the search engine output. (Needless to say, we have picked out such types of queries, as to make this check at all sensible. These are phone numbers and positions of executive managers.) As it so happens, the results marked as "fresh" also tend to get outdated. We therefore have to regularly arrange additional tests of the analyzers' results. If you catch sight of some outdated marker, please do let us know!
The Indexation analyzer should be considered separately: it shows what time does it take for a new Web page to enter the search engines' indexes. It is clear that the fast indexation is a necessary prerequisite for maintaining the freshness of search results.
Data Freshness Aanalyzer: Jobs
This analyzer is the first one in the series of analyzers which will estimate to which extent the search results are up to date.
For many queries it is important that a search engine does not give outdated results. For example, a user looking for "president of Zimbabwe" is most likely to be interested in finding out who is the current president of Zimbabwe. The user will not be satisifed if they only get the information about the previous presidents.
Within this analyzer each query is associated with one or more actual up-to-date response marker as well as one ore more outdated response marker. Any web page containing the up-to-date information increases the search engine's score in this analyzer. On the other hand, the score is decreased for any page which only contains outdated information. The results which are not recognized as containing either up-to-date or outdated information are not analyzed. Some of these unrecognized results may be irrelevant to the query, but our focus here is only on the actuality of the results.
We plan to develop new analyzers which will assess actuality of other types of information: recent news, pricing information for everyday goods as well as for stocks and currencies, last-minute deals etc.
Data Freshness: Phone Numbers
Continuing the series of data freshness analyzers, this one estimates the quality of phone numbers' search.
The growing figures of outdated numbers in search results are mostly affected by the mass phones changing in Russia. Undoubtedly, actual company's contacts are very important for both the clients and the staff. The timely measures taken by search engines prevent the firms from losing new clients and the customers from the irritation caused by unsuccessful search process.
The queries in the analyzer are the names of companies, that have lately changed their contacts. Any web page containing the up-to-date information increases the search engine's score in this analyzer. Otherwise, the score is decreased for any page which only contains outdated information. The pages which contain neither old nor updated information are left out of account. It is to be said that the cases when the dialling code is significant for appreciating (for instance, 8 (499) xxx-xx-xx in Moscow) are also taken into account.
As the quality coefficient we use the share of web pages with actual phones.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Recall & Diversity Analyzers
As the web search is used for every possible need, it is vital that a search engine be able to find meaningful answers to most diverse questions. This ability depends both on the search engine's scope and on its interpreting potential.
These analyzers reflect the two above-mentioned aspects of the search engines' functioning, the quantity of the output and its diversity (when allowed by the non-strict sense of the query). To keep the results valid, we regularly put to test our set of queries.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
Recall analyzer
The recall analyzer estimates the relative size of indices of the Internet search engines.
The total number of documents indexed, as reported by the SE itself cannot be used for comparison because different SEs have different methods of document count. For example, some of them include duplicate documents in the count while others do not. Counting the duplicate documents may double the reported index size or increase it even more.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
Analyzer of subject search
A human being is often able to interpret a search query, determine what the user wants, evaluate the information on the Web and form the search results better than a machine. For this reason, the results formed by an expert are always better than those of an algorithm.
This analyzer monitors daily the search results for a set of queries, for which the corresponding links have been selected by experts. It calculates the number of sites found by each search engine that were on the experts' list. For each query, the percentage of similarity between algorithmic and expert results is calculated.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
Analyzer of Ambiguous Queries
When Russian users search for «белки» do they mean the furry animals or the proteins? There is about a dozen universities which can be referred to as "КГУ", so is the user entering this query looking for Kazan State, Kursk State or any other educational institution which happens to have the same abbreviated name?
In the case of an ambiguous query, the user could have had in mind any of the "readings". Therefore, instead of showing the results based on the most frequent interpretation, it seems to be the best bet for a search engine to include the results corresponding to all the different readings. This way the users may at least realize that their query is ambiguous which in turn will help them to make the query more precise.
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
Local Search Analyzers
One of the different factors influencing the web search is the location where the search is conducted. Not that it is of overall importance, but quite often the set of answers to a query will differ according to the specific region.
We here inquire into the quality of the search conducted from different cities of Russia. The analyzer's queries are entered by the servers located in ten cities standing as far away as Krasnodar and Vladivostok. Then the results are tested to see if they are region-appropriate.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
Analyzer of Regional Search
When a user from Ufa or Novosibirsk enters the query "pizza delivery", he or she is most likely looking to get some pizza. The websites of companies offering pizza delivery in Moscow would probably be of no use to the user since it does not make sense to deliver pizza from Moscow to Novosibirsk.
In this analyzer, we query the search engines from different cities across Russia. We select the search engine results for which the title, snippet or URL clearly mentions the location from where the query was sent. We only evaluate the information available as part of SERP's because this is all that the regional users see when they make decisions as to whether they want to follow a link to a given webpage found in SE results.
The percentage of responses to queries averaged over all cities except Moscow is taken to be a good estimate of how friendly the search engine is to the regional users.
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
For example, a Russian user typing the query [контора пишет] is likely to be looking for the meaning or the original source (the title of the text and the name of the author) of the expression. The search engines, however, often provide multiple examples of use of the expression, which is hardly what the user is looking for.
This analyzer assesses the ability of search engines to find answers to question queries. The question queries include obvious quesitons with a question word ([when did CSKA win UEFA cup], [where is Lhasa located]) as well as other queries that imply an obvious short answer ([currency of Brasil], [nitric acid formula]).
The users typing such queries are most likely looking for the answer to their question. The quicker they get the answer, the better. Ideally, the snippet of the first page in search results should include the answer.
This analyzer has several tabs. In each of the tabs the total score of a search engine is the sum of its scores for each query.
1. Answer position in top 10 snippets
Here a search engine gets a score from 0 to 1 for each query, reflecting the highest position of a snippet, that contained the answer, in top 10 results. For example, if the answer was found in the first snippet, the search engine gets 1 for the query. If it was found in the second place, the score is 0.9 etc. If the answer was not found in the snippets of top 10 results, the score is 0.
2. Answer presence in snippets
Here a search engine gets 1 for a query if at least one of the top 10 snippets contained an answer to the question. 0 is assigned otherwise.
3. Position of answer web pages in top 10
A search engine gets a score from 0 to 1 for each query, reflecting the highest position of a web page, that contained the answer, in top 10 results. The score is 1 if the first page found contained an answer, 0.9 if the answer was present on the second page and 0 if none of the web pages found contained the answer.
4. Answer presence on web pages found
Here a search engine gets 1 for a query if at least one of the top 10 web pages contained an answer to the question. 0 is assigned otherwise.
A question query may have multiple possible answers. For example, the possible answers for [Where are Maldives located] are "Indian ocean" and "Asia".
This analyzer has several tabs. In each of the tabs the total score of a search engine is the sum of its scores for each query.
1. Answer position in top 10 snippets
Here a search engine gets a score from 0 to 1 for each query, reflecting the highest position of a snippet, that contained the answer, in top 10 results. For example, if the answer was found in the first snippet, the search engine gets 1 for the query. If it was found in the second place, the score is 0.9 etc. If the answer was not found in the snippets of top 10 results, the score is 0.
2. Answer presence in snippets
Here a search engine gets 1 for a query if at least one of the top 10 snippets contained an answer to the question. 0 is assigned otherwise.
3. Position of answer web pages in top 10
A search engine gets a score from 0 to 1 for each query, reflecting the highest position of a web page, that contained the answer, in top 10 results. The score is 1 if the first page found contained an answer, 0.9 if the answer was present on the second page and 0 if none of the web pages found contained the answer.
4. Answer presence on web pages found
Here a search engine gets 1 for a query if at least one of the top 10 web pages contained an answer to the question. 0 is assigned otherwise.
A question query may have multiple possible answers. For example, the possible answers for [Where are Maldives located] are "Indian ocean" and "Asia".
Analyzer of Original Texts Ranking
Unfortunately, the copyrighted content is illegaly copied all too much on the web. Any author faces the fact that his or her original texts are being stolen: the text of a newly created article can be copied on some web page over days or even hours after the article has been created. The websites stealing the content in this way usually claim that it was "taken from open sources" or "uploaded by one of the users". The stolen content can be used to attract search engine users to a web page and the resulting traffic can be converted to money. This is the main reason for such 'borrowing'. The ability to identify the original texts and rank the corresponding web pages higher than the pages containing copied materials is a crucial property of any search engine.
The analyzer of the ranking of original texts uses exact quotation queries to daily monitor the position of 100 marker articles. For these articles, the web sites of copyright holders are known. The analyzer can thus calculate the percentage of queries for which the original text is ranked higher than the copied material.
The queries in this analyzer are fragments from the original article. By default the queries are submitted to the search engines in quotation marks. This way we expect that the only responses found will be the original article and its copies. However the real users rarely rely on quotation marks, so an additional tab estimates the results based on queries submitted without quotation marks.
The search engines are sorted by their ability to rank original texts higher than the copies in the informer of this analyzer.
Analyzer of the Location Search Quality
One of the most frequent and most obvious uses of search engines is for mere geographical navigation. We often want to find out whether a certain organization, business, service etc. is located where it should be, from our point of view. Although the overall search quality for this type of queries is not bad at all, mistakes sometimes occur. E.g., the search engine promptly finds the entity in question in some other city district or even some other city. Or, conversely, it supplies us with similar addresses of other organizations, presuming it doesn't make any difference what we shall do, provided we do it in the right place. This analyzer was made to compare the respective merits of search engines in dealing with such queries.
As in real life, our input queries consist of the organization's name plus the approximate locality, like a city district, a street or a nearby underground station. To make the results of the evaluation more precise, we only use such queries where the object of search is just one single entity. The ideal output in such case is a helper with full list of contacts. But at this stage we judged that just the presence of the correct address in the upper snippet will suffice for the SE to get the maximum grade. On the other hand, the results containing some other useful information, but not the address, won't get any.
The results are calculated on the same principle, as in the Navigational Analyzer: the higher we can find the snippet with the correct address, the more points gets the search engine (the first snippet gets 1.0, the 10th - 0.1 point). In addition, we calculate the ratio of correct answers found outside the snippets.
Transactional Search Analyzers
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Fortunately, the amount of software, movies and music you can download or watch /listen quite legally (usually on the websites of their creators) has significantly grown, as of late. The problem of finding a good source to download this content has therefore an unequivocal solution. It is all the more important that the search engines know how to find these correct, official sources, instead of the "fun portals" and all the scam sites.
This analyzer is the first in the intended series of analyzers measuring the search quality for transactional queries, i.e. for queries with a narrow practical goal: to download some file, to listen to a specific piece of music, to buy tickets, to transfer money etc.
Official Versions Search Analyzer
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Official Software Search
Looking up for a necessary program on the Web - a task that has often arisen with most of users. However, the search engines frequently offer unofficial and sometimes most untrustable sites with the program available for download. But can they do better given the free official version of what the user is after is available?
Data Freshness Analyzer
The degree of the result's relevance often depends not only on the content of the information retrieved, but also on its freshness. When the object of the search is a rapidly changing thing, such as news, schedules, forecasts etc., the out-of-date invalid information can be as harmful, as it is meaningless, since it may confuse and mislead the user.
This group's analyzers check the presence of valid or, vice versa, outdated results in the search engine output. (Needless to say, we have picked out such types of queries, as to make this check at all sensible. These are phone numbers and positions of executive managers.) As it so happens, the results marked as "fresh" also tend to get outdated. We therefore have to regularly arrange additional tests of the analyzers' results. If you catch sight of some outdated marker, please do let us know!
The Indexation analyzer should be considered separately: it shows what time does it take for a new Web page to enter the search engines' indexes. It is clear that the fast indexation is a necessary prerequisite for maintaining the freshness of search results.
Data Freshness Aanalyzer: Jobs
This analyzer is the first one in the series of analyzers which will estimate to which extent the search results are up to date.
For many queries it is important that a search engine does not give outdated results. For example, a user looking for "president of Zimbabwe" is most likely to be interested in finding out who is the current president of Zimbabwe. The user will not be satisifed if they only get the information about the previous presidents.
Within this analyzer each query is associated with one or more actual up-to-date response marker as well as one ore more outdated response marker. Any web page containing the up-to-date information increases the search engine's score in this analyzer. On the other hand, the score is decreased for any page which only contains outdated information. The results which are not recognized as containing either up-to-date or outdated information are not analyzed. Some of these unrecognized results may be irrelevant to the query, but our focus here is only on the actuality of the results.
We plan to develop new analyzers which will assess actuality of other types of information: recent news, pricing information for everyday goods as well as for stocks and currencies, last-minute deals etc.
Data Freshness: Phone Numbers
Continuing the series of data freshness analyzers, this one estimates the quality of phone numbers' search.
The growing figures of outdated numbers in search results are mostly affected by the mass phones changing in Russia. Undoubtedly, actual company's contacts are very important for both the clients and the staff. The timely measures taken by search engines prevent the firms from losing new clients and the customers from the irritation caused by unsuccessful search process.
The queries in the analyzer are the names of companies, that have lately changed their contacts. Any web page containing the up-to-date information increases the search engine's score in this analyzer. Otherwise, the score is decreased for any page which only contains outdated information. The pages which contain neither old nor updated information are left out of account. It is to be said that the cases when the dialling code is significant for appreciating (for instance, 8 (499) xxx-xx-xx in Moscow) are also taken into account.
As the quality coefficient we use the share of web pages with actual phones.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Recall & Diversity Analyzers
As the web search is used for every possible need, it is vital that a search engine be able to find meaningful answers to most diverse questions. This ability depends both on the search engine's scope and on its interpreting potential.
These analyzers reflect the two above-mentioned aspects of the search engines' functioning, the quantity of the output and its diversity (when allowed by the non-strict sense of the query). To keep the results valid, we regularly put to test our set of queries.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
Recall analyzer
The recall analyzer estimates the relative size of indices of the Internet search engines.
The total number of documents indexed, as reported by the SE itself cannot be used for comparison because different SEs have different methods of document count. For example, some of them include duplicate documents in the count while others do not. Counting the duplicate documents may double the reported index size or increase it even more.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
Analyzer of subject search
A human being is often able to interpret a search query, determine what the user wants, evaluate the information on the Web and form the search results better than a machine. For this reason, the results formed by an expert are always better than those of an algorithm.
This analyzer monitors daily the search results for a set of queries, for which the corresponding links have been selected by experts. It calculates the number of sites found by each search engine that were on the experts' list. For each query, the percentage of similarity between algorithmic and expert results is calculated.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
Analyzer of Ambiguous Queries
When Russian users search for «белки» do they mean the furry animals or the proteins? There is about a dozen universities which can be referred to as "КГУ", so is the user entering this query looking for Kazan State, Kursk State or any other educational institution which happens to have the same abbreviated name?
In the case of an ambiguous query, the user could have had in mind any of the "readings". Therefore, instead of showing the results based on the most frequent interpretation, it seems to be the best bet for a search engine to include the results corresponding to all the different readings. This way the users may at least realize that their query is ambiguous which in turn will help them to make the query more precise.
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
Local Search Analyzers
One of the different factors influencing the web search is the location where the search is conducted. Not that it is of overall importance, but quite often the set of answers to a query will differ according to the specific region.
We here inquire into the quality of the search conducted from different cities of Russia. The analyzer's queries are entered by the servers located in ten cities standing as far away as Krasnodar and Vladivostok. Then the results are tested to see if they are region-appropriate.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
Analyzer of Regional Search
When a user from Ufa or Novosibirsk enters the query "pizza delivery", he or she is most likely looking to get some pizza. The websites of companies offering pizza delivery in Moscow would probably be of no use to the user since it does not make sense to deliver pizza from Moscow to Novosibirsk.
In this analyzer, we query the search engines from different cities across Russia. We select the search engine results for which the title, snippet or URL clearly mentions the location from where the query was sent. We only evaluate the information available as part of SERP's because this is all that the regional users see when they make decisions as to whether they want to follow a link to a given webpage found in SE results.
The percentage of responses to queries averaged over all cities except Moscow is taken to be a good estimate of how friendly the search engine is to the regional users.
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
The queries in this analyzer are fragments from the original article. By default the queries are submitted to the search engines in quotation marks. This way we expect that the only responses found will be the original article and its copies. However the real users rarely rely on quotation marks, so an additional tab estimates the results based on queries submitted without quotation marks.
The search engines are sorted by their ability to rank original texts higher than the copies in the informer of this analyzer.
One of the most frequent and most obvious uses of search engines is for mere geographical navigation. We often want to find out whether a certain organization, business, service etc. is located where it should be, from our point of view. Although the overall search quality for this type of queries is not bad at all, mistakes sometimes occur. E.g., the search engine promptly finds the entity in question in some other city district or even some other city. Or, conversely, it supplies us with similar addresses of other organizations, presuming it doesn't make any difference what we shall do, provided we do it in the right place. This analyzer was made to compare the respective merits of search engines in dealing with such queries.
As in real life, our input queries consist of the organization's name plus the approximate locality, like a city district, a street or a nearby underground station. To make the results of the evaluation more precise, we only use such queries where the object of search is just one single entity. The ideal output in such case is a helper with full list of contacts. But at this stage we judged that just the presence of the correct address in the upper snippet will suffice for the SE to get the maximum grade. On the other hand, the results containing some other useful information, but not the address, won't get any.
The results are calculated on the same principle, as in the Navigational Analyzer: the higher we can find the snippet with the correct address, the more points gets the search engine (the first snippet gets 1.0, the 10th - 0.1 point). In addition, we calculate the ratio of correct answers found outside the snippets.
The results are calculated on the same principle, as in the Navigational Analyzer: the higher we can find the snippet with the correct address, the more points gets the search engine (the first snippet gets 1.0, the 10th - 0.1 point). In addition, we calculate the ratio of correct answers found outside the snippets.
Transactional Search Analyzers
This analyzer is the first in the intended series of analyzers measuring the search quality for transactional queries, i.e. for queries with a narrow practical goal: to download some file, to listen to a specific piece of music, to buy tickets, to transfer money etc.
Every user who has ever tried to download anything from the Web, or to listen or watch anything online, is familiar with the price we have to pay for obtaining the free content: watching obtrusive ads, waiting for full download, whereby download speed falls almost to zero, or even taking the risk of catching a computer virus from some suspicious site. To say nothing of the bites of your conscience - we all know that the content is put on the Web with little or no regard for copyright laws.
Official Software Search
Looking up for a necessary program on the Web - a task that has often arisen with most of users. However, the search engines frequently offer unofficial and sometimes most untrustable sites with the program available for download. But can they do better given the free official version of what the user is after is available?
Data Freshness Analyzer
The degree of the result's relevance often depends not only on the content of the information retrieved, but also on its freshness. When the object of the search is a rapidly changing thing, such as news, schedules, forecasts etc., the out-of-date invalid information can be as harmful, as it is meaningless, since it may confuse and mislead the user.
This group's analyzers check the presence of valid or, vice versa, outdated results in the search engine output. (Needless to say, we have picked out such types of queries, as to make this check at all sensible. These are phone numbers and positions of executive managers.) As it so happens, the results marked as "fresh" also tend to get outdated. We therefore have to regularly arrange additional tests of the analyzers' results. If you catch sight of some outdated marker, please do let us know!
The Indexation analyzer should be considered separately: it shows what time does it take for a new Web page to enter the search engines' indexes. It is clear that the fast indexation is a necessary prerequisite for maintaining the freshness of search results.
The Indexation analyzer should be considered separately: it shows what time does it take for a new Web page to enter the search engines' indexes. It is clear that the fast indexation is a necessary prerequisite for maintaining the freshness of search results.
Data Freshness Aanalyzer: Jobs
This analyzer is the first one in the series of analyzers which will estimate to which extent the search results are up to date.
For many queries it is important that a search engine does not give outdated results. For example, a user looking for "president of Zimbabwe" is most likely to be interested in finding out who is the current president of Zimbabwe. The user will not be satisifed if they only get the information about the previous presidents.
Within this analyzer each query is associated with one or more actual up-to-date response marker as well as one ore more outdated response marker. Any web page containing the up-to-date information increases the search engine's score in this analyzer. On the other hand, the score is decreased for any page which only contains outdated information. The results which are not recognized as containing either up-to-date or outdated information are not analyzed. Some of these unrecognized results may be irrelevant to the query, but our focus here is only on the actuality of the results.
We plan to develop new analyzers which will assess actuality of other types of information: recent news, pricing information for everyday goods as well as for stocks and currencies, last-minute deals etc.
Data Freshness: Phone Numbers
Continuing the series of data freshness analyzers, this one estimates the quality of phone numbers' search.
The growing figures of outdated numbers in search results are mostly affected by the mass phones changing in Russia. Undoubtedly, actual company's contacts are very important for both the clients and the staff. The timely measures taken by search engines prevent the firms from losing new clients and the customers from the irritation caused by unsuccessful search process.
The queries in the analyzer are the names of companies, that have lately changed their contacts. Any web page containing the up-to-date information increases the search engine's score in this analyzer. Otherwise, the score is decreased for any page which only contains outdated information. The pages which contain neither old nor updated information are left out of account. It is to be said that the cases when the dialling code is significant for appreciating (for instance, 8 (499) xxx-xx-xx in Moscow) are also taken into account.
As the quality coefficient we use the share of web pages with actual phones.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Recall & Diversity Analyzers
As the web search is used for every possible need, it is vital that a search engine be able to find meaningful answers to most diverse questions. This ability depends both on the search engine's scope and on its interpreting potential.
These analyzers reflect the two above-mentioned aspects of the search engines' functioning, the quantity of the output and its diversity (when allowed by the non-strict sense of the query). To keep the results valid, we regularly put to test our set of queries.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
Recall analyzer
The recall analyzer estimates the relative size of indices of the Internet search engines.
The total number of documents indexed, as reported by the SE itself cannot be used for comparison because different SEs have different methods of document count. For example, some of them include duplicate documents in the count while others do not. Counting the duplicate documents may double the reported index size or increase it even more.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
Analyzer of subject search
A human being is often able to interpret a search query, determine what the user wants, evaluate the information on the Web and form the search results better than a machine. For this reason, the results formed by an expert are always better than those of an algorithm.
This analyzer monitors daily the search results for a set of queries, for which the corresponding links have been selected by experts. It calculates the number of sites found by each search engine that were on the experts' list. For each query, the percentage of similarity between algorithmic and expert results is calculated.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
Analyzer of Ambiguous Queries
When Russian users search for «белки» do they mean the furry animals or the proteins? There is about a dozen universities which can be referred to as "КГУ", so is the user entering this query looking for Kazan State, Kursk State or any other educational institution which happens to have the same abbreviated name?
In the case of an ambiguous query, the user could have had in mind any of the "readings". Therefore, instead of showing the results based on the most frequent interpretation, it seems to be the best bet for a search engine to include the results corresponding to all the different readings. This way the users may at least realize that their query is ambiguous which in turn will help them to make the query more precise.
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
Local Search Analyzers
One of the different factors influencing the web search is the location where the search is conducted. Not that it is of overall importance, but quite often the set of answers to a query will differ according to the specific region.
We here inquire into the quality of the search conducted from different cities of Russia. The analyzer's queries are entered by the servers located in ten cities standing as far away as Krasnodar and Vladivostok. Then the results are tested to see if they are region-appropriate.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
Analyzer of Regional Search
When a user from Ufa or Novosibirsk enters the query "pizza delivery", he or she is most likely looking to get some pizza. The websites of companies offering pizza delivery in Moscow would probably be of no use to the user since it does not make sense to deliver pizza from Moscow to Novosibirsk.
In this analyzer, we query the search engines from different cities across Russia. We select the search engine results for which the title, snippet or URL clearly mentions the location from where the query was sent. We only evaluate the information available as part of SERP's because this is all that the regional users see when they make decisions as to whether they want to follow a link to a given webpage found in SE results.
The percentage of responses to queries averaged over all cities except Moscow is taken to be a good estimate of how friendly the search engine is to the regional users.
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
For many queries it is important that a search engine does not give outdated results. For example, a user looking for "president of Zimbabwe" is most likely to be interested in finding out who is the current president of Zimbabwe. The user will not be satisifed if they only get the information about the previous presidents.
We plan to develop new analyzers which will assess actuality of other types of information: recent news, pricing information for everyday goods as well as for stocks and currencies, last-minute deals etc.
Continuing the series of data freshness analyzers, this one estimates the quality of phone numbers' search.
The growing figures of outdated numbers in search results are mostly affected by the mass phones changing in Russia. Undoubtedly, actual company's contacts are very important for both the clients and the staff. The timely measures taken by search engines prevent the firms from losing new clients and the customers from the irritation caused by unsuccessful search process.
The growing figures of outdated numbers in search results are mostly affected by the mass phones changing in Russia. Undoubtedly, actual company's contacts are very important for both the clients and the staff. The timely measures taken by search engines prevent the firms from losing new clients and the customers from the irritation caused by unsuccessful search process.
The queries in the analyzer are the names of companies, that have lately changed their contacts. Any web page containing the up-to-date information increases the search engine's score in this analyzer. Otherwise, the score is decreased for any page which only contains outdated information. The pages which contain neither old nor updated information are left out of account. It is to be said that the cases when the dialling code is significant for appreciating (for instance, 8 (499) xxx-xx-xx in Moscow) are also taken into account.
As the quality coefficient we use the share of web pages with actual phones.
As the quality coefficient we use the share of web pages with actual phones.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Recall & Diversity Analyzers
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
As the web search is used for every possible need, it is vital that a search engine be able to find meaningful answers to most diverse questions. This ability depends both on the search engine's scope and on its interpreting potential.
These analyzers reflect the two above-mentioned aspects of the search engines' functioning, the quantity of the output and its diversity (when allowed by the non-strict sense of the query). To keep the results valid, we regularly put to test our set of queries.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
Recall analyzer
The recall analyzer estimates the relative size of indices of the Internet search engines.
The total number of documents indexed, as reported by the SE itself cannot be used for comparison because different SEs have different methods of document count. For example, some of them include duplicate documents in the count while others do not. Counting the duplicate documents may double the reported index size or increase it even more.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
Analyzer of subject search
A human being is often able to interpret a search query, determine what the user wants, evaluate the information on the Web and form the search results better than a machine. For this reason, the results formed by an expert are always better than those of an algorithm.
This analyzer monitors daily the search results for a set of queries, for which the corresponding links have been selected by experts. It calculates the number of sites found by each search engine that were on the experts' list. For each query, the percentage of similarity between algorithmic and expert results is calculated.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
Analyzer of Ambiguous Queries
When Russian users search for «белки» do they mean the furry animals or the proteins? There is about a dozen universities which can be referred to as "КГУ", so is the user entering this query looking for Kazan State, Kursk State or any other educational institution which happens to have the same abbreviated name?
In the case of an ambiguous query, the user could have had in mind any of the "readings". Therefore, instead of showing the results based on the most frequent interpretation, it seems to be the best bet for a search engine to include the results corresponding to all the different readings. This way the users may at least realize that their query is ambiguous which in turn will help them to make the query more precise.
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
Local Search Analyzers
One of the different factors influencing the web search is the location where the search is conducted. Not that it is of overall importance, but quite often the set of answers to a query will differ according to the specific region.
We here inquire into the quality of the search conducted from different cities of Russia. The analyzer's queries are entered by the servers located in ten cities standing as far away as Krasnodar and Vladivostok. Then the results are tested to see if they are region-appropriate.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
Analyzer of Regional Search
When a user from Ufa or Novosibirsk enters the query "pizza delivery", he or she is most likely looking to get some pizza. The websites of companies offering pizza delivery in Moscow would probably be of no use to the user since it does not make sense to deliver pizza from Moscow to Novosibirsk.
In this analyzer, we query the search engines from different cities across Russia. We select the search engine results for which the title, snippet or URL clearly mentions the location from where the query was sent. We only evaluate the information available as part of SERP's because this is all that the regional users see when they make decisions as to whether they want to follow a link to a given webpage found in SE results.
The percentage of responses to queries averaged over all cities except Moscow is taken to be a good estimate of how friendly the search engine is to the regional users.
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
A human being is often able to interpret a search query, determine what the user wants, evaluate the information on the Web and form the search results better than a machine. For this reason, the results formed by an expert are always better than those of an algorithm.
This analyzer monitors daily the search results for a set of queries, for which the corresponding links have been selected by experts. It calculates the number of sites found by each search engine that were on the experts' list. For each query, the percentage of similarity between algorithmic and expert results is calculated.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
As an expert opinion, the output of the expert system Neuron is used. The aggregate indicator of this analyzer is the proportion of results matching the expert opinion, regardless of the position of the matching site(s).
The best search engine is the one with highest value of aggregate indicator. In the informer, the search engines are sorted by the aggregate indicator.
Currently, 18 queries are evaluated. The number of the queries will be increased.
Analyzer of Ambiguous Queries
When Russian users search for «белки» do they mean the furry animals or the proteins? There is about a dozen universities which can be referred to as "КГУ", so is the user entering this query looking for Kazan State, Kursk State or any other educational institution which happens to have the same abbreviated name?
In the case of an ambiguous query, the user could have had in mind any of the "readings". Therefore, instead of showing the results based on the most frequent interpretation, it seems to be the best bet for a search engine to include the results corresponding to all the different readings. This way the users may at least realize that their query is ambiguous which in turn will help them to make the query more precise.
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
Local Search Analyzers
This analyzer estimates the percentage of possible interpretations of the ambiguous queries which are reflected in the search engines' results.
Most of the marker queries contain just one word, and the possible interpretations usually do not include the numerous businesses bearing the name which equals our query. We only included the very famous companies, to which a user could potentially refer with a one-word query.
One of the different factors influencing the web search is the location where the search is conducted. Not that it is of overall importance, but quite often the set of answers to a query will differ according to the specific region.
We here inquire into the quality of the search conducted from different cities of Russia. The analyzer's queries are entered by the servers located in ten cities standing as far away as Krasnodar and Vladivostok. Then the results are tested to see if they are region-appropriate.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
To fill such an analyzer you have to sort out queries very carefully. For example, it is possible that a user looking for a "North Star" is really keen to find out anything about a restaurant three blocks away, but it is more likely that he searches for some information concerning the celestial body. On the contrary, "train schedule" in most cases is a region-dependent query. To make an unbiased evaluation of the local search quality we only use such queries as must be answered on the basis of local data.
Analyzer of Regional Search
When a user from Ufa or Novosibirsk enters the query "pizza delivery", he or she is most likely looking to get some pizza. The websites of companies offering pizza delivery in Moscow would probably be of no use to the user since it does not make sense to deliver pizza from Moscow to Novosibirsk.
In this analyzer, we query the search engines from different cities across Russia. We select the search engine results for which the title, snippet or URL clearly mentions the location from where the query was sent. We only evaluate the information available as part of SERP's because this is all that the regional users see when they make decisions as to whether they want to follow a link to a given webpage found in SE results.
The percentage of responses to queries averaged over all cities except Moscow is taken to be a good estimate of how friendly the search engine is to the regional users.
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
Analyzer of Regional Navigation
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
The percentage of
This analyzer embraces both the queries which are clearly asking for a regional service ("order glasses", "chinese food delivery") and the queries which imply a search for some local information ("drug store prices", "medical exam when pregnant")
This analyzer is devoted to the situations where the users are looking for a particular web page, but the relevant web page varies depending on where the query is coming from. Navigational search in general is the search for a particular web page acomplished by the queries such as [sberbank] or [gazeta ru].
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
If the users are looking for a particular web page, they often want a page which is specific to their current location. For example, the users from both Kazan and Moscow expect to find some page from afisha.ru when they input the query [afisha]. However, the user from Kazan probably will not be happy to see the pages of afisha.ru specific to Moscow.
We test the search engines on 100 queries submitted from different locations within Russia. The sets of queries used in each city are slightly different depending on whether the relevant regional pages exist. For each query, the analyzer records if the relevant regional website (or the regional part of the general website) was found.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
For each search engine we calculate the percentage of queries for which the relevant regional response appears in top 10 results.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Query Comprehension Analyzers
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
1) accidentally
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Analyzer of the Synonymous Queries
One and the same query can be formulated in many different ways. For example, the queries like "how to deduce the address from the telephone number", "search for address by telephone number", and "find an address by telephone number" have the same meaning for the user, these are synonymous queries.
A number of different practices leads to synonymous queries. Among them:
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
model_name
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
- using the well-knwon abbreviations: "weather in Saint Petersburg" and "weather in SpB";
- using transliteration – "gazeta ru website" and "газета ру website";
- varying word order - "repair automatic transmission" and "automatic transmission repair"
In most of these cases, the users expect to get information related to the meaning of the query, not to its particular formulation. We can take this to imply that the search engines should yield identical results in response to synonymous queries, despite the differences in query formulation.
The analyzer of the synonymous queries assesses the (presumably unnecessary) variability of responses to synonymous queries. The search engines are sorted by the amount of such variability.
All the queries used in this analyzer are real, they were acquired from Rambler's query statistics tool (http://adstat.rambler.ru/wrds/). Note also that we are excluding the queries containing grammatical or spelling errors from this analyzer.
number_name
Analyzers of Search Engine Mistakes
Search engine mistakes are the negative side of its ability to refine and to understand the query (discussed in the Query Comprehension section). The queries in this group were sought out to demonstrate such mistakes in a most vivid and obvious manner. They might seem exotic, but every active search engine user knows that such mishaps tend to occur quite regularly.
When your favourite browser makes a fool of itself, your emotions may vary between laughter and utter rage, but in any case your loyalty to the search engine suffers a crack. The user who is only mildly content when a search engine functions as it should, feels right indignant when he sees a blunder. The unavoidable irritation is the risk factor that should not be underrated by the developers.
Query Substitution Analyzer
Query substitution analyzer is the first one in the new series of analyzers demonstrating mistakes in the functioning of search engines. The words chosen for queries are correct, existing, meaningful words, but as they are rather infrequent, the search engines often don't take the challenge of finding them. Instead, they offer the searcher an output based on some other query which can more or less resemble the original one, but, as it would seem, is asked more often.
As a result, the query is not answered. What's more, the searcher gets mildly insulted, for some search engines spice their wrong results by a prompt remark, like: 'We have corrected the misprint', presuming the searcher does not know the correct spelling of the word he is looking for.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Phrasal Queries Substitution Analyser
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
The analyzer calculates the percentage of wrong search results (not containing any form of the original query). Latin transliteration is counted as a correct result whenever it is possible.
Two additional parameters counted are: 1) the percentage of automatical query substitution (the search engine actually deals with the query as if it were a misprint which is not the case); 2) the percentage of wrong suggests, like: 'you were probably looking for...'. Suggests are a less rude form of interference with the searcher's wishes. Still, when someone suspects that your search for 'dyery' was probably made under the spell of insanity and your real purpose was to find a 'diary', it is bound to drive you nuts.
Far it be from us to claim that all query corrections and suggests are a natural evil. There are many cases when they really serve their purpose (as you can see in our Typo Resistance Analyzer). But the search developers have to be extremely careful, while applying such strong means, for their misuse might harm the search engine's image.
Next one in the series of Search Engine Mistakes is the Phrase Substitution Analyzer. As also in the Query Substitution Analyzer, the queries here are existing and meaningful, but infrequent language units.They are just longer units - phrases instead of words. So, one would think, they must be found more easily, since the meaning of the second word specifies the meaning of the first, rare one. In this case any replacement or even suggest seems absolutely senseless. It will prevent the user from finding what he is looking for and possibly irritate him. That's why we opted for checking this parameter of search engine functioning.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
The analyzer calculates the percentage of 'bad' snippets, i.e. such snippets as do not contain the words of the query, because often it is a sign that the query words have been replaced. In some cases it has proved useful to check the snippets for some part of the query only. It was done when the perseverance on both words could distort the control procedure. Synonyms and transliterations are counted as correct responses, whenever possible.
Two additional metrics are the percentage of explicit query replacement and the percentage of replacement with suggest.
Names Search Analyzer
Trivial as it seems to look for some information about a person by entering his or her name, it often proves tricky: instead of finding the person needed, we stumble upon his famous namesakes.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The analyzer automatically divides the names it finds in the snippets and titles of the first output page into hits and misses, according to whether the right person was found. Then we perform a manual check. It is necessary for disentangling several complicated situations, like for instance, when the name and the surname are glued together at random (which would look strange in English, but can easily happen in Russian with its freer word-order). The results with name initials are also checked 'manually', so that the context might help to decide upon the approrpiate answer.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Unbreakable Queries Analyzer
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
One of the most persistent search engine mistakes is to make a mixture of names and surnames (like splitting 'Richard Shakespeare' into 'Richard III' and 'William Shakespeare'), so the queries in the analyzer were chosen with a view to making such results more probable. We only used the names of some real people we found on the web, but their surnames either coincided with those of some hot media personalities, or were in some other ways non-standard and difficult for a search engine to find.
The resulting number constitutes the average ratio of correctly found pages to the overall number of pages.
Search queries most often consist not of a single word, but of word chains or word phrases. Sometimes the relation between the words in such a chain is loose enough, but there are cases when the phrase is actually indivisible, expressing a single notion through two or more words, e.g. 'bad bishop' (a chess term that clearly shouldn't be echoed by 'America's worst bishops' in the output). In such cases it is the task of the search engine to preserve the wholeness of the phrase, at least in the top results. Otherwise, the meaning of the query gets distorted.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
The unbreakability analyzer continues the series of search engine mistakes analyzers. It measures the percentage of cases when a search engine does not destroy the meaning of an unbreakable query by separating the words. For every query we check the presence of the searched phrase in the output. In a minority of cases the phrase can be split by a very limited amount of words (like 'Black Volta' that can appear in 'Black and White Volta' without any change in the meaning). Such cases are counted as positive results. Also when the searched words stick together accidentally, they are counted positive, if there is no punctuation mark between them.
As usual, the average number of positive results for all the queries appears as the search engine's overall result in the informer.
Grammar Analyzer
The analyzer of grammatical misinterpretation is of a certain linguistic elegance. It deals with such cases when a specific word-form belongs to more than one paradigm, like 'miss' in 'Miss Juliet' vs. 'I miss you terribly'. Naturally, such situations are more intricate in languages with a rich inflectional morphology, like Russian, where a suffix or an ending increase the chance of there being a homonym to the word in question.
The task that lies on search engines is then to choose the more appropriate paradigm / the more appropriate meaning of the two and to adjust the output accordingly. The instrument of the correct choice is the context, however short (even in two-word queries the second word almost invariably seems to narrow the field of possible meanings for the first one).
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
Disturbing Content Analyzers
We have to admit that from this point of view the functioning of the investigated search engines has improved impressively over the last few years. Not so long ago the paradigms got confused so grossly, that in order to evaluate the output you only had to cross off all the wrong word-forms. Nowadays, the search engine tend not to deviate too much from the given word-form, so there is no choice but to base the checking procedure on the context. It also means the queries should become more sophisticated, or rare, or both. However, all the queries we use are quite realistic ones, meaning the search engines should actually get concerned with the results of the analyzer.
It is to be noted that the result of the analyzer, despite it belonging to the Mistakes group, shows the percentage of positive answers among the search engines' output.
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
Pornography Analyzer
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror' technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Obscene Language Analyzer
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
We make use of the Semantic Mirror technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
The Analyzer checks the presence and the quantity of obscene words on the pages found by the search engines in response to neutral queries (those that do not contain and do not suppose the response to contain obscenities). The default value is with "Safe search" on, meaning normally the result should be zero, the measurements with ordinary search options are also available.
There are two indicators in the analyzer. The simple indicator calculates the number of pages, found by the Search Engine, that contain at least one obscene word in it's regular or masked form (as the law doesn't make a difference, given the word is easily recognized).
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
The default indicator is more informative: it shows the quantity of the obscene words in the pages found. Each word, shown clearly, is worth 3 points, while a masked one is worth 1. The points are summed for all the pages found to give the result of the search engine for a specific query. The mean value for all queries is then calculated to form the analyzer indicator for a given search engine.
The recognition of obscene words is performed by the automatic service "Semantic Mirror".
Intrusive Ads Analyzer
Web pages with similar content may differ in the amount of ads. A search engine that favors pages with less intrusive ads is thus better for the user.
The analyzer is based on an ads recognition technology assessing the scripts, iframes and other code on the page. Each page is assigned a total intrusiveness score. The intrusiveness of a particular web page is a sum of the intriusiveness of all individual ads on this page. For each SERP a weighted average of the intrusiveness scores is taken. The average intrusiveness of all SERP's is then the score of a search engine in this analyzer.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Adult Content Ads Analyzer
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
The intrusiveness of a particular web page is a sum of the intriusiveness scores of all individual ads on this page. The intrusiveness of an ad is calculated as follows:
- context ads and small simple banners receive intrusiveness of 1
- teasers and bigger shiny banners receive 3
- some ads have to be clicked on in order to be closed. These include clickunders (separate windows which open upon a click anywhere on a page) and scrolling banners which cover up a part of a page. These ads receive intrusiveness of 9.
- finally, some particularly nasty ads have to be clicked more than once in order to be closed. These receive intrusiveness of 18.
For each web page found, the intrusiveness is adjusted according to the position of the page in the SERP. The weight of each result reflects the probability that the link will be clicked, and the weights of all positions average to 1 (position 1 has the weight of 1.5 and position 10 has the weight of 0.4). Thus, if the sites with intrusive ads are randomly distributed along the SERP, the average intrusiveness won't be affected by the positional weight.
This analyzer looks at queries seeking popular songs or video content. These queries often yield pages overloaded with ads.
Note that web page assessment is based on html code of a web page while the visibility of ads on a particular page might depend on the user's browser and the cookies.
Probably the most intrusive ads on the web are the ads with adult content. Presence of such ads on a web page irritates the users. Therefore a search engine which finds less web pages with pornographic ads is better for the user.
This analyzer has two tabs and both of them look at queries seeking popular songs or video content (these queries are also used in the Intrusive Ads Analyzer). These queries often yield pages overloaded with ads.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
The analyzer is based on an ads recognition technology which assigns a total intrusiveness score to each page. In the first tab (“% of sites with porno-ads”) each search engine gets a score which is equal to the percentage of the pages found containing explicitly pornographic ads. Each page contributes to the search engine's score if it has at least one advertisement containing pornographic pictures. The ads which are not literally pornographic but can be considered blatant or improper are not taken into account in this tab. The scores of the search engines also do not reflect the amount, size and position of the pornographic ads on the pages found. However in most cases if the porno ads are present they occur in a prominent position on a page.
The other tab (“improper + porn”) shows the mean intrusiveness of all adult advertisements on the page. Here the amount and size of the ads, as well as the position of the website in search results is taken into account. For each ad, the intrusiveness is multiplied by a coefficient reflecting the status of this ad with respect to pornography. The following coefficients are used:
0 - an ad is not pornographic or adult in any way
0.5 - an ad sometimes contains some improper pictures, but sometimes these pictures are absent
1 - an ad always contains improper pictures
10 - an ad always contains explicitly pornographic pictures
The score of a particular web page is a sum of the adjusted porno-intriusiveness scores of all individual ads on the page. The intrusiveness itself depends on the size and shininess of an ad as well as on whether the ad covers part of the page's content and requires the user to make a click. You can read more about how intrusiveness score is calculated here. Although the adjusted porno-intrusiveness is on average similar to the intrusiveness of the ads, these two measures cannot be compared directly.
Analyzer of Viruses and Malware
Search engines try to prevent their users from accessing infected or fraudulent web pages. How efficient is this security measure?
This analyzer shows the percentage of infected web pages in each search engine's results. Each web page is analyzed based on the sources of scripts and frames/iframes. In addition to viruses, the Russian web is overwhelmed with ads which are trying to fool people into downloading fraudulent files. These ads are marked as "fraudulent downloads" in the analyzer.
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
Analyzers of Additional Search Features
The number of threats on a particular web page is displayed, but it does not influence the overall score of a search engine - a page with 5 threats contributes as much as a page with 1 threat. Each web page is searched individually for viruses based on the antivus software reports, webmaster opinions and other sources.
By default, the analyzer takes into account all kinds of threats. The "% of sites with viruses" tab only takes into account the sites infected with viruses, the sites with fraudulent ads are excluded on that tab.
Some search engines warn their user that a particular page may threaten them. However for the kind of queries used in this analyzer such warnings may not be sufficient. This analyzer inspects the queries devoted to downloading, listening to or watching music and video clips - same queries as in the intrusive ads analyzer. These queries yield thousands of result pages with roughly the same content. In such a situation it would seem reasonable to not just warn the users about a possible threat, but to lower the infected websites in search results so that an uninfected site is displayed on a high position instead. The data in this analyzer can be displayed based on all the threats (the "All" tab) or, alternatively, it can be displayed based only on the pages for which no warning is shown on the SERP (the "unmarked on SERP" tab).
The object of the user's immediate interest is the set of answers he receives to his query. Nevertheless, his overall impression may as well be affected by smaller technical factors, not immediately discernible on the SERP.
These analyzers are meant to compare some additional parameters of the browsers' functioning that may be of considerable interest for the user. It's worth noting that comparative values bear more importance here than the absolute numbers.
Analyzer of Loading Time
The users expect that the seach engines will give them the results as fast as possible. Even small differences in the time it takes for the SERP to load may be a reason for some user to prefer one search engine over the other.
This analyzer shows the time it takes for users in different Russian cities to get their search results. The SERP loading time data are collected at the same time as the Regional Search analyzer results, hence for the same cities and for the same queries.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
Analyzer of Indexation Time
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
Of course the loading time is affected by multiple factors such as the quality and overall usage of the connection. However, if some search engine is slower to respond than the others on a substantial amount of queries every day, this is a tendency to analyze.
This analyzer also estimates the size of SERP's. The data on SERP size can be seen by clicking on the corresponding link below the graph. Just as most modern web browsers, we download the compressed SERP's.
How long does it take for the newly created pages to start showing up in search results?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We selected about two thousand websites from the Russian Internet. All of these websites inform the search engines about the updates using a special sitemap file (sitemap.xml). All of these websites are generally indexed by the search engines: this is confirmed by the fact that the random pages from these websites are almost always found among the first 10 hits in response to the queries identical to the titles of the relevant pages (the text within <title>). So how long does it take for the newly created pages to start appearing among top 10 results for corresponding queries?
We scan the sitemap files for new pages every day (each website may contribute up to 3 pages per day). Search engines are of course able to do the same - after all this is the purpose of the sitemap files.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.
For every new page, we record the title and the URL. Within a month from the date when the new page was detected, we check if this page is found in top 10 in response to two types of queries:
a) a query which mentions the text of the title restricts the search to the relevant website.
b) a regular text query using the title
The first type of query assesses the time it takes for the new pages to appear in search results. The second type of query assesses the speed of indexation.
The score in this analyzer is calculated as the mean visibility of the new pages within the last month. We also report the data for smaller periods.
Interestingly, there seem to be pages which appear in the search results in the first couple of days after they were detected but disappear by the end of the first week.