Let me first show you some news about the Internet. The first one is that Baidu’s web search share reached 73.2%. Baidu processed 109.6 billion web search requests, an increase of 0.6 percentage points compared with last year. After the report was released, many blogs appeared to attack Google based on this number, saying that it was not doing well.
Looking at the second article, it is still the same data. It is mentioned that Baidu search requests increased by 0.5 percentage points, Google search requests increased by 3.5 percentage points, and Google became the fastest growing search engine.
Both news uses data, and it is the same data, but if you only look at the first and second news, the conclusions drawn are completely different. Let’s look at the third article, which is also about market share data. Baidu’s market share dropped by 2.1%, Google increased by 5.6%, and the gap between the two narrowed to 7.7%. Looking at the same few news items, they all show numbers related to the search engine market share. But if someone who is not familiar with the search engine market sees it, he will eventually come to a very confusing conclusion.
Why do three news articles quote the same data but come up with different results? Below, Lu Songsong will analyze several principles on how to analyze data.
First, it is meaningless to look at a piece of data in isolation.
Continuing with the above discussion, the issue of market share between Baidu and Google is mentioned in the first news article that Baidu’s market share increased by 0.6%. It seems that Baidu is increasing, which naturally implies that Baidu is increasing and Google is decreasing. The second report is more comprehensive. It means that the shares of other search engines are declining, while Google and Baidu are growing, and Google is growing faster. This shows that we cannot look at a number in isolation.
For example, it would be unreasonable to compare Sohu and Sina together. Sohu has online games, wireless, and advertising, while Sina mainly focuses on wireless and advertising. A ratio of 3:2 is obviously unreasonable. It is more reasonable to compare them separately. .
Second, the caliber of data must be comparable.
For the search engine market share examples mentioned earlier in this article, some are defined by search requests, and some are defined by revenue. It makes no sense to compare different data together. If a piece of data cannot be seen naturally, you should look at its definition. Even if the same definition is used by different companies, the results will be different. The important thing is that we must ensure that the caliber is consistent when comparing.
Third, differences in data collection methods
Surveys after hot event reports often appear on various websites, such as 360 and QQ surveys on who to choose to uninstall. In fact, the results often cannot reflect the real situation, because generally speaking, only people who care about the news can express how to read the news. Come vote and use their voting results to represent the overall opinions, thereby guiding another group of people who do not know the truth to follow the mainstream views. The results of online surveys can easily be used by manufacturers to promote themselves and criticize their opponents.
Therefore, it is best to ask several questions about a number. When you encounter a number, it is best not to use it directly. First of all, you should know where the data comes from, how it is obtained, what the meaning and definition of the data are, and whether there are any omissions. Just like the previous example of search engine market share, you can only draw the correct conclusion after understanding which share it is and what the criteria are.
For original articles, please indicate that they were reprinted from Lu Songsong’s blog
Thanks to Lu Songsong for his contribution