H1N1 has caused a lot of noise recently and everyone is aware of it. Not only a few days ago, even Baidu was hit by the H1N1 virus.
In terms of search engine update speed, Baidu indexes portal news websites and other heavily updated websites very quickly, and basically updates them on the same day. However, for websites with a small amount of updates, such as hospital websites, there is a big difference between the two. For example, for websites with high weight and submitted sitemaps and pings, Google's indexing speed can reach the level of minutes. Sometimes, after just updating an article, the index speed can reach several minutes. It can be found in Google in minutes, but for Baidu, the indexing speed is slow. It usually only crawls to the homepage of the website, searches for the title, or the directory page, but there are very few content pages for the article.
According to my observation of the new website www.wznanke.com, this website is mainly a medical service website. From the included snapshots, it can be seen that Baidu seems to pay particularly high attention to the homepage of the website. When the fixed content of the home page of the website was revised and the fixed content was not perfected, I first used external soft article links, as well as Baidu’s knowledge, and Tieba as assistance. It usually takes less than a week to be included in Baidu, through analysis of related medical websites. For articles published within half a month, use Baidu to search for article titles. The top ones are basically articles reprinted or collected from portal websites. Most of the articles on hospital sites are the same, so the content of the articles is not visible in search engines. Very small, so how to increase the traffic of websites like Wanzhong Men's Network that have few professional content updates and obvious industry characteristics? If Baidu is also unfortunately affected by the H1N1 virus, maybe we can find some reasons!
1. Simulation capture analysis
(1) According to Baidu’s crawler record on the site on May 16, it shows:
#Software: Microsoft Internet Information Services 6.0
#Version: 1.0
#Date: 2009-05-16 14:42:56
#Fields: date time s-sitename s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status
2009-05-16 14:42:55 W3SVC490114653 61.129.14.17 GET /robots.txt - 80 - 61.135.190.55 Baiduspider+(+http://www.baidu.com/search/spider.htm) 404 0 64
First, the crawler found the navigation information at the top of the website homepage. Baidu paused after reading the robots once and returning 404. Since the content of the internal pages of the website is not yet complete, Baidu will wait for a long time before accessing the internal pages after reading the homepage. According to the simulation, Baidu’s first effective visit is:
2009-05-16 01:23:32 W3SVC490114653 61.129.14.17 GET /index.htm - 80 - 61.135.162.212 Baiduspider+(+http://www.baidu.com/search/spider.htm) 200 0 0
Secondly, Baidu's next read is likely to continue to read the homepage content first. We can display it through a snapshot in site:wznanke.com. However, when crawling the homepage for the second time, the robots are not read (simulated crawler display ).
2009-05-16 08:24:26 W3SVC490114653 61.129.14.17 GET /index.htm - 80 - 61.135.162.212 Baiduspider+(+http://www.baidu.com/search/spider.htm) 200 0 0
In the next step, Baidu may read more links on the homepage. Since the website is gradually improving, the website should improve relevant internal links and reject the appearance of dead links. According to the simulated crawler record:
2009-05-1608:26:01W3SVC490114653 61.129.14.17 GET /remensousuo/RuHeJianFei/index.htm - 80 - 61.135.162.212 Baiduspider+(+http://www.baidu.com/search/spider.htm) 200 0 0
According to the different content of Baidu baiduspider crawling network at each stage, we can adjust the relevant layout of the website in time according to relevant rules. Especially for new websites that are online, Baidu will not index them quickly. Only when your website has a certain weight in the search engine and has certain high-quality reverse links, Baidu will give the website a threshold. As the threshold increases, Baidu will begin to include relevant content pages, and the website will receive more traffic from Baidu.
Especially for webmasters whose website content is still being gradually improved and who are eager to be included in Baidu, do not blindly submit to major search engines. In the process, I first promoted the website through articles related to external links, and used Baidu to give higher weight Baidu Space, Tieba, Zhizhi, and cooperate with Baidu to frequently visit portal websites with faster update times, such as: Sina, NetEase, Tom Focus on writing articles of a certain quality in relevant communities. In this way, Baidu will regard the website as having higher external weight and voluntarily include it.
Of course, Baidu will not just be the spreader of the H1N1 virus. As long as we find the magic weapon for Baidu's collection rules, this H1N1 virus will also be eliminated by the majority of webmasters.
This article is kindly contributed by the webmaster of www.wznanke.com. Contact QQ309067036