Yesterday, I analyzed the IIS logs. Fortunately, spiders from Baidu, Google, and Yahoo all came to crawl. It seems that the optimization was successful. Pages that had not been crawled before were successfully crawled by Google after some external link guidance. I arrived and included it, but I also found some problems. There are many 404 page return records in Google's spider crawling records. This is not a good thing. It means that I have not cleaned up the code, and there are a lot of dead links. Then I logged into Google and used the website management tool to analyze it. Oh my God, there were 210 dead links. I guess the quality of my pages was not very good for Google, but I had trouble checking so many 404 pages, let alone changing them. This Then I thought of robots.txt.
Because the 404 pages here basically end with asp, for such a large 404 page we can set it up like this:
User-Agent:Googlebot
Disallow:/*.asp$
I came here this morning to analyze the logs of last night's Google crawl records, and as expected I no longer paid attention to these pages ending in asp.
If a large number of dead links do not present regular pages, it is not suitable to use robots.txt, so another way is to manually set the 404 page. Generally, the backend provided by the host provider should provide a 404 page. Operation, if it is a .net program, you can set the error page in web.config. I directly log in to the server to modify the 404 code return page processed by IIS. In a word, changing the 404 page will help guide customers to jump to Other useful pages to grab customers.
This article is published by Koushuiyu Web Tutorial Network (http://www.koushuiyu.cn). Please indicate the reprint, thank you!