The vast majority of webmasters know that it is very important for a website to get a good ranking in search engines, so many webmasters do their best to please search engines, wishing to treat search engine spiders like emperors, hoping to get Spiders are appreciated, thereby improving the ranking of the website, but in fact, even if the spider is well taken care of, it cannot get a good ranking. Why is this? Because spiders do not have human emotions, even if you treat them like an emperor. I also have no mercy for you, just do whatever you want, so in terms of website optimization, it is not that the better the spiders are, the better the optimization effect will be, but you must know the trade-offs! You must learn the skills to block some spiders! For example, in addition to restricting spiders in ADMIN When crawling on DATA, you can also properly block spiders on other directories, which is also very beneficial. Let’s analyze several techniques for blocking spiders!
1: Both the picture and template directories can be blocked
Because many webmasters are currently looking for the same pictures on the Internet and applying ready-made templates. These templates and pictures have been flooded on the Internet. At this time, if you let your website be crawled again by spiders, these old-fashioned things will naturally make you angry. Spiders are very disgusted, so your website will be labeled as imitating and cheating. It will be more difficult to gain the favor of search engines, so the IMAGES directory can usually be blocked!
2: The cache directory can be blocked to prevent repeated indexing
Spiders are very greedy. As long as you feed them, they will accept them all, whether real or illusory. For example, the spider will index the content in the cache directory of the website, which is bound to be different from the content in the website. Duplication. If there are too many duplications, Baidu's algorithm mechanism will think that your website is cheating, which will even increase the weight of your website and have a great impact on the website. Usually, the cache directory of each website building program is Different, it is necessary to block the corresponding cache directories according to different website building programs!
Three: CSS directories and some RSS pages need to be blocked
The CSS directory is completely useless for spiders. After crawling, it will affect the judgment of the search engine algorithm, so it can be blocked through the ROBOTS.TXT file. In addition, the RSS pages of many website building programs are also a kind of duplication of content. In the end, it will also cause misjudgments by search engines. Both aspects of content need to be blocked! This kind of blocking seems to be disrespectful to spiders, but in fact it is like good medicine, which is bitter in the mouth and good for the disease! Good advice is hard on the ears but good on the deeds!
Four: If there are two pages, then block the dynamic page first
Generally speaking, static pages of a website are easily indexed by search engines. Generally, spider crawling and indexing are two different things. In addition to static pages, most websites also have dynamic pages, such as "www.XXXX/1. html and www.xxxxx/asp?id=1 refer to the same page. If they are not blocked, the two pages will inevitably be crawled by spiders at the same time. However, when the search engine algorithm determines, because it is discovered If there are two identical pages, it will be considered that your website is suspected of cheating, so we will increase the inspection of your website, which will affect the ranking of the website. Therefore, the correct approach is to block the dynamic pages of the website first!
Five: Content related to website security and privacy must be blocked
It was mentioned at the beginning of this article that the ADMIN and DATA directories are actually related to the security and privacy of the website. There is no benefit in exposing it to spiders, and there may even be more channels for attacks, so security-related directories, such as databases, are involved. The directory, website log directory and backup directory all need to be blocked. In addition, some webmasters download the website after backing up the website, but accidentally delete the backup file after downloading. This can easily lead to repeated crawling by spiders, and it is also easy to This can lead to being attacked by hackers, so it is very necessary to use ROBOTS.TXT files to block files like "RAR and Zip"! At the very least, it can enhance the security of the website!
All in all, if you blindly regard spiders as emperors, you will often be flattering. Solving the work pressure of spiders through appropriate optimization and shielding is the biggest flattery, and it is also the way to improve the level of website optimization! Source of this article : http://www.wowawowa.cn/Wowawowa Weight Loss Network A5 is the first release, please indicate when reprinting, thank you!
Editor in charge: The personal space of Jiangnan, the author of Hadron