1. Why is space so important?
A search engine crawler (Baidu's crawler is called a spider, Google's crawler is called a robot) is a program that relies on URLs to crawl the pages of the website where the URL is located. We call it a crawler. It runs automatically. It collects URLs and downloads them. The program of this website's page will count all the links on the page, including internal links and external links. After the statistics, it will be crawled again and saved to its server in the form of txt text.
Inclusion is divided into two processes; 1. First, collect links by crawling to a certain page (the link you submitted to the search engine), 2. Crawl to your web page and download the web page. 1. Cache server (snapshot) 2. SITE server (include) 3. Index list server (ranking) They are not on the same server. This is why our snapshot dates are different. Because they are on different servers, there will be out-of-sync phenomena. For example: there is no homepage for the domain name of our SITE, but there is a homepage for directly searching the domain name. This means that the data is not synchronized.
Why is the stability of the space so important? Because search engine crawlers simulate users' behavioral habits to crawl website content. If the server is unstable or the opening speed is very slow, the crawler will lose interest in the website when it loses data or fails to crawl the content. Therefore, Wu Xun reminds SEOers that server instability will have a direct negative impact on SEO optimization.
2. So how should we prevent it?
1. Website data backup (webpage data and database data) must be carried out regularly, and the database backup website files must be packaged and downloaded locally. In case of being hacked, we can directly recover the data, change the FTP password, server password or space background control password, and temporarily cancel the writing rights of the website folder. The more complex the FTP password, the better!
2. Space opening speed exceeding 6 seconds is very detrimental to SEO. If the website has too many pictures and too much Flash, it is recommended that you compress the pictures to no more than 50KB. If you can’t use Flash, don’t use it. It is also recommended to enable the server. Compressed transmission function. Another reason is that when calling, especially when calling weather forecasts, as long as the calling website is slow to open the website, your own website will also be very slow. It is enough to have one online message software and website statistics, but more will also affect the website. opening speed. Remember: the larger the calling code, the slower the opening speed! If none of the above reasons are the cause, it is most likely that the space or server is slow to open. Please communicate with the space provider or computer room to solve the problem. If the problem cannot be solved, make sure to change it decisively. If the space or server is changed, Server please keep a few points in mind:
First, transfer the data (web page files and database), and then transfer it.
Test space or server speed before second transfer,
Third, first enable the second-level domain name for debugging or use the third-level domain name sent by this space provider for debugging.
Fourth, it is best to perform domain name resolution at the time when the number of user visits is the least.
Fifth, after resolving the domain name, the original space must be stable for 24 hours. The original space cannot be closed, and the original space data cannot be cleared, because the global effective time of DNS resolution is 5 minutes to 24 hours. Many old users have caches of the original IP and every The effective time of DNS resolution is different in different regions, and the spider also has a cache.
3. How to choose a reasonable space?
The first is to support pseudo-static space. Now most website source codes are dynamic and pseudo-static, so pseudo-static must be supported.
Secondly, it is best to provide IIS log query. If you want to understand the movements of crawlers on the website, you must check the IIS log, and it is best to generate one IIS log every hour.
Third, it is best to support the php+MySQL space. Most webmasters use the website source code of php+mysql.
Fourth, we must support the space’s background online decompression function. If we do not support background online decompression and compression, it will cost us a lot of time to upload files or backup.
Fifth, we need to support the binding of 301 redirects and 404 error pages. 301 redirects can centralize or shift the weight of our website. 404 error pages are friendly to users and crawlers.
Sixth, it is best not to limit the number of IIS concurrency. The space that limits the number of IIS concurrency will be directly paralyzed as long as it is attacked by threads.
The seventh problem is that technology can solve it in about 12 hours.
Editor in charge: Chen Long Author Wu Xun’s personal space