SEO tips: How to avoid high levels of duplicate content when optimizing your website

Author：Eve Cole Update Time：2011-06-03 18:12:52

Everyone must know about duplicate content. As the name suggests, it means that the content of the website is repeated, or it is highly repeated. It is also called duplicate web page, that is, duplicate content web page. Due to the Internet environment in China, plagiarism collection is quite popular. Especially after many CMSs have been launched in recent years, more and more webmasters have registered a domain name, uploaded a CMS program and started collecting and building websites. This approach is actually not advisable. Firstly, users will be disgusted by a large amount of repeated content. Search engines are based on user experience, so they will not like it either.

Some websites can use different URLs to access the same page due to problems with their own programs. I have mentioned this in my previous article on URL standardization. The standardization of URLs is very important, so I won’t discuss it here. It’s worth mentioning that the standardization of URLs can be said to be a must for all SEO optimization. Search engines don't like duplicate content. The program will automatically determine which version the original version is, and then ignore other pages. But for search engines, one consumes broadband resources and the other is a waste of time. For website administrators, multiple URLs will not only disperse weight and reduce rankings, but also face the risk of being punished by search engines. And you have to know that a spider is just a program after all, and the standardized URL it picks out by itself may not be what we want. Regarding the issue of punishment for copying content pages, there has been a debate in the SEO industry, which is the issue of punishment and non-punishment. Personally, I think you will still be punished. Although Google has clearly stated in the official guide for webmasters that duplicate content web pages will not be punished, but please do not create a large number of duplicate content web pages, as this is not conducive to website ranking. Baidu has also made it clear: If most of the content on your webpage is duplicated with existing content on the Internet, your site will likely be abandoned by Baidu. And when these duplicate URLs are included, when the search engine returns these duplicate content web pages to the searcher, it will seriously affect the user experience, because the search engine is based on the user experience as its core, and it does not want the search results listed to be the same. He is only willing to list one copy of duplicate content, and for other duplicate content pages, search engines will downgrade the page or delete it directly from the index.

When I read "Website Traffic Speed Up Second Edition" these days, I found that another situation for web pages with repeated content is that product sellers and agents copy product information from the manufacturer's website, so for the manufacturer It's okay, because manufacturers generally agree; but this creates a problem: a lot of duplicate content appears on different web pages, which search engines don't like. In order to make their products more familiar to customers, these websites may provide some versions that are more suitable for printing. If these URLs are not processed, the content pages will also be copied.

Another situation is the spider trap mentioned in an article the day before yesterday. That is, some e-commerce websites will use session IDs to give different users different IDs. In this case, the same will happen every time a spider visits. Causes duplicate content pages. For details, you can read this article about avoiding spider traps. When a search engine determines whether to copy a web page, it will have a set of related algorithm mechanisms to process it. This involves different search engines' different duplicate content detection mechanism algorithms. Because of the different weights of websites, search engines may regard the real original source as a copy, and the copied one as the original source. This is especially true for websites with too high weight in Baidu, like this article of mine Even if it is included by Baidu after I publish it, if it is reprinted by Sina, Baidu may still judge that I am reprinted.

In addition to the content part of the web page, duplicate content pages also have repeated titles, repeated structures, repeated templates, etc. Now everyone who does SEO knows that the title of a web page is very important, so when you name a title for a web page, try to Go to Baidu and Google and search to see if there is already a title like this or similar to this one. Try not to repeat the title. Regarding structural duplication, common ones include some CMS systems, website building programs, forum programs, etc. Since these programs are widely used, their URL structures often have large areas of duplication. You can study this by yourself and try to achieve the URL structure as much as possible. unique. So what is template duplication? Nowadays, the threshold for building a website is getting lower and lower. Many people upload a program, apply a default template and forget about it. Although content is the most important, it is still recommended to modify the default template. Download it. The default template is generally used very widely. Its layout structure and the HTML code or CSS code in the template often cause duplication. Although the search engine judges that the page will remove all HTML code, for us SEO enthusiasts In other words, for SEO practitioners, in order to better improve SEO performance, it is recommended to modify it.

There are also mirror websites, which are mentioned in this blog. You can search for them, but I won’t describe them here. We will not elaborate further on the problem of copied content pages caused by the reprinting and plagiarism of articles. Another thing to avoid is that the website content is too little. For example, some sites have too little substantive content on their content pages. In addition, each page inevitably has common parts, such as the navigation bar, the bottom of the page, etc. If If the amount of substantive content is less than these contents, it may also be judged as a duplicate content page by search engines. There is also a detail. Some websites have blank pages due to negligence or other reasons. If there are a large number of blank pages, they will be mistaken for duplicate content.

There is also a service or product type website. When operating by region, some products or service types have smaller ranges, or they are the same. This kind of page only modifies the region and other parameter information, but other large Most of the content is still the same. There are also common news websites that use RSS feeds to generate web content, and these contents are also prone to flooding. Some websites generate real static files before they are revised, and these real html pages are not deleted after the revision. If the website structure is changed but the content remains unchanged, the html files in these real directories should not be deleted. Sometimes, it will also cause a high degree of duplication of web content. Similarly, improper summary settings may also lead to duplicate content pages. In order to improve user experience, more and more websites, especially news websites, now enable article summary functions. These summaries not only provide users with convenient browsing, but may also cause misunderstandings. I think that duplicate content, and another uncommon situation is http and https, etc. It is essential to work on the details for website optimization, and duplicate content on the website will affect the ranking of the website to a certain extent. Source of this article: Shenzhen SEO Source http://www.zhsem.com/ Please indicate when reprinting, thank you!

The personal space of the author Xiao Wuming