On April 13, 2010, in the Baidu Tieba Webmaster Club, ZAC, a well-known Chinese SEO expert, asked in the post "Asking questions on behalf of others that original content cannot be identified": "The ranking of my own original content is often not as good as that of reprinted or plagiarized content. What can the webmaster do to prevent or improve? My website updates original content every day, and Baidu also updates it every day. However, after other people's reprints are included, my articles cannot be searched. I persisted for nearly 4 times. It’s my original work, but Baidu still dropped me to more than 500 places!”
Two years ago, the ZAC representative webmaster had a dialogue with Lee, who represented Baidu, on the issue of original inclusion.
More than two years have passed, and the situation stated in the above question has not changed, and has even worsened. Various "copied and collected pseudo-original" valuable original content pages are easily recommended to search netizens by Baidu web search through keyword indexing. However, the original content launch website has lost its reputation. This objectively condones the proliferation of so-called SEO based on "copying and collecting pseudo-originals" that targets Baidu's shortcomings.
Unsurprisingly, at Baidu's "Webmaster Clinic Open Day" event on August 10 this year, the issue of originality identification became a question that webmasters and SEOERs kept asking Baidu search engineer Lee.
Also unsurprisingly, Baidu search engineer Lee's answer was a replica of his answer two years ago: "Well, it can only be said that Baidu's strategy is not perfect yet, and we have been improving it." Perfect original identification algorithm".
People who follow Baidu news can easily find that Baidu Lee's answer "We are designing a relatively complete set of original identification algorithms" is completely in response to the Baidu web search anti-fraud team's "Measures against low-quality sites" on July 2. "Measures against low-quality websites (fake originals and non-original websites) have taken effect" - Everyone still remembers that at that time, "Measures against low-quality websites have taken effect" said grandly: "To provide high-quality, For webmasters of original resources, because we reduce or even eliminate the rankings of low-quality sites, you will get more traffic from Baidu."
But less than two months later, Baidu search engineer Lee's answer completely denied the Baidu web search anti-fraud team's statement, which was really shocking.
Moreover, when faced with the question of identifying "original content" twice after two years, Baidu Lee adopted the method of "looking around and talking about it" to deal with it perfunctorily: Two years ago, Lee's answer was "From the perspective of user experience, some reprints may not be Worse than the original...it's just domestic reprints, many of which are cut off from beginning to end, which makes the original author more injured." It is more aimed at the problem of irregular domestic reprinting; this year, Baidu Lee's answer is: "(Baidu received More than 80% of the complaints that claim to be original are invalid, and there are even a large number of websites that claim that old Chinese medicine doctors can cure terminal diseases in 3-5 days. The entire content is unreadable and they claim to be high-quality websites."
It is undeniable that what Lee said are all facts, but the accumulation of real details does not equal the real whole. The existence of these common situations does not mean that there is no high-quality original content in the Chinese Internet industry, nor does it mean that Baidu cannot identify websites. The reason for the original release. As the saying goes, "If you don't have a diamond, don't mess with the porcelain." Baidu Lee's statement can only prove that Baidu's ability to identify original content and remove duplicate pages has not improved at all.
It must be emphasized that, after understanding that poor ability to identify original pages is the weakness of all search engines, many grassroots original authors have added a copyright statement at the end of the article to mark the starting URL, and at the same time, they are guided by the "content synchronization" method of high-quality industry website submissions. Search engines and reposted webmasters - Although most of the links obtained are plain text links, Baidu search engineer Lee said, "Let's clarify the question: Can links in the form of plain text (non-tags) be recognized and processed? The answer "Yes, search engine spiders need to discover and crawl links on the Internet in a timely manner. It doesn't matter what form the link is in," which gave them confidence.
What disappoints these webmasters is that a large number of authoritative submission and reprint websites in the industry do not "reprint from the beginning to the end" as Baidu Lee said, and the web pages of high-weight submission or reprint websites are generated and included by search engines significantly earlier than " In the case of "copying and collecting pseudo-original" websites, there are still a large number of original first-page pages that are ignored by Baidu. The rankings of "copying and collecting pseudo-original" websites remain high - many of these pages still randomly intercept part of the article and do not fully express the theme of the article. It cannot meet the "better user experience" standard advertised by Baidu.
It must be noted that although the identification of original pages has always been a weakness of search engines, not all search engines perform as poorly as Baidu in the face of many high-weight URLs pointing to original launch pages. As Wang Tong, a well-known domestic SEOER, said, in the face of the proliferation of "copying and collecting pseudo-originals" on the Chinese Internet, for the original launch page with the copyright statement URL guide (in addition to the release time, link universality and link website page weight, etc. Standard), Google has not suffered a complete failure like Baidu, which claims to "know Chinese best" - the top blockbuster related searches are occupied by copied and collected pseudo-original pages, and the original first page disappears without a trace.
This shows that Baidu, which “knows Chinese best”, must complete the original identification work and the removal of duplicate pages (to identify high-quality information pages and important supplementary pages for key recommendations) before recommending URLs to search netizens through keyword indexing. It has not been completed - the reason is that the technical level is very low and there is an urgent need to catch up, and Baidu Lee's statement is just constantly looking for excuses for Baidu.
Moreover, in comparison, Google's better performance in identifying the original starting address proves that Baidu does not care about the original starting page, it only cares about more original content - but lacks proper copyright awareness. I think this is what Baidu is doing For a long time, the main reason for the backwardness of the original website algorithm is "either impossible or not."
No wonder, as soon as Wang Tong, a well-known domestic SEOER, said that "Baidu's 628 adjustment is to crack down on original websites", many webmasters and SEOERs felt sad.
In fact, if the identification technology of the original first page can be greatly improved, it can help the search engine's anti-cheating ability to be greatly improved, directly frustrate the plots of SEOERs who deceive search engines in various ways for profit, and provide a platform for those who are seriously committed to high-end search engines. Confidence in quality original content.
Only when Baidu respects the work of many small and medium-sized original website webmasters with practical actions and encourages them to continue to use their intelligence and talents to carry out original work can we guide more webmasters and SEOERs who are obsessed with "copying and collecting pseudo-originals" to devote their energy to Go to the original content that “best reflects the core value of the website”. For Baidu, although this step is difficult, it is a big move that is beneficial to the future development of search engines.
Moreover, I have to remind Baidu web search that it is not Baidu’s gift to many grassroots webmasters to solve the “original content collection problem” reported by webmasters as soon as possible with a more reasonable algorithm (well-known websites do not piss Baidu at all) , Taobao directly blocks Baidu), but it is a "basic obligation" that Baidu must fulfill under the current Copyright Law and other relevant laws. Baidu people can't feel too good about themselves.
Where to go, the road is at your feet, it all depends on the search engine's own choice (This article was originally published by gouyn12. All rights reserved. You are responsible for the article. Please indicate the original source of the article in the form of a link when reprinting it : http://www.gouyn12.com/cnnet/327.html ) .
(Editor: Chen Long) The personal space of the author gouyn12