Recently, Trilegangers, a Ukrainian website focusing on human 3D models, encountered unprecedented traffic attacks, causing its server to collapse. The website is committed to providing 3D artists and game developers with massive amounts of human body 3D model data, but is in trouble because of the frequent crawling of OpenAI's crawler GPTBot. This incident not only exposed the potential threat of web crawlers to website operations, but also triggered extensive discussions on the balance between AI technology and copyright protection.
According to Trilegangers staff, although the website explicitly prohibits unauthorized crawling and use in the usage agreement, the server is overloaded due to the incorrect setting of robots.txt file to prevent crawlers from accessing. According to the server log, OpenAI's GPTBot crawler initiated tens of thousands of requests through more than 600 different IP addresses, which resulted in the website being unable to function normally, similar to encountering a distributed denial of service (DDoS) attack. This situation not only affects the normal operation of the website, but also causes great inconvenience to users.
OpenAI mentioned in its crawler description that if the website does not want GPTBot to crawl content, it needs to be set in the robots.txt file. However, Trilegangers are not aware of this, leading to the current dilemma. Although robots.txt files are not required by law, if the website has stated that unauthorized use is prohibited, GPTBot's crawling behavior may still violate relevant regulations. This incident reminds the website operators of the importance of technical settings, and also triggers thinking about the ethics of the application of AI technology.
In addition, due to the use of Amazon AWS servers, Trilegangers' consumption in bandwidth and traffic has also increased dramatically, bringing additional cost pressure. In response to this emergencies, Trilegangers has taken measures to set up the correct robots.txt file and block access to multiple crawlers including GPTBot through Cloudflare. This practice is expected to effectively alleviate the burden on the server and ensure the website. normal operation. This lesson also provides valuable reference for other websites.
This incident has aroused people's attention to the behavior of network crawlers, especially in the context of the increasing development of AI technology. How to balance technology application and copyright protection has become a topic worth pondering. With the continuous advancement of AI technology, the behavior of network crawlers will become more complex and hidden. How to find a balance between technological development and copyright protection will be an important issue that needs to be solved in the future. This incident is not only a challenge to Trilegangers, but also a warning to the entire Internet industry.