Since the SEOTcs system updated the SEO scoring algorithm on November 24, a problem that has been bothering me has arisen. The following error will often be reported during the execution of the Java data job task:
"2011-12-03 18:00:32 DefaultHttpClient [INFO] I/O exception (java.net.SocketException) caught when processing request: Connection reset by peer: socket write error
2011-12-03 18:00:32 DefaultHttpClient [INFO] Retrying request”…
To this end, I searched some websites in Chinese and English, searched every corner I could find, and found out the reason why this situation occurs. This Java exception may occur on both the client and server sides. What causes this exception? There are two reasons:
1. If the Socket at one end is closed (or actively closed, or closed due to abnormal exit), the other end still sends data, and the first data packet sent triggers this exception (Connect reset by peer).
2. One end exits, but the connection is not closed when exiting. If the other end is reading data from the connection, the exception (Connection reset) will be thrown. Simply put, it is caused by read and write operations after the connection is disconnected.
So I simply thought that it could be solved by setting some socket timeouts:
But after setting up the situation is still the same.
This problem has troubled me for several days, and I have been thinking and doing comparative tests every day in order to find out the code that caused this problem. I can't help thinking, under the premise of the same number of keywords, why there was no error in the previous batch query ranking data, but Errors have been reported frequently recently. Why is this? Is the requested interface website blocking our server IP? This reason is not very sufficient. It must be caused by the failure to release the connection properly somewhere in the program!
Under the guidance of this idea, after several days of continuous hard work and practice, today I finally discovered the essence of the problem, which is caused by the timer method! The situation is like this. In the past few days, I have manually triggered some batch tasks and found that when the filter ranking value is 100, the error java.net.SocketException: Connection reset in java will keep being thrown, and the screen refresh is particularly powerful. , after carefully comparing this code of timer
Finally, I suddenly realized, yes! There is a problem here, let me analyze it myself:
A function value, the value it returns is a critical value, but in my timer method, it is judged that if the returned value is a critical value, it will force it to continue executing that method within 10 seconds, and this method is to To obtain a specific data of the source code in a page, each execution of this method will consume tens of milliseconds, which is equivalent to establishing a socket connection within this time, but because it always returns the critical value, so this method will continuously establish a socket connection within 10 seconds to obtain data. If this method takes about 80ms to execute each time (after testing, the execution time of each such method is about 80ms), in 10 seconds Within this time, 10*1000/80 = 125 socket connections will be established, that is, 12.5 socket connections will be established per second. In addition, since this is a filtering program, multiple critical values will appear together continuously, so , in a few seconds, the number of socket connections to the same website page will soar very high, reaching hundreds or even thousands, causing the number of request connections waiting to be processed to be too high:
Why did you use this timer method to execute a method several times in the first place? The reason was to obtain a stable value of data. But now that I think about it, the negative impact is so costly and the effect cannot be underestimated. , but after several days of comprehensive analysis and testing, the culprit was finally discovered. After the problem was solved, my mind suddenly felt relieved and I could sleep peacefully. . .