?Life is fantastic?!~
" Did you know all your doors were locked? " - Riddick (The Chronicles of Riddick)
Some crawler sample programs, as well as simulated login programs. The simulated login is based on selenium, and some simulated login is based on js reverse engineering. It is continuously updated. If you have any questions, you can directly submit Issues. You are welcome to submit PR. If you pass the test, you can directly merge. All programs in this article are written using python3
:-)
Simulated login basically uses direct login or selenium+webdriver. Some websites are very difficult to log in directly, such as QQ space, bilibili, etc. It is relatively easy if selenium is used.
Although selenium is used when logging in, for efficiency, we can maintain the cookies obtained after logging in, and then call requests or scrapy to collect data, so that the speed of data collection can be guaranteed.
Chrome FireFox
Please touch here to view test images
@deepforce | @cclauss | ksoeasyxiaosi | JasonJunJun | MediocrityXT
Everyone is welcome to participate and improve: One person can go fast, but a group of people can go further