wescraper Download - wescraper Source code download

wescraper

AI Source Code

1.0.0

Download

WeScraper (WEchat SCRAPER)

This tool uses Python2.7 and scrapy to search WeChat public account articles.

Tutorial

Direct query from command line

Install Scrapy and query directly.

 pip install scrapy
python wescraper/scraper.py account liriansu miawu > we.json # 查询liriansu和miawu相关的公众号
python wescraper/scraper.py key-day liriansu miawu > we.json # 查询liriansu和miawu相关的文章（一天内）

Web Server query

Install Scrapy and Tornado and query through the local server:

 pip install scrapy tornado
python wescraper/server.py

After the server is started, you can obtain the WeChat public account article list through http://localhost/account/foo/bar/baz...

Or you can use http://localhost/key-year/foo/bar/baz... to query public account articles by keyword.

Python Code call

See scraper.py source code

Detailed description

For some configurable parameters, see config.py
Querying the public account will get the first one in the list by default.
This tool may be banned. For solutions, please refer to Scrapy: Avoiding getting banned (generally speaking, changing the IP can solve the problem)
A cookie pool is maintained in cookie.py, which will randomly select n cookies for access. If the cookie is banned, a new cookie will be replaced.
Welcome to modify based on this code, remember to run the unit test: python wescraper/test/test.py
This tool completely relies on Sogou WeChat to search and crawl articles. If the Sogou WeChat search interface changes, the crawling may fail.
Python is great!