py spider for wechat Download - py spider for wechat Source code download

py spider for wechat

AI Source Code

1.0.0

Download

WeChat public account crawler

Secondary development and PR submission are welcome?

Project introduction

Use Python to build a crawler to crawl historical articles and content of designated public accounts, and support filtering articles using keywords.

How to use

Get token and cookie

1. Go to the WeChat public platform to log in or register a public account

2. Enter the main interface of the WeChat public platform, press the shortcut key `F12` to open the following interface, and switch to [Network]

3. Refresh the browser page and click to view the file contents indicated in the picture.

4. Copy the content selected by the mouse in [Load] and copy it to the [token] of the crawler program.

5. Find [Request Header] in [Header], copy all the content selected by the mouse in the picture, and copy it to the [cookie] of the crawler program

Official account selection

1. Enter the name of the public account you want to crawl in the crawler program, and click the [Query] button

2. In the query results of [Select Official Account] below, select the official account you want to crawl.

Crawl settings

1. Starting page number & number of crawled pages

(1) The historical articles of public accounts are obtained in pages. Generally, there are 5-10 articles on one page.

(2) The smaller the page number of the public account’s historical articles, the more recent the time. Page 0 stores the latest articles.

(3) It is recommended that the starting page number starts from 0

(4) The number of crawled pages cannot be 0, otherwise the crawling results will be empty

2. Save file name & save location

Enter the correct file name and select the file location.

3. Keyword filtering (can be filled in or not filled in)

(1) Function: Used to filter articles based on keywords and obtain articles containing keywords in the article title. If not filled in, all articles will be retrieved.

(2) Format:关键词1；关键词2；关键词3

Separate with [Chinese semicolon], no semicolon after the last keyword

Start crawling & view results

1. Click the [Start Crawling] button

2. View crawling results

(1) The program will generate a folder with保存文件名_当日日期in the selected file storage location directory, and save the crawled content in this folder

(2) The contents of the raw folder are cache files generated during the crawling process and can be deleted.

Expand

Additional Information

Version 1.0.0
Type AI Source Code
Update Time 2025-01-18
size 4.39MB
From Github

Related Applications

sentinel1 orbits py

2024-11-08
Spider Rope Fighting mobile version

2024-07-20
Spider Proxy app

2023-08-17
Spider Train Shooter Chinese version

2023-08-04
Spider Hero Spider Hero

2023-05-29
Spider Horror MultiplayerGame

2023-05-08

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
node telegram bot api

AI Source Code

v0.50.0
typebot.io

AI Source Code

v3.1.2
python wechaty getting started

AI Source Code

1.0.0
waymo open dataset

Other source code

December 2023 Update
wp functions

Other categories

1.0.0
termwind

Other categories

v2.3.0

Related Information All

py spider for wechat