Features:
******************************************
1. Developed using asp.net and run under IIS.
2. It can be automatically connected to the existing website system according to the warehousing settings, and can be seamlessly integrated with the existing system to supplement or replace the collection program of the existing system.
3. Collection scheduling, that is, scheduled tasks, can set a time for each collection rule to collect repeatedly at a scheduled time. Multiple collection tasks can be run at the same time. When the set time is reached, the collection program is automatically executed in the background of the Web server, truly realizing the need for manual intervention. Automatic updates.
4. It can automatically classify the collected information. When the target classification does not exist, the classification can be automatically created. The target classification can also be merged with the current website content classification through classification mapping. There is no need to create a collection task for each category.
5. The collection rule setting is simple and easy to understand. The program has two running modes, foreground running or background scheduled running.
6. It can realize the collection of multi-level web pages in depth, such as content paging, partial information on other pages, novel serialization and other types of information collection with master-slave table association.
7. Original breakpoint resume collection, the collection program only collects when the target website is updated, and only collects the updated part, which is highly efficient. This function is particularly useful for collecting serialized websites, such as serialized novels, TV series, etc.
8. Automatically download relevant external files to the local server or replace them with remote paths, such as pictures, FLASH, download files, etc., without manually uploading them to the server.
9. Supports the definition of collection models. You can define any data items to be collected as needed. Each model can also contain sub-models.
10. Automatically identify the web page codes of most collection sites. For example, common ones: GB2312, GBK, UTF-8, windows-1252, iso646-us, etc.
11. Support collecting thumbnails and other additional information from the list page.
12. Multi-threaded asynchronous collection, high collection efficiency and low server resource consumption.
v1.5.4
Improvement: Fixed the problem that the collection URL queue number exceeds 5000 and cannot be stopped when automatically restarting 2008-2-29
Improvement: In the advanced filtering settings of collection items, replacement can be performed. The format is to add "[to]" after the original filtering rule 2008-2-29
Added: Added the setting of collection time interval to avoid putting greater pressure on the collection station server Added: Added the situation that the collection website needs login verification, the login and verification address need to be set 2008-3-1
Added: List JS submission method (Post) pagination. Usage: Append the parameter "?fc_action=post¶meter 1={$pageid}" to the submission address. If the submission page already contains "?", it will be: "&fc_action=post¶meter 1={$pageid}" 2008-3 -1