recruitment data analysis visualization Download - recruitment data analysis visualization Source code download

recruitment data analysis visualization

Other categories

1.0.0

Download

Recruitment data visual analysis system

Project introduction

This system is developed by using Python + The Selenium crawler program collects recruitment data from the BOSS direct recruitment website, stores the collected recruitment data in the MySQL database, and then performs data cleaning on the recruitment data stored in the database, including data deduplication, unification of field types and content, Delete irrelevant data and other operations, and then analyze the cleaned data, including the number of recruitments for a certain type of position, academic qualifications, and work experience. Analysis from the perspectives of experience, company type, company size, city distribution, etc.; Analyze the salary level of a certain type of position from the perspectives of academic qualifications, work experience, company type, company size, etc.; Calculate the high-frequency skill words that appear in a certain type of position and combine them The skills to be mastered were obtained by analyzing the results. Finally, in order to intuitively display the analysis results, a recruitment data visual analysis system was designed and implemented to display the analysis results in the form of visual charts. Technically, the SpringBoot framework is used to build backend access, and a RESTful API is used to provide data to the frontend. The system frontend interface is built using the Vue + Element-UI framework, and the visual charts are generated using v-charts + echarts chart library.

Project operation

Data collection program environment preparation

Python 3.8.6
Python third-party library selenium
Download the chromedriver version corresponding to Google Chrome
Pycharm 2022.2.1

Run the crawler

Import the crawler program in the bosszp-spider directory into Pycharm , open the spiderMain file, find the main function in the program, and modify the code spiderObj = spider('copywriting', city, 1) in the main function to change the copywriting Change it to the post to be crawled, then use the terminal to enter the installation directory of Google Chrome and run ./chrome.exe -remote-debugging-port=9222 command, then open the BOSS direct recruitment website in the launched Google Chrome and scan the QR code to log in. After completing the above steps, you can run the crawler program.

Backend environment preparation

jdk 18.0.2
maven 3.8.6
nginx 1.23.1
MySQL 8.0.31
Redis 5.0.14
IDEA 2022.2.1
Navicat 16

nginx configuration introduction

Create an upload folder in the root directory of drive C , and then create avatar and voice folders in the upload folder.
The installation directory of nginx needs to be on the C drive. After the installation is completed and the startup is successful, modify the nginx.conf configuration file in the conf folder in the nginx installation directory. The specific modifications are as follows:

找到listen 80,然后在它下面添加或替换如下配置

        listen       80;

        server_name  localhost;

        sendfile        on;

        keepalive_timeout  65;

        charset utf-8;

        #access_log  logs/host.access.log  main;

        location / {

              add_header 'Access-Control-Allow-Origin' $http_origin;
              add_header 'Access-Control-Allow-Credentials' 'true';
              add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS';
              add_header 'Access-Control-Allow-Headers' 'DNT,web-token,app-token,Authorization,Accept,Origin,Keep-Alive,User-Agent,X-Mx-ReqToken,X-Data-Type,X-Auth-Token,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range';
              add_header 'Access-Control-Expose-Headers' 'Content-Length,Content-Range';
              if ($request_method = 'OPTIONS') {
                  add_header 'Access-Control-Max-Age' 1728000;
                  add_header 'Content-Type' 'text/plain; charset=utf-8';
                  add_header 'Content-Length' 0;
                  return 204;
              }

	      root /upload/;
              index  index.html index.htm;	# 需要转发的url地址
        }

        location ^~/apm/ {
                proxy_pass http://localhost:8890/;
       }

        location ^~/apj/ {
                proxy_pass http://localhost:8890/admin/;
       }

After the configuration modification is completed, restart nginx . If no errors occur, the nginx configuration is complete.

Running in the background

Use IDEA to import all the background code in the analyze directory. After all dependencies are downloaded, modify the configuration content in the application.yml file according to your own situation. After the modification is completed, use Navicat to create a database named bosszp and import it to the same level as the configuration file. bosszp.sql file, after importing the database table, the collected Recruitment data is imported into the job table of the created database using Navicat . Before running the background code, the data in the database needs to be cleaned. First, the data is deduplicated and irrelevant data is deleted, and then the keywords that appear in the job name are used. Classify each position information, and finally unify the type or content of the fields. Two processed example data are given below: (Only the field information to be processed is displayed)

address	handledAddress	transformAddress	type	handledType	dist
Beijing	Beijing-Shunyi District	Beijing	Operation and maintenance engineer	operationsEngineer	Shunyi District
Shenzhen	Shenzhen-Longgang District	Shenzhen	Operation and maintenance engineer	operationsEngineer	Longgang District

workTag	handledWorkTag	salary	handledSalary	avgSalary	salaryMonth
["Server Configuration", "Multiple Processes", "Multiple Threads", "Linux", "Algorithm Basics", "Data Structure", ""]	Server configuration multi-process multi-thread Linux algorithm basic data structure	[9000, 11000]	9-11K/month	10000	0 salary
["Python", "Java", "Go", "TypeScript", "Distributed Technology", "Container Technology", "", ""]	Python Java Go TypeScript distributed technology container technology	[15000, 25000]	15-25K/month·13 salary	20000	13 salary

companyTags	handledCompanyTags	companyPeople	handledCompanyPeople
none		[0, 20]	0-20 people
["Regular physical examination", "Supplementary medical insurance", "Snacks and afternoon tea", "Employee travel", "Overtime allowance", "Stock options", "Meal allowance", "Holiday benefits", "Year-end bonus", "Five Insurance and gold"]	Regular physical examination, supplementary medical insurance, snacks, afternoon tea, employee travel and overtime subsidy, stock options, meal supplement, holiday benefits, year-end bonus, five insurances and one fund	[0, 10000]	More than 10,000 people

After the data processing is completed, the background data preparation work is completed. Finally, the main program of the background code is started. If no abnormal errors occur, the background operation is successful.

Front desk environment preparation

nodejs 16.16.0
WebStorm 2022.2.1

Run in foreground

First, use the npm command to globally install the yarn package manager. Then use WebStorm to import all the front-end code in the recruitment-data-analysis directory. After the import is completed, use the yarn install command to install the required modules. After the module installation is complete, run the yarn run build command to install the project. Pack it. After the packaging is completed, a dist folder will be generated. Put all the files in this folder into the upload folder created above. After completion, the local access address of the front desk in Windows 11 is: http://localhost/

Expand

Additional Information

Version 1.0.0
Type Other categories
Update Time 2025-01-23
size 7.12MB
From Github

Related Applications

MMEarth data

2024-11-12
EMIT Data Resources

2024-11-09
data pump log analyzer

2024-11-06
Cosmic data set app

2024-03-15
Biological Data Mining

2010-03-22
Smart Data Recovery

2009-06-18

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
termwind

Other categories

v2.3.0
wp functions

Other categories

1.0.0
slugify

Other categories

Version 4.6.0 (10 September 2024)
waymo open dataset

Other source code

December 2023 Update
termwind

Other categories

v2.3.0
wp functions

Other categories

1.0.0

Related Information All