The aim of this project was to leverage data analytics techniques to determine the best playing 11 for a given cricket tournament, specifically the ICC Cricket World Cup 2022. The project involved web scraping data from the ESPN Cricinfo website, transforming the data using Python and Pandas, and creating interactive dashboards using Power BI for insightful visual representation.
Web Scraping Data from ESPN Cricinfo:
Utilized the third-party web scraper "Bright Data" to effectively gather information from the ESPN Cricinfo website, including match data, results, player batting data, and bowling data.
The scraped data was stored in the form of JSON for further processing.
Data Transformation and Conversion:
Leveraged Python and Pandas to transform the JSON data into CSV format.
Ensured the data was in a suitable format to be directly used in Power BI, simplifying the process of joining tables.
Power BI Dashboard Creation:
Utilized Power Query in Power BI to further transform and clean the data for analysis.
Created dynamic dashboards with interactive charts and visualizations, presenting measures for various aspects, such as power hitters, middle-order batsmen, and bowlers.
These dashboards provided valuable insights into player performance, team strengths, and areas of improvement.
Forming the Best Playing 11:
Applied data-driven analysis and decision-making techniques to form the best playing 11 based on the requirements and insights obtained from the dashboard.
The combined playing 11 was determined to optimize team performance and increase the chances of success in the ICC Cricket World Cup 2022.
Technologies Used:
Web Scraping: Bright Data (Third-party Web Scraper)
Programming Language: Python
Data Manipulation: Pandas
Data Visualization: Power BI
Outcomes:
The project's dynamic and informative dashboards provided actionable insights for cricket team management, coaches, and enthusiasts to strategize and select the best playing 11 for the ICC Cricket World Cup 2022.