The Smart & Universal Web Scrapper is an intelligent data extraction tool powered by Generative AI. It simplifies the process of scraping data from any website by allowing users to provide the website link and the required data fields. With its versatile capabilities, this tool can extract data seamlessly and present it in a tabular format, which can be downloaded in various formats such as Excel, JSON, or Markdown. Its smart, user-friendly interface ensures efficient and accurate data extraction for all your web scraping needs.
Python:
Python is a popular, versatile programming language known for its simplicity and readability. It is widely used for various applications, including web development, data analysis, machine learning, and automation tasks. Python's extensive ecosystem of libraries and frameworks makes it a powerful tool for developers.
LLaMA 3.1 (70b):
LLaMA (Lean Large-Language Model) is a family of large language models developed by Meta AI. The 3.1 (70b) version refers to a specific model variant with 70 billion parameters. Large language models like LLaMA are trained on vast amounts of text data, allowing them to understand and generate human-like text for various natural language processing tasks.
Groq API:
Groq API provides access to Groq's powerful AI inference platform. It enables developers to leverage their advanced hardware and software for rapid and efficient AI model execution.
Streamlit:
Streamlit is an open-source Python library that simplifies the process of building interactive data visualization and machine learning web applications. It allows developers to create user interfaces by writing Python scripts, making it easier to share data-driven applications with others.
Fork or clone this repository to your local machine using Git.
Install the necessary libraries.
pip install -r requirements.txt
Create a .env
file in your project directory and add any required API keys (e.g., Google API key, Groq API KEY).
streamlit run app.py
GNU General Public License v3.0