GitHub - GitProSolutions/BookScraperExcel: This project is a Python web scraper that can extract book data from Amazon.com, store the data in an Excel file, and handle blocks by rotating through a list of proxies.

#BookScraperExcel

Project Description

This project is a web scraper built in Python that is capable of scraping book data from Amazon.com. It uses the requests, beautifulsoup4, and pandas libraries, as well as the random and time modules, to make HTTP requests, parse HTML content, and export the data to an Excel file.

The web scraper is enhanced with the ability to handle blocks and use multiple proxies, which makes it more reliable and efficient. It also sets up an application programming interface (API) using the API class, which allows users to retrieve the book data in a convenient format (as a list of dictionaries), and export the data to an Excel file for further analysis or use.

How to Install and Run the Project

Clone the repository: git clone https://github.com/GitProSolutions/BookScraperExcel.git.
Install the required dependencies by running pip install -r requirements.txt in the project directory.
Run the main.py file in a Python environment with the required dependencies installed.

How to Use the Project

In the main.py file, modify the BASE_URL variable to the Amazon search results page of your choice.
Modify the num_pages variable to the number of pages of results you want to scrape.
If desired, modify the API_KEY and API_SECRET variables to your own values.
Run the main.py file and wait for the book data to be scraped and exported to an Excel file.
To use the API, make a GET request to the following URL: http://localhost:5000/api/v1/books?key=YOUR_API_KEY. Replace YOUR_API_KEY with the API_KEY variable value in the main.py file.

Credits

This project was built by GitProSolutions as a learning exercise in web scraping, API development, and Python programming.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
Main.py		Main.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project Description

Table of Contents

How to Install and Run the Project

How to Use the Project

Credits

License

Badges

About

Uh oh!

Releases

Packages

Languages

License

GitProSolutions/BookScraperExcel

Folders and files

Latest commit

History

Repository files navigation

Project Description

Table of Contents

How to Install and Run the Project

How to Use the Project

Credits

License

Badges

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages