Introduction
In this tutorial, you'll learn how to build a web scraper that monitors Amazon phone deals during major sales events like the Spring Sale. This practical tool will help you track price drops and find the best deals from brands like Apple, Samsung, and Motorola. We'll use Python with BeautifulSoup and requests libraries to extract deal information from Amazon's product pages.
Prerequisites
- Python 3.7+ installed on your system
- Basic understanding of Python programming concepts
- Knowledge of HTML structure and CSS selectors
- Installed libraries: requests, beautifulsoup4, pandas
Step-by-Step Instructions
1. Setting Up Your Development Environment
1.1 Install Required Libraries
First, we need to install the necessary Python libraries for web scraping and data handling. Open your terminal and run:
pip install requests beautifulsoup4 pandas
This installs the libraries needed for making HTTP requests, parsing HTML, and organizing data in tabular format.
1.2 Create Project Structure
Create a new directory for your project and set up the basic file structure:
mkdir amazon_deal_scraper
cd amazon_deal_scraper
touch scraper.py
touch requirements.txt
The scraper.py file will contain our main scraping logic, and requirements.txt will list our dependencies.
2. Building the Core Scraper Functionality
2.1 Create Basic Scraper Class
Let's start by creating the main scraper class that will handle all our scraping operations:
import requests
from bs4 import BeautifulSoup
import time
import pandas as pd
class AmazonDealScraper:
def __init__(self):
self.session = requests.Session()
# Set a user agent to mimic a real browser
self.session.headers.update({
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
})
self.deals = []
def get_product_page(self, url):
try:
response = self.session.get(url)
response.raise_for_status() # Raises an HTTPError for bad responses
return response.text
except requests.RequestException as e:
print(f"Error fetching page: {e}")
return None
The user agent header is crucial because Amazon blocks requests without proper headers. The session object maintains cookies and connection settings for efficiency.
2.2 Parse Product Information
Now we need to extract key information from product pages:
def parse_product(self, html_content, product_url):
soup = BeautifulSoup(html_content, 'html.parser')
# Extract product title
title_element = soup.find('span', {'id': 'productTitle'})
title = title_element.get_text().strip() if title_element else 'Title not found'
# Extract current price
price_element = soup.find('span', {'class': 'a-price-whole'})
price = price_element.get_text().strip() if price_element else 'Price not found'
# Extract original price (if on sale)
original_price_element = soup.find('span', {'class': 'a-price a-text-price'})
original_price = original_price_element.get_text().strip() if original_price_element else 'No original price'
# Extract discount percentage
discount_element = soup.find('span', {'class': 'savingsPercentage'})
discount = discount_element.get_text().strip() if discount_element else 'No discount'
# Extract brand information
brand_element = soup.find('a', {'id': 'bylineInfo'})
brand = brand_element.get_text().strip() if brand_element else 'Brand not found'
product_data = {
'title': title,
'price': price,
'original_price': original_price,
'discount': discount,
'brand': brand,
'url': product_url,
'scraped_at': time.strftime('%Y-%m-%d %H:%M:%S')
}
return product_data
Each CSS selector targets specific elements on Amazon's product page structure. Understanding HTML structure is key to successful scraping.
3. Implementing Deal Monitoring
3.1 Create Deal Monitoring Function
Next, we'll implement a function to monitor multiple products:
def monitor_deals(self, product_urls):
print(f"Starting to monitor {len(product_urls)} products...")
for url in product_urls:
print(f"Scraping: {url}")
html_content = self.get_product_page(url)
if html_content:
product_data = self.parse_product(html_content, url)
self.deals.append(product_data)
# Add a small delay to be respectful to Amazon's servers
time.sleep(1)
else:
print(f"Failed to scrape: {url}")
return self.deals
def save_to_csv(self, filename='amazon_deals.csv'):
df = pd.DataFrame(self.deals)
df.to_csv(filename, index=False)
print(f"Deals saved to {filename}")
The 1-second delay between requests helps avoid being blocked by Amazon's anti-bot measures.
3.2 Main Execution Function
Finally, let's create the main function that ties everything together:
def main():
# Sample product URLs (replace with actual Amazon product links)
product_urls = [
'https://www.amazon.com/dp/B09G9JQX8Z', # Example iPhone 14
'https://www.amazon.com/dp/B09G9JQX8Z', # Example Samsung Galaxy S23
'https://www.amazon.com/dp/B09G9JQX8Z', # Example Motorola Edge 40
# Add more URLs as needed
]
scraper = AmazonDealScraper()
deals = scraper.monitor_deals(product_urls)
if deals:
scraper.save_to_csv('spring_sale_deals.csv')
print(f"Successfully scraped {len(deals)} deals")
else:
print("No deals were scraped")
if __name__ == '__main__':
main()
This script will scrape multiple product pages and save the results to a CSV file for easy analysis.
4. Advanced Features and Enhancements
4.1 Add Email Notifications
For a more practical application, we can add email alerts when deals drop below a certain threshold:
import smtplib
from email.mime.text import MIMEText
def send_email_alert(self, subject, message):
# Email configuration (replace with your email settings)
smtp_server = "smtp.gmail.com"
port = 587
sender_email = "[email protected]"
password = "your_app_password"
receiver_email = "[email protected]"
msg = MIMEText(message)
msg['Subject'] = subject
msg['From'] = sender_email
msg['To'] = receiver_email
try:
server = smtplib.SMTP(smtp_server, port)
server.starttls()
server.login(sender_email, password)
server.sendmail(sender_email, receiver_email, msg.as_string())
server.quit()
print("Email alert sent successfully")
except Exception as e:
print(f"Failed to send email: {e}")
This enhancement makes the scraper more actionable by automatically notifying you of significant deals.
4.2 Add Price Comparison Logic
Enhance the scraper to compare current prices against historical data:
def compare_with_history(self, current_price, historical_data):
# Simple comparison logic
if historical_data and current_price < min(historical_data):
return True # Price dropped below historical minimum
return False
This feature helps identify when deals are particularly good compared to previous pricing.
Summary
In this tutorial, you've learned how to build a comprehensive Amazon phone deal scraper that monitors product prices during major sales events. You've created a class-based scraper that handles multiple products, parses key deal information, and saves results to CSV format. The enhanced version includes email alerts and price comparison features that make it more practical for real-world use. This tool is particularly useful for tracking the best deals during Amazon's Spring Sale, helping you make informed purchasing decisions when major brands like Apple, Samsung, and Motorola offer significant discounts.



