Introduction
Smartwatches and smart rings have become essential tools for health monitoring and fitness tracking. In this tutorial, you'll learn how to work with wearable device data using Python to analyze sleep patterns, step counts, and activity metrics. We'll build a data processing pipeline that can handle JSON data from wearable devices, extract meaningful insights, and visualize trends over time.
Prerequisites
- Basic Python knowledge and familiarity with data analysis libraries
- Python 3.7+ installed on your system
- Required packages: pandas, matplotlib, seaborn, requests
- Sample wearable data in JSON format (we'll generate mock data for this tutorial)
Step-by-Step Instructions
Step 1: Setting Up Your Development Environment
Install Required Libraries
First, we need to install the necessary Python libraries for data processing and visualization. Open your terminal or command prompt and run:
pip install pandas matplotlib seaborn requests
This installs the core libraries we'll use: pandas for data manipulation, matplotlib and seaborn for visualization, and requests for handling API calls if needed.
Step 2: Creating Sample Wearable Data
Generate Mock Data Structure
Before analyzing real wearable data, let's create sample data that mimics what a smartwatch or smart ring would produce:
import json
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
# Generate sample wearable data
sample_data = {
"device_id": "WR-123456",
"device_type": "smart_ring",
"user_id": "user_001",
"data": []
}
# Create 30 days of sample data
start_date = datetime.now() - timedelta(days=30)
for i in range(30):
date = start_date + timedelta(days=i)
sample_data["data"].append({
"date": date.strftime("%Y-%m-%d"),
"steps": np.random.randint(2000, 15000),
"sleep_hours": round(np.random.uniform(5, 9), 1),
"heart_rate": np.random.randint(60, 100),
"activity_level": np.random.choice(["low", "moderate", "high"], p=[0.3, 0.5, 0.2])
})
# Save to JSON file
with open('wearable_data.json', 'w') as f:
json.dump(sample_data, f, indent=2)
print("Sample wearable data generated successfully!")
This code creates realistic sample data that includes daily step counts, sleep duration, heart rate, and activity levels for a 30-day period. The data structure mimics what you'd receive from actual wearable devices.
Step 3: Loading and Exploring Wearable Data
Reading JSON Data into Pandas DataFrame
Now let's load the sample data and explore its structure:
import pandas as pd
import json
# Load the wearable data
with open('wearable_data.json', 'r') as f:
data = json.load(f)
# Convert to DataFrame
df = pd.DataFrame(data['data'])
# Convert date column to datetime
df['date'] = pd.to_datetime(df['date'])
# Display basic information about the dataset
print("Dataset Info:")
print(df.info())
print("\nFirst 5 rows:")
print(df.head())
print("\nDataset shape:")
print(df.shape)
print("\nStatistical summary:")
print(df.describe())
This step is crucial because it allows us to understand our data structure before performing any analysis. The DataFrame format makes it easy to apply various analytical operations and visualizations.
Step 4: Data Cleaning and Preprocessing
Handling Missing Values and Data Types
Real-world wearable data often contains inconsistencies. Let's clean our data:
# Check for missing values
print("Missing values per column:")
print(df.isnull().sum())
# Verify data types
print("\nData types:")
print(df.dtypes)
# Convert any necessary columns
df['steps'] = pd.to_numeric(df['steps'], errors='coerce')
# Remove any rows with missing data (optional approach)
# df = df.dropna()
# Fill missing values with median for numeric columns
numeric_columns = ['steps', 'sleep_hours', 'heart_rate']
for col in numeric_columns:
if df[col].isnull().sum() > 0:
df[col].fillna(df[col].median(), inplace=True)
print("\nCleaned data shape:")
print(df.shape)
print("\nMissing values after cleaning:")
print(df.isnull().sum())
Proper data cleaning ensures that our analysis is accurate and reliable. Missing values can skew results, so we either remove them or fill them with appropriate values like medians or means.
Step 5: Analyzing Activity Trends
Creating Activity Metrics and Insights
Let's extract meaningful insights from our wearable data:
# Calculate daily averages
avg_steps = df['steps'].mean()
avg_sleep = df['sleep_hours'].mean()
avg_heart_rate = df['heart_rate'].mean()
print(f"Average daily steps: {avg_steps:.0f}")
print(f"Average sleep hours: {avg_sleep:.1f}")
print(f"Average heart rate: {avg_heart_rate:.0f}")
# Identify peak activity days
peak_activity_days = df.nlargest(5, 'steps')
print("\nTop 5 most active days:")
print(peak_activity_days[['date', 'steps']])
# Analyze sleep patterns
sleep_analysis = df.groupby('activity_level')['sleep_hours'].agg(['mean', 'std']).round(2)
print("\nSleep analysis by activity level:")
print(sleep_analysis)
This analysis helps identify patterns in user behavior. For example, we can see if users sleep more on low activity days or if there's a correlation between activity levels and sleep quality.
Step 6: Visualizing Wearable Data
Creating Comprehensive Data Visualizations
Visualizations make it easy to understand trends in wearable data:
import matplotlib.pyplot as plt
import seaborn as sns
# Set style for better-looking plots
sns.set_style("whitegrid")
plt.figure(figsize=(15, 10))
# Plot 1: Steps over time
plt.subplot(2, 2, 1)
plt.plot(df['date'], df['steps'], marker='o', linewidth=1, markersize=3)
plt.title('Daily Steps Over Time')
plt.xlabel('Date')
plt.ylabel('Steps')
plt.xticks(rotation=45)
# Plot 2: Sleep hours over time
plt.subplot(2, 2, 2)
plt.plot(df['date'], df['sleep_hours'], marker='s', linewidth=1, markersize=3, color='orange')
plt.title('Daily Sleep Hours Over Time')
plt.xlabel('Date')
plt.ylabel('Sleep Hours')
plt.xticks(rotation=45)
# Plot 3: Heart rate distribution
plt.subplot(2, 2, 3)
sns.histplot(df['heart_rate'], kde=True, bins=20)
plt.title('Heart Rate Distribution')
plt.xlabel('Heart Rate (bpm)')
plt.ylabel('Frequency')
# Plot 4: Activity level distribution
plt.subplot(2, 2, 4)
activity_counts = df['activity_level'].value_counts()
plt.pie(activity_counts.values, labels=activity_counts.index, autopct='%1.1f%%')
plt.title('Activity Level Distribution')
plt.tight_layout()
plt.savefig('wearable_analysis.png', dpi=300, bbox_inches='tight')
plt.show()
print("Visualizations saved as wearable_analysis.png")
These visualizations provide immediate insights into user patterns and trends, making it easier to understand health and fitness data at a glance.
Step 7: Exporting Analysis Results
Creating Summary Reports
Finally, let's create a summary report of our findings:
# Create summary report
summary_report = {
"device_id": data['device_id'],
"analysis_period": f"{df['date'].min().strftime('%Y-%m-%d')} to {df['date'].max().strftime('%Y-%m-%d')}",
"metrics": {
"average_steps": round(avg_steps, 0),
"average_sleep_hours": round(avg_sleep, 1),
"average_heart_rate": round(avg_heart_rate, 0),
"total_days": len(df),
"most_active_day": peak_activity_days.iloc[0]['date'].strftime('%Y-%m-%d'),
"highest_steps": peak_activity_days.iloc[0]['steps']
},
"recommendations": []
}
# Add recommendations based on analysis
if avg_sleep < 7:
summary_report['recommendations'].append("Consider improving sleep habits for better health")
if avg_steps < 5000:
summary_report['recommendations'].append("Try to increase daily step count for better fitness")
# Save summary report
with open('wearable_summary_report.json', 'w') as f:
json.dump(summary_report, f, indent=2)
print("Summary report generated successfully!")
print(json.dumps(summary_report, indent=2))
This structured report makes it easy to share insights with users or stakeholders, providing actionable recommendations based on their wearable data.
Summary
In this tutorial, we've built a complete workflow for analyzing wearable device data. We started by generating sample data that mimics real smartwatch and smart ring information, then loaded and cleaned the data using pandas. We performed meaningful analysis on activity trends, sleep patterns, and heart rate metrics, and created comprehensive visualizations to display insights. Finally, we exported a structured summary report with actionable recommendations. This pipeline can be extended to work with real wearable APIs, incorporate more sophisticated analysis techniques, or integrate with cloud services for scalable data processing.



