GM, Ford, and Stellantis have cut 20,000 white-collar jobs. AI is about to accelerate the trend.

Learn how to analyze workforce data and employment trends using Python. This tutorial teaches you to work with real datasets, clean data, and visualize employment patterns that are changing due to AI adoption in industries like automotive manufacturing.

Introduction

As major automakers like General Motors, Ford, and Stellantis cut thousands of white-collar jobs, artificial intelligence is becoming a key driver of these changes. This tutorial will teach you how to use Python to analyze workforce data, a skill that's increasingly important in understanding how AI impacts employment. You'll learn to work with real datasets, clean data, and visualize employment trends using popular Python libraries.

Prerequisites

To follow this tutorial, you'll need:

A computer with internet access
Python installed (version 3.6 or higher recommended)
Basic understanding of Python programming concepts
Access to a code editor or IDE (like VS Code or Jupyter Notebook)

Why these prerequisites? Python is the go-to language for data analysis, and having a basic understanding of it will help you grasp the concepts faster. We'll be using libraries like pandas for data manipulation and matplotlib for visualization.

Step-by-step Instructions

1. Install Required Python Libraries

First, we need to install the necessary Python packages. Open your terminal or command prompt and run:

pip install pandas matplotlib numpy

Why? These libraries are essential for data analysis. pandas helps us work with structured data, matplotlib creates visualizations, and numpy handles numerical operations.

2. Create a Sample Dataset

Let's create a simple dataset that mimics the employment data from the automakers. Create a new Python file (e.g., workforce_analysis.py) and add the following code:

import pandas as pd

# Create sample data
sample_data = {
    'Company': ['GM', 'Ford', 'Stellantis', 'GM', 'Ford', 'Stellantis'],
    'Year': [2020, 2020, 2020, 2023, 2023, 2023],
    'White_Collar_Employees': [15000, 20000, 10000, 12000, 16000, 8000],
    'AI_Adoption_Level': ['Low', 'Low', 'Low', 'High', 'High', 'High']
}

# Create DataFrame
df = pd.DataFrame(sample_data)
print(df)

Why? This creates a simple dataset that represents employment numbers and AI adoption levels for different companies over time. This simulates the kind of data that companies analyze to make workforce decisions.

3. Load and Explore the Data

Now let's expand our script to load and explore the dataset:

import pandas as pd
import matplotlib.pyplot as plt

# Sample data (as above)
sample_data = {
    'Company': ['GM', 'Ford', 'Stellantis', 'GM', 'Ford', 'Stellantis'],
    'Year': [2020, 2020, 2020, 2023, 2023, 2023],
    'White_Collar_Employees': [15000, 20000, 10000, 12000, 16000, 8000],
    'AI_Adoption_Level': ['Low', 'Low', 'Low', 'High', 'High', 'High']
}

df = pd.DataFrame(sample_data)

# Display basic information about the dataset
print("Dataset Info:")
print(df.info())
print("\nDataset Description:")
print(df.describe())
print("\nFirst few rows:")
print(df.head())

Why? Understanding your data is crucial. The info() method shows data types and missing values, while describe() gives us statistical summaries. head() lets us see the first few rows.

4. Clean and Process the Data

Let's clean our data by checking for any inconsistencies:

# Check for missing values
print("Missing values:")
print(df.isnull().sum())

# Check for duplicate rows
print("\nDuplicate rows:")
print(df.duplicated().sum())

# Convert AI_Adoption_Level to numeric for analysis
# Mapping: Low = 1, High = 2
df['AI_Level_Numeric'] = df['AI_Adoption_Level'].map({'Low': 1, 'High': 2})

print("\nDataset after cleaning:")
print(df)

Why? Data cleaning ensures our analysis is accurate. We check for missing values and duplicates, which can skew our results. Converting text categories to numbers makes it easier to perform mathematical operations.

5. Visualize Employment Trends

Now let's create visualizations to understand the employment trends:

# Create a bar chart showing employee numbers by company
plt.figure(figsize=(10, 6))
plt.bar(df['Company'], df['White_Collar_Employees'], color=['red', 'blue', 'green'])
plt.title('White-Collar Employees by Company')
plt.xlabel('Company')
plt.ylabel('Number of Employees')
plt.show()

# Create a line chart showing employee trends over time
plt.figure(figsize=(10, 6))
for company in df['Company'].unique():
    company_data = df[df['Company'] == company]
    plt.plot(company_data['Year'], company_data['White_Collar_Employees'], marker='o', label=company)

plt.title('White-Collar Employee Trends Over Time')
plt.xlabel('Year')
plt.ylabel('Number of Employees')
plt.legend()
plt.show()

Why? Visualizations help us understand patterns quickly. The bar chart shows current employee numbers, while the line chart shows how these numbers have changed over time. These patterns are what companies analyze when making workforce decisions.

6. Analyze AI Impact on Workforce

Let's calculate how AI adoption might affect employment numbers:

# Calculate average employees by AI level
avg_employees_by_ai = df.groupby('AI_Adoption_Level')['White_Collar_Employees'].mean()
print("Average employees by AI adoption level:")
print(avg_employees_by_ai)

# Calculate percentage change in employees
print("\nPercentage change in employees by company:")
for company in df['Company'].unique():
    company_data = df[df['Company'] == company]
    if len(company_data) > 1:
        initial = company_data.iloc[0]['White_Collar_Employees']
        final = company_data.iloc[-1]['White_Collar_Employees']
        change = ((final - initial) / initial) * 100
        print(f"{company}: {change:.2f}%")

Why? This analysis helps us understand how AI adoption correlates with workforce changes. In the real world, companies use similar calculations to justify workforce decisions.

7. Save Your Analysis

Finally, let's save our cleaned dataset and analysis:

# Save cleaned dataset to CSV
df.to_csv('cleaned_workforce_data.csv', index=False)
print("\nCleaned dataset saved to 'cleaned_workforce_data.csv'")

# Save the analysis results
with open('workforce_analysis_results.txt', 'w') as f:
    f.write("Workforce Analysis Results\n")
    f.write(f"Average employees by AI level:\n{avg_employees_by_ai}\n\n")
    f.write("Percentage change by company:\n")
    for company in df['Company'].unique():
        company_data = df[df['Company'] == company]
        if len(company_data) > 1:
            initial = company_data.iloc[0]['White_Collar_Employees']
            final = company_data.iloc[-1]['White_Collar_Employees']
            change = ((final - initial) / initial) * 100
            f.write(f"{company}: {change:.2f}%\n")

print("\nAnalysis results saved to 'workforce_analysis_results.txt'")

Why? Saving your work ensures you don't lose your analysis and can share results with others. This is crucial in professional environments where data analysis often leads to important business decisions.

Summary

In this tutorial, you've learned how to analyze workforce data using Python. You've created a sample dataset, cleaned and processed it, visualized employment trends, and calculated the impact of AI adoption on workforce numbers. These skills are essential for understanding how technology, including AI, is reshaping employment patterns in industries like automotive manufacturing.

As automakers like GM, Ford, and Stellantis continue to cut jobs, data analysis skills become increasingly valuable. This tutorial provides a foundation for understanding employment trends and how AI adoption affects workforce decisions. You can expand this analysis by using real datasets from government employment statistics or company reports to gain deeper insights into the changing nature of work.