I Tried DoorDash’s Tasks App and Saw the Bleak Future of AI Gig Work

Learn how to build a data labeling system that simulates the gig work model used by AI platforms like DoorDash's Tasks app. This tutorial teaches you to create a system that assigns tasks to workers and processes labeled data.

Introduction

In this tutorial, you'll learn how to work with the technology behind AI training platforms like DoorDash's Tasks app. You'll build a simple data labeling system that mimics how gig workers contribute to AI training by creating labeled datasets. This is a practical demonstration of how AI training data is collected and processed, which is essential for understanding the future of AI-powered gig work.

Prerequisites

Basic Python knowledge
Understanding of machine learning concepts
Python libraries: pandas, numpy, pillow, and requests
Basic understanding of image processing and data labeling

Step-by-Step Instructions

1. Set Up Your Development Environment

First, create a virtual environment and install the required packages. This ensures your project doesn't interfere with other Python installations.

python -m venv ai_tasks_env
source ai_tasks_env/bin/activate  # On Windows: ai_tasks_env\Scripts\activate
pip install pandas numpy pillow requests

2. Create the Data Labeling Structure

Next, we'll create a basic data structure to represent the labeling tasks that gig workers would encounter. This simulates how DoorDash's Tasks app organizes work.

import pandas as pd
import numpy as np
from PIL import Image
import os

class TaskData:
    def __init__(self):
        self.tasks = pd.DataFrame(columns=['task_id', 'image_path', 'label', 'status', 'worker_id'])
        
    def add_task(self, image_path, label, worker_id=None):
        task_id = len(self.tasks) + 1
        new_task = {
            'task_id': task_id,
            'image_path': image_path,
            'label': label,
            'status': 'pending',
            'worker_id': worker_id
        }
        self.tasks = pd.concat([self.tasks, pd.DataFrame([new_task])], ignore_index=True)
        
    def get_pending_tasks(self):
        return self.tasks[self.tasks['status'] == 'pending']
        
    def mark_completed(self, task_id, worker_id, label):
        self.tasks.loc[self.tasks['task_id'] == task_id, 'status'] = 'completed'
        self.tasks.loc[self.tasks['task_id'] == task_id, 'worker_id'] = worker_id
        self.tasks.loc[self.tasks['task_id'] == task_id, 'label'] = label

3. Generate Sample Images for Training

We'll create sample images that represent the types of tasks gig workers might encounter. These images will be used to simulate the training data collection process.

def create_sample_images(directory='sample_images'):
    if not os.path.exists(directory):
        os.makedirs(directory)
    
    # Create sample images
    for i, label in enumerate(['laundry', 'scrambled_eggs', 'park_walk']):
        # Create a simple image
        img = Image.new('RGB', (224, 224), color=(i*50, i*30, i*20))
        img.save(f'{directory}/{label}_{i}.jpg')
        print(f'Created {label}_{i}.jpg')
        
create_sample_images()

4. Implement Task Assignment Logic

This step simulates how tasks are assigned to gig workers. We'll create a system that distributes tasks based on worker availability and skill levels.

class TaskAssigner:
    def __init__(self, worker_pool):
        self.workers = worker_pool
        self.task_queue = []
        
    def assign_task(self, task_data, worker_id):
        # Simple assignment logic - assign to first available worker
        for worker in self.workers:
            if worker['status'] == 'available':
                worker['status'] = 'busy'
                worker['current_task'] = task_data
                return worker['id']
        return None
        
    def complete_task(self, worker_id, label):
        for worker in self.workers:
            if worker['id'] == worker_id:
                worker['status'] = 'available'
                worker['completed_tasks'] += 1
                break

5. Simulate Worker Interaction

Now we'll simulate how a gig worker would interact with the tasks, including viewing, labeling, and submitting work.

def simulate_worker_interaction(task_data, assigner, worker_id):
    print(f'\nWorker {worker_id} starting task assignment')
    
    # Get pending tasks
    pending = task_data.get_pending_tasks()
    if len(pending) == 0:
        print('No pending tasks available')
        return
    
    # Assign first task
    first_task = pending.iloc[0]
    print(f'Assigning task {first_task["task_id"]} to worker {worker_id}')
    
    # Simulate worker labeling
    worker_label = input(f'Worker {worker_id}, please label task {first_task["task_id"]}: ')
    
    # Mark as completed
    task_data.mark_completed(first_task['task_id'], worker_id, worker_label)
    print(f'Task {first_task["task_id"]} completed with label: {worker_label}')
    
    # Update worker status
    assigner.complete_task(worker_id, worker_label)

6. Build the Complete Simulation

Finally, we'll put everything together to create a complete simulation that demonstrates how the gig work system functions.

def main_simulation():
    # Initialize data and workers
    task_data = TaskData()
    workers = [
        {'id': 1, 'status': 'available', 'completed_tasks': 0},
        {'id': 2, 'status': 'available', 'completed_tasks': 0},
        {'id': 3, 'status': 'available', 'completed_tasks': 0}
    ]
    
    assigner = TaskAssigner(workers)
    
    # Add sample tasks
    sample_tasks = [
        ('sample_images/laundry_0.jpg', 'laundry'),
        ('sample_images/scrambled_eggs_1.jpg', 'scrambled_eggs'),
        ('sample_images/park_walk_2.jpg', 'park_walk')
    ]
    
    for img_path, label in sample_tasks:
        task_data.add_task(img_path, label)
    
    print('Starting AI Training Task Simulation')
    print(f'Total tasks: {len(task_data.tasks)}')
    
    # Simulate worker interactions
    for worker_id in [1, 2, 3]:
        simulate_worker_interaction(task_data, assigner, worker_id)
        
    print('\nFinal Task Status:')
    print(task_data.tasks)
    
    print('\nWorker Statistics:')
    for worker in workers:
        print(f'Worker {worker["id"]}: {worker["completed_tasks"]} tasks completed')

# Run the simulation
if __name__ == '__main__':
    main_simulation()

Summary

This tutorial demonstrated how AI training platforms like DoorDash's Tasks app function by building a simplified simulation. You learned how tasks are structured, assigned to workers, and processed. The system mimics real-world gig work where workers contribute to AI training by labeling data. Understanding this process is crucial for grasping how AI development relies on human-in-the-loop systems and how gig economy platforms are evolving to support AI training infrastructure.

The key takeaway is that these systems represent a new form of work where human labor directly contributes to AI model development, creating both opportunities and challenges for gig workers in the AI era.