Penemue raises €1.7M to scale AI hate speech detection

Learn to build a real-time hate speech detection system using Python and transformer models, similar to Penemue's AI platform for identifying online hate and digital violence.

Introduction

In the wake of Penemue's €1.7M funding to scale AI hate speech detection, this tutorial will guide you through building a real-time hate speech detection system using Python and machine learning. This practical implementation will help you understand how AI systems can identify online hate, digital violence, and disinformation across multiple languages. You'll learn to create a scalable solution that can process text streams and classify content with confidence scores.

Prerequisites

Python 3.8 or higher installed on your system
Basic understanding of machine learning concepts
Knowledge of Python libraries: scikit-learn, transformers, pandas, numpy
Access to a command line or terminal
Internet connection for downloading model files

Step-by-Step Instructions

1. Set up your Python environment

First, create a virtual environment to isolate your project dependencies:

python -m venv hate_detector_env
source hate_detector_env/bin/activate  # On Windows: hate_detector_env\Scripts\activate

Install required packages:

pip install transformers torch scikit-learn pandas numpy

Why: Creating a virtual environment ensures you don't interfere with system-wide packages and can manage dependencies effectively for your hate speech detection system.

2. Initialize the hate speech detection model

We'll use a pre-trained transformer model fine-tuned for hate speech detection. Create a Python file called hate_detector.py:

import torch
from transformers import pipeline
import pandas as pd

# Initialize the hate speech detection pipeline
# Using a model fine-tuned for multilingual hate speech detection
classifier = pipeline(
    "text-classification",
    model="cardiffnlp/twitter-roberta-base-hate-latest",
    return_all_scores=True
)

Why: The RoBERTa model is specifically trained on social media text and is effective at detecting hate speech patterns in online content.

3. Create a text classification function

Add this function to your hate_detector.py file:

def classify_text(text):
    """Classify text for hate speech with confidence scores"""
    results = classifier(text)
    
    # Process results to extract label and confidence
    classification = []
    for result in results:
        label = result['label']
        score = result['score']
        
        # Convert labels to more readable format
        if label == 'LABEL_0':
            label = 'NOT_HATE'
        elif label == 'LABEL_1':
            label = 'HATE'
        
        classification.append({
            'label': label,
            'confidence': score
        })
    
    return classification

Why: This function transforms the model's raw output into a more interpretable format, making it easier to understand detection results and confidence levels.

4. Implement real-time processing

Add a function to process multiple texts:

def process_text_stream(texts):
    """Process a list of texts and return classification results"""
    results = []
    
    for text in texts:
        classification = classify_text(text)
        results.append({
            'text': text,
            'classification': classification
        })
    
    return results

Why: This allows you to process batches of text, simulating how Penemue might handle real-time streams from social media platforms or news feeds.

5. Create a sample dataset for testing

Add this test data to your file:

# Sample texts for testing
sample_texts = [
    "I love this new product! It's amazing!",
    "You are such an idiot. This is terrible.",
    "The weather today is beautiful.",
    "This policy is discriminatory and harmful.",
    "I hate when people are rude.",
    "We should all respect each other regardless of background.",
    "This is the worst experience ever.",
    "People with different opinions should be respected.",
    "I can't believe how stupid this is.",
    "We need to work together to solve problems."
]

Why: Having a diverse set of sample texts helps verify that your system can correctly identify both hate speech and neutral content.

6. Run the detection system

Add this code to execute your system:

# Process sample texts
results = process_text_stream(sample_texts)

# Display results
for result in results:
    print(f"Text: {result['text']}")
    for classification in result['classification']:
        print(f"  Label: {classification['label']}, Confidence: {classification['confidence']:.4f}")
    print()

Why: This execution demonstrates how your system processes different types of text and provides confidence scores for each classification, similar to how Penemue's system would work with real-time data streams.

7. Extend for multilingual support

To support multiple languages, modify your approach:

def multilingual_classifier(text, language='en'):
    """Classify text in multiple languages"""
    # For multilingual support, you might want to use a multilingual model
    # This is a simplified version - in practice, you'd use different models per language
    
    if language == 'de':  # German
        model_name = "oliverguhr/german-hate-speech-bert"
    else:  # Default to English
        model_name = "cardiffnlp/twitter-roberta-base-hate-latest"
    
    # Re-initialize pipeline with appropriate model
    local_classifier = pipeline(
        "text-classification",
        model=model_name,
        return_all_scores=True
    )
    
    return local_classifier(text)

Why: Real-world applications like Penemue must handle multiple languages, so this extension prepares your system for broader deployment across different linguistic contexts.

8. Add logging and metrics

Enhance your system with logging capabilities:

import logging

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def classify_with_logging(text):
    """Classify text with logging for monitoring"""
    try:
        results = classify_text(text)
        logger.info(f"Processed text: {text[:50]}...")
        logger.info(f"Classification results: {results}")
        return results
    except Exception as e:
        logger.error(f"Error processing text: {e}")
        return None

Why: Production systems require monitoring and logging to track performance, identify issues, and maintain system health - essential for real-time applications like Penemue's platform.

Summary

This tutorial has walked you through creating a hate speech detection system using transformer models, similar to the technology behind Penemue's platform. You've learned to set up the environment, implement text classification, process real-time streams, and extend functionality for multilingual support. The system provides confidence scores for each classification, which is crucial for distinguishing between hate speech and legitimate discourse. While this implementation focuses on English, the framework can be extended to support the 89 languages mentioned in Penemue's coverage. This hands-on approach gives you practical experience in building scalable AI systems for content moderation and digital violence detection.