Skip to content

CPU Isolation and Task Affinity for Multicore Optimization

Introduction

Raspberry Pi 4 and 5 come with quad-core ARM processors, but by default, the Linux kernel distributes processes across all cores dynamically. While this works well for general-purpose computing, certain workloads benefit from dedicating specific CPU cores to specific tasks—eliminating context switching, reducing cache misses, and ensuring predictable performance.

This guide covers advanced CPU management techniques:

  • CPU Isolation: Reserving cores exclusively for specific processes
  • Task Affinity: Pinning processes to specific cores
  • IRQ Affinity: Routing hardware interrupts to designated cores
  • Cgroups v2: Resource control and CPU quotas
  • Scheduler Policies: Real-time scheduling (SCHED_FIFO, SCHED_RR, SCHED_DEADLINE)

These techniques are valuable for:

  • Real-time applications: Audio processing, robotics control, video streaming
  • Network appliances: Routers, firewalls, VPN gateways
  • High-performance computing: Scientific simulations, data processing
  • Edge AI: Separating inference from data collection
  • Low-latency systems: Trading bots, sensor fusion, motor control

Understanding CPU Architecture

Raspberry Pi CPU Topology

Check your CPU configuration:

# View CPU information
lscpu

# Output for Raspberry Pi 5:
# Architecture:        aarch64
# CPU(s):              4
# On-line CPU(s) list: 0-3
# Model name:          ARM Cortex-A76
# Thread(s) per core:  1
# Core(s) per socket:  4
# Socket(s):           1

Key Points: - Raspberry Pi 4: 4× Cortex-A72 cores (1.5 GHz, can overclock to 2.0 GHz) - Raspberry Pi 5: 4× Cortex-A76 cores (2.4 GHz) - No SMT/Hyper-Threading: Each CPU number = one physical core - NUMA: Single memory domain (uniform memory access)

CPU Numbering

1
2
3
4
5
6
# List all CPUs and their current frequency
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq

# Check CPU topology
ls /sys/devices/system/cpu/cpu0/topology/
# core_id  core_siblings  core_siblings_list  physical_package_id  thread_siblings

CPUs are numbered 0-3: - CPU0: Often handles most system interrupts by default - CPU1-3: Available for workload distribution

CPU Isolation with isolcpus

CPU isolation prevents the kernel scheduler from automatically placing processes on designated cores. Isolated cores remain idle unless explicitly assigned tasks.

Basic Isolation

Edit /boot/firmware/cmdline.txt (or /boot/cmdline.txt on older systems):

1
2
3
4
5
# Before (one long line):
console=serial0,115200 console=tty1 root=PARTUUID=... rootfstype=ext4 rootwait

# After - isolate CPU2 and CPU3:
console=serial0,115200 console=tty1 root=PARTUUID=... rootfstype=ext4 rootwait isolcpus=2,3

Important: This is a single line. Add isolcpus=2,3 at the end.

Reboot and verify:

sudo reboot

# After reboot, check isolated CPUs
cat /sys/devices/system/cpu/isolated
# Output: 2-3

# Check online CPUs available to scheduler
cat /sys/devices/system/cpu/present
# Output: 0-3 (all present)

# Only CPU0 and CPU1 will show activity in general processes
top
# Press '1' to see per-CPU usage

Advanced Isolation Options

1
2
3
4
5
6
7
8
# Isolate with domain isolation (prevents load balancing)
isolcpus=domain,2,3

# Isolate with managed IRQ exclusion
isolcpus=domain,managed_irq,2,3

# Multiple ranges
isolcpus=1-3  # Isolate CPU1, CPU2, CPU3; only CPU0 for general use

Isolation Modes: - domain: Exclude from SMP load balancing domains - managed_irq: Don't route managed interrupts to isolated CPUs - Without options: Basic isolation (processes can still be manually assigned)

Testing Isolation

1
2
3
4
5
6
# Run stress test on non-isolated CPUs
stress-ng --cpu 2 --timeout 60s &

# Monitor CPU usage
htop
# CPU2 and CPU3 should remain at 0% usage

Task Affinity with taskset

Task affinity pins processes to specific CPU cores, preventing migration.

CPU Mask Format

CPUs can be specified as: - List: 0,2 (CPU0 and CPU2) - Range: 0-3 (CPU0 through CPU3) - Hex mask: 0xF (binary 1111 = all 4 CPUs)

1
2
3
4
5
6
7
8
# CPU mask calculation:
# CPU0 = bit 0 = 0x1
# CPU1 = bit 1 = 0x2
# CPU2 = bit 2 = 0x4
# CPU3 = bit 3 = 0x8
# CPU0+CPU1 = 0x3
# CPU2+CPU3 = 0xC
# All CPUs = 0xF

Pin Existing Process

# Find process PID
pidof firefox
# Output: 1234

# Check current affinity
taskset -p 1234
# Output: pid 1234's current affinity mask: f (all CPUs)

# Pin to CPU2 only
sudo taskset -p -c 2 1234
# Output: pid 1234's current affinity mask: 4
# pid 1234's new affinity mask: 4

# Pin to CPU2 and CPU3
sudo taskset -p -c 2,3 1234

# Verify
taskset -p 1234
# Output: c (binary 1100 = CPU2 and CPU3)

Launch Process with Affinity

# Run on CPU3 only
taskset -c 3 ./my_program

# Run on CPU2 and CPU3
taskset -c 2,3 nice -n -10 ./high_priority_task

# Run CPU-intensive task on isolated CPU
taskset -c 2 stress-ng --cpu 1 --timeout 60s

# Using hex mask
taskset 0x4 ./program  # CPU2 only (binary 0100)

Affinity for All Threads

1
2
3
4
5
# Pin process and all its threads
taskset -a -c 2,3 ./multithreaded_app

# Pin existing process tree
taskset -a -p -c 2,3 <PID>

Practical Example: Network Server

#!/bin/bash
# Pin network server to isolated CPUs

# Isolate CPU2 and CPU3 (set in /boot/firmware/cmdline.txt)
# Then run server:

# Start nginx on CPU2
taskset -c 2 nginx

# Start redis on CPU3
taskset -c 3 redis-server /etc/redis/redis.conf

# Database on CPU0-1 (non-isolated, for balanced I/O)
taskset -c 0,1 mongod --config /etc/mongod.conf

IRQ Affinity

Hardware interrupts (IRQs) can cause jitter on isolated CPUs. Route interrupts to specific cores.

View Current IRQ Assignment

# List all IRQs and their CPU affinity
cat /proc/interrupts

# Example output:
#            CPU0       CPU1       CPU2       CPU3
#  30:      12345       6789       4567       2345  GIC-0  27 Level     arch_timer
#  34:       1234          0          0          0  GIC-0  65 Level     fe00b880.mailbox
#  39:      56789      23456      12345       6789  GIC-0  189 Level    mmc1

# Check affinity of specific IRQ (e.g., IRQ 39)
cat /proc/irq/39/smp_affinity
# Output: f (hex) = 1111 (binary) = all CPUs

cat /proc/irq/39/smp_affinity_list
# Output: 0-3

Set IRQ Affinity

# Pin IRQ 39 to CPU0 only
echo 1 | sudo tee /proc/irq/39/smp_affinity
# 1 = binary 0001 = CPU0

# Pin to CPU0 and CPU1
echo 3 | sudo tee /proc/irq/39/smp_affinity
# 3 = binary 0011 = CPU0,CPU1

# Using list format (easier)
echo 0,1 | sudo tee /proc/irq/39/smp_affinity_list

# Pin Ethernet interrupts to CPU0
for irq in $(grep eth0 /proc/interrupts | cut -d: -f1); do
    echo 0 | sudo tee /proc/irq/$irq/smp_affinity_list
done

Disable IRQ Balancing

The irqbalance daemon automatically distributes IRQs. Disable it for manual control:

1
2
3
4
5
6
# Stop and disable irqbalance
sudo systemctl stop irqbalance
sudo systemctl disable irqbalance

# Verify it's stopped
systemctl status irqbalance

Automated IRQ Setup Script

#!/bin/bash
# irq_setup.sh - Pin all IRQs to CPU0 and CPU1

# Disable IRQ balancing
systemctl stop irqbalance

# Route all IRQs to CPU0 and CPU1
for irq in /proc/irq/*/smp_affinity_list; do
    echo "0,1" > "$irq" 2>/dev/null || true
done

# Verify
echo "IRQ affinity set:"
grep -H . /proc/irq/*/smp_affinity_list 2>/dev/null | head -n 10

Make it persistent:

# Create systemd service
sudo tee /etc/systemd/system/irq-affinity.service <<EOF
[Unit]
Description=Set IRQ Affinity
After=network.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/irq_setup.sh
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target
EOF

sudo chmod +x /usr/local/bin/irq_setup.sh
sudo systemctl enable irq-affinity
sudo systemctl start irq-affinity

Cgroups v2 CPU Management

Cgroups provide fine-grained resource control. Raspberry Pi OS uses cgroups v2 by default (kernel 5.10+).

Verify Cgroups v2

# Check cgroup version
mount | grep cgroup
# Should see: cgroup2 on /sys/fs/cgroup type cgroup2

# Enable CPU controller if not active
cat /sys/fs/cgroup/cgroup.controllers
# Should include: cpu memory pids

# If cpu is missing, enable it
echo "+cpu +cpuset" | sudo tee /sys/fs/cgroup/cgroup.subtree_control

Create CPU-Limited Cgroup

# Create cgroup for limited tasks
sudo mkdir -p /sys/fs/cgroup/limited_cpu

# Set CPU quota: 50% of one core
# cpu.max format: $MAX $PERIOD (microseconds)
# 50000 100000 = 50% of one core
echo "50000 100000" | sudo tee /sys/fs/cgroup/limited_cpu/cpu.max

# Restrict to CPU0 and CPU1 only
echo "0-1" | sudo tee /sys/fs/cgroup/limited_cpu/cpuset.cpus

# Add process to cgroup
echo $PID | sudo tee /sys/fs/cgroup/limited_cpu/cgroup.procs

# Verify
cat /sys/fs/cgroup/limited_cpu/cpu.max
# Output: 50000 100000

cat /sys/fs/cgroup/limited_cpu/cpuset.cpus
# Output: 0-1

CPU Weight (Proportional Share)

# Create two cgroups with different CPU weights
sudo mkdir -p /sys/fs/cgroup/high_priority
sudo mkdir -p /sys/fs/cgroup/low_priority

# High priority gets 2× CPU time of low priority
echo "200" | sudo tee /sys/fs/cgroup/high_priority/cpu.weight
echo "100" | sudo tee /sys/fs/cgroup/low_priority/cpu.weight

# Assign processes
echo $PID1 | sudo tee /sys/fs/cgroup/high_priority/cgroup.procs
echo $PID2 | sudo tee /sys/fs/cgroup/low_priority/cgroup.procs

Systemd Integration

Systemd services can use cgroup settings directly:

# Create service with CPU affinity
sudo tee /etc/systemd/system/isolated_task.service <<EOF
[Unit]
Description=Isolated CPU Task
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/my_realtime_app
Restart=always

# CPU affinity
CPUAffinity=2 3

# CPU quota: 150% (1.5 cores)
CPUQuota=150%

# Real-time priority
Nice=-10

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl start isolated_task
sudo systemctl status isolated_task

Monitoring Cgroup Usage

# View CPU stats for cgroup
cat /sys/fs/cgroup/limited_cpu/cpu.stat
# Output:
# usage_usec 1234567
# user_usec 1000000
# system_usec 234567
# nr_periods 1234
# nr_throttled 56
# throttled_usec 78901

# Check which CPUs are being used
cat /sys/fs/cgroup/limited_cpu/cpuset.cpus.effective

Scheduler Policies and Priorities

Linux supports multiple scheduling policies. Default is CFS (Completely Fair Scheduler), but real-time policies provide deterministic behavior.

Scheduling Policies

Policy Description Priority Range Use Case
SCHED_OTHER Default CFS Nice -20 to 19 General purpose
SCHED_FIFO Real-time FIFO 1-99 (higher = more priority) Hard real-time
SCHED_RR Real-time round-robin 1-99 Real-time with time slicing
SCHED_DEADLINE Deadline scheduling N/A Periodic tasks
SCHED_BATCH Batch processing Nice -20 to 19 CPU-intensive background
SCHED_IDLE Lowest priority N/A Very low priority tasks

View Current Policy

1
2
3
4
5
6
7
# Check policy of current shell
chrt -p $$
# Output: pid 1234's current scheduling policy: SCHED_OTHER
#         pid 1234's current scheduling priority: 0

# Check policy of process
chrt -p $(pidof my_app)

Set Real-Time Policy (SCHED_FIFO)

# Launch with SCHED_FIFO priority 50
sudo chrt -f 50 ./my_realtime_app

# Change existing process to SCHED_FIFO
sudo chrt -f -p 50 $(pidof my_app)

# Verify
chrt -p $(pidof my_app)
# Output: pid 5678's current scheduling policy: SCHED_FIFO
#         pid 5678's current scheduling priority: 50

Round-Robin Real-Time (SCHED_RR)

1
2
3
4
5
# SCHED_RR allows time-slicing among same-priority tasks
sudo chrt -r 50 ./my_app

# Set time slice (quantum) - requires RT kernel patch
# Default is usually 100ms

Deadline Scheduling (SCHED_DEADLINE)

For periodic tasks with strict timing requirements:

1
2
3
4
5
6
# Run task every 10ms with 5ms runtime budget
# Format: -D --sched-runtime <ns> --sched-deadline <ns> --sched-period <ns>
sudo chrt -d --sched-runtime 5000000 --sched-deadline 10000000 --sched-period 10000000 ./periodic_task

# Example: 1kHz control loop (1ms period, 0.5ms runtime)
sudo chrt -d --sched-runtime 500000 --sched-deadline 1000000 --sched-period 1000000 ./control_loop

Nice Values (SCHED_OTHER)

1
2
3
4
5
6
7
8
9
# Lower nice = higher priority (-20 to 19)
nice -n -10 ./important_task
nice -n 10 ./background_task

# Change existing process
sudo renice -n -15 -p $(pidof important_app)

# Run at lowest priority
nice -n 19 ./batch_job

RT Throttling (Safety Mechanism)

Real-time processes can starve the system. Linux has RT throttling:

# Check RT throttling settings
cat /proc/sys/kernel/sched_rt_period_us
# Output: 1000000 (1 second)

cat /proc/sys/kernel/sched_rt_runtime_us
# Output: 950000 (0.95 seconds)

# This means RT tasks can use max 95% CPU time
# Remaining 5% reserved for non-RT (safety)

# Disable throttling (dangerous!)
echo -1 | sudo tee /proc/sys/kernel/sched_rt_runtime_us

# Enable full RT usage (still risky)
echo 1000000 | sudo tee /proc/sys/kernel/sched_rt_runtime_us

Practical Use Cases

Case 1: Real-Time Audio Processing

Isolate CPU3 for audio, prevent jitter:

# 1. Isolate CPU3 in /boot/firmware/cmdline.txt
# isolcpus=3

# 2. Route audio IRQs to CPU0
#!/bin/bash
for irq in $(grep -i audio /proc/interrupts | cut -d: -f1); do
    echo 0 | sudo tee /proc/irq/$irq/smp_affinity_list
done

# 3. Run audio app on CPU3 with SCHED_FIFO
sudo chrt -f 80 taskset -c 3 jackd -d alsa -r 48000 -p 128

# 4. Monitor latency
cat /proc/interrupts | grep -i audio

Case 2: Network Router/Firewall

Dedicate CPU0-1 to network, CPU2-3 to applications:

# /boot/firmware/cmdline.txt
isolcpus=2,3

# IRQ setup script
#!/bin/bash
# Pin all network IRQs to CPU0-1
for irq in $(grep -E 'eth0|wlan0' /proc/interrupts | cut -d: -f1); do
    echo 0,1 | sudo tee /proc/irq/$irq/smp_affinity_list
done

# iptables/nftables on CPU0-1
taskset -c 0,1 systemctl restart nftables

# Application servers on CPU2-3
taskset -c 2,3 nginx
taskset -c 2,3 php-fpm

Case 3: Machine Learning Inference

Separate data acquisition from inference:

# CPU0-1: Camera capture and preprocessing
# CPU2-3: TensorFlow Lite inference

# Capture process
taskset -c 0,1 nice -n -5 python3 camera_capture.py &

# Inference on isolated cores
sudo chrt -f 50 taskset -c 2,3 python3 inference.py &

# Monitor
watch -n 1 'mpstat -P ALL 1 1'

Case 4: High-Performance Computing

Balance I/O and computation:

# Create cgroups
sudo mkdir -p /sys/fs/cgroup/compute
sudo mkdir -p /sys/fs/cgroup/io

# Compute: CPU2-3, 200% quota, high priority
echo "2-3" | sudo tee /sys/fs/cgroup/compute/cpuset.cpus
echo "200000 100000" | sudo tee /sys/fs/cgroup/compute/cpu.max
echo "200" | sudo tee /sys/fs/cgroup/compute/cpu.weight

# I/O: CPU0-1, 150% quota, normal priority
echo "0-1" | sudo tee /sys/fs/cgroup/io/cpuset.cpus
echo "150000 100000" | sudo tee /sys/fs/cgroup/io/cpu.max
echo "100" | sudo tee /sys/fs/cgroup/io/cpu.weight

# Run compute tasks
echo $PID_COMPUTE | sudo tee /sys/fs/cgroup/compute/cgroup.procs

# Run I/O tasks
echo $PID_IO | sudo tee /sys/fs/cgroup/io/cgroup.procs

Case 5: Kubernetes Node Optimization

Reserve CPU for system pods:

# kubelet configuration
# /var/lib/kubelet/config.yaml
systemReserved:
  cpu: "500m"
  memory: "512Mi"
kubeReserved:
  cpu: "500m"
  memory: "512Mi"

# Pin kubelet to CPU0
# /etc/systemd/system/kubelet.service.d/10-cpu-affinity.conf
[Service]
CPUAffinity=0

# Application pods use CPU1-3
# In pod spec:
spec:
  containers:
  - name: app
    resources:
      requests:
        cpu: "1000m"
      limits:
        cpu: "2000m"

Monitoring and Verification

Real-Time CPU Usage

# Per-CPU usage
mpstat -P ALL 1

# With task affinity info
htop
# Press F2 → Display options → Enable "PROCESSOR"

# Detailed per-process CPU usage
pidstat -u -p ALL 1

# Show CPU affinity in ps
ps -eo pid,comm,psr,args
# PSR column shows current CPU

Measure Context Switches

1
2
3
4
5
6
7
8
9
# Context switches per second
vmstat 1
# Look at "cs" column

# Per-process context switches
pidstat -w 1

# Check voluntary vs involuntary
cat /proc/$PID/status | grep ctxt

Latency Testing

# Install rt-tests
sudo apt install rt-tests

# Cyclictest - measure real-time latencies
sudo cyclictest -p 80 -t 4 -n -i 1000 -l 100000
# -p 80: priority 80
# -t 4: 4 threads (one per CPU)
# -i 1000: 1000µs interval
# -l 100000: 100,000 iterations

# Pin to specific CPUs
sudo taskset -c 2,3 cyclictest -p 80 -t 2 -n -i 1000 -l 100000

# Output shows min/avg/max latencies per CPU

System-Wide Profiling

# Install perf
sudo apt install linux-perf

# Profile all CPUs for 10 seconds
sudo perf stat -a -e cycles,instructions,cache-misses sleep 10

# Profile specific CPU
sudo perf stat -C 2 -e cycles,instructions sleep 10

# Profile process with affinity info
sudo perf record -a -g -- taskset -c 2 ./my_app
sudo perf report

Troubleshooting

Issue: Isolated CPUs Still Show Activity

# Check for kernel threads on isolated CPUs
ps -eLo psr,comm | grep -E '^\s*[23]'

# Move kernel threads away from isolated CPUs
for pid in $(pgrep -f 'ksoftirqd|kworker'); do
    sudo taskset -p -c 0,1 $pid 2>/dev/null || true
done

# Disable specific kernel threads
echo 0 | sudo tee /sys/devices/virtual/workqueue/*/cpumask

Issue: Real-Time Task Causes System Freeze

# RT task consumed all CPU, system unresponsive
# Solution: Ensure RT throttling is enabled

# Check current settings
cat /proc/sys/kernel/sched_rt_runtime_us

# If -1 (disabled), re-enable with 95% limit
echo 950000 | sudo tee /proc/sys/kernel/sched_rt_runtime_us

# Alternative: Use SCHED_RR instead of SCHED_FIFO
sudo chrt -r 50 ./my_app

Issue: Cgroup CPU Quota Not Enforced

1
2
3
4
5
6
7
8
9
# Verify CPU controller is enabled
cat /sys/fs/cgroup/cgroup.controllers
# Should include "cpu"

# Enable CPU controller in subtree
echo "+cpu" | sudo tee /sys/fs/cgroup/cgroup.subtree_control

# Check if process is in correct cgroup
cat /proc/$PID/cgroup

Issue: IRQ Affinity Resets After Reboot

# irqbalance might be re-enabling
sudo systemctl disable irqbalance

# Create persistent script
sudo tee /etc/rc.local <<'EOF'
#!/bin/bash
for irq in /proc/irq/*/smp_affinity_list; do
    echo "0,1" > "$irq" 2>/dev/null || true
done
exit 0
EOF

sudo chmod +x /etc/rc.local

Performance Benchmarks

Before Optimization (Default)

1
2
3
4
5
6
7
8
# All CPUs running general workload
stress-ng --cpu 4 --timeout 60s &
hackbench -g 10 -l 1000

# Typical results:
# Hackbench: 5.2 seconds
# Context switches: 15,000/sec
# Cyclictest max latency: 250µs

After CPU Isolation

1
2
3
4
5
6
7
8
9
# CPU2-3 isolated, dedicated to real-time task
# CPU0-1 handle general workload

stress-ng --cpu 2 --timeout 60s &  # On CPU0-1
sudo chrt -f 80 taskset -c 2,3 cyclictest -t 2 -n -i 1000 -l 100000

# Improved results:
# Cyclictest max latency: 45µs (82% reduction)
# Jitter: <10µs (vs >100µs before)

Network Throughput Comparison

# Before: IRQs spread across all CPUs
iperf3 -c server -t 60
# ~800 Mbps, high CPU usage on all cores

# After: IRQs on CPU0-1, iperf on CPU2-3
for irq in $(grep eth0 /proc/interrupts | cut -d: -f1); do
    echo 0,1 | sudo tee /proc/irq/$irq/smp_affinity_list
done
taskset -c 2,3 iperf3 -c server -t 60
# ~940 Mbps, CPU2-3 at 100%, CPU0-1 at 30%

Best Practices

1. Start Conservative

1
2
3
4
# Don't isolate too many cores
# Leave at least 2 cores for system (especially on Pi 4)
# Good: isolcpus=2,3 (on 4-core system)
# Bad: isolcpus=1,2,3 (only 1 core for OS)

2. Combine Techniques

1
2
3
4
5
6
# Layer optimizations for best results:
# 1. CPU isolation (isolcpus)
# 2. IRQ affinity
# 3. Task affinity (taskset)
# 4. Scheduler policy (chrt)
# 5. Cgroup limits (optional)

3. Monitor First, Optimize Later

1
2
3
4
5
6
7
# Establish baseline with monitoring tools
mpstat -P ALL 1 &
pidstat -u 1 &
vmstat 1 &

# Run workload and identify bottlenecks
# Then apply targeted optimizations

4. Test Thoroughly

1
2
3
4
5
6
7
8
# Stress test your configuration
stress-ng --cpu 4 --io 4 --vm 2 --vm-bytes 512M --timeout 300s

# Check for system stability
dmesg | grep -i 'hung\|rcu\|stall'

# Measure latency under load
sudo cyclictest -p 80 -t 4 -n -i 1000 -l 1000000 -m

5. Document Your Setup

# Create README with your configuration
cat > /root/cpu_config.txt <<EOF
CPU Configuration for this system:
- Isolated CPUs: 2,3
- IRQ affinity: CPU 0,1
- Real-time apps: CPU 2,3 (SCHED_FIFO priority 70-80)
- General apps: CPU 0,1
- RT throttling: Enabled (95%)

Services:
- nginx: CPU 2,3
- database: CPU 0,1
- monitoring: CPU 0

Restore with:
- /usr/local/bin/irq_setup.sh
- systemctl start cpu-optimization.service
EOF

Complete Setup Script

#!/bin/bash
# complete_cpu_optimization.sh
# Complete CPU optimization for Raspberry Pi 4/5

set -e

echo "=== CPU Optimization Setup ==="

# 1. Check current configuration
echo "Current CPUs: $(nproc)"
echo "Isolated CPUs: $(cat /sys/devices/system/cpu/isolated 2>/dev/null || echo 'none')"

# 2. Disable IRQ balancing
echo "Stopping irqbalance..."
systemctl stop irqbalance 2>/dev/null || true
systemctl disable irqbalance 2>/dev/null || true

# 3. Set IRQ affinity (all IRQs to CPU0-1)
echo "Setting IRQ affinity to CPU0-1..."
for irq in /proc/irq/*/smp_affinity_list; do
    echo "0,1" > "$irq" 2>/dev/null || true
done

# 4. Create systemd service for persistent IRQ affinity
cat > /etc/systemd/system/cpu-optimization.service <<'EOF'
[Unit]
Description=CPU Optimization (IRQ affinity)
After=network.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'for irq in /proc/irq/*/smp_affinity_list; do echo "0,1" > "$irq" 2>/dev/null || true; done'
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable cpu-optimization
systemctl start cpu-optimization

# 5. Verify
echo ""
echo "=== Verification ==="
echo "IRQ affinity sample:"
head -n 5 /proc/irq/*/smp_affinity_list

echo ""
echo "=== Setup complete! ==="
echo "To enable CPU isolation, add to /boot/firmware/cmdline.txt:"
echo "  isolcpus=2,3"
echo "Then reboot."

Summary

This guide covered comprehensive CPU management techniques:

✅ CPU Isolation

  • Reserve cores exclusively for critical tasks
  • Reduce scheduler overhead and context switches
  • Set via isolcpus boot parameter

✅ Task Affinity

  • Pin processes to specific CPUs with taskset
  • Improve cache locality and reduce migrations
  • Apply to running processes or at launch

✅ IRQ Affinity

  • Route hardware interrupts to designated cores
  • Reduce jitter on isolated CPUs
  • Disable irqbalance for manual control

✅ Cgroups v2

  • Fine-grained CPU quotas and shares
  • Restrict processes to specific cores
  • Integrated with systemd services

✅ Scheduler Policies

  • Real-time scheduling (SCHED_FIFO, SCHED_RR)
  • Deadline scheduling for periodic tasks
  • Priority tuning with chrt and nice values

✅ Practical Applications

  • Real-time audio: <50µs latency
  • Network appliances: Dedicated interrupt handling
  • Edge AI: Separate inference from I/O
  • HPC: Balanced compute and I/O workloads

Next Steps

Advanced Topics to Explore:

  1. PREEMPT_RT Kernel: Full real-time Linux for <10µs latencies
  2. NUMA Optimization: On multi-socket systems (not applicable to Pi, but good to know)
  3. CPU Frequency Scaling: Governor tuning for performance vs power
  4. Memory Affinity: Combine with numactl on larger systems
  5. Container Optimization: Apply these techniques to Docker/Kubernetes

Related Raspberry Pi Guides:

With proper CPU management, you can achieve near-deterministic performance on Raspberry Pi, making it viable for production real-time systems, edge computing, and high-performance network appliances.