CPU Isolation and Task Affinity for Multicore Optimization
Introduction
Raspberry Pi 4 and 5 come with quad-core ARM processors, but by default, the Linux kernel distributes processes across all cores dynamically. While this works well for general-purpose computing, certain workloads benefit from dedicating specific CPU cores to specific tasks—eliminating context switching, reducing cache misses, and ensuring predictable performance.
This guide covers advanced CPU management techniques:
- CPU Isolation: Reserving cores exclusively for specific processes
- Task Affinity: Pinning processes to specific cores
- IRQ Affinity: Routing hardware interrupts to designated cores
- Cgroups v2: Resource control and CPU quotas
- Scheduler Policies: Real-time scheduling (SCHED_FIFO, SCHED_RR, SCHED_DEADLINE)
These techniques are valuable for:
- Real-time applications: Audio processing, robotics control, video streaming
- Network appliances: Routers, firewalls, VPN gateways
- High-performance computing: Scientific simulations, data processing
- Edge AI: Separating inference from data collection
- Low-latency systems: Trading bots, sensor fusion, motor control
Understanding CPU Architecture
Raspberry Pi CPU Topology
Check your CPU configuration:
| # View CPU information
lscpu
# Output for Raspberry Pi 5:
# Architecture: aarch64
# CPU(s): 4
# On-line CPU(s) list: 0-3
# Model name: ARM Cortex-A76
# Thread(s) per core: 1
# Core(s) per socket: 4
# Socket(s): 1
|
Key Points:
- Raspberry Pi 4: 4× Cortex-A72 cores (1.5 GHz, can overclock to 2.0 GHz)
- Raspberry Pi 5: 4× Cortex-A76 cores (2.4 GHz)
- No SMT/Hyper-Threading: Each CPU number = one physical core
- NUMA: Single memory domain (uniform memory access)
CPU Numbering
| # List all CPUs and their current frequency
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq
# Check CPU topology
ls /sys/devices/system/cpu/cpu0/topology/
# core_id core_siblings core_siblings_list physical_package_id thread_siblings
|
CPUs are numbered 0-3:
- CPU0: Often handles most system interrupts by default
- CPU1-3: Available for workload distribution
CPU Isolation with isolcpus
CPU isolation prevents the kernel scheduler from automatically placing processes on designated cores. Isolated cores remain idle unless explicitly assigned tasks.
Basic Isolation
Edit /boot/firmware/cmdline.txt (or /boot/cmdline.txt on older systems):
| # Before (one long line):
console=serial0,115200 console=tty1 root=PARTUUID=... rootfstype=ext4 rootwait
# After - isolate CPU2 and CPU3:
console=serial0,115200 console=tty1 root=PARTUUID=... rootfstype=ext4 rootwait isolcpus=2,3
|
Important: This is a single line. Add isolcpus=2,3 at the end.
Reboot and verify:
| sudo reboot
# After reboot, check isolated CPUs
cat /sys/devices/system/cpu/isolated
# Output: 2-3
# Check online CPUs available to scheduler
cat /sys/devices/system/cpu/present
# Output: 0-3 (all present)
# Only CPU0 and CPU1 will show activity in general processes
top
# Press '1' to see per-CPU usage
|
Advanced Isolation Options
| # Isolate with domain isolation (prevents load balancing)
isolcpus=domain,2,3
# Isolate with managed IRQ exclusion
isolcpus=domain,managed_irq,2,3
# Multiple ranges
isolcpus=1-3 # Isolate CPU1, CPU2, CPU3; only CPU0 for general use
|
Isolation Modes:
- domain: Exclude from SMP load balancing domains
- managed_irq: Don't route managed interrupts to isolated CPUs
- Without options: Basic isolation (processes can still be manually assigned)
Testing Isolation
| # Run stress test on non-isolated CPUs
stress-ng --cpu 2 --timeout 60s &
# Monitor CPU usage
htop
# CPU2 and CPU3 should remain at 0% usage
|
Task Affinity with taskset
Task affinity pins processes to specific CPU cores, preventing migration.
CPUs can be specified as:
- List: 0,2 (CPU0 and CPU2)
- Range: 0-3 (CPU0 through CPU3)
- Hex mask: 0xF (binary 1111 = all 4 CPUs)
| # CPU mask calculation:
# CPU0 = bit 0 = 0x1
# CPU1 = bit 1 = 0x2
# CPU2 = bit 2 = 0x4
# CPU3 = bit 3 = 0x8
# CPU0+CPU1 = 0x3
# CPU2+CPU3 = 0xC
# All CPUs = 0xF
|
Pin Existing Process
| # Find process PID
pidof firefox
# Output: 1234
# Check current affinity
taskset -p 1234
# Output: pid 1234's current affinity mask: f (all CPUs)
# Pin to CPU2 only
sudo taskset -p -c 2 1234
# Output: pid 1234's current affinity mask: 4
# pid 1234's new affinity mask: 4
# Pin to CPU2 and CPU3
sudo taskset -p -c 2,3 1234
# Verify
taskset -p 1234
# Output: c (binary 1100 = CPU2 and CPU3)
|
Launch Process with Affinity
| # Run on CPU3 only
taskset -c 3 ./my_program
# Run on CPU2 and CPU3
taskset -c 2,3 nice -n -10 ./high_priority_task
# Run CPU-intensive task on isolated CPU
taskset -c 2 stress-ng --cpu 1 --timeout 60s
# Using hex mask
taskset 0x4 ./program # CPU2 only (binary 0100)
|
Affinity for All Threads
| # Pin process and all its threads
taskset -a -c 2,3 ./multithreaded_app
# Pin existing process tree
taskset -a -p -c 2,3 <PID>
|
Practical Example: Network Server
| #!/bin/bash
# Pin network server to isolated CPUs
# Isolate CPU2 and CPU3 (set in /boot/firmware/cmdline.txt)
# Then run server:
# Start nginx on CPU2
taskset -c 2 nginx
# Start redis on CPU3
taskset -c 3 redis-server /etc/redis/redis.conf
# Database on CPU0-1 (non-isolated, for balanced I/O)
taskset -c 0,1 mongod --config /etc/mongod.conf
|
IRQ Affinity
Hardware interrupts (IRQs) can cause jitter on isolated CPUs. Route interrupts to specific cores.
View Current IRQ Assignment
| # List all IRQs and their CPU affinity
cat /proc/interrupts
# Example output:
# CPU0 CPU1 CPU2 CPU3
# 30: 12345 6789 4567 2345 GIC-0 27 Level arch_timer
# 34: 1234 0 0 0 GIC-0 65 Level fe00b880.mailbox
# 39: 56789 23456 12345 6789 GIC-0 189 Level mmc1
# Check affinity of specific IRQ (e.g., IRQ 39)
cat /proc/irq/39/smp_affinity
# Output: f (hex) = 1111 (binary) = all CPUs
cat /proc/irq/39/smp_affinity_list
# Output: 0-3
|
Set IRQ Affinity
| # Pin IRQ 39 to CPU0 only
echo 1 | sudo tee /proc/irq/39/smp_affinity
# 1 = binary 0001 = CPU0
# Pin to CPU0 and CPU1
echo 3 | sudo tee /proc/irq/39/smp_affinity
# 3 = binary 0011 = CPU0,CPU1
# Using list format (easier)
echo 0,1 | sudo tee /proc/irq/39/smp_affinity_list
# Pin Ethernet interrupts to CPU0
for irq in $(grep eth0 /proc/interrupts | cut -d: -f1); do
echo 0 | sudo tee /proc/irq/$irq/smp_affinity_list
done
|
Disable IRQ Balancing
The irqbalance daemon automatically distributes IRQs. Disable it for manual control:
| # Stop and disable irqbalance
sudo systemctl stop irqbalance
sudo systemctl disable irqbalance
# Verify it's stopped
systemctl status irqbalance
|
Automated IRQ Setup Script
| #!/bin/bash
# irq_setup.sh - Pin all IRQs to CPU0 and CPU1
# Disable IRQ balancing
systemctl stop irqbalance
# Route all IRQs to CPU0 and CPU1
for irq in /proc/irq/*/smp_affinity_list; do
echo "0,1" > "$irq" 2>/dev/null || true
done
# Verify
echo "IRQ affinity set:"
grep -H . /proc/irq/*/smp_affinity_list 2>/dev/null | head -n 10
|
Make it persistent:
| # Create systemd service
sudo tee /etc/systemd/system/irq-affinity.service <<EOF
[Unit]
Description=Set IRQ Affinity
After=network.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/irq_setup.sh
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
EOF
sudo chmod +x /usr/local/bin/irq_setup.sh
sudo systemctl enable irq-affinity
sudo systemctl start irq-affinity
|
Cgroups v2 CPU Management
Cgroups provide fine-grained resource control. Raspberry Pi OS uses cgroups v2 by default (kernel 5.10+).
Verify Cgroups v2
| # Check cgroup version
mount | grep cgroup
# Should see: cgroup2 on /sys/fs/cgroup type cgroup2
# Enable CPU controller if not active
cat /sys/fs/cgroup/cgroup.controllers
# Should include: cpu memory pids
# If cpu is missing, enable it
echo "+cpu +cpuset" | sudo tee /sys/fs/cgroup/cgroup.subtree_control
|
Create CPU-Limited Cgroup
| # Create cgroup for limited tasks
sudo mkdir -p /sys/fs/cgroup/limited_cpu
# Set CPU quota: 50% of one core
# cpu.max format: $MAX $PERIOD (microseconds)
# 50000 100000 = 50% of one core
echo "50000 100000" | sudo tee /sys/fs/cgroup/limited_cpu/cpu.max
# Restrict to CPU0 and CPU1 only
echo "0-1" | sudo tee /sys/fs/cgroup/limited_cpu/cpuset.cpus
# Add process to cgroup
echo $PID | sudo tee /sys/fs/cgroup/limited_cpu/cgroup.procs
# Verify
cat /sys/fs/cgroup/limited_cpu/cpu.max
# Output: 50000 100000
cat /sys/fs/cgroup/limited_cpu/cpuset.cpus
# Output: 0-1
|
CPU Weight (Proportional Share)
| # Create two cgroups with different CPU weights
sudo mkdir -p /sys/fs/cgroup/high_priority
sudo mkdir -p /sys/fs/cgroup/low_priority
# High priority gets 2× CPU time of low priority
echo "200" | sudo tee /sys/fs/cgroup/high_priority/cpu.weight
echo "100" | sudo tee /sys/fs/cgroup/low_priority/cpu.weight
# Assign processes
echo $PID1 | sudo tee /sys/fs/cgroup/high_priority/cgroup.procs
echo $PID2 | sudo tee /sys/fs/cgroup/low_priority/cgroup.procs
|
Systemd Integration
Systemd services can use cgroup settings directly:
| # Create service with CPU affinity
sudo tee /etc/systemd/system/isolated_task.service <<EOF
[Unit]
Description=Isolated CPU Task
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/my_realtime_app
Restart=always
# CPU affinity
CPUAffinity=2 3
# CPU quota: 150% (1.5 cores)
CPUQuota=150%
# Real-time priority
Nice=-10
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl start isolated_task
sudo systemctl status isolated_task
|
Monitoring Cgroup Usage
| # View CPU stats for cgroup
cat /sys/fs/cgroup/limited_cpu/cpu.stat
# Output:
# usage_usec 1234567
# user_usec 1000000
# system_usec 234567
# nr_periods 1234
# nr_throttled 56
# throttled_usec 78901
# Check which CPUs are being used
cat /sys/fs/cgroup/limited_cpu/cpuset.cpus.effective
|
Scheduler Policies and Priorities
Linux supports multiple scheduling policies. Default is CFS (Completely Fair Scheduler), but real-time policies provide deterministic behavior.
Scheduling Policies
| Policy |
Description |
Priority Range |
Use Case |
SCHED_OTHER |
Default CFS |
Nice -20 to 19 |
General purpose |
SCHED_FIFO |
Real-time FIFO |
1-99 (higher = more priority) |
Hard real-time |
SCHED_RR |
Real-time round-robin |
1-99 |
Real-time with time slicing |
SCHED_DEADLINE |
Deadline scheduling |
N/A |
Periodic tasks |
SCHED_BATCH |
Batch processing |
Nice -20 to 19 |
CPU-intensive background |
SCHED_IDLE |
Lowest priority |
N/A |
Very low priority tasks |
View Current Policy
| # Check policy of current shell
chrt -p $$
# Output: pid 1234's current scheduling policy: SCHED_OTHER
# pid 1234's current scheduling priority: 0
# Check policy of process
chrt -p $(pidof my_app)
|
Set Real-Time Policy (SCHED_FIFO)
| # Launch with SCHED_FIFO priority 50
sudo chrt -f 50 ./my_realtime_app
# Change existing process to SCHED_FIFO
sudo chrt -f -p 50 $(pidof my_app)
# Verify
chrt -p $(pidof my_app)
# Output: pid 5678's current scheduling policy: SCHED_FIFO
# pid 5678's current scheduling priority: 50
|
Round-Robin Real-Time (SCHED_RR)
| # SCHED_RR allows time-slicing among same-priority tasks
sudo chrt -r 50 ./my_app
# Set time slice (quantum) - requires RT kernel patch
# Default is usually 100ms
|
Deadline Scheduling (SCHED_DEADLINE)
For periodic tasks with strict timing requirements:
| # Run task every 10ms with 5ms runtime budget
# Format: -D --sched-runtime <ns> --sched-deadline <ns> --sched-period <ns>
sudo chrt -d --sched-runtime 5000000 --sched-deadline 10000000 --sched-period 10000000 ./periodic_task
# Example: 1kHz control loop (1ms period, 0.5ms runtime)
sudo chrt -d --sched-runtime 500000 --sched-deadline 1000000 --sched-period 1000000 ./control_loop
|
Nice Values (SCHED_OTHER)
| # Lower nice = higher priority (-20 to 19)
nice -n -10 ./important_task
nice -n 10 ./background_task
# Change existing process
sudo renice -n -15 -p $(pidof important_app)
# Run at lowest priority
nice -n 19 ./batch_job
|
RT Throttling (Safety Mechanism)
Real-time processes can starve the system. Linux has RT throttling:
| # Check RT throttling settings
cat /proc/sys/kernel/sched_rt_period_us
# Output: 1000000 (1 second)
cat /proc/sys/kernel/sched_rt_runtime_us
# Output: 950000 (0.95 seconds)
# This means RT tasks can use max 95% CPU time
# Remaining 5% reserved for non-RT (safety)
# Disable throttling (dangerous!)
echo -1 | sudo tee /proc/sys/kernel/sched_rt_runtime_us
# Enable full RT usage (still risky)
echo 1000000 | sudo tee /proc/sys/kernel/sched_rt_runtime_us
|
Practical Use Cases
Case 1: Real-Time Audio Processing
Isolate CPU3 for audio, prevent jitter:
| # 1. Isolate CPU3 in /boot/firmware/cmdline.txt
# isolcpus=3
# 2. Route audio IRQs to CPU0
#!/bin/bash
for irq in $(grep -i audio /proc/interrupts | cut -d: -f1); do
echo 0 | sudo tee /proc/irq/$irq/smp_affinity_list
done
# 3. Run audio app on CPU3 with SCHED_FIFO
sudo chrt -f 80 taskset -c 3 jackd -d alsa -r 48000 -p 128
# 4. Monitor latency
cat /proc/interrupts | grep -i audio
|
Case 2: Network Router/Firewall
Dedicate CPU0-1 to network, CPU2-3 to applications:
| # /boot/firmware/cmdline.txt
isolcpus=2,3
# IRQ setup script
#!/bin/bash
# Pin all network IRQs to CPU0-1
for irq in $(grep -E 'eth0|wlan0' /proc/interrupts | cut -d: -f1); do
echo 0,1 | sudo tee /proc/irq/$irq/smp_affinity_list
done
# iptables/nftables on CPU0-1
taskset -c 0,1 systemctl restart nftables
# Application servers on CPU2-3
taskset -c 2,3 nginx
taskset -c 2,3 php-fpm
|
Case 3: Machine Learning Inference
Separate data acquisition from inference:
| # CPU0-1: Camera capture and preprocessing
# CPU2-3: TensorFlow Lite inference
# Capture process
taskset -c 0,1 nice -n -5 python3 camera_capture.py &
# Inference on isolated cores
sudo chrt -f 50 taskset -c 2,3 python3 inference.py &
# Monitor
watch -n 1 'mpstat -P ALL 1 1'
|
Balance I/O and computation:
| # Create cgroups
sudo mkdir -p /sys/fs/cgroup/compute
sudo mkdir -p /sys/fs/cgroup/io
# Compute: CPU2-3, 200% quota, high priority
echo "2-3" | sudo tee /sys/fs/cgroup/compute/cpuset.cpus
echo "200000 100000" | sudo tee /sys/fs/cgroup/compute/cpu.max
echo "200" | sudo tee /sys/fs/cgroup/compute/cpu.weight
# I/O: CPU0-1, 150% quota, normal priority
echo "0-1" | sudo tee /sys/fs/cgroup/io/cpuset.cpus
echo "150000 100000" | sudo tee /sys/fs/cgroup/io/cpu.max
echo "100" | sudo tee /sys/fs/cgroup/io/cpu.weight
# Run compute tasks
echo $PID_COMPUTE | sudo tee /sys/fs/cgroup/compute/cgroup.procs
# Run I/O tasks
echo $PID_IO | sudo tee /sys/fs/cgroup/io/cgroup.procs
|
Case 5: Kubernetes Node Optimization
Reserve CPU for system pods:
| # kubelet configuration
# /var/lib/kubelet/config.yaml
systemReserved:
cpu: "500m"
memory: "512Mi"
kubeReserved:
cpu: "500m"
memory: "512Mi"
# Pin kubelet to CPU0
# /etc/systemd/system/kubelet.service.d/10-cpu-affinity.conf
[Service]
CPUAffinity=0
# Application pods use CPU1-3
# In pod spec:
spec:
containers:
- name: app
resources:
requests:
cpu: "1000m"
limits:
cpu: "2000m"
|
Monitoring and Verification
Real-Time CPU Usage
| # Per-CPU usage
mpstat -P ALL 1
# With task affinity info
htop
# Press F2 → Display options → Enable "PROCESSOR"
# Detailed per-process CPU usage
pidstat -u -p ALL 1
# Show CPU affinity in ps
ps -eo pid,comm,psr,args
# PSR column shows current CPU
|
Measure Context Switches
| # Context switches per second
vmstat 1
# Look at "cs" column
# Per-process context switches
pidstat -w 1
# Check voluntary vs involuntary
cat /proc/$PID/status | grep ctxt
|
Latency Testing
| # Install rt-tests
sudo apt install rt-tests
# Cyclictest - measure real-time latencies
sudo cyclictest -p 80 -t 4 -n -i 1000 -l 100000
# -p 80: priority 80
# -t 4: 4 threads (one per CPU)
# -i 1000: 1000µs interval
# -l 100000: 100,000 iterations
# Pin to specific CPUs
sudo taskset -c 2,3 cyclictest -p 80 -t 2 -n -i 1000 -l 100000
# Output shows min/avg/max latencies per CPU
|
System-Wide Profiling
| # Install perf
sudo apt install linux-perf
# Profile all CPUs for 10 seconds
sudo perf stat -a -e cycles,instructions,cache-misses sleep 10
# Profile specific CPU
sudo perf stat -C 2 -e cycles,instructions sleep 10
# Profile process with affinity info
sudo perf record -a -g -- taskset -c 2 ./my_app
sudo perf report
|
Troubleshooting
Issue: Isolated CPUs Still Show Activity
| # Check for kernel threads on isolated CPUs
ps -eLo psr,comm | grep -E '^\s*[23]'
# Move kernel threads away from isolated CPUs
for pid in $(pgrep -f 'ksoftirqd|kworker'); do
sudo taskset -p -c 0,1 $pid 2>/dev/null || true
done
# Disable specific kernel threads
echo 0 | sudo tee /sys/devices/virtual/workqueue/*/cpumask
|
Issue: Real-Time Task Causes System Freeze
| # RT task consumed all CPU, system unresponsive
# Solution: Ensure RT throttling is enabled
# Check current settings
cat /proc/sys/kernel/sched_rt_runtime_us
# If -1 (disabled), re-enable with 95% limit
echo 950000 | sudo tee /proc/sys/kernel/sched_rt_runtime_us
# Alternative: Use SCHED_RR instead of SCHED_FIFO
sudo chrt -r 50 ./my_app
|
Issue: Cgroup CPU Quota Not Enforced
| # Verify CPU controller is enabled
cat /sys/fs/cgroup/cgroup.controllers
# Should include "cpu"
# Enable CPU controller in subtree
echo "+cpu" | sudo tee /sys/fs/cgroup/cgroup.subtree_control
# Check if process is in correct cgroup
cat /proc/$PID/cgroup
|
Issue: IRQ Affinity Resets After Reboot
| # irqbalance might be re-enabling
sudo systemctl disable irqbalance
# Create persistent script
sudo tee /etc/rc.local <<'EOF'
#!/bin/bash
for irq in /proc/irq/*/smp_affinity_list; do
echo "0,1" > "$irq" 2>/dev/null || true
done
exit 0
EOF
sudo chmod +x /etc/rc.local
|
Before Optimization (Default)
| # All CPUs running general workload
stress-ng --cpu 4 --timeout 60s &
hackbench -g 10 -l 1000
# Typical results:
# Hackbench: 5.2 seconds
# Context switches: 15,000/sec
# Cyclictest max latency: 250µs
|
After CPU Isolation
| # CPU2-3 isolated, dedicated to real-time task
# CPU0-1 handle general workload
stress-ng --cpu 2 --timeout 60s & # On CPU0-1
sudo chrt -f 80 taskset -c 2,3 cyclictest -t 2 -n -i 1000 -l 100000
# Improved results:
# Cyclictest max latency: 45µs (82% reduction)
# Jitter: <10µs (vs >100µs before)
|
Network Throughput Comparison
| # Before: IRQs spread across all CPUs
iperf3 -c server -t 60
# ~800 Mbps, high CPU usage on all cores
# After: IRQs on CPU0-1, iperf on CPU2-3
for irq in $(grep eth0 /proc/interrupts | cut -d: -f1); do
echo 0,1 | sudo tee /proc/irq/$irq/smp_affinity_list
done
taskset -c 2,3 iperf3 -c server -t 60
# ~940 Mbps, CPU2-3 at 100%, CPU0-1 at 30%
|
Best Practices
1. Start Conservative
| # Don't isolate too many cores
# Leave at least 2 cores for system (especially on Pi 4)
# Good: isolcpus=2,3 (on 4-core system)
# Bad: isolcpus=1,2,3 (only 1 core for OS)
|
2. Combine Techniques
| # Layer optimizations for best results:
# 1. CPU isolation (isolcpus)
# 2. IRQ affinity
# 3. Task affinity (taskset)
# 4. Scheduler policy (chrt)
# 5. Cgroup limits (optional)
|
3. Monitor First, Optimize Later
| # Establish baseline with monitoring tools
mpstat -P ALL 1 &
pidstat -u 1 &
vmstat 1 &
# Run workload and identify bottlenecks
# Then apply targeted optimizations
|
4. Test Thoroughly
| # Stress test your configuration
stress-ng --cpu 4 --io 4 --vm 2 --vm-bytes 512M --timeout 300s
# Check for system stability
dmesg | grep -i 'hung\|rcu\|stall'
# Measure latency under load
sudo cyclictest -p 80 -t 4 -n -i 1000 -l 1000000 -m
|
5. Document Your Setup
| # Create README with your configuration
cat > /root/cpu_config.txt <<EOF
CPU Configuration for this system:
- Isolated CPUs: 2,3
- IRQ affinity: CPU 0,1
- Real-time apps: CPU 2,3 (SCHED_FIFO priority 70-80)
- General apps: CPU 0,1
- RT throttling: Enabled (95%)
Services:
- nginx: CPU 2,3
- database: CPU 0,1
- monitoring: CPU 0
Restore with:
- /usr/local/bin/irq_setup.sh
- systemctl start cpu-optimization.service
EOF
|
Complete Setup Script
| #!/bin/bash
# complete_cpu_optimization.sh
# Complete CPU optimization for Raspberry Pi 4/5
set -e
echo "=== CPU Optimization Setup ==="
# 1. Check current configuration
echo "Current CPUs: $(nproc)"
echo "Isolated CPUs: $(cat /sys/devices/system/cpu/isolated 2>/dev/null || echo 'none')"
# 2. Disable IRQ balancing
echo "Stopping irqbalance..."
systemctl stop irqbalance 2>/dev/null || true
systemctl disable irqbalance 2>/dev/null || true
# 3. Set IRQ affinity (all IRQs to CPU0-1)
echo "Setting IRQ affinity to CPU0-1..."
for irq in /proc/irq/*/smp_affinity_list; do
echo "0,1" > "$irq" 2>/dev/null || true
done
# 4. Create systemd service for persistent IRQ affinity
cat > /etc/systemd/system/cpu-optimization.service <<'EOF'
[Unit]
Description=CPU Optimization (IRQ affinity)
After=network.target
[Service]
Type=oneshot
ExecStart=/bin/bash -c 'for irq in /proc/irq/*/smp_affinity_list; do echo "0,1" > "$irq" 2>/dev/null || true; done'
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable cpu-optimization
systemctl start cpu-optimization
# 5. Verify
echo ""
echo "=== Verification ==="
echo "IRQ affinity sample:"
head -n 5 /proc/irq/*/smp_affinity_list
echo ""
echo "=== Setup complete! ==="
echo "To enable CPU isolation, add to /boot/firmware/cmdline.txt:"
echo " isolcpus=2,3"
echo "Then reboot."
|
Summary
This guide covered comprehensive CPU management techniques:
✅ CPU Isolation
- Reserve cores exclusively for critical tasks
- Reduce scheduler overhead and context switches
- Set via
isolcpus boot parameter
✅ Task Affinity
- Pin processes to specific CPUs with
taskset
- Improve cache locality and reduce migrations
- Apply to running processes or at launch
✅ IRQ Affinity
- Route hardware interrupts to designated cores
- Reduce jitter on isolated CPUs
- Disable
irqbalance for manual control
✅ Cgroups v2
- Fine-grained CPU quotas and shares
- Restrict processes to specific cores
- Integrated with systemd services
✅ Scheduler Policies
- Real-time scheduling (SCHED_FIFO, SCHED_RR)
- Deadline scheduling for periodic tasks
- Priority tuning with
chrt and nice values
✅ Practical Applications
- Real-time audio: <50µs latency
- Network appliances: Dedicated interrupt handling
- Edge AI: Separate inference from I/O
- HPC: Balanced compute and I/O workloads
Next Steps
Advanced Topics to Explore:
- PREEMPT_RT Kernel: Full real-time Linux for <10µs latencies
- NUMA Optimization: On multi-socket systems (not applicable to Pi, but good to know)
- CPU Frequency Scaling: Governor tuning for performance vs power
- Memory Affinity: Combine with
numactl on larger systems
- Container Optimization: Apply these techniques to Docker/Kubernetes
Related Raspberry Pi Guides:
With proper CPU management, you can achieve near-deterministic performance on Raspberry Pi, making it viable for production real-time systems, edge computing, and high-performance network appliances.