how to config linux/CPU better for large scale software running (NUMA)

2.2k views Asked by At

I am doing performance analysis on linux for large scale programs which is memory driven(tens of Gigabytes memory).

I am thinking if it's possible to config linux/hardware to be more suitable to run such kind of large programs. But I am not familiar with this side.

Anybody have points about how to config

  1. memory allocation strategy of OS
  2. cache config for CPU
  3. else...

Any comment is appreciated..

This is the typical CPU model (4 Opteron processors each has dual core):

processor       : 3
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 65
model name      : Dual-Core AMD Opteron(tm) Processor 2218
stepping        : 2
cpu MHz         : 2600.000
cache size      : 1024 KB
physical id     : 1
siblings        : 2
core id         : 1
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips        : 5200.09
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
1

There are 1 answers

0
Brian Cain On

Useful for investigating memory / caching on a multi-socket system:

  • hwloc's lstopo (example):

    lstopo
    
  • numactl / libnuma (but only if it really is a NUMA system)

    numactl --hardware
    numactl --show
    
  • sysfs, procfs:

    sudo grep . /sys/devices/system/cpu/cpu*/cpufreq/*
    grep . /sys/devices/system/cpu/cpu*/topology/physical_package_id
    sudo grep . /proc/irq/*/smp_affinity # compare w/ /proc/interrupts