Illegal Instruction" Error Running ollama on Debian 12 (bookworm) x86_64

379 views Asked by At
Body

I'm encountering an "Illegal Instruction" error when trying to run the ollama program on my Debian 12 (bookworm) system with an x86_64 architecture. This issue occurs both as a regular user and as root.

System Information
OS: Debian 12 (bookworm)
Kernel: Linux debian 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux
CPU: Intel(R) Celeron(R) CPU G1840 @ 2.80GHz
Issue Details

Whenever I attempt to run ollama or any related commands (e.g., ollama --version), I receive the "Illegal Instruction" error. I've followed various troubleshooting steps to diagnose the problem, including reinstalling ollama, verifying CPU architecture, and checking for missing dependencies.

Troubleshooting Steps Taken
  1. Verified that my CPU architecture is x86_64.
  2. Reinstalled ollama using the official installation script.
  3. Checked for missing dependencies using ldd.
Additional Information
  • The ollama binary is identified as an ELF 64-bit LSB executable, x86-64.
  • The system's PATH variable includes /usr/sbin, ensuring that the usermod command is available.
  • The system is running Debian 12 (bookworm).
Questions
  1. What could be causing the "Illegal Instruction" error when running ollama on Debian 12 (bookworm)?
  2. Are there any specific compatibility issues with Debian 12 (bookworm) or the x86_64 architecture that I should be aware of?
  3. Are there any further diagnostic steps I can take to pinpoint the cause of this issue?

I attempted to run the ollama program on my Debian 12 (bookworm) x86_64 system and expected it to execute without errors, providing the expected output or version information. However, when I ran ollama and related commands (e.g., ollama --version), I encountered an "Illegal Instruction" error. This error was unexpected, and it prevented the program from running as anticipated.

1

There are 1 answers

0
Peter Cordes On

You need a version of your program built to not require new CPU features.

Over the years, many extensions have been added to x86-64 which only newer CPUs support. Notably AVX and other SIMD extensions; your Celeron G1840 has SSE4.2 but not AVX or later (https://www.intel.com/content/www/us/en/products/sku/80800/intel-celeron-processor-g1840-2m-cache-2-80-ghz/specifications.html - the CPU cores are the Haswell microarchitecture but it's a Pentium/Celeron so they crippled it by disabling AVX/AVX2/FMA and BMI1/2).

Only Ice Lake and later Pentium/Celeron CPUs have those -march=x86-64-v3 features, unfortunately.


Your build of ollama probably assumes some CPU features your CPU doesn't have (/proc/cpuinfo). You could run gdb ollama and then inside GDB run, or run -foo /path if you need to pass args

When GDB stops at SIGILL, disas $pc, +15 to disassemble up to 15 bytes starting at the fault address, and look at the first instruction to find out what faulted. If it starts with a v like vmovdqa (%rdi), %xmm0, it's an AVX instruction.

Your OS can't really do anything about that; it's new enough that if your CPU supported AVX (or AVX-512), it would have enabled it.


Skylake had an erratum where Pentium/Celeron models report BMI1/2 as available in CPUID, but fault on them anyway, so programs that try to detect what the CPU supports will be misled.

But that CPUID bug was new in Skylake so it won't affect you; I remember people (including myself) being surprised that there even were Skylake CPUs without BMI extensions, or worried that all SKL models lacked BMI. Also it was fixed by a microcode update, exactly the kind of thing that updateable microcode can fix easily.

Perhaps Intel disabled AVX on those CPUs by disabling decode of VEX prefixes in the machine code, which BMI instructions also depend on. If so, disabling BMI might have been just an acceptable casualty, not really a goal. (I don't know how non-VEX BMI instructions execute on those CPUs. That's actually just tzcnt; lzcnt and popcnt have their own feature bits. And tzcnt works the same as bsf for non-zero inputs, and most compiler-generated code doesn't rely on the input=0 behaviour.)