Trouble understanding registers x86

232 views Asked by At

I've been trying to teach myself how to accomplish certain tasks in assembly.

Right now, I am working on trying to detect palindromes. I know I could use a stack, or possibly compare strings using Irvine's library, but I'm trying to do it via registers.

The problem is, when it comes to using registers, I'm more than a bit confused.

The following compiles, but when I get to the CMP line, the program breaks and gives me this message:

Unhandled exception at 0x004033FC in Project.exe: 0xC0000005: Access violation reading location 0x0000000F.

I'm assuming it has something to do with how I set the registers, but even using the registers while debugging isn't helping me much.

Any help would be appreciated.

INCLUDE Irvine32.inc

.data

enteredWord BYTE "Please enter the string to check: ", 0
presetWord BYTE "Step on no pets", 0

isAPalindrome BYTE "The word is a palindrome. ", 0
isNotAPalindrome BYTE "The word is not a palindrome. ", 0

.code
main proc
mov ecx, SIZEOF presetWord - 1
mov esi,OFFSET presetWord

checkWord:
MOV eax,[esi]
CMP [ecx],eax
JNE NOTPALIN

inc esi
dec ecx
loop checkWord
mov edx, offset isAPalindrome
call WriteString
jmp _exit
main endp

NOTPALIN PROC
mov edx, offset isNotAPalindrome
call WriteString
ret
NOTPALIN endp


_exit:
exit


end main
1

There are 1 answers

5
Ped7g On

CPU register is piece of computer memory located directly inside the CPU core. Piece of computer memory means some amount of bits (0/1), in case of 64b x86 CPU the general registers are 64 bits "wide", under names rax, rcx, rdx, rbx, ...

The ecx is the lower 32b part of rcx (upper 32b part is not accessible under special name, only through instructions using rcx). And the lower 16b part is accessible through cx, which is composed from two 8b parts ch (upper), and cl (lower).

So as you are using ecx, you can set 32 bits to either 0 or 1. Which can interpreted as unsigned number from 0 to 232-1 (in hexa 0 .. 0xFFFFFFFF), or as signed number from -231 to +231-1 (0x80000000 .. 0x7FFFFFFF). Or you can interpret the meaning of those bits in any way you wish, and write code for.

In your code you can utilize three common ways how to interpret value of bits in some CPU register.

; EBX as memory address:
mov   ebx,OFFSET presetWord  ; some address into memory (32b unsigned number)
; ECX as numeric value ("unsigned long" in C++)
mov   ecx,SIZEOF presetWord - 1  ; 15
; AL as ASCII character (extended 8 bit)
mov   al,[ebx]      ; also shows how memory is referenced by address
; AL == 83 == 'S' => value of memory at address "presetWord"

In your example doing cmp [ecx],eax means to reference memory at address 15, which is fortunately for you illegal, so it does crash. If you would by accident use some legal address for your process (but not the one you wanted to really use), it would silently proceed and continue with unexpected result.

You probably did want to do cmp [esi+ecx],eax, which means to reference memory at address presetWord+15 (last char of string), but that's true only for first iteration. Then you do inc esi and it will point at presetWord+1 address (second char).

And you probably wanted to compare only characters, so you should change that eax to al to fetch/compare only single byte at one time, because the string is encoded in ASCII encoding (8bit per char). eax would work for UTF-32 encoding.


To check for palindrome you may want to load one register ("r1") with address of first char, one register ("r2") of address (!) of last char, and then do this loop:

  • if (r2 <= r1) -> exit with true (all important chars compared) (check addresses as unsigned numbers)
  • here addresses r1 < r2 -> now compare characters
  • if (byte [r1] != byte [r2]) exit false
  • ++r1, --r2 -> adjust the addresses to point at second/second last chars
  • loop to beginning

This will produce "false" for presetWord, as 'S' != 's', so you may want to introduce case insensitivity to the if (byte [r1]... part, but I would first make it work without that.


While debugging, you should be able to recognize "class" of some of those numbers in registers. If you load size into register, it will be very likely some small number, like 0000000F (15). Address will be very likely some large number like 8040506E. ASCII characters when used as single char should lead to something like 20 - 7F in common cases, but if you do mov al,..., the debugger is still displaying whole eax, so the upper three bytes will remain it's previous value, for example reading space character into eax set as 12345678 will change the value of eax to 12345620 (space ' ' == 0x20 in ASCII).

You can also use memory view to check content of particular address in memory. If you would for example change that cmp to cmp [esi+ecx],eax, and check that address in memory view, you would see it would point in second iteration again at the last char, not the second last char.

This is all visible and possible to check in the debugger, sometimes a bit tedious, then again often easier than asking on SO or just thinking about the source code, especially if you are stuck for longer time.


Finally ... why even registers? Because computer memory is separate chip. And it may look innocent, but instruction like mov al,[presetWord] may actually stall for hundreds of CPU cycles, while the CPU chip will wait for the memory chip to read the content of memory and send it over bus wires to the CPU chip. While the al and ecx is directly inside the CPU, accessible in the same cycle when the CPU needs it.

So you may want to store values into register, if you use them often in your calculation, to not slow down with memory (although once the memory content is cached by L0/1/2/3 caches, the "hundreds" of cycles becomes reasonable amount, sometimes even 0 cycles with cache level directly on CPU chip). But you want to access memory in predictable pattern (so cache can read-ahead), and in reasonable amounts (caches work usually with sizes like 16-32B up to 4-8k by their level). If you access in couple of instructions like 16 different 8k memory pages, you may run out of available cache-lines, and then there will be at least one access featuring full stall, waiting for real memory read.