I am trying to write raw machine code bytes as 0s and 1s in to a text file, and execute it as that through the BIOS.
I have some problems understanding, however, how addressing, multiplying, offsets, addressing, operands, and instructions work in combinatorial arrangements, i.e. difference between MOV AL, 07
and MOV BL, AL
.
I mean it makes sense in Assembly, but in machine code it becomes highly difficult to get the idea of parameters.
So what I want to know is this: How can I better understand this? There are no tutorials I've found that accurately explain/describe the 0s and 1s from instructions in combinatorial correlations or connections between data passing, MMIO, addressing modes, arithmetic, and the like.
On this site http://ref.x86asm.net/coder32.html#x00 it tries, but I don't understand this.
EXAMPLE: Say I want to move 5 in to AL ... would I specify the literal '5' in binary as part of the opcode in binary prefix chained with the AL/MOV instruction, or would I have one fixed binary code for each instruction, regardless of value? That is what I want to know ... how to understad how machine code is written.
There is (mostly) a one-to-one mapping between assembler mnemonics and machine instructions. You can find these mappings in the Intel Software Developers Manual, Volume 2, which contains the complete x86 16-, 32- and 64-bit instruction sets. You'll probably want to start with Chapter 2: Instruction Format which describes the translations you're trying to come up with.
In the case of
mov al, 5
it's just as you say, you put the literal there. The instruction in machine code is:Since thats the
MOV r8, imm8
form of theMOV
instruction. Formov bl, al
, you'd want theMOV r/m8,r8
form, which in your case would encode to:The
c3
you can look up in Table 2-2 32-Bit Addressing Forms with the ModR/M Byte, where you'll see it at the intersection of theBL
row and theAL
column. (There's a 16-bit table, too if that's the mode you're in - the value in this case is the same.)