I was reading on wikipedia about machine code, microcode and bytecode.
It seems that microcodes are something more low-level than machine code, while bytecodes seem to be more high-level.
I didn't realy get how something can be more low-level than machine code, and more generally, how both of them relate to machine code.
What is the difference between bytecode and microcode?
1.9k views Asked by Gabriele Scarlatti AtThere are 2 answers
In the context of those three machine code is the instruction set that a processor runs, the instructions that are published and some of us learn to program (in assembly language with an assembler being a tool that converts the assembly into machine code).
0: 66 b8 05 00 mov ax,0x5
Not all processors are microcoded, fewer than you think, next to none. It is more of a CISC thing than a RISC thing, AFAIK some RISCs have been used. When you think microcoded basically think x86.
Something like moving an immediate into a general purpose register doesn't seem like a problem, but then look at some other instructions that allow you to have memory operands like the
0: 66 b8 05 00 mov ax,0x5
4: 66 67 8b 07 mov ax,WORD PTR [bx]
8: 66 67 8b 47 05 mov ax,WORD PTR [bx+0x5]
Steps:
- take the third one
- read the contents of bx from the register file
- add 5 to that value
- perform a word sized load from memory using that address
- save that value in ax in the register file
A bunch of steps that on a simple load and store machine would be done using more instructions, some pseudo code
add r3,r2,#5
load r4,[r3]
mov r1,r4
The underlying processor if there is one may be a byte slice or vliw (very long instruction word) or something created for the task.
No reason to expect and we are pretty sure it is the case that from one generation of x86 to another the underlying microengines have changed, perhaps a complete replacement or perhaps not.
Think of it as EBCDIC vs ASCII vs Unicode, etc. The basic alphabet capital A-Z lower case a-z and 0-9 plus some others, we can represent them digitally using different coding schemes and only some folks have to know the scheme the rest of us just type these characters into an edit box on a web page and not have to know how it works.
Processors that have survived multiple generations can easily be re-designed while retaining compatibility with the past. if you take 100 programmers write some detailed description of a programming problem, you will get somewhere between 1 and 100 different solutions that all perform the same task. Some might be better than others. Assuming all are bug free, then they are all valid solutions. Take 100 RTL engineers give them an instruction set and a processor bus specification, and you get between 1 and 100 different solutions, any of which could technically be used to implement that processor. No reason that after one design is used a couple few years later a completely new design might want to be used. Or you could do the intel thing, team A builds every other processor and team B builds every other (they may or may not do this now, but did for some period) so every other might be based on improvements to a prior design, but one to the next might be completely different underneath. And this has nothing yet to do with microcoding or not.
You can certainly make an x86 that is not microcoded as well as making a MIPS that is. Look up LC-3 a very very simple instruction set that can be implemented in about a page of Verilog, but the university creators have at least one microcoded solution which is crazy huge, for demonstration and educational purposes. Historically as well as today it takes a lot of work to build a chip, every spin costs a lot (think ones to tens of millions of dollars per spin) at the fewer spins it takes to get something you can sell the better. If I wanted to make say a washing machine controller and for mass production cost savings I want as much as I can get in one chip on the controller board. I can choose to make it all logic, not programmable. But if there are any bugs or features to be changed after the chip is in production it's a chip spin and we toss the boards build with the prior chip. If we make some percentage of it programmable and some not the odds of being able to add features or fix bugs and not have to recall nor toss product on the shelves goes down. Sometimes it's even cheaper to produce. A programmable solution doesn't always have to be larger than a non-programmable one, it depends on the problem being solved. Experience in that field guides those decisions. For it to pay off to make your own chip for a product like that you have to get your returns on volume spend hundreds of thousands to millions to produce a controller chip so that each chip is a couple of buck cheaper than building the board with a microcontroller and a few other chips.
Older generation processors like the 6502, 8086 and predecessors tried to do more per instruction as shown above. Instruction sets have evolved (or have they look at the pdp8 and think about your questions and CISC, RISC and others) So you get microcoding as a balance to being able to get a product you can sell without too many chip spins but having all those features is a valid compromise. But then you started to see RISC, VLIW, and other solutions where you weren't really looking at a microengine used as a general purpose processor but you were looking at designs that used more instructions, but the instructions were simpler to implement and could execute faster. So there is a design choice, pros and cons, and as we know both solutions are capable of general purpose computing, or special purpose computing. Some of those designs and companies not necessarily due to their instruction set being better or worse, some is marketing, some is being at the right place at the right time to take over a market and control it such that others cannot penetrate it, etc. We have seen some of these survive a number of generations.
go to the visual6502 page and see some things they talked about there. the processor was designed around a ROM, think of it as a ROM based state machine your opcode essentially is a pointer into the ROM, the ROM contains the sub commands that drive the state machine or microengine if you will. Maybe microcode and breaking a machine code instruction down into smaller steps will make sense. AMD had a processor called the AMD 29000, to some extent that processor became the microengine behind the first/early AMD x86 clones. Transmeta took another approach that runtime it converted the x86 code into VLIW code, why they didn't just sell it as a VLIW processor I don't know. Their goal to make an x86 clone, not some new processor I guess.
Languages like C were supposed to solve the problem of having various processor based systems on the Arpanet being able to talk to each other. Have to re-write the communication code (well write it from scratch to a spec) each time at each university for them to be able to talk. What if you could make a higher level language that you implement a backend to, but the high level programs can be run on each target. While C is still very relevant for obvious reasons, Pascal, Java, Python and others came along that had an even more interesting notion, take the high level language compile it into a machine code, but a machine code that you then write a target specific virtual machine for. usually a stack based solution that most processors could implement easily, not necessarily efficiently but a good chance of success creating a virtual machine for each target platform. You don't have to carry the source code around if you don't want to any more you can carry the bytecode around. Already compiled. Not much of a difference from an instruction set simulator so that you can run say ARM programs on an x86 or 6502 arcade games like Asteroids on your arm based phone or x86 based laptop, except the bytecode instruction set is somewhat designed for that emulator/virtual machine and not necessarily designed like a real machine language.
The way JAVA bytecode is done for example is you can basically build yourself a look up table, for each bytecode instruction and implement that in native instructions (where possible, some of the instructions need/want system calls which you can still do in native code sure). Here again think about 6502, 8080/z80, 8086, you had opcodes that were a byte the byte itself didn't necessarily map logically in a way you could pull it apart and see the operands, it was a byte you used against a look up table. You look at MIPS, ARM, RISC-V and others you don't really have an opcode (mips and riscv try but if you really look it's incomplete), you have a much larger instructions where portions of that instruction drive the state machine (more) directly (ideally). The opcode based designs the opcode is looked up or mapped into a table that drives the state machine indirectly from the bits of the opcode/instruction. Yes, there is some blurring between those RISC has some opcode bits, CISC may have some bits that can be used more directly.
Another use for bytecode we have seen is just in time compilers like LLVM. Started off as and still uses that as its backbone, the compiler front ends create bytecode then the bytecode can be optimized at that level, and then ultimately either just in time or while building your project the bytecode is then compiled if you will into the target assembly language or machine code. The bytecode being a nice intermediate abstraction layer between all the possible front end languages and all the implemented backend targets.
As with the RISC vs CISC comparison I am generalizing, you can certainly compile JAVA into target code, likewise you could make a C compiler/toolchain that outputs bytecode for one of these virtual machines.
Machine code, what processors operate on, what they understand, the instructions that tell them to do something.
Microcode, some processor designs implement machine code by basically emulating the machine code instructions using another, ideally simpler, instruction set. microcode running on a microengine.
Bytecode, an ideally generic instruction set that can be emulated in a virtual machine such that a compiled program can be delivered and used on non-compatible systems. Only have to develop the program once for the VM and deliver it, the same binary. Don't have to develop a version for each target operating system/instruction set. The term also used as a way to abstract compiled language front ends from the possible target backends in a toolchain.
See Wikipedia's webpage for any of those terms, in the right column it says:
Source code is any collection of code, possibly with comments, written using a human-readable programming language, usually as plain text. Microcode translates machine instructions, state machine data or other input into sequences of detailed circuit-level operations. It separates the machine instructions from the underlying electronics so that instructions can be designed and altered more freely.
One analogy is that machine code tells the CPU which microcode (subroutines) to use.
Almost half a century ago I knew the Intel 8080 off by heart, it was human readable to some people. Modern processors have a much larger instruction set and I think it's fair to say that most people, not even some people, are familiar with the entire machine code instruction set of their favorite processor of today.
For a particular processor architecture the machine code is mostly overlapping for any processor, for example you can run most Intel optimized machine code on AMD processors (and vice versa) but the microcode of each isn't cross compatible.
There are programs which rely on a specific processor and don't have hooks to allow different subroutines to be chosen depending on which processor is being used, often these are drivers or highly optimized programs designed to execute as quickly as possible (or occupy the least memory, or another tradeoff) - these programs (or drivers) are usually available in multiple versions to accommodate a manufacturer's specific version of a CPU, this is especially common with ARM CPUs where each core supports a different set of instructions and memory is limited (so accommodating a wide range of processors is impractical, and pointless in an embedded system).
The above is intended to be an easy to understand explanation, clicking on the links above provides a more exact and in-depth explanation.