When paging is enabled by setting the paging bit in CR0 to 1, all pointers (including EIP) are now interpreted as virtual rather than physical addresses. Unless the region of memory which the CPU is currently executing from is "identity mapped" (virtual addresses are mapped to identical physical addresses), it seems that this would cause the CPU to do what amounts to an "unconditional jump" -- it should start executing code from a different (physical) address.
Does this actually happen? It seems it would be very tricky to get OS startup code to work reliably with this behavior. Or do all protected-mode OSs identity-map their own kernel code?
Yes and No
Yes, in the informal sense, since now the MMU do a translation from virtual to linear addresses and since the CPU fetches virtual addresses. If we switch on paging when executing an instruction at address
4000h
, assuming the next instruction is at4003h
, it is possible that 4003h is translated into8003h
so actually making a jump from4000h
to8003h
. So we have to map the page we are currently executing into or we won't know where the CPU will execute code from.No, in the technical sense this is not a jump since the CPU does not see any jump instruction with all its side effects (like discarding OoO instructions) and furthermore the CPU access the memory only after the whole cache hierarchy missed meaning that you could still be executing instructions from
4003h
even if the page is mapped to a different address.So, do or don't we need an identity map?
Yes, we need it. Not a full identity map, I usually only (identity) map the pages 7 and 8 (corresponding to linear range 7000h-8fffh) for example.
Comparing enabling paging with enabling Protected Mode you can see how different they are. Paging takes effect immediately, so you need to create all the page tables before you activate it and you need at least one identity page to handle your current running code without relying on the caches.
Enabling protected mode instead is more "easy", you can even create the the GDT entries after you entered protected mode and you can control when to make first use of it by changing a segment register (usually
CS
with a jump).Actually you don't strictly need an identity page if you know what you are doing (say by duplicating your code or by using some hardware memory aliasing) but this is very context specific in the general case it just makes things uselessly complicated.