How prompt is x86 at setting the page dirty bit?

3k views Asked by At

From a software point of view, what is the latency between an instruction that dirties a memory page and when the core actually marks the page dirty in the Page Table Entry (PTE)?

In other words, if an instruction dirties a page, can the very next instruction read the PTE and see the dirty bit set?

I don't care about the actual elapsed cycles, only if there is a software visible window in which the dirty bit is not yet set. I can't seem to find any guarantees in the reference manuals.

3

There are 3 answers

2
Alexey Frunze On BEST ANSWER

From the AMD's manual (circa 2005), Volume 2: System Programming:

5.4 Page-Translation-Table Entry Fields ... Dirty (D) Bit. Bit 6. This bit is only present in the lowest level of the page-translation hierarchy. It indicates whether the pagetranslation table or physical page to which this entry points has been written. The D bit is set to 1 by the processor the first time there is a write to the physical page.

Ditto from Intel (circa 2006), Volume 3-A: System Programming Guide, Part 1:

3.7.6 Page-Directory and Page-Table Entries ... Dirty (D) flag, bit 6 Indicates whether a page has been written to when set. (This flag is not used in page-directory entries that point to page tables.) Memory management software typically clears this flag when a page is initially loaded into physical memory. The processor then sets this flag the first time a page is accessed for a write operation.

UPDATE:

From the latest Intel manual (vol 3A, System Programming Guide):

8.1.2.1 Automatic Locking The operations on which the processor automatically follows the LOCK semantics are as follows: ... When updating page-directory and page-table entries — When updating page-directory and page-table entries, the processor uses locked cycles to set the accessed and dirty flag in the page-directory and page-table entries.

From the rest of the text in sections 8.1 and 8.2 it follows that once the CPU sets the dirty bit using the locked operation, the other CPUs should start seeing the updated value.

Of course, you may have a race condition in that you first read the dirty bit as 0 on one CPU (or in one of its threads) and later another CPU (or another thread on the same CPU) causes this bit to be set to 1, but that isn't any unusual.

0
geppy On

AMD64 Architecture Programmer’s Manual Volume 2: System Programming (revision 3.22, Sept. 2012)

In general, Dirty bit updates are ordered with respect to other loads and stores, although not necessarily with respect to accesses to WC memory; in particular, they may not cause WC buffers to be flushed. However, to ensure compatibility with future processors, a serializing operation should be inserted before reading the D bit.

(Emphasis mine.)

0
A Koscianski On

According to page 2033 of this document, Intel x86 caches information about the page table. The text states that if the dirty bit is cleaned by software, there's a possibility that the CPU still sees it as equal to 1.

Now, for the question: if the CPU caches the dirty bit, there's a possibility that the update to the PTE (page-table-entry) does not take place immediately. It could be delayed by a cache-write-back policy.

Page 1651 of the same document describes the WBINVD instruction, that flushes internal caches. It does not say that this includes all data cached by the CPU.