- journal mode
data=journal mode provides full data and metadata journaling. All new data is written to the journal first, and then to its final location.
In the event of a crash, the journal can be replayed, bringing both data and metadata into a consistent state. This mode is the slowest except when data needs to be read from and written to disk at the same time where it outperforms all others modes. Enabling this mode will disable delayed allocation and O_DIRECT support.
Here I have a few questions, please take a look at it:
Configure data=journal, then the user calls write(), does the write() return after the data is successfully written to the journal, or does it return the user success after entering the pagecache? If it is the latter, it means that the journal is submitted asynchronously, so the meaning of the journal of ext4 is to ensure the consistency of the file system itself, and there is no guarantee that user data will not be lost?
If ext4 submits the journal asynchronously, when will the journal be triggered?
Is there any other file system that allows the journal to be synchronized before write() returns successfully?
According to the results of my local experiments, it is inferred that the journal should be submitted asynchronously. I used a separate ssd partition as journal_dev. When I used fio to test and write files, I found that the io of journal_dev was intermittent, not always having IO.
open()
).commit=
in https://www.kernel.org/doc/Documentation/filesystems/ext4.txt ) and probably before any pendingsync
/fsync
etc are allowed to complete.If you were to pass
O_SYNC
toopen()
or to do an additionalfsync
you will learn about when your write made it to stable media as far as the kernel can know.