Motivation
We are working on a white-box checking algorithm of Snapshot Isolation (SI): given an execution of a database, to check whether it satisfies SI.
The SI checking problem is NP-hard for general executions. So it is desirable to make use of the knowledge of how SI is actually implemented in databases.
The insight is that most databases, especially distributed databases, implement SI following a generic protocol using start-timestamps and commit-timestamps. With these timestamps of transactions in an execution, the SI checking problem becomes solvable in polynomial time. Therefore, we want to obtain these timestamps when generating executions.
It is crucial for us to really understand the meaning of the start-timestamps and commit-timestamps in the database under testing. We must be very sure that we have obtained the right timestamps in the right way.
That is why we ask for help here.
Background
We are digging into the implementation of snapshot isolation of MongoDB, especially into the use of timestamps in transactions.
Consider the classic description of start-timestamp and commit-timestamp in implementing Snapshot Isolation, quoted from the paper; Section 4.2:
For start-timestamp: A transaction executing with Snapshot Isolation always reads data from a snapshot of the (committed) data as of the time the transaction started, called its Start-Timestamp. This time may be any time before the transaction’s first Read.
For commit-timestamp: When the transaction
T1
is ready to commit, it gets a Commit-Timestamp, which is larger than any existing Start-Timestamp or Commit-Timestamp. WhenT1
commits, its changes become visible to all transactions whose Start-Timestamps are larger thanT1
's Commit-Timestamp.
For conflict detection: The transaction
T1
successfully commits only if no other transactionT2
with a Commit-Timestamp inT1
's execution interval [Start-Timestamp, Commit-Timestamp] wrote data thatT1
also wrote. Otherwise,T1
will abort. This feature, called First-committer-wins prevents lost updates.
Our Problem
How can we obtain such start-timestamp and commit-timestamp of a transaction in MongoDB from, e.g., database logs?
Our Solution
Environment
- MongoDB v7.0.2
- Python driver v4.1.1
- Sharded cluster deployment
sh.shardCollection("<database>.<collection>", { <shard key field> : "hashed" } )
Run transactions in MongoDB
We use a simpler version of the official example which uses the with_transaction
API.
from pymongo import MongoClient
from pymongo.read_concern import ReadConcern
from pymongo.write_concern import WriteConcern
client = MongoClient(host="10.206.0.12", port=27017)
def callback(session):
collection_one = session.client.mydb1.foo
collection_one.insert_one({"abc": 1}, session=session)
with client.start_session() as session:
session.with_transaction(
callback,
read_concern=ReadConcern("snapshot"),
write_concern=WriteConcern("majority")
)
To obtain the start-timestamp
According to mongodb-setting-systemLog.verbosity @ docs, we provide the following configure file (srs.conf
)
sharding:
clusterRole: shardsvr
replication:
replSetName: rs2
net:
bindIp: 0.0.0.0
storage:
oplogMinRetentionHours: 48
systemLog:
destination: file
logAppend: true
component:
transaction:
verbosity: 1
when starting a mongod using the command
mongod --fork --logpath /root/mongo-config/mongo-srs.log --config srs.conf
The option systemLog.component.transaction.verbosity
enables MongoDB to log the start-timestamp into the log file /root/mongo-config/mongo-srs.log
which looks like:
{
...
"c":"TXN",
"ctx":"conn80",
"msg":"transaction",
"attr":{
"parameters":{
"lsid":{
"id":{
"$uuid":"d25844f9-b25b-4ed3-8734-cccf8a4c584a"
},
...
},
"txnNumber":1,
"readConcern":{
"level":"snapshot",
"atClusterTime":{
"$timestamp":{
"t":1696995553,
"i":2
}
},
"provenance":"clientSupplied"
}
},
"readTimestamp":"Timestamp(1696995553, 2)",
"terminationCause":"committed",
...
}
}
readTimestamp
(Timestamp(1696995553, 2)
) is the start-timestamp of the transaction with lsid
and txnNumber
.
Question 1
- Have we obtained the start-timestamp correctly?
- Furthermore, what is the meaning of
atClusterTime
in the log above? Is it always the same withreadTimestamp
for multi-document transactions?
To obtain the commit-timestamp
The commit-timestamp of transactions are in the oplog.rs
collection of the local
database managed by MongoDB.
{
lsid: {
id: new UUID("d25844f9-b25b-4ed3-8734-cccf8a4c584a"),
uid: Binary(Buffer.from("e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", "hex"), 0)
},
txnNumber: Long("1"),
op: 'c',
ns: 'admin.$cmd',
o: {
applyOps: [
{
op: 'i',
ns: 'mydb1.foo',
ui: new UUID("d9490802-b1bc-431d-90ac-db2a535ecc91"),
o: { _id: ObjectId("652618e4b4c5bae6e2da451c"), abc: 1 },
o2: { abc: 1, _id: ObjectId("652618e4b4c5bae6e2da451c") }
}
]
},
ts: Timestamp({ t: 1696995556, i: 3 }),
t: Long("15"),
v: Long("2"),
wall: ISODate("2023-10-11T03:39:16.566Z"),
prevOpTime: { ts: Timestamp({ t: 0, i: 0 }), t: Long("-1") }
}
ts
(Timestamp({ t: 1696995556, i: 3 })
) is the commit-timestamp of the transaction with lsid
and txnNumber
.
Question 2
- Have we obtained the commit-timestamp correctly?
About Read-Only Transactions
We can obtain the start-timestamp of read-only transactions in the way as described above. We do not find commit-timestamp of read-only transactions in oplog.rs
.
Question 3
Do read-only transactions have commit-timestamps in MongoDB? If so, how to obtain them?