I am trying to implement a database index based on the data structure (Blink tree) and algorithms suggested by Lehman and Yao in this paper. In page 2, the authors state that:
The disk is partitioned in sections of fixed size (physical pages; in this paper, these correspond to the nodes of the tree). These are the only units that can be read or written by a process. [emphasis mine] (...)
(...) a process is allowed to lock and unlock a disk page. This lock gives that process exclusive modification rights to that page; also, a process must have a page locked in order to modify that page. (...) Locks do not prevent other processes from reading the locked page. [emphasis mine]
I am not completely sure my interpretation is correct (I am not used to reading academic papers), but I think it can be concluded from the emphasized sentences that the authors mean the operations that read and write a page are assumed to be "atomic", in the sense that, if a process A has already begun reading (resp. writing) a page, another process B may not begin writing (resp. reading) that same page until A is done performing its read (resp. write) operation. Multiple processes simultaneously reading the same page is, of course, a legitimate condition, as is having multiple processes simultaneously performing arbitrary operations on exclusively different pages (process A on page P, process B on page Q, process C on page R, etc.).
Is my interpretation correct?
Can I assume POSIX' read()
and write()
system calls are "atomic" in the sense described above? Can I rely on these system calls having some internal logic to determine whether a specfic read()
or write()
call should be temporarily blocked based on the position of the file descriptor and the specified size of the chunk to be read or written?
If the answer to the above questions is "No", how should I roll my own locking mechanism?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…