论文标题
一种自适应方法,用于可回收的相互膨胀
An Adaptive Approach to Recoverable Mutual Exlcusion
论文作者
论文摘要
相互排除(ME)是处理并发系统中冲突的最常用技术之一。传统上,相互排斥算法的设计是在以下假设的那样,即在获取/释放锁或执行其关键部分时不会失败。但是,在现实生活中确实发生了故障,可能使锁处于不一致的状态。这引起了\ emph {可回收相互排除(RME)}的问题,该问题涉及设计一种可以忍受失败的相互排除算法,同时保持安全性和livesice性能。 任何ME算法(包括RME算法)的性能的重要度量之一是\ emph {远程内存引用(rmrs)}的数量(用于获取和释放锁以及失败后的锁定结构)。最著名的RME算法解决了$ n $过程中的$ n $进程的问题,该问题由$ \ Mathcal {O}给出,无论系统中的故障数量如何,$ \ MATHCAL {O}(\ frac {\ frac {\ log n} {\ log n} {\ log n} {\ log n})$。 在这项工作中,我们提出了一种用于解决RME问题的新算法,其RMR复杂性逐渐\ emph {appapts}对系统“最近”中发生的故障数量。在没有故障的情况下,我们的算法仅生成$ \ MATHCAL {O}(1)$ rmrs。此外,其RMR复杂性由$ \ Mathcal {o}(\ min \ {\ sqrt {f},\ frac {\ log n} {\ log n} {\ log log \ log \ log n} \})$在过去的“最新”中,$ f $是$ f $的损失总数。除了读取和写入说明外,我们的算法还使用比较swap(\ cas {})和获取和存储(\ fas {})硬件指令,这两种说明通常在大多数现代处理器中可用。
Mutual exclusion (ME) is one of the most commonly used techniques to handle conflicts in concurrent systems. Traditionally, mutual exclusion algorithms have been designed under the assumption that a process does not fail while acquiring/releasing a lock or while executing its critical section. However, failures do occur in real life, potentially leaving the lock in an inconsistent state. This gives rise to the problem of \emph{recoverable mutual exclusion (RME)} that involves designing a mutual exclusion algorithm that can tolerate failures, while maintaining safety and liveness properties. One of the important measures of performance of any ME algorithm, including an RME algorithm, is the number of \emph{remote memory references (RMRs)} made by a process (for acquiring and releasing a lock as well as recovering the lock structure after a failure). The best known RME algorithm solves the problem for $n$ processes in sub-logarithmic number of RMRs, given by $\mathcal{O}(\frac{\log n}{\log \log n})$, irrespective of the number of failures in the system. In this work, we present a new algorithm for solving the RME problem whose RMR complexity gradually \emph{adapts} to the number of failures that have occurred in the system "recently". In the absence of failures, our algorithm generates only $\mathcal{O}(1)$ RMRs. Furthermore, its RMR complexity is given by $\mathcal{O}(\min\{ \sqrt{F}, \frac{\log n}{\log \log n} \})$ where $F$ is the total number of failures in the "recent" past. In addition to read and write instructions, our algorithm uses compare-and-swap (\CAS{}) and fetch-and-store (\FAS{}) hardware instructions, both of which are commonly available in most modern processors.
