Concurrency Control

date
Jun 20, 2021
slug
comp90020-cc
status
Published
tags
Programming
COMP90020
summary
type
Page
Year
 

What is Transaction

  • A transaction is a sequence of events that are performed as one indivisible unit
  • A transaction either commits ( all of its events are performed at once ) or aborts ( none of its events take effect ) .
 

ACID property

  • Atomicity : either all events of a transaction (Tx) take effect or none of them ( all or nothing).
  • Consistency : if a Tx commits, the result is a valid configuration of the system. In other words, each successful Tx commits only legal results.
  • Isolation: Events within a Tx must be hidden from other Txs running concurrently
  • Durability : Once a Tx has committed, the system must guarantee that this Tx survives any subsequent malfunctions.
  • Several challenge in guaranteeing the ACID property
    • Failures of disks, servers and communication in both clients and / or server
    • Concurrency control: a server deals with multiple Txs ( from different clients ) at the same time
    • Transactions may be distributed across multiple servers
    •  

Consistency model

一致性模型. 从根结点开始, 左半部分是属于ACID中的Isolation, 右半部分属于CAP中的Consistency
一致性模型. 从根结点开始, 左半部分是属于ACID中的Isolation, 右半部分属于CAP中的Consistency
  • Unavailable: 当出现⽹络隔离等问题的时候,为了保证数据的⼀致性,不提供服务。
  • Sticky Available: 即使⼀些节点出现问题,在⼀些还没出现故障的节点,仍然 保证可⽤,但需要保证 client 的操作是⼀致的。
  • Highly Available: 就是⽹络全挂掉,在没有出现问题的节点上⾯,仍然可⽤。
粉色框中的模型属于Unavailable, 黄色框中的模型属于Sticky Available, 蓝色框中的属于Highly Available
 

Consistency phenomena

  • Lost update
    • (1) Tx T transfers from a to b, Tx U transfers from c to b, each wants to increase the b’s balance by 10 % . Initially b has a balance of 200. (2) Should be 200 → 220 → 242. However, in this case b ends u p with balance of 220.
      (1) Tx T transfers from a to b, Tx U transfers from c to b, each wants to increase the b’s balance by 10 % . Initially b has a balance of 200. (2) Should be 200 → 220 → 242. However, in this case b ends u p with balance of 220.
  • Inconsistent retrieval
    • (1) Both accounts a, b has a balance of 200 (2) Tx V moves 100 from a to b ; Tx W calculates the sum of two accounts (3) Wrong result ( 300 instead of 400 ) .
      (1) Both accounts a, b has a balance of 200 (2) Tx V moves 100 from a to b ; Tx W calculates the sum of two accounts (3) Wrong result ( 300 instead of 400 ) .
  • Dirty read
    • Dirty Read 出现在⼀个事务读取到了另⼀个还未提交事务的修改数据
      Dirty Read 出现在⼀个事务读取到了另⼀个还未提交事务的修改数据
 

Methods for Concurrency Control

 

Serial equivalence

  • Serial equivalence could allow concurrency but guaranteeing no conflicts
  • What to do:
    • At commit point of a Tx T, check for serial equivalence with all other Txs ( with which T has overlap ) .
    • If not serially equivalent: Abort T && Roll back ( undo ) any write that T did to server objects
  • Conflict operations
    • Two operations are said to be conflicting operations if their combined effect depends on the order they are executed
      • read(x) and write(x)
      • write(x) and read(x)
      • write(x) and write(x)
      • 上述三种情况等于: (read(x) read(x) ) (read/write on different objects)
  • Checking for serial equivalence
    • 简而言之, 就是Txs之间执行conflict operation的顺序要始终保持一致
    • e.g. 前面lost update的例子中, 我们可以知道执行conflict operation的顺序没有始终保持一致, 黄色的conflict operation pair是先Tx T 再 Tx U , 而其余两个则是相反, 因此这个不是serial equivalence, 无法满足我们的no conflict需求.
      • write-write(U,T) should be in blue.
        write-write(U,T) should be in blue.
    • 下图是对应的serially equivalent version
      • notion image
 

Pessimistic (Two-phase locking & Read-Write lock)

  • prevent transactions from accessing the same object (e. g. locking)
  • The following contents only consider exclusive clock and not shared lock. You could find more about two phase locking model with shared lock here.
  • Two-phase locking
    • A Tx has two phases
      • Growing phase: only acquire locks
      • Shrinking phase: only releases locks
    • A Tx cannot acquire any locks after it has started releasing
  • Two-phase locking guarantees serial equivalence
    • notion image
  • String two-phase locking: release locks only at commit point (commit or abort)
    • String two-phase locking prevents dirty reads and premature writes
  • 但是, 悲观锁降低了系统的并发量, 而现实中, 有些系统中大部分是读操作 (如浏览网页). 这时我们引入 read-write lock
  • Read-Write locks
    • Each object has a lock that can be held in one of two modes
      • Read mode: multiple Txs allowed in
      • Write mode: exclusive lock
    • 读与写, 写与写 之间是incompatible的
      • X 表明 incompatibility, 也就是说第一种锁(左列)会阻塞第二种锁(顶行)获取同一对象
        X 表明 incompatibility, 也就是说第一种锁(左列)会阻塞第二种锁(顶行)获取同一对象
    • RW锁有三种
      • Read-preferring RW locks. 写锁只有在没有读锁占用的时候才能够获取锁. 这种情况允许最大量的并发量. 但是由于read-read lock是compatibility的, 很有可能会持续有新的reader来占用这个object, 最终导致 write-starvation的情况出现
      • Write-preferring RW locks. 读锁只有在写锁已经释放对象并且排队队列中没有写锁的时候才能获取锁. 这种锁可以避免write-starvation, 但是会相应地减少并发量
      • Unspecified priority RW locks. 读写平等
    • 你可以从这里这里获取更多关于Java锁的知识
  • Deadlock - the downside of locking
    • 死锁发生必须满足的条件 (general)
      • 互斥使用(资源独占):一个资源每次只能给一个进程使用
      • 占有且等待(请求和保持,部分分配):进程在申请新的资源的同时保持对原有资源的占有
      • 不可抢占(不可剥夺):资源申请者不能强行的从资源占有者手中夺取资源,资源只能由占有者自愿释放
      • 循环等待:存在一个进程等待队列 ,其中等待占有的资源,等待占有的资源,…,等待占有的资源,形成一个进程等待环路。
    • Three necessary conditions for a deadline to occur (from slides)
      • Some objects are accessed in exclusive lock modes
      • Transactions holding locks cannot be preempted
      • There is a circular wait ( cycle ) in the Wait-for graph
    • 死锁预防
      • 破坏任一上述死锁形成的条件
      • 如: 引入timeout, abort Tx if lock cannot be acquired within timeout; 允许Tx对锁进行抢占, 等
    • 死锁避免
      • 当系统每次分配资源给进程(transaction)时候, 对剩余资源进行计算, 如果新的进程要求的资源大于剩余资源, 则不分配
    • 死锁检测
      • keep track of Wait-for graph → 寻找是否有cycle的存在 → 如果存在则随机终止一个transaction来破环
 

Optimistic (Timestamp ordering)

  • allow transactions to write, check later, if conflicts arise, abort Tx(s) and roll back update
    • 例子: CAS, 你可以从这里阅读更多关于CAS存在的问题 (ABA问题, 循环时间长开销大, 只能保证一个共享变量的原子操作)
  • Abort可能会导致其他的Tx发生dirty read, 这样这些Tx也需要被abort, 同样地, 与这些Tx相关的Tx也因为可能发生脏读需要被abort
    • 这种情况被称为Cascading aborts.
  • Timestamp ordering
    • Algorithm of timestamp ordering.
      Algorithm of timestamp ordering.
      Timestamp ordering [点击►展开查看中译版]
      • 所有的Tx组成一个序列, 每个Tx都有一个时间戳 , 用来标识Tx在序列中的位置
      • 每个对象x都有读和写两个时间戳, 当有Tx读或者写的时候就会更新相应的时间戳
        • RTS(x): x记录到的最大的Tx的时间戳
        • WTS(x): x记录到的最大的Tx的时间戳
      • 如果 想要读x, 那么
        • 如果 TS(i) WTS(x), 允许读, 然后把RTS(x)更新为RTS(x)和TS(i)中的较大者.
        • 否则, 不允许读, 同时终止
      • 如果想要写, 那么
        • 如果 TS(i) WTS(x)和RTS(x)中的较大者, 则允许写, 同时把WTS(x)更新为WTS(x)和TS(i)中的较大者
        • 否则, 不允许写, 同时终止
Example of timestamp ordering
Example of timestamp ordering
 

Kung-Robinson method

  • Another general approach to concurrency control is to maintain multiple versions of the objects during the execution of Txs. Kung-Robinson method is one of Multi-version approach
  • Strategy
    • The model has three phases for each transaction:
      notion image
  • Validation
    • notion image
      你可以点击这里查看更为详细且带有图例说明的validation
 

Nested Transactions

notion image
  • Properties
    • ACID also apply to subTx: subTxs appear to be atomic to their parents.
    • Each subTx can fail independently of its parent and of other subTxs.
    • Rules for committing nested Txs
      • A Tx may commit or abort after its child Txs have completed
      • When a subTx completes, it commit provisionally commit or abort. Decision to abort is final.
      • When a parent aborts, all its subTxs are aborted.
      • When a subTx aborts, the parent can decide whether to abort or not.
  • Advantages
    • SubTxs at the same level ( and their descendants ) may run concurrently.
      • Example: balanceTotal can invoke getBalance on different accounts as subTxs.
    • SubTxs can commit or abort independently. This provides flexibility and robust depending on the application.
      • Deliver a mail to a mailing list, each subTx delivers the mail to one recipient. If one or more fail, the parent could still record the fact and commit.
 

Strict executions of transactions

Generally, it is required that transactions delay both their read and write operations so as to avoid both dirty reads and premature writes. The executions of transactions are called strict if the service delays both read and write operations on an object until all transactions that previously wrote that object have either committed or aborted. The strict execution of transactions enforces the desired property of isolation. [Coulouris pp.690]

© wongchihaul 2021 - 2024