Time and Global State

date

Jun 20, 2021

slug

comp90020-tngs

status

Published

1. Clock Synchronization

1.1 Clocks

hardware clock

software clock = scaled factor * hardware clock + offset

skew

diff between the reading of any 2 clocks

drift

diff in the rate at which 2 clocks count time

clock drift rate

diff in rate between a perfect reference clock and a physical clock

A synchronous distributed system is one in which the following bounds are defined:

Time to execute step of a process has known lower and upper bounds
Each message transmitted is received within a known bounded time
Each process has a local clock whose drift rate has a known bound

1.2 Cristian's Method

Cristian's algorithm works between a process P, and a time server S connected to a time reference source. Put simply:

P requests the time from S
After receiving the request from P, S prepares a response and appends the time T from its own clock.
P then sets its time to be T + RTT/2

This method assumes that the RTT is split equally between request and response, which may not always be the case but is a reasonable assumption on a LAN connection.

Further accuracy can be gained by making multiple requests to S and using the response with the shortest RTT.

We can estimate the accuracy of the system as follows. Let min be the minimum time to transmit a message one-way. The earliest point at which S could have placed the time T, was min after P sent its request. Therefore, the time at S, when the message by P is received, is in the range (T + min) to (T + RTT - min). The width of this range is (RTT - 2*min). This gives an accuracy of (RTT/2 - min).

1.3 Berkeley Algorithm

Master + Slaves

Unlike Cristian's algorithm, the server process in the Berkeley algorithm, called the leader, periodically polls other follower processes. Generally speaking, the algorithm is:

A leader is chosen via an election process such as Chang and Roberts algorithm.

The leader polls the followers who reply with their time in a similar way to Cristian's algorithm.

The leader observes the round-trip time (RTT) of the messages and estimates the time of each follower and its own.

The leader then averages the clock times, ignoring any values it receives far outside the values of the others.

Instead of sending the updated current time back to the other process, the leader then sends out the amount (positive or negative) that each follower must adjust its clock. This avoids further uncertainty due to RTT at the follower processes.

With this method the average cancels out individual clock's tendencies to drift.

Computer systems normally avoid rewinding their clock when they receive a negative clock alteration from the leader. Doing so would break the property of monotonic time, which is a fundamental assumption in certain algorithms in the system itself or in programs such as make. A simple solution to this problem is to halt the clock for the duration specified by the leader, but this simplistic solution can also cause problems, although they are less severe. For minor corrections, most systems slow the clock (known as "clock slew"), applying the correction over a longer period of time.

Often, any client whose clock differs by a value outside of a given tolerance is disregarded when averaging the results. This prevents the overall system time from being drastically skewed due to one erroneous clock.

2. Logical Time

2.1 Happened-Before Relation

a, b, c 是三个事件, global happened-before定义如下:

If process p: , then .

如果存在进程p, a发生在b之前, 那么在全局上, a都发生在b之前.

For any message m: send(m) receive(m)

对于任一msg, 先有的send事件, 再有的receive事件

If and , then

is partial order

If two events and happen in different processes which do not exchange messages, then and are not ordered with respect that is neither nor holds

如果事件和在不同的进程上进行, 同时它们之间没有信息交换, 那么和之间不是partial order, 即, 和都不成立.

2.2 Lamport's Logical Clocks

每个process 都会有一个logical lock , 会被初始化成为0

LC1:

上的在每个属于的事件发生前都会增加1

LC2:

如果上的事件 send(m), 那么消息m包含了a的时间戳
当接收到消息m的时候,先将自己的然后再增加1, 此时的事件为receive(m)

Lamport's Logical Clocks obey Clock Condition

Clock Condition:

For any events a, b: if , then

From the definition of , the Clock Condition is satisfied if the following holds:

If are events in process and , then

satisified by LC1

If is the sending of a message by process and is the receipt of that message by process , then

satisfied by LC2

Notes: does not imply

2.3 Vector Clocks

每个process都有一个vector, 里面包含了所在system里面所有process(假设总数为N, 包括自己)的logical clock

Rules for updating clocks

VC1: set = 0 for
VC2: before timestamps an envent it sets
VC3: piggybacks on every message it sends
VC4: when receives in a message, it sets , for

and adds 1 before the next event to its own element using VC2

Comparison

Happened-Before:

Concurrent: if neither nor

2.4 Summary

happened-before relation is a partial order on events that reflects a flow of information between them

Lamport clocks are counters that are updated according to the happened-before relationship between events

Vector clocks are an improvement on Lamport clocks: two events are ordered by happened-before or are concurrent by comparing their vector timestamps

3. Global States

3.1 Definitions

History of process :

A prefix of a history:

A global hisotry of a system of N processes to :

State:

To capture messages in the communication channel, each process records sending or receipt of messages as part of their state

3.2 Cuts

A cut of the system's execution

Subset of its global history that is a uniion of prefixes of process histories

Frontier of a cut

The last event in each process

Consistent cut

For all events and : if and then

3.3 Global States

global state is a set of states of the individual processes:

The state of corresponds to the state of immediately after the last event before the cut.

consistent global state: a global state the corresponds to a consistent cut

3.4 Runs and Linearizations

The execution of a distributed system can be seen as a series of transitions between global states

Each transitions corresponds to one event in one precess

If two events occur simultaneously they myst be concurrent and we can place them in either order.

Run

Total ordering of all the events in a global history that is consistent with each process' local history ordering ()

Linearisation (consistent run)

Ordering of events in a global history that is consistent with the happened-before relation
Only pass thru consistent global states

Reachable State

A state S' is reachable from another S if there is a linearisation thru these states.

3.5 Snapshots

A consistent global state is recorded in a snapshot of an execution of a distributed algorithm

A snapshot consists of

a local snapshot of the state of each process
the channel state of messages in transit for each channel

Use of Snapshots

Restarting after a failure
Off-line determination of stable properties, which remain true as soon as they have become true such as deadlocks, garbage objects
Debugging

💡

FYI:

The execution of a distributed system evolves thru consistent global states

The global state changes whenever an event happens

Process sends msg
Process receives msg
Process takes a step

Moving from (consistent global) state to (consistent global) state obeys happened-before relation

3.6 Chandy-Lamport Snapshot Algorithm

Uses a control message called a marker

Maker separates messages in the channels

A process records its state, then it sends a marker along all of its outgoing channels before sending out any more messages
A marker separates the messages in the channel into those to be included in the snapshot from those not to be recorded in the snapshot
A process must record its state at once when it receives a marker on any of its incoming channels

Assumption

Neither channels nor processes fail, i.e., communication is reliable
Channels are unidirectional and messages arrive in order (FIFO)
There is a path between any two processes, i.e., strongly connected network
Any process can initiate the snapshot algorithm
The processes can continue to work while the snapshot takes place
Processes can distinguish basic messages of the underlying distributed algorithm and control messages of the snapshot algorithm.

Pseudocode of Chandy-Lamport Snapshot Algorithm

Illustration of Chandy-Lamport Snapshot Algorithm