Both specifications and implementations can be represented by means of models of concurrency such as labelled transition systems (LTSs) or process graphs. I present the definitions below.
Another way to represent concurrent systems is by means of process algebraic expressions.
These concepts will be illustrated by means of an example (see below). It shows how an implementation could be given as a parallel composition of components, enforcing synchronisation on certain actions.
When both specifications and implementations are given as LTSs, we need a criterion to determine if they match. I will discuss a criterion based on bisimulation, that matches the states of the related LTSs. I show why it is useful here to abstract from internal actions of the implementation.
When the specification and implementation are given as process expressions, to decide whether they match, one considers the meaning, or semantics of these expressions. Often, this meaning is given by mapping them to LTSs.
J.C.M. Baeten & W.P. Weijland (1990): Process Algebra, Cambridge University Press
or
W.J. Fokkink (2000): Introduction to Process Algebra, Texts in Theoretical Computer Science, An EATCS Series, Springer.
The syntax of ACP, the Algebra of Communicating Processes,
features the operations
Here are the definitions of these operators, in terms of process
graphs extended with a predicate ✔ that signals successful termination.
There are two runners R1 and R1
participating in the relay race, and while running they carry a baton.
Runner 1 starts running, and at the halfway-point passes the baton to
runner 2, who is waiting there. Then runner 2 runs the rest of the
race and delivers the baton at the finish.
In this course I will present a lattice of semantic equivalence
relations on process graphs. These represent different points of view
on whether two processes are roughly the same. In such a lattice,
equivalences are classified with respect to their discriminating power.
Two processes p and q are partial trace
equivalent, notation P =PT Q, if they have the
same partial traces. Here a partial trace of a process p
is the sequence of transition labels encountered on a path in the
process graph of p, starting from the initial state. Here a path can
be defined as an alternating sequence of states and transitions, where
each transition goes from the state before it to the state after it.
Processes p and q are completed trace equivalent,
notation P =CT Q, if moreover they have the same
complete traces. Here a completed trace is a state that stems from a
maximal path; one that cannot be prolonged at the end.
A completed trace of p is the sequence of labels of the
transitions in a complete path. We write CTfin(p) for the set of
finite completed traces of process p, and CT∞(p)
for the set of all finite completed traces of p together with
its infinite traces. When writing just CT(p) it depends on
context which of the two is meant (i.e. this differs from one paper to
the next). I this class I defined CT(p) to mean CT∞(p).
Partial traces are defined likewise, but can end anywhere.
We write PTfin(p) for the set of
finite partial traces of process p,
and PT∞(p)
for the set of all finite partial traces of p together with
its infinite traces. Normally, PT(p) means PTfin(p).
Now processes p and q are infinitary completed trace equivalent,
notation P =CT∞ Q,
if CT∞(p) = CT∞(q).
It is a theorem that this implies that also PT∞(p) = PT∞(q).
Processes p and q are finitary completed trace equivalent,
notation P =CTfin Q,
if CTfin(p) = CTfin(q) and PTfin(p) = PTfin(q).
Here the last requirement cannot be skipped.
Likewise, processes p and q are infinitary partial trace equivalent,
notation P =PT∞ Q,
if PT∞(p) = PT∞(q).
They are finitary partial trace equivalent,
notation P =PT Q,
if PT(P) = PT(Q).
Together, these four equivalence notion form a square when ordered w.r.t. discriminating power.
The atomic actions of ACP consists of
all a, b, c, etc. from a given set A
of visible actions, and one special action τ, that is meant to be
internal and invisible to the outside world. For each application, a
partial communication function γ: AxA -> A is
chosen that tells for each two visible actions a and b whether
they synchronise (namely if γ is defined), and if so, what is
result of their synchronisation: γ(a,b).
The communication function is required to be commutative and associative.
The invisible action cannot take part in synchronisations.
It is homework to write down a formal definition.
root is labelled with ✔ if
either IG or IH is.
Intuitively, G||H allows all possible interleavings of actions
from G and actions from H. In addition, it enables
actions to synchronise with their communication partners.
It is used to remove the remnants of unsuccessful communication, so
that the synchronisation that is enabled by parallel composition, is enforced.
The semantics presented above is of the denotational kind.
Here "denotational" entails that each constant denotes a process graph (up to isomorphism)
and each ACP operator denotes an operation on process graphs
(creating a new graph out of one or two argument graphs).
Here is the example of the relay race presented in class.
We can describe the behaviour of the two runners by the ACP expressions:
give is the finish of runner 1 at the halfway-point,
where R1 gives the baton to R2,
receive is the action of runner 2 to accept the baton from runner 1 and start running,
and finish is the action of runner 2 of arriving at the finish.
An ACP expression describing the whole interaction is
Lecture 2
Tuesday 2-6-2020: Spectrum of semantic equivalences
I will discuss the concept of an equivalence relation, and recall
that any equivalence relation on a set D gives rise to a partitioning
of D into equivalence classes (see Homework 1). Now one equivalence
relation ≡ is called finer (or more discriminating)
than another equivalence relation ~ if each ≡-equivalence class
is included in a ~-equivalence class, i.e. when p≡q
implies p~q for all p,q in D. Conversely, ~ is
called coarser than ≡.
As part of this lattice of semantic equivalences I present 8=2x2x2
versions of trace semantics, all defined on process graphs
G=(S,T,I) that are not equipped with a termination
predicate (=final states).
For process graphs without a termination predicate, a path of a
process p is an alternating sequence of states and transitions,
starting from the state p and either infinite or ending in a
state, such that each transition goes from the state before it to the
state after it. It is complete if it is either infinite or ends
in a state from which no further transitions are possible.
[Later in the course we will encounter alternative definitions of completeness.]
One of the things we learned from the example of the relay race
is that we need an equivalence relation on processes that abstracts
from the internal action τ.
To formally prove that the implementation matches this specification,
we need a semantic equivalence that makes start.τ.finish equivalent with
start.finish. Such equivalences are called weak
or abstract. In contrast, an equivalence that treats τ
like any other action is strong.
The definitions of partial and completed trace equivalence we encountered above can all be regarded as definitions of strong partial trace equivalence and strong partial completed trace equivalence, as long as we treat τ like any other action. To arrive at definitions of weak partial trace equivalence (=WPT or =WPT∞), we define a weak partial trace of a process p as a strong partial trace of p from which all the occurrences of τ have been omitted. Now two processes are called (finitary or infinitary) weak partial trace equivalent if they have the same (finite or possibly infinite) weak partial traces. Weak completed trace equivalence, and its infinitary version, can be defined in the same vein.
In addition, two processes are called language equivalent if they have the same traces leading to terminating states. In automata theory such traces are called words, accepted by a process p. The set of all words accepted by p is the language accepted by p. It turns out that language equivalence is coarser than partial trace equivalence, whereas completed trace equivalence is finer than partial trace equivalence.
My paper The Linear Time - Branching Time Spectrum I is a small encyclopedia of (strong, or concrete) equivalence relations; language equivalence appears in Section 19.
I discuss why one would want to distinguish the processes b + c and τ.b+τ.c. The reason is that in a context of an environment that admits synchronisation on b, but not on c, the first process always performs the (weak) trace b, whereas the second has the possibility to reach a state of deadlock after doing the internal step τ. Deadlock is a (bad) state of a system, in which no further actions are possible, and the system does not terminate successfully.
For the same reason, one would want to distinguish the systems Q := τ.(b + c) and P := τ.b+τ.c. However, these two systems are equivalent under each of the 8 variants of trace equivalence discussed above. For this reason we need equivalence relations on process graphs or labelled transition systems that are finer than partial or completed trace equivalence, in the sense that they make more distinctions.
The definition of bisimulation equivalence, also called bisimilarity, can be found in this handout, which you should read entirely, except for the sections "Modal logic", "Non-well-founded sets" and "Concurrency". In that handout, LTSs and process graphs have an extra component ⊨ that is used to tell whether certain predicates, taken from a set P, hold in certain states of our processes. In this course, we typically consider only one predicate, namely ✔, telling that successful termination is possible in a state. For this reason, a quadruple (S,I,→,⊨) is presented as (S,I,→,✔). More commonly, one ignores the component ⊨ or ✔ completely.
On the set of all process graphs we can study multiple equivalence relations and order them w.r.t. their discriminating power. I drew a spectrum of equivalence relations, with finer ones located above coarser ones.
In this spectrum I located partial trace equivalence, completed trace equivalence and bisimulation equivalence, and treated examples showing their differences. Bisimulation equivalence is at the top of the spectrum and partial trace equivalence near the bottom. Completed trace equivalence is somewhere in between. Bisimulation equivalence is called a branching time equivalence, because it takes into account at which point two different traces branch off. Partial and completed trace equivalence on the other hand are linear time equivalences, as they only focus on the linear executions of a process, without taking branching into account.
The spectrum of weak trace equivalences is the same as in the strong case: weak completed trace equivalence is finer (more discriminating) than weak partial trace equivalence. A weak version of bisimilarity is even finer.
Defining a weak form of bisimilarity is relatively tricky. As we saw by example, just contracting all τ-moves in a labelled transition system doesn't yield what we want to get. There are several, subtly different, approaches in the literature. The most prominent ones are weak bisimilarity and branching bisimilarity, both defined in the above-linked handout.
A colouring of a labelled transition system is just a function that associates a colour with every state. (Thus a colouring induces an equivalence relation on states, and any equivalence relation induces a colouring.) Now given a colouring, the potentials of a state s of a certain colour, are all pairs (a,C) with a an action and C a colour, such that s can do some inert transitions followed by an non-inert a-transition to a state with colour C. Here a transition is inert if it is labelled τ and moreover goes between two states of the same colour. Thus if (a,C) is a potential of a state s, then either a≠τ or C is not the colour of s. The colouring is consistent if all states of the same colour also have the same potentials. Now one can show that two states are branching bisimulation equivalent iff there exists a consistent colouring that gives them the same colour.
Strong bisimilarity can be characterised in the same way; the only difference is that no actions count as inert. Thus a potential (a,C) simply means one can do an a-transition to a state with colour C.
An alternative approach is to introduce variables X,Y,Z to the language, as well as recursion constructs <X|S>. Here S is a set of recursive equations X = PX, and <X|S> specifies "the process X that satisfies the equations S". We normally restrict attention to the guarded case, in which there is only one process X satisfying S.
Here are the definitions, in terms of process graphs without predicates and without a notion of final states, of the operators mentioned above:
The main differences between ACP and CCS are:
In the operational framework we drop the distinction between processes and states. There will be one humongous LTS (without initial state) whose states are all possible closed process expressions, and the transitions between states are given by certain operational rules. To get an idea of the difference between process graphs and LTSs, read the definitions in the bisimulation handout, and especially note how Definition 3 of bisimulation avoids having the clause on initial states.
Today I introduce the approach of structural operational semantics (SOS), by giving the structural operational semantics of CCS. Read the first section of this handout. In SOS, the outgoing transitions of a composed process will be determined out of the outgoing transitions of its argument processes without at that time considering transitions occurring further on. What is called $\textbf{fix}_XS$ in the handout is called <X|S> in class. The rule for recursion involves substitution.
To formulate a canonical safety property, assume that our alphabet of visible actions contains one specific action $b$, whose occurrence is bad. The canonical safety property now says that $b$ will never happen.
Definition: A process $p$ satisfies the canonical safety property, notation $P \models \textit{safety}(b)$, if no trace of $P$ contains the action $b$.
In this definition it makes no difference whether "trace" refers to a partial or a complete trace, whether infinite traces are taken into account or not, or whether we use strong or weak traces. I often use words like "trace" instead of "possibly infinite strong partial trace" when the choice of adjectives is irrelevant.
To arrive at a general concept of safety property for labelled transition systems, assume that some notion of bad is defined. Now, to judge whether a process $P$ satisfies this safety property, one should judge whether $P$ can reach a state in which one would say that something bad had happened. But all observable behaviour of p that is recorded in a process graph until one comes to such a verdict, is the sequence of visible actions performed until that point. Thus the safety property is completely determined by the set of sequences of visible actions that, when performed by $P$, lead to such a judgement. Therefore one can just as well define the concept of a safety property in terms of such a set.
Definition: A safety property of processes in an LTS is given by a set $B \subseteq A^*$. A process $P$ satisfies this safety property, notation $P \models \textit{safety}(B)$, when $\textit{WPT}(P) \cap B = \emptyset$.
The following theorem says that weak partial trace equivalence preserves safety properties.
Theorem: If $P =_{\textit{WPT}} Q$ and $P \models \textit{safety}(B)$ for some $B \subseteq A^*$, then also $Q \models \textit{safety}(B)$.
Thus, if two processes have the same weak partial (finite) traces then they satisfy the same safety properties. The reserve also holds: if two processes satisfy the same safety properties, then they have the same weak partial traces. In class I sketched a proof of these (obvious) statements.
A liveness property says that something good will happen eventually. The terminology "liveness" used here stems from Lamport 1977, who was inspired by the older use of the term "liveness" in Petri nets, whose meaning however is different. The following concepts are important when formulating (or proving) liveness properties.
WPTfin | |||
P | normal progress | a.g.0 | $\{ag,a,\epsilon\}$ |
Q | fair divergence | a.Δg.0 | $\{ag,a,\epsilon\}$ |
R | livelock | a.(g.0+ τ.Δ0) | $\{ag,a,\epsilon\}$ |
S | deadlock | a.(g.0 + τ.0) | $\{ag,a,\epsilon\}$ |
In the above table, the unary operator $\Delta$ appends a $\tau$-loop at the beginning of its argument. Formally, $\Delta G$ is the process graph created out of $G$ by adding a fresh initial state $I$, together with a fresh $\tau$-loop I --$\tau$--> I, and a transition I --$\alpha$--> s for each transition IG --$\alpha$--> s from $G$. Process graphs of the 4 processes in the above table are drawn in class. If we assume fairness, "normal progress" may be identified with "fair divergence", but in the absence of a fairness assumption only "normal progress" satisfies the liveness property that g will happen eventually. In weak partial trace semantics, deadlock, livelock, divergence and normal progress are all identified. The weak partial traces of all four processes are the same. This week it is homework to invent definitions of completed trace equivalence that make different choices in this matter.
In CSP, as well as in David Walker's version of weak bisimulation semantics, livelock and divergence are identified, and distinguished from deadlock and normal progress, thereby implicitly assuming choices to be unfair (a safe worst case scenario). In weak and branching bisimulation semantics as used in CCS and ACP, choices are assumed to be fair, and fair divergence is identified with normal progress, and distinguished from livelock. Moreover, there is no distinction between deadlock and livelock. This helps in the verification of communication protocols, where you try to send a message as long as needed until it succeeds and assume it will eventually arrive correctly.
Divergence-preserving branching bisimilarity ($BB^\Delta$) is the finest useful weak equivalence found in the literature. It distinguishes all four modes of progress: deadlock, livelock, possibly fair divergence and normal progress. It can be defined in terms of consistent colourings:
A colouring of a labelled transition system is just a function that associates a colour with every state. (Thus a colouring induces an equivalence relation on states, and any equivalence relation induces a colouring.) Now given a colouring, the potentials of a state s of a certain colour, are all pairs (a,C) with a an action and C a colour, such that s can do some inert transitions followed by an non-inert a-transition to a state with colour C. Here a transition is inert if it is labelled τ and moreover goes between two states of the same colour. Thus if (a,C) is a potential of a state s, then either a≠τ or C is not the colour of s. The colouring is consistent if all states of the same colour also have the same potentials. Now one can show that two states are branching bisimulation equivalent iff there exists a consistent colouring that gives them the same colour. (Until here repeated from last week.)
State s is internally divergent w.r.t. a colouring if there is an infinite path of τ-transitions starting from s, only passing through states with the same colour as s. The colouring preserves internal divergence if for any colour either all states of that colour are divergent w.r.t. the colouring, or none are. Now two states are divergence-preserving branching bisimulation equivalent if they receive the same colour by some consistent colouring that preserves internal divergence.
State s is divergent if there is an infinite path of τ-transitions starting from s. A colouring preserves divergence if for any colour either all states of that colour are divergent, or none are. Now two states are weakly divergence preserving branching bisimilar if they receive the same colour by some consistent colouring that preserves divergence. In class I sketch a proof that this notion is weaker than (or possibly equal to) divergence preserving branching bisimilarity, in the sense that when two processes are divergence preserving branching bisimilar, then they are surely weakly divergence preserving branching bisimilar.
Divergence-preserving weak bisimilarity ($WB^\Delta$) is defined much simpler: A divergence-preserving weak bisimulation is one that relates divergent states to divergent states only. Here a state is divergent if it is the start of an infinite path, all of whose transitions are labelled τ.
An equivalence ~ is a congruence for a language L if P ~ Q implies that C[P] ~ C[Q] for every context C[ ]. Here a context C[ ] is an L-expression with a hole in it, and C[P] is the result of plugging in P for the hole.
It is not hard to show that (strong) partial trace equivalence is a congruence for encapsulation. This follows because the partial traces of ∂H(P) are completely determined by the partial traces of the process P itself (namely by removing all partial traces with actions from H). The same argument can be used to see that partial trace equivalence is a congruence for all operators of CCS and ACP.
Partial trace equivalence is inadequate to deal with deadlock. Adding deadlock information to partial trace semantics yields completed trace semantics. However, completed trace equivalence is not a congruence for all operators. It fails for the restriction operator of CCS, which is the same as the encapsulation operator of ACP. It also fails for the parallel composition of CSP. A counterexample: the processes $a.b + a.c$ and $a.(b+c)$ are completed trace equivalent. However, after placing them in a context $\partial_H(\_\!\_)$ with $H=\{c\}$, they are not. For $\partial_H(a.b + a.c)$ has a completed trace $a$, yet $\partial_H(a.(b+c))$ does not. This operator corresponds to placing the processes in an environment that blocks, or refuses to synchronise on, the action $c$. Strong bisimulation equivalence on the other hand is a congruence for all operators of ACP, CCS and CSP. This is a reason to use bisimulation rather then partial or completed trace equivalence.
The congruence closure ~c of ~ (w.r.t. a language L) is defined by
P ~c Q if and only if C[P] ~ C[Q] for any L-context C[ ].
It can be shown that ~c is the coarsest congruence finer than ~. (Note that there are three statements in that last sentence.)The congruence closure of strong completed trace equivalence can be characterised as strong failures equivalence (not defined yet). It is finer than strong completed trace equivalence, but coarser than strong bisimilarity. A similar thing applies to weak equivalences.
Expansion Theorem for CCS: Let $P := \sum_{i\in I}\alpha_i.P_i$
and $Q := \sum_{j\in J}\beta_j.Q_j$. Then
$$P | Q = \sum_{i\in I} \alpha_i (P_i | Q) + \sum_{j\in J} \beta_j (P | Q_j) +
\sum_{\begin{array}[t]{c}i\in I, j \in J \\ \alpha_i =
\bar\beta_j\end{array}}\tau.(P_i | Q_j).$$
By means of this theorem, and a similar treatment of the restriction
and renaming operators of CCS,
any CCS expression $P$ (with only guarded recursion) can be rewritten into
a bisimulation equivalent CCS
expression of the form $\sum_{i\in I}\alpha_i.P_i$. Such an
expression is called a head normal form.
In fact, this can be done in such a way, that the $P_i$ are exactly
those processes with $P \stackrel{\alpha_i}{\Longrightarrow} P_i$.
If there is no recursion at all, this process can be iterated, and any
recursion-free CCS expression
can be rewritten in to a Basic CCS expression: one built solely
from 0, $+$ and action prefixing.
The above complete equational axiomatisation of (strong)
bisimulation equivalence for
basic CCS now extends to a complete equational axiomatisation of
all of recursion-free CCS.
When using equational reasoning to verify that a specification is
equivalent to an implementation we use, in addition to the machinery
of equational logic, the recursive definition principle (RDP)
and the recursive specification principle (RSP).
RDP says that a system satisfies its recursive definition; RSP says
that two processes satisfying the same guarded recursive
equation, must be equal. It holds for strong bisimilarity, and for strong
finitary partial trace equivalence.
$$\begin{array}{ccc}
\langle X|S \rangle = \langle S_X|S \rangle && \mbox{(RDP)} \\
S \Rightarrow X = \langle X|S \rangle & \mbox{provided $S$ is guarded} & \mbox{(RSP)} \\
\end{array}$$
In the above formalisations of RDP and RSP, $S$ is set of equations
$\{X = S_X \mid X \in V_S\}$, a recursive specification.
$\langle X|S \rangle$ is a syntactic construct, denoting "the process
$X$ as specified by the equations $S$". Moreover, $\langle S_X|S
\rangle$ is a shorthand notation. It denotes the term $S_X$, that is,
the right-hand side of the equation for $X$ in the recursive
specification $S$, in which the process $\langle Y|S \rangle$ is
substituted for each free occurrence of a variable $Y \in V_S$.
Now RDP ought to be clear. To interpret RSP, it is important that
axioms with free variables allow us to substitute any processes for
those free variables. So if we substitute $P$ for $X$, and other
processes for the other variables, then the antecedent of the
implications say that $P$ is the $X$-component of a solution of the
equations $S$. Now RSP says that in such a case $P$ must equal the
canonical solution $\langle X|S \rangle $ of $S$. So if we have two
processes $P$ and $Q$ that are both $X$-components of solutions of
$S$, they must both equal $\langle X|S \rangle$, and hence each other.
It turns out that RDP and RSP, together with the basic axioms listed
above, and the axioms that enable head normalisation, together form a
complete axiomatisation for guarded processes. This means that any
true equation between guarded processes can be derived from these axioms.
Likewise, in CCS, weak branching bisimulation equivalence $=_{WB}$ is
replaced by weak bisimulation congruence $=_{WB}^c$
It is defined as the coarsest congruence contained in $=_{WB}$,
obtained as the congruence closure of $=_{WB}$ (see above).
It turns out that $P =_{WB}^c Q$ iff for
all processes $R$ we have $P+R =_{WB} Q+R$.
Thus it suffices to take the congruence closure w.r.t. the +-operator;
the result then turns out to be a congruence for the other operators
as well.
In fact, the same can be done for all other equivalences we
encountered (such as weak and strong completed trace equivalence)
and in class I indicate the resulting congruence closures
on the linear time - branching time spectrum, our map of equivalences,
sorted by the "finer than"-relation.
Two processes P and Q are rooted branching bisimulation (RBB) equivalent iff
RBB equivalence equals branching bisimulation congruence; it is finer,
i.e. more discriminating, than BB equivalence,
and it is a congruence for all operators of ACP, CCS and CSP.
Similarly, I present another characterisation of $=_{WB}^c$, and
illustrate it in class with examples:
Two processes P and Q are RWB equivalent iff
RWB equivalence equals weak bisimulation congruence; it is finer,
i.e. more discriminating, than WB equivalence,
and it is a congruence for all operators of ACP, CCS and CSP.
These above two rooted equivalences can be given a uniform treatment.
Define a rooted weak or branching bisimulation to be one that
relates roots (= initial states) with roots only.
Now two processes are RBB (resp. RWB) equivalent iff
after applying
an operation (root-unwinding) that preserves strong bisimilarity and
removes any loops going through the initial states they are
related by a such a rooted weak or branching bisimulation.
Delay bisimilarity and η-bisimilarity are semantic equivalences
intermediate between weak and branching bisimilarity, defined in the handouts. Rooted
delay bisimilarity is axiomatised by (T1) and (T2); rooted
η-bisimilarity by (T1) and (T3). Rooted branching bisimilarity is
completely axiomatised by
$$\alpha.(\tau.(P+Q)+Q) = \alpha.(P+Q)$$
on top the of axioms for strong bisimilarity. This axiom is a
strengthening of (T1). There does not exists a congruence that is
axiomatised by (T1) plus the axioms of strong bisimilarity alone.
In this direction, rooted branching bisimilarity is at good as it gets.
The recursive specification $X = \tau.X$ has a unique solution up to
strong bisimilarity, but not up to rooted weak or branching bisimilarity.
So to make RSP work optimally, we need a more liberal concept of
guardedness for strong bisimilarity (or other strong equivalences)
than for weak bisimilarity (or other weak equivalences).
In other words, we need a stronger, or less liberal, concept
of unguardedness for strong bisimilarity.
We call a recursive specification strongly unguarded if a
recursive variable $X$ occurs in the body of a recursive equation, but
not in a subterm of the form $\alpha.P'$, that is, not guarded by
$\alpha$, for any action $\alpha \in Act = A\cup \{\tau\}$.
It is weakly unguarded if a
recursive variable $X$ occurs in the body of a recursive equation, but
not in a subterm of the form $a.P'$, that is, not guarded by
$a$, for any visible action $a \in A$.
Note that, in line with the meaning of the words "strong" and "weak",
a strongly unguarded recursive specification is certainly also weakly unguarded.
This wouldn't have worked out so nicely when talking about "guarded"
rather than "unguarded" recursive specifications.
RSP is a sound principle for strong bisimulation semantics when
interpreting "guarded" as "not strongly unguarded". But to make it a
sound principle for weak or branching bisimulation semantics, we need
to interpret "guarded" as "not weakly unguarded".
Again, it can be shown that for weak and branching bisimilarity, the
axioms listed above, and the axioms that enable head normalisation,
together with RDP and RSP, form a complete axiomatisation for processes
without weakly unguarded recursion.
Weak bisimulation congruence, the congruence closure of weak bisimilarity,
is the default semantic equivalence used in the language CCS.
It can be characterised as rooted weak
bisimilarity:
two processes are weak bisimulation congruent, iff after applying
an operation (root-unwinding) that preserves strong bisimilarity and
removes any loops going through the initial states they are
related by a rooted weak bisimulation, one that relates initial
states with initial states only.
Likewise, branching bisimulation congruence, the congruence closure of branching bisimilarity,
is the default semantic equivalence used in the language ACP.
It can be characterised as rooted branching bisimilarity.
The language CSP follows the other approach: it drops the offending
operator +. Its operational semantics is given here.
The syntax of CSP, the theory of Communicating Sequential Processes,
features the operations
Here are the definitions, in terms of process graphs without
predicates and without a notion of final states, of the operators
mentioned above:
Lecture 7
Tuesday 23-6-2020 : Congruence properties of weak equivalences
When trying to apply the above axiomatisation approach to weak bisimulation equivalence $=_{WB}$, we get stuck, because
$=_{WB}$ fails to be a congruence for the +-operator, and hence fails
the last law of equational logic.
A counterexample: the processes $\tau.0$ and $0$ are weakly bisimilar,
but after placing them in a context $a.0 + \_$ they are not.
The same problem pertains to branching bisimilarity, and to most other
weak equivalences, but not to weak trace equivalence.
This problem can be solved by
Whereas the process algebras ACP and CCS follow the second approach,
the process algebra CSP follows the last approach.
Rooted weak and branching bisimilarity
In ACP, branching bisimulation equivalence $=_{BB}$ is
replaced by branching bisimulation congruence $=_{BB}^c$
It is defined as the coarsest congruence contained in $=_{BB}$,
obtained as the congruence closure of $=_{BB}$ (see above).
It turns out that $P =_{BB}^c Q$ iff for
all processes $R$ we have $P+R =_{BB} Q+R$.
Thus it suffices to take the congruence closure w.r.t. the +-operator;
the result then turns out to be a congruence for the other operators as well.
I also present another characterisation of $=_{BB}^c$, and
illustrate it in class with examples:
Thus a "rooted branching bisimulation" acts in the first step like a
strong bisimulation, and after that like a branching bisimulation.
Crucial here is that the mimicked a-step is not optional.
Equational axiomatisations for rooted weak and branching bisimilarity
Above I presented a complete equational axiomatisation
of strong bisimilarity on recursion-free CCS.
To axiomatise rooted weak bisimilarity $=_{WB}^c$ we need three extra axioms:
$$\begin{array}{c}
(\mbox{T1}) & \alpha.\tau.P = \alpha.P \\
(\mbox{T2}) & \tau.P = \tau.P + P \\
(\mbox{T3}) & \alpha.(\tau.P + Q) = \alpha.(\tau.P + Q) +\alpha.P\\
\end{array}$$
The latter two describe a kind of transitive closure on the LTS: if we
have two transitions
p --τ--> q --a--> r or p --a--> q --τ--> r,
then we obtain weakly bisimulation congruent systems by adding a
transition p --a--> r.
Strongly versus weakly unguarded recursion
Recall the principles RDP and RSP that say, together, that guarded
recursive specifications have unique solutions. RDP says that each
recursive specification has a solution, and RSP says that if a
guarded recursive equation has multiple solutions, they must be
semantically equivalent. For RSP guardedness is a necessary side
condition. The recursive specification $X = X$, for instance, is not
guarded, and has many solutions. In fact, each process is a
solution of this specification. The recursive specification $X = a.b + X$
also have many solutions, but not each process is a solution.
In fact, its solutions, up to strong bisimilarity, are the processes
$a.b + P$, for arbitrary $P$.
Lecture 8
Monday 29-6-2020 : CSP, and Structural operational semantics
CSP
Recall that weak and branching bisimilarity fail to be congruences for the choice
operator (+) of ACP or CCS.
There are two ways to remedy such a situation: to change the equivalence
(by taking instead its congruence closure) or to change the language
(by dropping the offending operator).
The atomic actions of CSP consists of
all a, b, c, etc. from a given set A of
visible actions. The syntax of CSP does not provide for invisible
actions, although τ does occur in the process graph semantics of
CSP; it is introduced by internal choice and by concealment.
It turns out that weak and branching bisimulation equivalence are
congruences for all operators of CSP.
Unlike the parallel compositions of CCS and ACP, the parallel
composition of CSP not only allows, but also enforces synchronisation.
For this reason CSP does not need a separate restriction or
encapsulation operator.
Structural Operational Semantics
Today I present the theory of structural operational semantics (SOS),
as described in this handout.
We have seen this framework in action already when I presented the
structural operational semantics of CCS, given here.
Note that in class I call processes $P$ and $Q$, and in the latter
handout I use the letters $E$ and $F$. These stand for
expressions that may contain free variables. Processes are
closed expressions, which may not contain free variables.
In the new handout I use variables $x$ and $y$ instead of processes
$P$ and $Q$ or expressions $E$ and $F$. It's all the same, as one may
substitute processes or general expressions for these variables.
Today I present the SOS of CCSP, also given here.
I also cover the use of negative premises in structural operational semantics (see the handout above) and the GSOS format with negative premises, using the priority operator that is sometimes added to ACP as an example.
Finally, I explain a theorem in Structured Operational Semantics, saying that strong bisimulation is a congruence as soon as all operators concerned have operational rules in the GSOS format. See the handout on structural operational semantics for details. In class I cover few examples illustrating a failure of bisimulation being a congruence when we deviate from this format.
As a consequence of this result, strong bisimilarity is a congruence for all operators of ACP, CCS and CSP.
In the lecture I illustrate this on an example. It also illustrates that each use of negation in the an HML formula correspond with a switch of initiative from one side to the other in the bisimulation game.
By the way: Please read this small fragment of the wikipedia page for bisimulation for a Game theoretical characterisation of strong bisimilarity. (This should have been mentioned in Lecture 3, but then I forgot.)
Above, the red connectives and modalities are redundant, because they are expressible in the green connectives and negation. Likewise, the green connectives are expressible in terms of the red ones and negation. Finally, if we keep both the red and the green connectives and modalities, then negation is redundant. It is the latter form we need to extend HML into the modal $\mu$-calculus. It additionally has modalities $\mu X.\phi(X)$ and $\nu X.\phi(X)$. Here $\phi(X)$ is a modal $\mu$-formula that may contain the variable $X$. $\mu X.\phi(X)$ denotes the least fixed point of the equation $X=\phi(X)$, and $\nu X.\phi(X)$ the greatest. For these fixed points to exist, the formulas $\phi(X)$ need to be monotone in $X$, and to achieve that we needed to eliminate negation.
A formal denotational semantics of the modal $\mu$-calculus can be found in wikipedia. There, negation is not eliminated, but instead a variable needs to be under an even number of negation operators. This treatment is equivalent. In the lecture I merely mentioned the modal $\mu$-calculus, so that you know it exists. For now it is not covered material, on which I may ask exam questions.
P ⊨ $\top$ | ||
P ⊭ $\bot$ | ||
P ⊨ Φ∧Ψ | iff | P ⊨ Φ and P ⊨ Ψ |
P ⊨ Φ∨Ψ | iff | P ⊨ Φ or P ⊨ Ψ |
P ⊨ [K]Φ | iff | ∀Q∈{P':P--a-->P' and a∈K} . Q ⊨ Φ |
P ⊨ <K>Φ | iff | ∃Q∈{P':P--a-->P' and a∈K} . Q ⊨ Φ |
Preorders are used just like equivalence relations to compare specifications and implementations. I normally write the implementation on the right: $$\textit{Spec} \sqsubseteq \textit{Impl}$$ For each preorder $\sqsubseteq$ there exists an associated equivalence relation $\equiv$, called its kernel. It is defined by $$P \equiv Q ~~\Leftrightarrow~~ (P \sqsubseteq Q \wedge Q \sqsubseteq P).$$ Preorders play the same role as semantic equivalence relations, and can be ordered in a spectrum. In fact, for almost every semantic equivalence $\equiv$, there exists an associated preorder with the property that $\equiv$ is its kernel. The partial trace preorder for instance is defined by $$P \sqsubseteq_{\it PT} Q ~~\Leftrightarrow~~ PT(P) \supseteq PT(Q)$$ Half of the scientists orient the symbol $\sqsubseteq_{\it PT}$ in the other direction, so that it aligns with the direction of the set-inclusion symbol $\subseteq$. I aligned this course reversely. The process $Q$ in $P \sqsubseteq Q$ is called a refinement of $P$. It is a step closer to the implementation. We require that all good properties of process $P$ are shared by $Q$, although it is acceptable if $Q$ has extra good properties, not required by $P$. Most of the time, our properties will be trace based, such as
In such cases, a process is defined to have the good property $\phi$ iff each of its (complete) traces has the good property $\phi$. Thus if process $Q$ has only a subset of the traces of $P$, it surely has all the good properties of $P$. One has to be careful with this type of reasoning, and not jump to the conclusion that a process that cannot do anything is surely the best of all. But it motivates the orientation of the symbol $\sqsubseteq$, with the improved process on the right.
When adding simulation equivalence to the spectrum of equivalence relations that had already partial trace equivalence, completed trace equivalence and bisimulation equivalence in it, it turns out that simulation equivalence is finer that =PT, coarser than =B, and incomparable with =CT.
Let $L$ be the fragment of the infinitary Hennessy-Milner logic, given by the syntax \[\varphi ::= {\color{green} \top} \mid {\color{green} \bigwedge_{i\in I} \varphi_i} \mid {\color{green} \langle \alpha \rangle \color{green}\phi} \] This logic yields a modal characterisation of (strong) simulation equivalence, just like the full Hennessy-Milner logic yields a model characterisation of (strong) bisimulation equivalence: \[ P =_ S Q ~~\Leftrightarrow~~ \forall \varphi \in L.~ (P \models \varphi \Leftrightarrow Q \models \varphi). \]
[BK08] Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008. ISBN 978-0-262-02649-9
I will not say much about CTL*, a logic that combines the expressive power of LTL and CTL.
To apply CTL or LTL to labelled transition systems we need a translation $\eta$ that maps
(states in) labelled transition systems to (states in) Kripke structures. We then define
validity of an LTL or CTL formula $\phi$ on a state $p$ in an LTL by $$p \models_\eta \phi \Leftrightarrow \eta(p) \models \phi$$
A suitable translation $\eta$ introduces a new state halfway each
non-τ transition, and transfers the transition label to that state.
Two processes (states in a deadlock-free LTS) satisfy the same LTL
formulas if they have the same infinite traces. For finitely-branching
processes, we even have "iff".
Two processes (states in a deadlock-free LTS) satisfy the same LTL-X
formulas if they have the same weak infinite traces (obtained by
leaving out τ's).
Two processes (states in a deadlock-free LTS) satisfy the same CTL
formulas if they are strongly bisimilar. For finitely-branching processes, we even have "iff".
For general processes, we also have "iff", provided that we use CTL
with infinite conjunctions.
Two processes (states in a deadlock-free LTS) satisfy the same CTL-X
formulas if they are divergence-preserving branching bisimulation equivalent.
We even have "iff", provided that we use CTL-X
with infinite conjunctions.
I also extend the above results to LTSs that may
have deadlocks (states without outgoing transitions).
There are several approaches found in the literature:
Two processes $P$ and $Q$ (in an arbitrary LTS) satisfy the same LTL
formulas if P =CT∞ Q.
For finitely-branching processes, we even have "iff".
Two processes $P$ and $Q$ (in an arbitrary LTS) satisfy the same LTL-X
formulas if P =WCT∞ Q, that is,
if they have the same (possibly infinite) weak completed traces (obtained by
leaving out τ's).
Two processes $P$ and $Q$ (in an arbitrary LTS) satisfy the same CTL
formulas if they are strongly bisimilar. For finitely-branching processes, we even have "iff".
For general processes, we also have "iff", provided that we use CTL
with infinite conjunctions.
Two processes $P$ and $Q$ (in an arbitrary LTS) satisfy the same CTL-X
formulas if they are divergence-preserving branching bisimilar.
We even have "iff", provided that we use CTL-X
with infinite conjunctions.
The last result, however, only works with approach 3 or 4 above.
Otherwise we would get a weaker form of divergence-preserving
branching bisimilarity, which would identify deadlock and livelock.
The complexity of the original algorithm, by Kanellakis & Smolka,
is $O(m.(n+l))$, where
$m$ is the number of transitions, $n$ is the number of states, and $l$
is the number of different action labels in a finite system (when
assuming n < m+2, which can be achieved by collapsing deadlock states).
(Often $l$ is taken to be a constant, and not counted.)
The algorithm has been improved by Paige & Tarjan to reach complexity $O(m.(log(n)+log(l))$.
Note however, that the complexity of deciding bisimilarity is
EXPTIME when the input is a process algebra expression (of a not too complex kind).
This is because an LTS can be exponentially larger than its generating
expression.
The partition refinement algorithm works by first considering all
processes equivalent, and putting them into 1 big equivalence class.
Then we keep finding reasons for cutting an equivalence class into
two smaller equivalence classes (red and blue), namely
when we find evidence that none of the process in red are
equivalent to processes in blue. When we run out of reasons to
further cut equivalence classes in two, the equivalence relation we are left
with is strong bisimilarity.
Here LTL-X is what remains of LTL after throwing out
the next-state operator X. It can be argued that this operator is not
compatible with abstraction from internal actions.
Now we have the following extensions of the results above:
To be precise, in the
semantics of CTL, the paths chosen in Causes 10-15 should be
completed paths rather then infinite paths.
In the
semantics of LTL the words may be finite sequences and the only
definition that needs to be amended is that of the next-state
operator. Let's use the convention that if $w$ is the empty word,
then $w \models {\bf X} \psi$ is always false. (A consequence of
this decision is that we loose the self-duality of $\bf X$.)
Lecture 11
Tuesday 14-7-2020: Deciding bisimilarity
Today I explain the partition refinement algorithm for deciding
(strong) bisimulation equivalence on finite-state processes. As
supporting material I hand out
a copy of Sections 2.1, 2.2 and 3.1 of Chapter 6 of the Handbook of Process Algebra.
The partition refinement algorithm for strong bisimilarity treated
above can be easily adapted to deal with branching bisimilarity.
Please read Sections 1, 2, 3 and 6 of
An O(m log n) Algorithm for Branching Bisimulation.
Or (better) Section 1 (until "Another important insight"), 2, 3 (until
"In order to see that the time complexity") and 7 of
the published version.
The key idea is that τ-moves between
states within the same block are skipped over, whereas τ-moves
between different blocks are treated as if τ is any normal action.
For the rest it resembles the partition refinement algorithm for strong bisimilarity.
There are two crucial differences with the partition refinement algorithm for strong bisimilarity:
That book uses equational reasoning (applying axioms) to show that the implementation is rooted branching bisimilar to the specification. This mode of proving things is called equivalence checking. Another form of equivalence checking is to directly show the rooted branching bisimilarity of the transition systems of Spec and Impl, for instance using the partition refinement algorithm. An alternative to equivalence checking is model checking. This also involves modelling the (implementation of) the protocol in a process algebra or similar formalism, but instead of developing an abstract specification, one formalises intended properties in terms of temporal logics, like CTL or LTL. Then one applies techniques to directly show that the (transition system generated by the) implementation of the protocol satisfies its temporal formula. Out of the 8 major toolsets for system verification surveyed here, 7 employ model checking and 4 employ equivalence checking. Two of the latter use rooted branching bisimulation as their primary equivalence. A more complete overview of tools is found on pages 92-95 of the above book.
Of course equivalence checking and model checking can be combined: there are theorems that tell us that if Spec ~ Impl for a suitable equivalence ~, and Spec satisfies a temporal formula of a particular form, then also Impl satisfies this formula. Using such results, we can first use equivalence checking or partition refinement to replace a system with many states by a smaller system that is semantically equivalent. Then we use model checking to check that the smaller system satisfies a correctness property, given by a temporal formula. We then know that also the original system satisfies this correctness property.
The correctness of the alternating bit protocol (ABP) hinges on a fairness assumption, saying that if you try to send a message infinitely often, sooner or later it will come through correctly. Without relying on such an assumption, the protocol is obviously incorrect, on grounds that there can be no way to ensure correct message delivery. Fairness assumptions are important in verification. If we refuse to make them we declare any protocol like the ABP a priori incorrect, and miss an opportunity to check the protocol logic for flaws that have nothing to do with the unreliable nature of the communication channel.
A strong fairness assumption is already incorporated in the notions of (rooted) branching (or weak) bisimulation equivalence seen before. No fairness assumption is incorporated in divergence-preserving (rooted) branching (or weak) bisimulation equivalence (see lecture 5). Those equivalences would not be suitable for the verification of the ABP.
Orthogonal to this classification is a spectrum that ranges from interleaving semantics, in which parallelism is treated by arbitrary interleaving of atomic actions, and causality respecting semantics, in which causal relations between actions are explicitly taken into account. A typical equation that holds in interleaving semantics but not in causal semantics is a||b = ab+ba. It is rejected in causal semantics because in a||b the two actions a and b are causally independent, whereas in ab+ba either a depends on b or vice versa. A causal version of partial trace equivalence can be obtained by taking partial orders of actions instead of sequences. This is why causal semantics are sometimes called partial order semantics.
Between interleaving and causal semantics lays step semantics. It differs from interleaving semantics by explicitly taking into account the possibility that two concurrent actions a and b by pure chance happen at the exact same (real-time) moment. Here a||b differs from ab+ba because only the former has a trace where a and b happen at exactly then same time.
Between step semantics and causal semantics lays interval semantics. It takes causality into account only to the extend that it manifest itself by durational actions overlapping in time.
One argument to use at least interval semantics (and possibly causal semantics) instead of interleaving semantics is real-time consistency. If we imagine that actions take time to execute, interleaving semantics may equate processes that have different running times (it is not real-time consistent), whereas this cannot happen in interval or causal semantics. An other argument is action refinement: we want that two equivalent processes remain equivalent after systematically splitting all occurrences of an action a into a1;a2, or refining a into any more complicated process. This works for interval or causal semantics, but not for interleaving semantics.
A map of the equivalences discussed above appears on page 3 of this paper. In this class I will not treat the right-most column of that diagram. Please also read the text of page 3 below that diagram. And read section 1.2 (on pages 4-5), listing reasons for preferring one semantic equivalence over another.
Petri nets can be translated to process graphs by taking as states their markings, the initial marking being the initial state, and as process-graph transitions the transition firing relation between markings. A theorem states that translating an ACP or CCS expression into a net, and then translating further into a process graph, yields the same graph, up to strong bisimilarity, as directly interpreting the expression as a process graph (using the standard denotational or operational semantics).
On Petri nets and event structures it is common to not use interleaving equivalences that identify a|b and ab+ba. Reasons to distinguish these processes are preservation of semantic equivalence under action refinement and real-time consistency; see above.
In the same spirit, a common and simple kind of safety property can be modelled in LTL as $\neg$Fb, or equivalently G$\neg$b. It says that there is no path from the initial state to a state where b holds. Here $b$ is an atomic proposition assigned to those states in the Kripke structure that are deemed "bad".
Safety properties are often established by means of invariants, properties of states that hold in the initial state, and are preserved under doing a transition. Being "not a bad state" could be such an invariant. Note that if the initial state is "not bad", and each transition that starts in a "not bad" state also ends in a "not bad" state, then indeed it follows that no bad state in reachable, and $\neg$Fb holds. In the formulation of invariants, states are often determined by the values of certain variables, and the invariants are formulated in terms of these variables. The bad thing might be that variable $x$ exceeds the value 137.
Whereas normally this course focuses on reactive systems, where visible actions model a synchronisation between the modelled system and its environment, today I focus on closed systems, which are not supposed to be influenced by their environment. In a reactive system, represented as a process graph, the labelling of the transitions provides an interface for the environment to interact with the system; the environment (or user) can either block or allow a certain action, and thereby steer the system when it has a choice between multiple actions. In Kripke structures the transitions are not labelled, and it is much more common to study closed systems. If in this setting one wants to model the interaction of our system with a user, it is more appropriate to model the user separately, and put this user model in a parallel composition with the system under investigation.
Whether liveness properties of systems hold depends often on whether or not we make appropriate progress and fairness assumptions. I discuss 5 such assumptions: progress, justness, weak fairness, strong fairness and full fairness. Please read page 7 of this paper for an informal description of the first four of these assumptions, and an illustrating example.
A linear-time property is a property of paths. LTL is a logic that specifies properties of paths in Kripke structure. Therefore any LTL formula models a linear-time property. A good example of a linear-time property is given by the LTL formula Fg discussed above. It says that a state that is labelled by the atomic proposition $g$ occurs on this path. But we can talk about linear-time properties also if they are specified informally, or using some other formalism.
Now I define what it means for a process $P$ to satisfy the linear-time property $\varphi$ under the completeness criterion $C$. This is denoted $P \models^C \varphi$. It holds if all complete paths of $P$ satisfy $\varphi$. Here a complete path is one that is selected as being complete by the completeness criterion $C$.
In a Kripke structure one normally assumes that there are no states without outgoing transitions. In the world of LTSs without a termination predicate, this condition is called deadlock-free, for a state without outgoing transitions is regarded as a deadlock. For such Kripke structures, the default completeness criterion says that a path is complete if and only if is infinite. If no other completeness criterion is discussed, this is always the intended one. Under this completeness criterion, formulas $\varphi$ holds for process $P$ iff it holds for all infinite paths of $P$. Naturally, $P \models^C \varphi$ can then be abbreviated by $P \models \varphi$. The superscript $C$, and the entire concept of a completeness criterion, need to come up only when we want to discuss alternative completeness criteria.
In LTSs, or in Kripke structures that are explicitly allowed to contain deadlock states, the default completeness criterion says that a path is complete if it is either infinite or ends in a state from which no further transitions are possible. This is the completeness criterion used when defining completed trace semantics―see Lecture 2.
We say that completeness criterion $D$ is stronger than completeness criterion $C$ if it rules our more paths as incomplete, that is, if the set of complete paths according to $D$ is a subset of the set of complete paths according to $C$. If $D$ is stronger than $C$, then $P \models^C \varphi$ implies $P \models^D \varphi$, but not necessarily vice versa. For to check $P \models^D \varphi$, we have to check that $\varphi$ holds for all complete paths, and under $D$ there are fewer complete paths than under $C$, so there is less to check.
Now progress, justness, weak fairness and strong fairness can be seen as completeness criteria, ruling out some paths from consideration when determining whether a process satisfies a property. They get stronger in the listed order, so $P \models^{\it SF} \varphi$ implies $P \models^{\it WF} \varphi$, which implies $P \models^J \varphi$, which implies $P \models^{\it Pr} \varphi$, for all processes $P$ and properties $\varphi$. Here $\it SF$, $\it WF$, $J$ and $\it Pr$ are the completeness criteria formalising those assumptions.
The strongest completeness criterion thinkable (let me call it $\infty$ in this paragraph only) says that no path is complete. Naturally, we now have $P \models^\infty \varphi$ for all processes $P$ and properties $\varphi$. This is not a useful completeness criterion, so we now forget it again.
The weakest completeness criterion possible (let me call it $\emptyset$)
says that all paths are complete. This is in fact a useful
completeness criterion. Note that $P \models^{\it Pr} \varphi$ implies
$P \models^\emptyset \varphi$, so we have already a hierarchy of 5
completeness criteria.
A task $T$ is enabled is a state $s\in S$, if $s$ has an outgoing
transition that belongs to that task. Task $T$ is perpetually enabled
on a path $\pi$ if it is enabled in every state of $\pi$.
Task $T$ occurs in $\pi$ if $\pi$ contains a transition that
belongs to $T$.
A finite prefix of a path $\pi$ is the part of $\pi$ from its
beginning until a given state. A suffix of a path $\pi$ is the
path that remains after you remove a finite prefix of it.
Note that if $\pi$ is infinite, then all its suffixes are infinite as well.
A path $\pi$ is called weakly fair if, for every suffix $\pi'$ of
$\pi$, each task that is perpetually enabled on $\pi'$, occurs in $\pi'$.
Equivalently: A path $\pi$ is weakly fair if each task
that from some state onwards is perpetually enabled on $\pi$, occurs
infinitely often in $\pi$.
Weak fairness is a completeness criterion: the weakly fair paths are
the complete ones.
Weak fairness as a property of paths can be expressed by the LTL
formulas $${\bf G}({\bf G}(\textit{enabled}(T)) \Rightarrow {\bf F} (\textit{occurs}(T)))$$
for each task $T$. Hence $\textit{enabled}(T)$ is an atomic
proposition that is assigned to a state from which a transition from
$T$ is enabled. Likewise $\textit{occurs}(T)$ holds for a Kripke state modelling an
LTS transition from $T$.
The above formula can equivalently be written as
$${\bf G}(\neg {\bf G}(\textit{enabled}(T) \wedge \neg \textit{occurs}(T)))$$
or
$${\bf FG}(\textit{enabled}(T)) \Rightarrow {\bf GF}(\textit{occurs}(T)).$$
The statement $P \models^{\it WF} \varphi$ is equivalent to
$P \models \chi \Rightarrow \varphi$, where $\chi$ is the conjunction
of the formulas above, one for each task $T$.
A path $\pi$ is called strongly fair if, for every suffix $\pi'$ of
$\pi$, each task that is relentlessly enabled on $\pi'$, occurs in $\pi'$.
Equivalently: A path $\pi$ is strongly fair if each task that
is relentlessly enabled on $\pi$ occurs infinitely often in $\pi$.
Also strong fairness is a completeness criterion: the strongly fair
paths are the complete ones.
Note that a task that is perpetually enabled is certainly relentlessly
enabled. Hence a strongly fair path is surely also weakly fair.
Thus strong fairness is a stronger completeness criterion than weak fairness.
As an example, take the alternating bit protocol (ABP) encountered in
Lecture 12. A property that this protocol should
satisfy is that each message $d\in D$ that is offered by the user of
the protocol on channel $A$ will eventually be delivered to the user at
the other end via channel $C$.
In LTL, implicitly applying the translation from process graphs to
Kripke structures discussed in Lecture 10,
Let us consider the successful transmission of a message over the
unreliable channel to be a task. If we look at the process graph of
the ABP, there are infinite paths that get stuck in loops, and in
which (from some point onwards) the successful sending of a message
does not occur. Looking at the process graph, we see that our task
is relentlessly enabled on such a path, but not perpetually enabled.
Hence such a path is weakly fair but not strongly fair.
It follows that the ABP is correct when we make a strong fairness
assumption, but not when we only make a weak fairness assumption.
Strong fairness can be expressed in LTL as
$${\bf G}({\bf GF}(\textit{enabled}(T)) \Rightarrow {\bf F} (\textit{occurs}(T)))$$
The above formula can equivalently be written as
$${\bf G}(\neg {\bf G}({\bf F}(\textit{enabled}(T)) \wedge \neg \textit{occurs}(T)))$$
or
$${\bf GF}(\textit{enabled}(T)) \Rightarrow {\bf GF}(\textit{occurs}(T)).$$
To model local fairness assumptions, we need to explicitly declare our tasks.
This means that we move from process graphs or Kripke structures to
the extension of these structures with the tasks as an extra
component.
A global fairness assumption, on the other hand, is independent of the
system under investigation, and instead considers all choices of a
given type to be fair. Here we work with models of concurrency that
are equipped with some structure, and when translating those models to
process graphs or Kripke structures, the Tasks are somehow extracted
from the structure of the original model.
An example is fairness of actions. Here the underlying model of
concurrency consists of process graphs, where the transitions are
labelled with actions. Now there is a task for each action,
and a transition belongs to that task if it has the corresponding label.
Another example is fairness of transitions.
Here each transition constitutes a singleton task.
A third example is fairness of instructions.
Here the underlying model is a programming language, where each
program is translated to a process graph or Kripke structure.
Now there is a task for each line of code (instruction) in the source program,
and a transition in the process graph belongs to that task iff it
stems from that instruction.
There are more such methods, but I will skip them here.
When dealing with weak or strong fairness, it is often common to put
each transition in at least one task. One way to accomplish this is
to create one special task that contains all transitions.
In this situation, a weakly (or strongly) fair path is certainly progressing.
Namely, if a path $\pi$ is not progressing (fails the progress
condition), then it ends in a state with an outgoing transition.
Hence a task $T$ containing that transition is perpetually enabled in
the suffix $\pi'$ of $\pi$ consisting of its last state only. Now weak
fairness prescribes that task $T$ occurs in $\pi'$, which it clearly doesn't.
So $\pi$ is not weakly fair.
This arguments shows that weak (as well as strong) fairness is a stronger
completeness criterion than progress.
A progress assumption is build in in the logics CTL and LTL (for they
work with infinite paths rather than partial paths).
Now if we considered $X$ it is own right, the assumption of progress
would guarantee that $X$ will eventually do an $a$-action. It cannot
stay in its initial state.
However, in the context with the parallel process $Y$, the composition
$X||Y$ is not guaranteed to ever do an $a$, at least when only
assuming progress. For the infinite path that consists of
$b$-transitions only does satisfy the progress condition.
Now justness allows us to apply progress locally, and conclude
that also $X||Y$ must eventually do an $a$-action, at least when
assuming that $X$ and $Y$ are parallel components that do not
interact with each other in any way.
Formally, justness can be formulated in terms of resources
needed by transitions. These resources could be components in a parallel
composition that give rise to the transition. The process $a|\bar a$
for instance has two components, R(ight) and L(eft).
The $a$-transitions needs one recourse, namely L. Likewise, the $\bar
a$-transition only needs R. However, the $\tau$-transitions needs
two resources, namely L and R. In the setting of Petri nets, the
resources needed by a transition are the tokens that need be in its preplaces.
Now a path is called just if for each transition that is at
some point enabled (meaning it is an outgoing transition of a state in
the path), one of its resources is actually used for a transition that
occurs in the path (past that state).
The rough idea is that if all resources for a transition are present,
and none is ever used for anything else, then the transition ought to occur.
To this end we only have to assume progress. Then we show that the
implementation is branching bisimilar with the specification, and
hence is also assumed to be correct / to satisfy this requirement.
This method allows us to implicitly assume strong fairness.
In fact we can even prove systems correct in cases where strong
fairness is not strong enough. The form of fairness obtained in this
way we call full fairness.
Full fairness cannot be seen as a completeness criterion, because
there are no specific paths that are ruled out. Yet it is a method to
prove liveness properties, and on that basis it can be compared with
strong fairness, and turns out to be strictly stronger.
In CTL, a useful liveness property often has the form
It is called an AGAF property, and says that whenever some
precondition $\varphi$ is met, then eventually we reach a state where
$\psi$ holds. By means of full fairness, we can obtain such a result,
as soon as we have the weaker property
This AGEF property says that whenever some
precondition $\varphi$ is met, it will always remain possible to reach
a state where $\psi$ holds.
The notions of (rooted) branching (or weak) divergence-preserving
bisimulation equivalence do not make fairness of justness assumptions.
Weak fairness
Weak and strong fairness are defined in terms of tasks.
A task is a set of transitions in a Kripke structure or process graph.
To talk about weak and strong fairness, we need a structure
$(S, \rightarrow, I, \mathcal{T})$. Here $(S,\rightarrow, I)$ models
a Kripke structure or process graph, with a set $S$ of state, a
transition relation $\rightarrow$, and an initial state $I\in S$.
The added structure $\mathcal{T}$ is a collection of tasks $T \in \mathcal{T}$,
each being a set of transitions.
Strong fairness
A task $T$ is relentlessly enabled on a path $\pi$ if each
suffix of $\pi$ contains a state in which it is enabled.
This is the case if the task is enabled in infinitely many states of
$\pi$, in a state that occurs infinitely often in $\pi$, or in the
last state of a finite $\pi$.
$\textrm{ABP} \not\models^{\it WF} \textbf{G}(r_A(d) \Rightarrow \textbf{F} s_C(d))$.
Global versus local fairness
A local fairness assumption is tailored to a specific application,
where certain choices are supposed to be fair, but others are not.
Take as example the alternating bit protocol. The choice between a
successful transition of a message over the unreliable channel versus
an unsuccessful transmission can be regarded to be fair. This means
that we count on it not failing all the time. At the same time, there
is a choice between the environment sending us a message to transmit
that is shorter then 12 characters, or is at least 12 characters long.
This choice is not regarded as fair, as we have no clue on what drives
the environment to supply us with certain messages.
Progress
The assumption of progress, for closed systems, says that we
will not stay forever in a state with an outgoing transition.
Seen as a completeness criterion, it doesn't restrict infinite paths,
but declares a finite path to be complete only if it ends in a state
without outgoing transitions.
Justness
Justness is a local progress assumption.
Consider the system X||Y, where X is given by the recursive
specification X=a.X, and Y by Y=b.Y.
This system consists of two parallel components, one of which performs
infinitely many $a$s, and the other infinitely many $b$s.
The process graph of this system has one state, with an $a$-loop and a $b$-loop.
Example
An illustrative example appears in Figure 4 of
this paper (page 13).
That net displays a shop with two clerks I and II and two returning
customers A and B. There are 4 interactions between the clerks and the
customers: I serves A, II serves A, I serves B, II serves B.
In this net there are many traces that describe possible complete runs
of the systems. The stronger our fairness assumption, the fewer runs
are admissible. When assuming nothing, all partial traces model runs.
When assuming progress, this rules out all finite traces; now only
infinite traces model real completed runs. When assuming justness a
run that only contains "I serves A" is ruled out, because the
preconditions of "II serves B" are always met, and these tokens are
never used for anything else. However, a run that alternates between
"I serves A" and "II serves A" is still fine. Such a run is ruled out
by weak fairness however, as in each state "I serves B" is enabled,
yet customer B is never served. Weak fairness, however, allows the run
that alternates between "I serves A" and "I serves B", because "II
serves A" is not enabled in every state (due to customer A leaving
repeatedly). Yet, this run is ruled out by strong fairness.
Full fairness
A very strong fairness assumption is already incorporated in the notions of
(rooted) branching (or weak) bisimulation equivalence seen before.
We saw above that it takes strong fairness to make the alternating bit
protocol correct. Yet, we succeeded in proving it correct by means of
branching bisimilarity. The reasoning goes as follows:
The specification is obviously correct, as it satisfies the requirement
Opinion
My personal opinion is that it is often reasonable to assume (global)
progress. In the absence of that, no non-trivial liveness property can
be proven. When assuming progress I find it reasonable to also assume
justness (globally). However, the use of global (weak or strong)
fairness assumption should not be our default; it should be used only
if for some reason we are convinced that such an assumption is
warranted, or because we can only obtain a meaningful liveness
property conditionally on making such an assumption.
Course evaluation
Please fill in the
course and instructor evaluation forms.
Lecture 15
Tuesday 28-7-2020: Reactive temporal logic
I present "reactive CTL/LTL".
In Lecture 10 we say how to interpret CTL and
LTL on process graphs (or states in labelled transition systems) with deadlocks.
We discussed 4 methods to deal with deadlock. In the context of CTL or
LTL minus the next-state operator, the 3rd or 4th method was the best.
But when using full CTL or LTL, with the next-state operator, the 2nd
method is just as good, and this is what I use today.
In a deadlock state in a Kripke structure (a state without outgoing
transitions), the CTL formula $\neg$EXtrue holds, and in a
state with an outgoing transition it does not hold. Hence this formula
could be used to describe deadlock.
We use the De Nicola-Vaandrager translation from process graphs to Kripke structures. Formally, a process graph is a triple (S,I,→), with S a set (of states), I∈S an initial state, and → a set of triples (s,a,t) with s,t ∈S and a an action, drawn from a set Act. The associated Kripke structure is $(S',I,\rightarrow',\models)$, where $S' := S \cup\{(s,a,t)\in{\rightarrow}\mid a\neq \tau\}$, and ${\rightarrow'} := \{(s,(s,a,t)),((s,a,t),t) \mid (s,a,t)\in S'\} \cup \{(s,t)\mid (s,\tau,t) \in {\rightarrow}\}$. This Kripke structure has the same states and transitions as the process graph, except that a new state $(s,a,t)$ is added halfway a transition from $s$ to $t$ labelled $a\neq\tau$. As atomic propositions we take the set Act of actions, except for $\tau$, and declare that a state satisfies the atomic proposition $a\in Act$ iff that state has the form $(s,a,t)$. So $r \models a$ iff $r=(s,a,t)$.
Now we consider all possible subsets $B$ of $Act\setminus\{\tau\}$. $B$ contains the actions that may be blocked by the environment. We define a $B$-complete path of a process graph as a path that is either infinite or ends in a state of which all outgoing transitions have a label from $B$. A $B$-complete path in a Kripke structure that arises as the De Nicola-Vaandrager translation of a process graph is a path that ends in a state, all of whose successor states are labelled with atomic propositions from $B$. Thus, the De Nicola-Vaandrager translation maps $B$-complete paths in process graphs to $B$-complete paths in the associated Kripke structures. An LTL formula $\phi$ is said to hold for the Kripke structure $K$ when the actions from $B$ might be blocking, notation $K \models_B \phi$, iff each $B$-complete path from the initial state of $K$ satisfies $\phi$. Likewise, $\models_B$ can be defined for CTL by simply replacing all references to infinite paths by $B$-complete paths.
As an example, consider a vending machine that takes in a coin $c$ and produces a pretzel $p$. We assume that taking in the coin requires cooperation from the environment, but producing the pretzel does not. A process algebraic specification is $$VM=c.p.VM$$ In old-fashioned LTL we have $VM \models G(c\Rightarrow Fp)$. This formula says that whenever a coin is inserted, eventually a pretzel is produced. This formula is intuitively true indeed. But we also have $VM \models G(p\Rightarrow Fc)$. This formula says that whenever a pretzel is produced, eventually a new coin will be inserted. This formula is intuitively false. This example shows that old-fashioned LTL is not suitable to describe the behaviour of this vending machine. Instead we use reactive LTL, and take $B=\{c\}$. This choice of $B$ says that the environment may bock the action $c$, namely by not inserting a coin; however, the environment may not block $p$. (Alternatively, one can interpret $B$ as an agreement with the user/environment, namely that the environment will never block actions outside of $B$.) As intuitively expected, we have $VM \models_B G(c\Rightarrow Fp)$ but $VM \not\models_B G(p\Rightarrow Fc)$.
Note that the old-fashioned LTL/CTL interpretation $\models$ is simply $\models_\emptyset$, obtained by taking the empty set of blocking actions.
There are 3 ways to incorporate the above idea.
Thus dealing with liveness properties under justness assumptions (without assuming weak fairness) calls for a semantic equivalence that distinguishes processes that are interleaving bisimilar. In fact, this opens a new dimension in our lattice of equivalences.
I review a few extensions of CCS-like languages in which it is possible to correctly model fair schedulers and mutual exclusion protocols. These come in two types:
In the second class one finds CCS, or labelled transition systems, extended with timeout actions.
Now a data expression is made from these constants and from variables ranging over a particular data type by applying the functions or operators specified by the data structure. For instance append(8+n,q) is a data expression of type queue. It is built from the constant 8 of type integer, the variable n of type integer, the binary operator + of type integer, the variable q of type queues of integers, and the operator append mentioned above.
Instead of atomic actions like a and b, we use actions that model the transmission of data over certain channels. In value-passing ACP there may be actions sc(d) indicating the sending of data d over the channel c. In value-passing CCS, the action a(d) denotes the sending of data d on the channel c. Here c is taken from a set of channels names, where each channel comes with a certain data type, and d is a data expression of that type. The action c(x) represents the receipt of some data on some channel c. Here x is a variable of the data type required by c. The process (c(5).0 | c(x).b(x+2).0)\c) in value-passing CCS, for example, first communicates the value 5 over channel c: the left component sends 5 and the right-hand side receives it, binding it to the variable x. This communication takes the shape of a $\tau$-transitions, abstracting from the transmitted value. Then the right-hand component sends the value 5+2=7 on channel b.
The other two transition rules for parallel composition, and the rules for $+$, are exactly the same as for standard CCS. So are the rules for 0, restriction and renaming, noting that the data carried around by actions is not affected by restriction and renaming operators. The rule for recursion is $$ \frac{E \{\vec{exp}/\vec{x}\} \stackrel{\alpha}{\longrightarrow} Q}{X(\vec{exp}) \stackrel{\alpha}{\longrightarrow} Q}$$ when $X(\vec{x}) = E$ is the defining equation for the recursion constant $X$. The semantics for guards is more difficult, and skipped here, and we do not use assignment in this setting.
The rule for the send prefix now is $$(\xi, \bar{a}(\vec{exp}).P) \stackrel{\bar{a}(\xi(\vec{exp}))}{\longrightarrow} (\xi,P). $$ The difference with the previous semantics is that the vector of expressions $\vec{exp}$ is now converted through the valuation function $\xi$ into a vector of values. Thus the arguments of actions in transition labels have a semantic rather than a syntactic nature.
The rule for the read prefix becomes $$(\xi, a(\vec{x}).P ) \stackrel{a(\vec{v})}{\longrightarrow} (\xi[\vec{x}:=\vec{v}], P).$$ Here $\vec v$ may be any vector of values of the types of the variables $\vec x$. Thus, if we have $a(n).P$, with $n$ a variable for type integer, then there are infinitely many outgoing transitions, namely one for each integer that can be received when reading $x$ on channel $a$. The valuation $\xi[\vec{x}:=\vec{v}]$ is just $\xi$, but updated to now bind the values $\vec v$ to the variables $\vec x$.
In this approach the effort of passing values from a sending to a receiving process is shifted from the communication rule for parallel composition (see above) to the read and send prefixes. The rule for communication is simply $$\frac{(\xi,P) \stackrel{\bar c(\vec{v})}\longrightarrow (\zeta,P') \qquad (\xi,Q) \stackrel{c(\vec v)}\longrightarrow (\zeta,Q')}{(\xi,P | Q) \stackrel{\tau}\longrightarrow (\zeta,P'|Q')}$$ similar to standard CCS. A problem in applying this rule, is that at the sending end we normally have a transition $(\xi,Q) \stackrel{\bar c(\vec v)}\longrightarrow (\xi,Q')$ rather than $(\xi,Q) \stackrel{\bar c(\vec v)}\longrightarrow (\zeta,Q')$. This problem can be solved by postulating that $(\xi,Q')=(\zeta,Q')$ when $\xi(x)=\zeta(x)$ for all data variables $x$ that actually occur in $Q'$. By renaming the bound variables $\vec{x}$ within expressions $a(\vec{x}).P$ one can arrange for these variables not to occur within the parallel process $Q$, and this way the above agreement between $\xi$ and $\zeta$ can be reached.
The other two transition rules for parallel composition, and the rules for $+$, are again exactly the same as for standard CCS, and so are the rules for 0, restriction and renaming. The rule for recursion is $$ \frac{(\zeta,E) \stackrel{\alpha}{\longrightarrow} (\chi,Q)}{ (\xi,X(\vec{exp})) \stackrel{\alpha}{\longrightarrow} (\chi,Q)}$$ when $X(\vec{x}) = E$ is the defining equation for the recursion constant $X$. Here $\zeta$ is a valuation defined on the variables $\vec x$, such that $\zeta(\vec{x}) := \xi(\vec{exp})$.
The rule for guards is $$\frac{\xi(\varphi)={\bf true} \qquad (\xi,P) \stackrel{\alpha}{\longrightarrow} (\zeta,P')} { (\xi, [\varphi]P) \stackrel{\alpha}{\longrightarrow} (\zeta,P')}$$ and the rule of assignments $$ (\xi,[[ x:= exp ]]P) \stackrel{\tau}{\longrightarrow} (\xi[ x:= \xi(exp)], P).$$
The rule for send prefixing is $$ \bar{a}(\vec{exp}).P \stackrel{\bar{a}([[\vec{exp}]])}{\longrightarrow} P$$ where the operator $[[ ... ]]$ converts a closed data expression (or formula) into the data value it denotes; for instance $[[2+5]] = 7$ (or $[[7 > 5]] = \bf true$). The rules for read prefixing are $$ a(\vec{x}).P \stackrel{a(\vec{v})}{\longrightarrow} P\{\vec{v}/\vec{x}\}. $$ As above, $\vec v$ may be any vector of values of the types of the variables $\vec x$. The process expression $P\{\vec{v}/\vec{x}\}$ is the result of substituting (expressions denoting) the values $\vec v$ for the variables $\vec x$ in $P$. Note that since the variables $\vec x$ are bound in $a(\vec{x}).P$, this is a process without free variables, that is, a closed process expression. However, the expression $P$ may contain free occurrences of $\vec x$, and therefore does not count as a closed process expression. It is turned into a closed process expression by means of the substitution $\{\vec{v}/\vec{x}\}$. The rule for communication now is $$\frac{P \stackrel{\bar c(\vec{v})}\longrightarrow P' \qquad Q \stackrel{c(\vec v)}\longrightarrow Q'}{P | Q \stackrel{\tau}\longrightarrow P'|Q'}$$ and also the other two transition rules for parallel composition, and the rules for $+$, are exactly the same as for standard CCS. So are the rules for 0, restriction and renaming. The rule for recursion is $$ \frac{E \{\vec{exp}/\vec{x}\} \stackrel{\alpha}{\longrightarrow} Q}{X(\vec{exp}) \stackrel{\alpha}{\longrightarrow} Q}$$ when $X(\vec{x}) = E$ is the defining equation for the recursion constant $X$, just as in the late semantics. The rule for guards is $$\frac{[[\varphi]]={\bf true} \qquad P \stackrel{\alpha}{\longrightarrow} P'} { [\varphi]P \stackrel{\alpha}{\longrightarrow} P'}$$ and we do not use assignment in this setting.
Rob van Glabbeek | homework | rvg@cse.unsw.edu.au |