# Course notes for COMP6752: Modelling Concurrent Systems

This files describes the covered material.

## Lecture 1 Monday 1-6-2020: Specification and Implementation

Today I introduce the topics to be treated in this course: formalising specifications as well as implementations of concurrent systems, studying the criteria for deciding whether an implementation meets a specification, and techniques for proving this. Look here for the broader context.

Both specifications and implementations can be represented by means of models of concurrency such as labelled transition systems (LTSs) or process graphs. I present the definitions below.

Another way to represent concurrent systems is by means of process algebraic expressions.

These concepts will be illustrated by means of an example (see below). It shows how an implementation could be given as a parallel composition of components, enforcing synchronisation on certain actions.

When both specifications and implementations are given as LTSs, we need a criterion to determine if they match. I will discuss a criterion based on bisimulation, that matches the states of the related LTSs. I show why it is useful here to abstract from internal actions of the implementation.

When the specification and implementation are given as process expressions, to decide whether they match, one considers the meaning, or semantics of these expressions. Often, this meaning is given by mapping them to LTSs.

A process graph is a triple (S,I,→), with S a set (of states), I∈S an initial state, and a set of triples (s,a,t) with s,t ∈S and a an action, drawn from a set Act. An LTS is just a process graph without initial state. (But sometimes LTS is used as synonym for process graph, i.e. with initial state.) We'll also use process graphs with a fourth component ✔⊆S indicating the final states of the process: those in which the system can terminate successfully.
Specifications and implementations can not only be represented by LTSs or other models of concurrency, but also by process algebraic expressions, where complex processes are build up from constants for atomic actions using operators for sequential, alternative and parallel composition. The most popular process algebraic languages from the literature are CCS (the Calculus of Communicating Systems), CSP (Communicating Sequential Processes) and ACP (the Algebra of Communicating Processes). I present the syntax of ACP, focusing on the partially synchronous parallel composition operator. The corresponding textbook is

J.C.M. Baeten & W.P. Weijland (1990): Process Algebra, Cambridge University Press

or

W.J. Fokkink (2000): Introduction to Process Algebra, Texts in Theoretical Computer Science, An EATCS Series, Springer.

1. ε (successful termination)       (only present in the optional extension ACPε)
3. a (action constant) for each action a.
4. P.Q (sequential composition)
5. P+Q (summation, choice or alternative composition)
6. P||Q (parallel composition)
7. H(P) (restriction, or encapsulation) for each set of visible actions H.
8. τI(P) (abstraction) for each set of visible actions I       (only present in the optional extension ACPτ).
The atomic actions of ACP consists of all a, b, c, etc. from a given set A of visible actions, and one special action τ, that is meant to be internal and invisible to the outside world. For each application, a partial communication function γ: AxA -> A is chosen that tells for each two visible actions a and b whether they synchronise (namely if γ is defined), and if so, what is result of their synchronisation: γ(a,b). The communication function is required to be commutative and associative. The invisible action cannot take part in synchronisations.

Here are the definitions of these operators, in terms of process graphs extended with a predicate ✔ that signals successful termination.

• ε is the graph with one state and no transitions. This one state is the initial state. It is marked with ✔.
• δ is the graph with one state and no transitions. This one state is the initial state. It is not marked as terminating.
• a is a graph with two states (and initial and a final one) and one transition between them, labelled a. The final state is marked with ✔.
• G.H is the process that first performs G, and upon successful termination of G proceed with H.
It is homework to write down a formal definition.
• G+H is obtained by taking the union of copies of G and H with disjoint sets of states, and adding a fresh state root, which will be the initial state of G+H. For each transition IG --a--> s in G, where IG denotes the initial state of G, there will be an extra transition root --a--> s, and likewise, for each transition IH --a--> s in H, where IH denotes the initial state of H, there will be an extra transition root --a--> s.
root is labelled with ✔ if either IG or IH is.
• G||H is obtained by taking the Cartesian product of the states of G and H; that is, the states of G|H are pairs (s,t) with s a state from G and t a state from H. The initial state of G||H is the pair of initial states of G and H. A state (s,t) is labelled ✔ if and only if both s and t are labelled ✔. The transitions are
• (s,t) --a--> (s',t) whenever s --a--> s' is a transition in G,
• (s,t) --a--> (s,t') whenever t --a--> t' is a transition in H, and
• (s,t) --c--> (s',t') whenever s --a--> s' is a transition in G and t --b--> t' is a transition in H, and γ(a,b)=c.
Intuitively, G||H allows all possible interleavings of actions from G and actions from H. In addition, it enables actions to synchronise with their communication partners.
• H(G) is just G, but with all actions in H omitted.
It is used to remove the remnants of unsuccessful communication, so that the synchronisation that is enabled by parallel composition, is enforced.
• τI(G) is just G, but with all actions in I renamed into τ.

The semantics presented above is of the denotational kind. Here "denotational" entails that each constant denotes a process graph (up to isomorphism) and each ACP operator denotes an operation on process graphs (creating a new graph out of one or two argument graphs).
Here is the example of the relay race presented in class.

There are two runners R1 and R1 participating in the relay race, and while running they carry a baton. Runner 1 starts running, and at the halfway-point passes the baton to runner 2, who is waiting there. Then runner 2 runs the rest of the race and delivers the baton at the finish.
We can describe the behaviour of the two runners by the ACP expressions:

R1 = start.give
Here start is the action of runner 1 starting the race,
give is the finish of runner 1 at the halfway-point, where R1 gives the baton to R2,
receive is the action of runner 2 to accept the baton from runner 1 and start running,
and finish is the action of runner 2 of arriving at the finish.
An ACP expression describing the whole interaction is
τI(H(R1 || R2))
with H={give,receive} and I={hand-over}. Here the communication function γ is given by γ(give,receive) = γ(receive,give) = hand-over. We will draw the corresponding process graph in class, and after deleting unreachable parts it looked just like the process graph of the expression
start.τ.finish
However, from a very high level of abstraction, where we abstract from the relay in the relay race, and rather focus on the baton, which we see departing by the action start, and subsequently arriving at the finish by the action finish, we may want to specify the overall behaviour of this race as
start.finish
To formally prove that the implementation matches this specification, we need a relation between these processes that abstracts from the internal action τ.

## Lecture 2 Tuesday 2-6-2020: Spectrum of semantic equivalences

I will discuss the concept of an equivalence relation, and recall that any equivalence relation on a set D gives rise to a partitioning of D into equivalence classes (see Homework 1). Now one equivalence relation ≡ is called finer (or more discriminating) than another equivalence relation ~ if each ≡-equivalence class is included in a ~-equivalence class, i.e. when p≡q implies p~q for all p,q in D. Conversely, ~ is called coarser than ≡.

In this course I will present a lattice of semantic equivalence relations on process graphs. These represent different points of view on whether two processes are roughly the same. In such a lattice, equivalences are classified with respect to their discriminating power.

As part of this lattice of semantic equivalences I present 8=2x2x2 versions of trace semantics, all defined on process graphs G=(S,T,I) that are not equipped with a termination predicate (=final states).

Two processes p and q are partial trace equivalent, notation P =PT Q, if they have the same partial traces. Here a partial trace of a process p is the sequence of transition labels encountered on a path in the process graph of p, starting from the initial state. Here a path can be defined as an alternating sequence of states and transitions, where each transition goes from the state before it to the state after it.

Processes p and q are completed trace equivalent, notation P =CT Q, if moreover they have the same complete traces. Here a completed trace is a state that stems from a maximal path; one that cannot be prolonged at the end.

For process graphs without a termination predicate, a path of a process p is an alternating sequence of states and transitions, starting from the state p and either infinite or ending in a state, such that each transition goes from the state before it to the state after it. It is complete if it is either infinite or ends in a state from which no further transitions are possible. [Later in the course we will encounter alternative definitions of completeness.]

A completed trace of p is the sequence of labels of the transitions in a complete path. We write CTfin(p) for the set of finite completed traces of process p, and CT(p) for the set of all finite completed traces of p together with its infinite traces. When writing just CT(p) it depends on context which of the two is meant (i.e. this differs from one paper to the next). I this class I defined CT(p) to mean CT(p).

Partial traces are defined likewise, but can end anywhere. We write PTfin(p) for the set of finite partial traces of process p, and PT(p) for the set of all finite partial traces of p together with its infinite traces. Normally, PT(p) means PTfin(p).

Now processes p and q are infinitary completed trace equivalent, notation P =CT Q, if CT(p) = CT(q). It is a theorem that this implies that also PT(p) = PT(q). Processes p and q are finitary completed trace equivalent, notation P =CTfin Q, if CTfin(p) = CTfin(q) and PTfin(p) = PTfin(q). Here the last requirement cannot be skipped.

Likewise, processes p and q are infinitary partial trace equivalent, notation P =PT Q, if PT(p) = PT(q). They are finitary partial trace equivalent, notation P =PT Q, if PT(P) = PT(Q).

Together, these four equivalence notion form a square when ordered w.r.t. discriminating power.

One of the things we learned from the example of the
relay race is that we need an equivalence relation on processes that abstracts from the internal action τ. To formally prove that the implementation matches this specification, we need a semantic equivalence that makes start.τ.finish equivalent with start.finish. Such equivalences are called weak or abstract. In contrast, an equivalence that treats τ like any other action is strong.

The definitions of partial and completed trace equivalence we encountered above can all be regarded as definitions of strong partial trace equivalence and strong partial completed trace equivalence, as long as we treat τ like any other action. To arrive at definitions of weak partial trace equivalence (=WPT or =WPT), we define a weak partial trace of a process p as a strong partial trace of p from which all the occurrences of τ have been omitted. Now two processes are called (finitary or infinitary) weak partial trace equivalent if they have the same (finite or possibly infinite) weak partial traces. Weak completed trace equivalence, and its infinitary version, can be defined in the same vein.

On processes that are equipped with a termination predicate on states (like the accepting states from automata theory), partial traces and completed traces are defined in such a way that we could just as well have thought about the termination predicate that holds in a state s as a special transition marked ✔ that goes from s to a fresh state that is not connected in any other way. Thus, for example, CT(a.(ε+b.δ)) = {a✔, ab}.

In addition, two processes are called language equivalent if they have the same traces leading to terminating states. In automata theory such traces are called words, accepted by a process p. The set of all words accepted by p is the language accepted by p. It turns out that language equivalence is coarser than partial trace equivalence, whereas completed trace equivalence is finer than partial trace equivalence.

My paper The Linear Time - Branching Time Spectrum I is a small encyclopedia of (strong, or concrete) equivalence relations; language equivalence appears in Section 19.

## Lecture 3 Tuesday 9-6-2020: Linear time versus branching time

Today we look at equivalence relations on process graphs or labelled transition systems that are finer than partial or completed trace equivalence, in the sense that they make more distinctions.

I discuss why one would want to distinguish the processes b + c and τ.b+τ.c. The reason is that in a context of an environment that admits synchronisation on b, but not on c, the first process always performs the (weak) trace b, whereas the second has the possibility to reach a state of deadlock after doing the internal step τ. Deadlock is a (bad) state of a system, in which no further actions are possible, and the system does not terminate successfully.

For the same reason, one would want to distinguish the systems Q := τ.(b + c) and P := τ.b+τ.c. However, these two systems are equivalent under each of the 8 variants of trace equivalence discussed above. For this reason we need equivalence relations on process graphs or labelled transition systems that are finer than partial or completed trace equivalence, in the sense that they make more distinctions.

The definition of bisimulation equivalence, also called bisimilarity, can be found in this handout, which you should read entirely, except for the sections "Modal logic", "Non-well-founded sets" and "Concurrency". In that handout, LTSs and process graphs have an extra component ⊨ that is used to tell whether certain predicates, taken from a set P, hold in certain states of our processes. In this course, we typically consider only one predicate, namely ✔, telling that successful termination is possible in a state. For this reason, a quadruple (S,I,→,⊨) is presented as (S,I,→,✔). More commonly, one ignores the component ⊨ or ✔ completely.

On the set of all process graphs we can study multiple equivalence relations and order them w.r.t. their discriminating power. I drew a spectrum of equivalence relations, with finer ones located above coarser ones.

In this spectrum I located partial trace equivalence, completed trace equivalence and bisimulation equivalence, and treated examples showing their differences. Bisimulation equivalence is at the top of the spectrum and partial trace equivalence near the bottom. Completed trace equivalence is somewhere in between. Bisimulation equivalence is called a branching time equivalence, because it takes into account at which point two different traces branch off. Partial and completed trace equivalence on the other hand are linear time equivalences, as they only focus on the linear executions of a process, without taking branching into account.

The spectrum of weak trace equivalences is the same as in the strong case: weak completed trace equivalence is finer (more discriminating) than weak partial trace equivalence. A weak version of bisimilarity is even finer.

Defining a weak form of bisimilarity is relatively tricky. As we saw by example, just contracting all τ-moves in a labelled transition system doesn't yield what we want to get. There are several, subtly different, approaches in the literature. The most prominent ones are weak bisimilarity and branching bisimilarity, both defined in the above-linked handout.

I treat weak and branching bisimulation equivalence, from this handout; read the section "Abstraction". I also characterise branching bisimilarity in terms of consistent colourings:

A colouring of a labelled transition system is just a function that associates a colour with every state. (Thus a colouring induces an equivalence relation on states, and any equivalence relation induces a colouring.) Now given a colouring, the potentials of a state s of a certain colour, are all pairs (a,C) with a an action and C a colour, such that s can do some inert transitions followed by an non-inert a-transition to a state with colour C. Here a transition is inert if it is labelled τ and moreover goes between two states of the same colour. Thus if (a,C) is a potential of a state s, then either a≠τ or C is not the colour of s. The colouring is consistent if all states of the same colour also have the same potentials. Now one can show that two states are branching bisimulation equivalent iff there exists a consistent colouring that gives them the same colour.

Strong bisimilarity can be characterised in the same way; the only difference is that no actions count as inert. Thus a potential (a,C) simply means one can do an a-transition to a state with colour C.

On processes that are equipped with a termination predicate on states (like the accepting states from automata theory), weak and branching bisimilarity (like all other semantic equivalences) are defined in such a way that we could just as well have thought about the termination predicate that holds in a state s as a special transition marked ✔ that goes from s to a fresh state that is not connected in any other way.

## Lecture 4 Monday 15-6-2020: Recursion, CCS, and operational semantics

In order to describe infinite processes in a language like ACP, we add a bunch of constants X,Y,Z to the language, known as process names. Each process name X comes with a defining equation X = PX where PX is any expression in the language, possibly including X, or any other process names. The set of all such defining equations (that are created separately for each application of the language) is called a recursive specification. If any occurrence of a process name in PX lays within the scope of a subexpression a.P'X of PX, the recursive specification is guarded. Process names from a guarded recursive specification are a good way to define (possibly infinite) processes. An example is an a loop, defined by the recursive specification X = a.X.

An alternative approach is to introduce variables X,Y,Z to the language, as well as recursion constructs <X|S>. Here S is a set of recursive equations X = PX, and <X|S> specifies "the process X that satisfies the equations S". We normally restrict attention to the guarded case, in which there is only one process X satisfying S.

The syntax of CCS, the Calculus of Communicating Systems, features the operations
1. 0 (inaction)
2. a.P (action prefix) for each action a.
3. P+Q (summation, choice or alternative composition)
4. P|Q (parallel composition)
5. P\a (restriction) for each action a.
6. P[f] (relabelling) for each function f:A -> A.
The atomic actions of CCS consists of all a, b, c, etc. from a given set A, together with the complementary actions a for each a in A, and one special action τ, that is meant to be internal and invisible to the outside world. By convention, if we put an overbar over a complementary action a, we get a again.

Here are the definitions, in terms of process graphs without predicates and without a notion of final states, of the operators mentioned above:

• 0 is the graph with one state and no transitions. This one state is the initial state.
• a.G is obtained from G by adding a new state, which will be the initial one, and a new a-labelled transition from this new state to the initial state of G (which is no longer initial in a.G).
• G+H is obtained by taking the union of copies of G and H with disjoint sets of states, and adding a fresh state root, which will be the initial state of G+H. For each transition IG --a--> s in G, where IG denotes the initial state of G, there will be an extra transition root --a--> s, and likewise, for each transition IH --a--> s in H, where IH denotes the initial state of H, there will be an extra transition root --a--> s.
We can think of this operator, as unwinding the initial states (roots) of G and H, and then gluing together these roots to form the new initial state of G+H.
• G|H is obtained by taking the Cartesian product of the states of G and H; that is, the states of G|H are pairs (s,t) with s a state from G and t a state from H. The initial state of G|H is the pair of initial states of G and H. The transitions are
• (s,t) --a--> (s',t) whenever s --a--> s' is a transition in G,
• (s,t) --a--> (s,t') whenever t --a--> t' is a transition in H, and
• (s,t) --τ--> (s',t') whenever s --a--> s' is a transition in G and t --a--> t' is a transition in H.
Intuitively, G|H allows all possible interleavings of actions from G and actions from H. In addition, it enables actions a to synchronise with their complementary actions a; the result of such an synchronisation is the internal action τ.
• G\a is just G, but with all actions a and a omitted.
It is used to remove the remnants of unsuccessful communication, so that the synchronisation that is enabled by parallel composition, is enforced.
• G[f] is the graph G in which each label a is replaced by f(a), and likewise a by f(a).
Processes with loops are specified by means of recursive specifications, just as for ACP (see above).

The main differences between ACP and CCS are:

• ACP makes a distinction between deadlock and successful termination. As a consequence, action prefixing can be replaced by action constants and a general sequential composition. This would not work for CCS, as the behaviour of the sequential composition depends on the successful termination of the first component, which is not modelled in CCS.
• ACP adds two auxiliary operators, the left merge and the communication merge (skipped in my treatment of ACP above), to enable a finite equational axiomatisation of the parallel composition. These operators do not occur in CCS.
• Whereas CCS combines communication and abstraction from internal actions in one operator, in ACP these activities are separated. In CCS the result of any communication is the unobservable action τ. In ACP it is an observable action, from which (in the extended language ACPτ) one can abstract by applying an abstraction operator τI, renaming designated actions into τ.
• CCS adheres to a specific communication format, admitting only handshaking communication, whereas ACP allows a variety of communication paradigms, including ternary communication, through the choice of the communication function γ.

In the denotational semantics we saw before, the denotation of an expression is a process graph (an LTS with initial state) and the meaning of a composed expression like P|Q is a process graph obtained in a particular way from the process graphs of the arguments P and Q.

In the operational framework we drop the distinction between processes and states. There will be one humongous LTS (without initial state) whose states are all possible closed process expressions, and the transitions between states are given by certain operational rules. To get an idea of the difference between process graphs and LTSs, read the definitions in the bisimulation handout, and especially note how Definition 3 of bisimulation avoids having the clause on initial states.

Today I introduce the approach of structural operational semantics (SOS), by giving the structural operational semantics of CCS. Read the first section of this handout. In SOS, the outgoing transitions of a composed process will be determined out of the outgoing transitions of its argument processes without at that time considering transitions occurring further on. What is called $\textbf{fix}_XS$ in the handout is called <X|S> in class. The rule for recursion involves substitution.

## Lecture 5 Tuesday 16-6-2020: Deadlock, livelock and divergence

A safety property is any property of processes that says that something bad will never happen. This terminology stems from Lamport 1977.

To formulate a canonical safety property, assume that our alphabet of visible actions contains one specific action $b$, whose occurrence is bad. The canonical safety property now says that $b$ will never happen.

Definition: A process $p$ satisfies the canonical safety property, notation $P \models \textit{safety}(b)$, if no trace of $P$ contains the action $b$.

In this definition it makes no difference whether "trace" refers to a partial or a complete trace, whether infinite traces are taken into account or not, or whether we use strong or weak traces. I often use words like "trace" instead of "possibly infinite strong partial trace" when the choice of adjectives is irrelevant.

To arrive at a general concept of safety property for labelled transition systems, assume that some notion of bad is defined. Now, to judge whether a process $P$ satisfies this safety property, one should judge whether $P$ can reach a state in which one would say that something bad had happened. But all observable behaviour of p that is recorded in a process graph until one comes to such a verdict, is the sequence of visible actions performed until that point. Thus the safety property is completely determined by the set of sequences of visible actions that, when performed by $P$, lead to such a judgement. Therefore one can just as well define the concept of a safety property in terms of such a set.

Definition: A safety property of processes in an LTS is given by a set $B \subseteq A^*$. A process $P$ satisfies this safety property, notation $P \models \textit{safety}(B)$, when $\textit{WPT}(P) \cap B = \emptyset$.

The following theorem says that weak partial trace equivalence preserves safety properties.

Theorem: If $P =_{\textit{WPT}} Q$ and $P \models \textit{safety}(B)$ for some $B \subseteq A^*$, then also $Q \models \textit{safety}(B)$.

Thus, if two processes have the same weak partial (finite) traces then they satisfy the same safety properties. The reserve also holds: if two processes satisfy the same safety properties, then they have the same weak partial traces. In class I sketched a proof of these (obvious) statements.

A liveness property says that something good will happen eventually. The terminology "liveness" used here stems from Lamport 1977, who was inspired by the older use of the term "liveness" in Petri nets, whose meaning however is different. The following concepts are important when formulating (or proving) liveness properties.

A progress assumption roughly states that if we are in a state with an outgoing transition, we will not stay in that state forever. A fairness assumption roughly says, of a choice between a certain action g and an alternative, that if we encounter this choice infinitely often, sooner or later g will be chosen. A global fairness assumption assumes this for each and every choice in the system. Progress and sometimes fairness assumptions are often made when proving liveness properties. In fact, if we make no progress or fairness assumptions of any kind, no process will satisfy any sensible liveness property.
I discuss deadlock, livelock and (possibly fair) divergence, versus normal progress under a progress assumption. Some formal definitions are in
this handout, which you should read. A deadlock occurs in a state where no further actions are possible. A livelock occurs in a state where no further visible (non-τ) actions are possible, but always an internal action (τ) is possible. A divergence is an infinite sequence of internal actions. A livelock is a special kind of divergence. A divergence that is not a livelock could be regarded "possibly fair" if one makes an appropriate fairness assumption, because one might escape from it.

 WPTfin P normal progress a.g.0 $\{ag,a,\epsilon\}$ Q fair divergence a.Δg.0 $\{ag,a,\epsilon\}$ R livelock a.(g.0+ τ.Δ0) $\{ag,a,\epsilon\}$ S deadlock a.(g.0 + τ.0) $\{ag,a,\epsilon\}$

In the above table, the unary operator $\Delta$ appends a $\tau$-loop at the beginning of its argument. Formally, $\Delta G$ is the process graph created out of $G$ by adding a fresh initial state $I$, together with a fresh $\tau$-loop I --$\tau$--> I, and a transition I --$\alpha$--> s for each transition IG --$\alpha$--> s from $G$. Process graphs of the 4 processes in the above table are drawn in class. If we assume fairness, "normal progress" may be identified with "fair divergence", but in the absence of a fairness assumption only "normal progress" satisfies the liveness property that g will happen eventually. In weak partial trace semantics, deadlock, livelock, divergence and normal progress are all identified. The weak partial traces of all four processes are the same. This week it is homework to invent definitions of completed trace equivalence that make different choices in this matter.

In CSP, as well as in David Walker's version of weak bisimulation semantics, livelock and divergence are identified, and distinguished from deadlock and normal progress, thereby implicitly assuming choices to be unfair (a safe worst case scenario). In weak and branching bisimulation semantics as used in CCS and ACP, choices are assumed to be fair, and fair divergence is identified with normal progress, and distinguished from livelock. Moreover, there is no distinction between deadlock and livelock. This helps in the verification of communication protocols, where you try to send a message as long as needed until it succeeds and assume it will eventually arrive correctly.

Divergence-preserving branching bisimilarity ($BB^\Delta$) is the finest useful weak equivalence found in the literature. It distinguishes all four modes of progress: deadlock, livelock, possibly fair divergence and normal progress. It can be defined in terms of consistent colourings:

A colouring of a labelled transition system is just a function that associates a colour with every state. (Thus a colouring induces an equivalence relation on states, and any equivalence relation induces a colouring.) Now given a colouring, the potentials of a state s of a certain colour, are all pairs (a,C) with a an action and C a colour, such that s can do some inert transitions followed by an non-inert a-transition to a state with colour C. Here a transition is inert if it is labelled τ and moreover goes between two states of the same colour. Thus if (a,C) is a potential of a state s, then either a≠τ or C is not the colour of s. The colouring is consistent if all states of the same colour also have the same potentials. Now one can show that two states are branching bisimulation equivalent iff there exists a consistent colouring that gives them the same colour. (Until here repeated from last week.)

State s is internally divergent w.r.t. a colouring if there is an infinite path of τ-transitions starting from s, only passing through states with the same colour as s. The colouring preserves internal divergence if for any colour either all states of that colour are divergent w.r.t. the colouring, or none are. Now two states are divergence-preserving branching bisimulation equivalent if they receive the same colour by some consistent colouring that preserves internal divergence.

State s is divergent if there is an infinite path of τ-transitions starting from s. A colouring preserves divergence if for any colour either all states of that colour are divergent, or none are. Now two states are weakly divergence preserving branching bisimilar if they receive the same colour by some consistent colouring that preserves divergence. In class I sketch a proof that this notion is weaker than (or possibly equal to) divergence preserving branching bisimilarity, in the sense that when two processes are divergence preserving branching bisimilar, then they are surely weakly divergence preserving branching bisimilar.

Divergence-preserving weak bisimilarity ($WB^\Delta$) is defined much simpler: A divergence-preserving weak bisimulation is one that relates divergent states to divergent states only. Here a state is divergent if it is the start of an infinite path, all of whose transitions are labelled τ.

## Lecture 6 Monday 22-6-2020 : Compositionality and equational axiomatisations

I define what it means for an equivalence relation to be a congruence for an operator, or, in other words, for the operator to be compositional for the equivalence. This is what makes formal verification of big systems given as parallel compositions feasible.

An equivalence ~ is a congruence for a language L if P ~ Q implies that C[P] ~ C[Q] for every context C[ ]. Here a context C[ ] is an L-expression with a hole in it, and C[P] is the result of plugging in P for the hole.

It is not hard to show that (strong) partial trace equivalence is a congruence for encapsulation. This follows because the partial traces of H(P) are completely determined by the partial traces of the process P itself (namely by removing all partial traces with actions from H). The same argument can be used to see that partial trace equivalence is a congruence for all operators of CCS and ACP.

Partial trace equivalence is inadequate to deal with deadlock. Adding deadlock information to partial trace semantics yields completed trace semantics. However, completed trace equivalence is not a congruence for all operators. It fails for the restriction operator of CCS, which is the same as the encapsulation operator of ACP. It also fails for the parallel composition of CSP. A counterexample: the processes $a.b + a.c$ and $a.(b+c)$ are completed trace equivalent. However, after placing them in a context $\partial_H(\_\!\_)$ with $H=\{c\}$, they are not. For $\partial_H(a.b + a.c)$ has a completed trace $a$, yet $\partial_H(a.(b+c))$ does not. This operator corresponds to placing the processes in an environment that blocks, or refuses to synchronise on, the action $c$. Strong bisimulation equivalence on the other hand is a congruence for all operators of ACP, CCS and CSP. This is a reason to use bisimulation rather then partial or completed trace equivalence.

The congruence closure ~c of ~ (w.r.t. a language L) is defined by

P ~c Q if and only if C[P] ~ C[Q] for any L-context C[ ].

It can be shown that ~c is the coarsest congruence finer than ~. (Note that there are three statements in that last sentence.)

The congruence closure of strong completed trace equivalence can be characterised as strong failures equivalence (not defined yet). It is finer than strong completed trace equivalence, but coarser than strong bisimilarity. A similar thing applies to weak equivalences.

I give an introduction to equational logic, and present a complete equational axiomatisation of bisimulation equivalence on the language of (very) basic CCS (only using 0, action prefixing and +). The axioms for BCCS are $$\begin{array}{cc} (P + Q) + R = P + (Q + R) & \mbox{(associativity of the +)} \\ P + Q = Q + P & \mbox{(commutativity of the +)} \\ P + P = P & \mbox{(idempotence of the +)} \\ P + 0 = P & \mbox{(0 is a neutral element of the +)} \\ \end{array}$$ If we want to axiomatise partial trace equivalence instead of bisimilarity, there is an extra axiom $$\alpha.(P+Q) = \alpha.P + \alpha.Q$$ Such equational axiomatisations all use the following rules of equational logic: $$P=P \qquad \frac{P=Q}{Q=P} \qquad \frac{P=Q \quad Q=R}{P=R} \qquad \frac{P=Q}{C[P] = C[Q]} ~\mbox{for any context}~ C[\ ].$$ Here, the equality signs = can be read as bisimilarity or any other appropriate equivalence.

When using equational reasoning to verify that a specification is equivalent to an implementation we use, in addition to the machinery of equational logic, the recursive definition principle (RDP) and the recursive specification principle (RSP). RDP says that a system satisfies its recursive definition; RSP says that two processes satisfying the same guarded recursive equation, must be equal. It holds for strong bisimilarity, and for strong finitary partial trace equivalence. $$\begin{array}{ccc} \langle X|S \rangle = \langle S_X|S \rangle && \mbox{(RDP)} \\ S \Rightarrow X = \langle X|S \rangle & \mbox{provided S is guarded} & \mbox{(RSP)} \\ \end{array}$$ In the above formalisations of RDP and RSP, $S$ is set of equations $\{X = S_X \mid X \in V_S\}$, a recursive specification. $\langle X|S \rangle$ is a syntactic construct, denoting "the process $X$ as specified by the equations $S$". Moreover, $\langle S_X|S \rangle$ is a shorthand notation. It denotes the term $S_X$, that is, the right-hand side of the equation for $X$ in the recursive specification $S$, in which the process $\langle Y|S \rangle$ is substituted for each free occurrence of a variable $Y \in V_S$. Now RDP ought to be clear. To interpret RSP, it is important that axioms with free variables allow us to substitute any processes for those free variables. So if we substitute $P$ for $X$, and other processes for the other variables, then the antecedent of the implications say that $P$ is the $X$-component of a solution of the equations $S$. Now RSP says that in such a case $P$ must equal the canonical solution $\langle X|S \rangle$ of $S$. So if we have two processes $P$ and $Q$ that are both $X$-components of solutions of $S$, they must both equal $\langle X|S \rangle$, and hence each other.

It turns out that RDP and RSP, together with the basic axioms listed above, and the axioms that enable head normalisation, together form a complete axiomatisation for guarded processes. This means that any true equation between guarded processes can be derived from these axioms.

## Lecture 7 Tuesday 23-6-2020 : Congruence properties of weak equivalences

When trying to apply the above axiomatisation approach to weak bisimulation equivalence $=_{WB}$, we get stuck, because $=_{WB}$ fails to be a congruence for the +-operator, and hence fails the last law of equational logic. A counterexample: the processes $\tau.0$ and $0$ are weakly bisimilar, but after placing them in a context $a.0 + \_$ they are not. The same problem pertains to branching bisimilarity, and to most other weak equivalences, but not to weak trace equivalence. This problem can be solved by
• giving up on full equational logic,
• giving up on $=_{WB}$
• or giving up on the +.
Whereas the process algebras ACP and CCS follow the second approach, the process algebra CSP follows the last approach.

### Rooted weak and branching bisimilarity

In ACP, branching bisimulation equivalence $=_{BB}$ is replaced by branching bisimulation congruence $=_{BB}^c$ It is defined as the coarsest congruence contained in $=_{BB}$, obtained as the congruence closure of $=_{BB}$ (see above). It turns out that $P =_{BB}^c Q$ iff for all processes $R$ we have $P+R =_{BB} Q+R$. Thus it suffices to take the congruence closure w.r.t. the +-operator; the result then turns out to be a congruence for the other operators as well.

Likewise, in CCS, weak branching bisimulation equivalence $=_{WB}$ is replaced by weak bisimulation congruence $=_{WB}^c$ It is defined as the coarsest congruence contained in $=_{WB}$, obtained as the congruence closure of $=_{WB}$ (see above). It turns out that $P =_{WB}^c Q$ iff for all processes $R$ we have $P+R =_{WB} Q+R$. Thus it suffices to take the congruence closure w.r.t. the +-operator; the result then turns out to be a congruence for the other operators as well.

In fact, the same can be done for all other equivalences we encountered (such as weak and strong completed trace equivalence) and in class I indicate the resulting congruence closures on the linear time - branching time spectrum, our map of equivalences, sorted by the "finer than"-relation.

I also present another characterisation of $=_{BB}^c$, and illustrate it in class with examples:

Two processes P and Q are rooted branching bisimulation (RBB) equivalent iff

• whenever P --a--> P' then Q --a--> Q' and P' =BB Q'
• and when Q --a--> Q' then P --a--> P' and P' =BB Q'.
Thus a "rooted branching bisimulation" acts in the first step like a strong bisimulation, and after that like a branching bisimulation.

RBB equivalence equals branching bisimulation congruence; it is finer, i.e. more discriminating, than BB equivalence, and it is a congruence for all operators of ACP, CCS and CSP.

Similarly, I present another characterisation of $=_{WB}^c$, and illustrate it in class with examples:

Two processes P and Q are RWB equivalent iff

• whenever P --a--> P' then Q ==>--a-->==> Q' and P' =WB Q'
• and when Q --a--> Q' then P ==>--a-->==> P' and P' =WB Q'.
Crucial here is that the mimicked a-step is not optional.

RWB equivalence equals weak bisimulation congruence; it is finer, i.e. more discriminating, than WB equivalence, and it is a congruence for all operators of ACP, CCS and CSP.

These above two rooted equivalences can be given a uniform treatment. Define a rooted weak or branching bisimulation to be one that relates roots (= initial states) with roots only. Now two processes are RBB (resp. RWB) equivalent iff after applying an operation (root-unwinding) that preserves strong bisimilarity and removes any loops going through the initial states they are related by a such a rooted weak or branching bisimulation.

### Equational axiomatisations for rooted weak and branching bisimilarity

Above I presented a complete equational axiomatisation of strong bisimilarity on recursion-free CCS. To axiomatise rooted weak bisimilarity $=_{WB}^c$ we need three extra axioms: $$\begin{array}{c} (\mbox{T1}) & \alpha.\tau.P = \alpha.P \\ (\mbox{T2}) & \tau.P = \tau.P + P \\ (\mbox{T3}) & \alpha.(\tau.P + Q) = \alpha.(\tau.P + Q) +\alpha.P\\ \end{array}$$ The latter two describe a kind of transitive closure on the LTS: if we have two transitions p --τ--> q --a--> r or p --a--> q --τ--> r, then we obtain weakly bisimulation congruent systems by adding a transition p --a--> r.

Delay bisimilarity and η-bisimilarity are semantic equivalences intermediate between weak and branching bisimilarity, defined in the handouts. Rooted delay bisimilarity is axiomatised by (T1) and (T2); rooted η-bisimilarity by (T1) and (T3). Rooted branching bisimilarity is completely axiomatised by $$\alpha.(\tau.(P+Q)+Q) = \alpha.(P+Q)$$ on top the of axioms for strong bisimilarity. This axiom is a strengthening of (T1). There does not exists a congruence that is axiomatised by (T1) plus the axioms of strong bisimilarity alone. In this direction, rooted branching bisimilarity is at good as it gets.

### Strongly versus weakly unguarded recursion

Recall the principles RDP and RSP that say, together, that guarded recursive specifications have unique solutions. RDP says that each recursive specification has a solution, and RSP says that if a guarded recursive equation has multiple solutions, they must be semantically equivalent. For RSP guardedness is a necessary side condition. The recursive specification $X = X$, for instance, is not guarded, and has many solutions. In fact, each process is a solution of this specification. The recursive specification $X = a.b + X$ also have many solutions, but not each process is a solution. In fact, its solutions, up to strong bisimilarity, are the processes $a.b + P$, for arbitrary $P$.

The recursive specification $X = \tau.X$ has a unique solution up to strong bisimilarity, but not up to rooted weak or branching bisimilarity. So to make RSP work optimally, we need a more liberal concept of guardedness for strong bisimilarity (or other strong equivalences) than for weak bisimilarity (or other weak equivalences). In other words, we need a stronger, or less liberal, concept of unguardedness for strong bisimilarity.

We call a recursive specification strongly unguarded if a recursive variable $X$ occurs in the body of a recursive equation, but not in a subterm of the form $\alpha.P'$, that is, not guarded by $\alpha$, for any action $\alpha \in Act = A\cup \{\tau\}$. It is weakly unguarded if a recursive variable $X$ occurs in the body of a recursive equation, but not in a subterm of the form $a.P'$, that is, not guarded by $a$, for any visible action $a \in A$. Note that, in line with the meaning of the words "strong" and "weak", a strongly unguarded recursive specification is certainly also weakly unguarded. This wouldn't have worked out so nicely when talking about "guarded" rather than "unguarded" recursive specifications.

RSP is a sound principle for strong bisimulation semantics when interpreting "guarded" as "not strongly unguarded". But to make it a sound principle for weak or branching bisimulation semantics, we need to interpret "guarded" as "not weakly unguarded".

Again, it can be shown that for weak and branching bisimilarity, the axioms listed above, and the axioms that enable head normalisation, together with RDP and RSP, form a complete axiomatisation for processes without weakly unguarded recursion.

## Lecture 8 Monday 29-6-2020 : CSP, and Structural operational semantics

### CSP

Recall that weak and branching bisimilarity fail to be congruences for the choice operator (+) of ACP or CCS. There are two ways to remedy such a situation: to change the equivalence (by taking instead its congruence closure) or to change the language (by dropping the offending operator).

Weak bisimulation congruence, the congruence closure of weak bisimilarity, is the default semantic equivalence used in the language CCS. It can be characterised as rooted weak bisimilarity: two processes are weak bisimulation congruent, iff after applying an operation (root-unwinding) that preserves strong bisimilarity and removes any loops going through the initial states they are related by a rooted weak bisimulation, one that relates initial states with initial states only. Likewise, branching bisimulation congruence, the congruence closure of branching bisimilarity, is the default semantic equivalence used in the language ACP. It can be characterised as rooted branching bisimilarity.

The syntax of CSP, the theory of Communicating Sequential Processes, features the operations

1. 0 (inaction, originally written STOP)
2. a.P (action prefix, originally written a→P) for each visible action a.
3. PQ (external choice)
4. P⊓Q (internal choice)
5. P||SQ (parallel composition, enforcing synchronisation over the set S of visible actions)
6. P/a (concealment, originally written P\a) for each action a.
7. P[f] (renaming) for each function f:A -> A.
The atomic actions of CSP consists of all a, b, c, etc. from a given set A of visible actions. The syntax of CSP does not provide for invisible actions, although τ does occur in the process graph semantics of CSP; it is introduced by internal choice and by concealment.

Here are the definitions, in terms of process graphs without predicates and without a notion of final states, of the operators mentioned above:

• 0 is the graph with one state and no transitions. This one state is the initial state.
• a.G is obtained from G by adding a new state, which will be the initial one, and a new a-labelled transition from this new state to the initial state of G (which is no longer initial in a.G).
• GH is an operator that acts like a parallel composition without synchronisation until a visible action from one of the two components occurs, and from that moment onwards (internal or external) actions from that component only are possible.
• G⊓H is defined to be τG + τH.
• G||SH is obtained by taking the Cartesian product of the states of G and H; that is, the states of G||H are pairs (s,t) with s a state from G and t a state from H. The initial state of G||H is the pair of initial states of G and H. The transitions are
• (s,t) --a--> (s',t) whenever a is not in S and s --a--> s' is a transition in G,
• (s,t) --a--> (s,t') whenever a is not in S and t --a--> t' is a transition in H,
• and (s,t) --a--> (s',t') whenever a is in S and s --a--> s' is a transition in G and t --a--> t' is a transition in H.
Unlike the parallel compositions of CCS and ACP, the parallel composition of CSP not only allows, but also enforces synchronisation. For this reason CSP does not need a separate restriction or encapsulation operator.
• G/a is just G, but with all actions a renamed into τ.
• G[f] is the graph G in which each label a is replaced by f(a).
It turns out that weak and branching bisimulation equivalence are congruences for all operators of CSP.

### Structural Operational Semantics

Today I present the theory of structural operational semantics (SOS), as described in this handout. We have seen this framework in action already when I presented the structural operational semantics of CCS, given here. Note that in class I call processes $P$ and $Q$, and in the latter handout I use the letters $E$ and $F$. These stand for expressions that may contain free variables. Processes are closed expressions, which may not contain free variables. In the new handout I use variables $x$ and $y$ instead of processes $P$ and $Q$ or expressions $E$ and $F$. It's all the same, as one may substitute processes or general expressions for these variables.

Today I present the SOS of CCSP, also given here.

I also cover the use of negative premises in structural operational semantics (see the handout above) and the GSOS format with negative premises, using the priority operator that is sometimes added to ACP as an example.

Finally, I explain a theorem in Structured Operational Semantics, saying that strong bisimulation is a congruence as soon as all operators concerned have operational rules in the GSOS format. See the handout on structural operational semantics for details. In class I cover few examples illustrating a failure of bisimulation being a congruence when we deviate from this format.

As a consequence of this result, strong bisimilarity is a congruence for all operators of ACP, CCS and CSP.

## Lecture 9 Tuesday 30-6-2020 : The Hennessy-Milner logic

Today I present the Hennessy-Milner logic (HML), following this handout. (In the section "modal logic" take polymodal logic with an empty set of propositions.) HML is modal logic. The syntax of HML is given by: $\varphi,\psi ::= {\color{green} \top} \mid {\color{red} \bot} \mid {\color{green} \varphi \wedge \psi} \mid {\color{red} \varphi \vee \color{red}\psi} \mid \neg\varphi \mid {\color{green} \langle \alpha \rangle \color{green}\phi} \mid \color{red}[\alpha] \phi$ Infinitary HML (HML$^\infty$) moreover has an infinitary conjunction $\bigwedge_{i\in I}\phi_i$.
By Theorems 1 and 2 in the handout, two processes are strong bisimulation equivalent iff they satisfy the same infinitary HML-formulas. Thus, to show that two processes are not strongly bisimilar, it suffices to find an infinitary HML-formulas that is satisfies by one, but not by the other. Moreover, if we know that at least one of the processes is image-finite, we do not need infinite conjunction.

In the lecture I illustrate this on an example. It also illustrates that each use of negation in the an HML formula correspond with a switch of initiative from one side to the other in the bisimulation game.

By the way: Please read this small fragment of the wikipedia page for bisimulation for a Game theoretical characterisation of strong bisimilarity. (This should have been mentioned in Lecture 3, but then I forgot.)

Above, the red connectives and modalities are redundant, because they are expressible in the green connectives and negation. Likewise, the green connectives are expressible in terms of the red ones and negation. Finally, if we keep both the red and the green connectives and modalities, then negation is redundant. It is the latter form we need to extend HML into the modal $\mu$-calculus. It additionally has modalities $\mu X.\phi(X)$ and $\nu X.\phi(X)$. Here $\phi(X)$ is a modal $\mu$-formula that may contain the variable $X$. $\mu X.\phi(X)$ denotes the least fixed point of the equation $X=\phi(X)$, and $\nu X.\phi(X)$ the greatest. For these fixed points to exist, the formulas $\phi(X)$ need to be monotone in $X$, and to achieve that we needed to eliminate negation.

A formal denotational semantics of the modal $\mu$-calculus can be found in wikipedia. There, negation is not eliminated, but instead a variable needs to be under an even number of negation operators. This treatment is equivalent. In the lecture I merely mentioned the modal $\mu$-calculus, so that you know it exists. For now it is not covered material, on which I may ask exam questions.

A slightly different treatment of HML allows modal operators specifying a set of actions, rather than a single action. The syntax of HML is then given by
Φ ::= $\top$  | $\bot$  | ΦΦ  | ΦΦ  | ¬Φ  | [K]Φ  | <K>Φ
If a process P has a property Φ, we write P ⊨ Φ. The formal semantics is defined as follows
 P ⊨ $\top$ P ⊭ $\bot$ P ⊨ Φ∧Ψ iff P ⊨ Φ  and P ⊨  Ψ P ⊨ Φ∨Ψ iff P ⊨ Φ  or P ⊨  Ψ P ⊨ [K]Φ iff ∀Q∈{P':P--a-->P'  and   a∈K} . Q ⊨ Φ P ⊨ Φ iff ∃Q∈{P':P--a-->P'  and   a∈K} . Q ⊨ Φ
In this setting it is easy to express deadlock: a process $P$ deadlocks if it satisfies $P \models [Act]\bot$, with $Act$ the set of all actions.
I also introduce preorders, relations that are transitive and reflexive. They are like equivalence relations, but without the property of symmetry. I often use the symbol $\sqsubseteq$ for a generic preorder. A specific one, such as the partial trace preorder, is then denoted $\sqsubseteq_{\it PT}$.

Preorders are used just like equivalence relations to compare specifications and implementations. I normally write the implementation on the right: $$\textit{Spec} \sqsubseteq \textit{Impl}$$ For each preorder $\sqsubseteq$ there exists an associated equivalence relation $\equiv$, called its kernel. It is defined by $$P \equiv Q ~~\Leftrightarrow~~ (P \sqsubseteq Q \wedge Q \sqsubseteq P).$$ Preorders play the same role as semantic equivalence relations, and can be ordered in a spectrum. In fact, for almost every semantic equivalence $\equiv$, there exists an associated preorder with the property that $\equiv$ is its kernel. The partial trace preorder for instance is defined by $$P \sqsubseteq_{\it PT} Q ~~\Leftrightarrow~~ PT(P) \supseteq PT(Q)$$ Half of the scientists orient the symbol $\sqsubseteq_{\it PT}$ in the other direction, so that it aligns with the direction of the set-inclusion symbol $\subseteq$. I aligned this course reversely. The process $Q$ in $P \sqsubseteq Q$ is called a refinement of $P$. It is a step closer to the implementation. We require that all good properties of process $P$ are shared by $Q$, although it is acceptable if $Q$ has extra good properties, not required by $P$. Most of the time, our properties will be trace based, such as

there is no trace containing the action "meltdown"
or
in each complete trace, when the action "coin" occurs, it must be followed by "coffee"

In such cases, a process is defined to have the good property $\phi$ iff each of its (complete) traces has the good property $\phi$. Thus if process $Q$ has only a subset of the traces of $P$, it surely has all the good properties of $P$. One has to be careful with this type of reasoning, and not jump to the conclusion that a process that cannot do anything is surely the best of all. But it motivates the orientation of the symbol $\sqsubseteq$, with the improved process on the right.

I define when one process simulates another. This is like bisimulation, but in the transfer property the initiative is always taken by the simulated process, and the matching move should be taken by the simulating process. Two processes $P$ and $Q$ are simulation equivalent, notation $P =_S Q$, if one simulates the other and vice versa. This is not the same as bisimulation equivalence, for there we insists on a single relation that is at the same time a simulation in both directions. The processes ab+a(b+c) and a(b+c) simulate each other, but are not bisimulation equivalent.

When adding simulation equivalence to the spectrum of equivalence relations that had already partial trace equivalence, completed trace equivalence and bisimulation equivalence in it, it turns out that simulation equivalence is finer that =PT, coarser than =B, and incomparable with =CT.

Let $L$ be the fragment of the infinitary Hennessy-Milner logic, given by the syntax $\varphi ::= {\color{green} \top} \mid {\color{green} \bigwedge_{i\in I} \varphi_i} \mid {\color{green} \langle \alpha \rangle \color{green}\phi}$ This logic yields a modal characterisation of (strong) simulation equivalence, just like the full Hennessy-Milner logic yields a model characterisation of (strong) bisimulation equivalence: $P =_ S Q ~~\Leftrightarrow~~ \forall \varphi \in L.~ (P \models \varphi \Leftrightarrow Q \models \varphi).$

## Lecture 10 Monday 13-7-2020: Temporal logic

In today's lecture we discuss some limitations of HML; some of the limitations are not present when using another logic, namely Computation Tree Logic (CTL). Therefore, we look at the syntax and the semantics of CTL. A couple of examples illustrate the power and expressiveness of CTL. (A good overview over CTL can be found on Wikipedia). The last logic we discuss and compare today is Linear-time Temporal Logic (LTL). (Both links are required reading.)
LTL and CTL can be shown to be incomparable by proving that there cannot exist an LTL-formula that is equivalent to the CTL-formula AG EF a, and by showing that there cannot exist a CTL-formula equivalent to FG a (see e.g. [BK08]).

[BK08] Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008. ISBN 978-0-262-02649-9

I will not say much about CTL*, a logic that combines the expressive power of LTL and CTL.

LTL and CTL are defined on Kripke structures, transition systems of which the states rather than the transitions are labelled. Traditionally it is required that each state has at least one outgoing transition. The state-labels are atomic propositions; a state could be labelled with zero, one or more of them. Write $s \models \phi$ if the temporal logic formula $\phi$ (in CTL or LTL) holds for the state $s$.

Two processes (states in a deadlock-free LTS) satisfy the same LTL formulas if they have the same infinite traces. For finitely-branching processes, we even have "iff".

Two processes (states in a deadlock-free LTS) satisfy the same LTL-X formulas if they have the same weak infinite traces (obtained by leaving out τ's).
Here LTL-X is what remains of LTL after throwing out the next-state operator X. It can be argued that this operator is not compatible with abstraction from internal actions.

Two processes (states in a deadlock-free LTS) satisfy the same CTL formulas if they are strongly bisimilar. For finitely-branching processes, we even have "iff". For general processes, we also have "iff", provided that we use CTL with infinite conjunctions.

Two processes (states in a deadlock-free LTS) satisfy the same CTL-X formulas if they are divergence-preserving branching bisimulation equivalent. We even have "iff", provided that we use CTL-X with infinite conjunctions.

1. Translate any LTS (possibly with deadlock) into a Kripke structure without deadlock by adding a self-loop to any state in the Kripke structure that otherwise would be a deadlock. Then use the standard interpretation (in terms of infinite paths) of the LTL or CTL operators on deadlock-free Kripke structures.
2. Stick to the translation defined earlier, thereby obtaining Kripke structures that could have deadlocks. Change the interpretation of LTL and CTL by using complete paths instead of infinite paths for the interpretation of formulas. The complete paths are the infinite paths, together with all finite paths ending in a deadlock state.
To be precise, in the
semantics of CTL, the paths chosen in Causes 10-15 should be completed paths rather then infinite paths.
In the semantics of LTL the words may be finite sequences and the only definition that needs to be amended is that of the next-state operator. Let's use the convention that if $w$ is the empty word, then $w \models {\bf X} \psi$ is always false. (A consequence of this decision is that we loose the self-duality of $\bf X$.)
3. Stick to the translation defined earlier, thereby obtaining Kripke structures that could have deadlocks. Change the interpretation of LTL and CTL by using complete paths instead of infinite paths for the interpretation of formulas. Moreover, add a new modality EG$^\infty$φ that requires φ to hold on all states of an infinite path.
4. Translate any LTS (possibly with deadlock) into a Kripke structure without deadlock by adding a new state labelled with a self-loop, labelled by a fresh proposition δ (deadlock) and a transition to this state from any state in the Kripke structure that otherwise would be a deadlock. Then use the standard interpretation (in terms of infinite paths) of the LTL or CTL operators on deadlock-free Kripke structures.
Now we have the following extensions of the results above:

Two processes $P$ and $Q$ (in an arbitrary LTS) satisfy the same LTL formulas if P =CT Q. For finitely-branching processes, we even have "iff".

Two processes $P$ and $Q$ (in an arbitrary LTS) satisfy the same LTL-X formulas if P =WCT Q, that is, if they have the same (possibly infinite) weak completed traces (obtained by leaving out τ's).

Two processes $P$ and $Q$ (in an arbitrary LTS) satisfy the same CTL formulas if they are strongly bisimilar. For finitely-branching processes, we even have "iff". For general processes, we also have "iff", provided that we use CTL with infinite conjunctions.

Two processes $P$ and $Q$ (in an arbitrary LTS) satisfy the same CTL-X formulas if they are divergence-preserving branching bisimilar. We even have "iff", provided that we use CTL-X with infinite conjunctions.

The last result, however, only works with approach 3 or 4 above. Otherwise we would get a weaker form of divergence-preserving branching bisimilarity, which would identify deadlock and livelock.

## Lecture 11 Tuesday 14-7-2020: Deciding bisimilarity

Today I explain the partition refinement algorithm for deciding (strong) bisimulation equivalence on finite-state processes. As supporting material I hand out a copy of Sections 2.1, 2.2 and 3.1 of Chapter 6 of the Handbook of Process Algebra.

The complexity of the original algorithm, by Kanellakis & Smolka, is $O(m.(n+l))$, where $m$ is the number of transitions, $n$ is the number of states, and $l$ is the number of different action labels in a finite system (when assuming n < m+2, which can be achieved by collapsing deadlock states). (Often $l$ is taken to be a constant, and not counted.) The algorithm has been improved by Paige & Tarjan to reach complexity $O(m.(log(n)+log(l))$.

Note however, that the complexity of deciding bisimilarity is EXPTIME when the input is a process algebra expression (of a not too complex kind). This is because an LTS can be exponentially larger than its generating expression.

The partition refinement algorithm works by first considering all processes equivalent, and putting them into 1 big equivalence class. Then we keep finding reasons for cutting an equivalence class into two smaller equivalence classes (red and blue), namely when we find evidence that none of the process in red are equivalent to processes in blue. When we run out of reasons to further cut equivalence classes in two, the equivalence relation we are left with is strong bisimilarity.

The partition refinement algorithm for strong bisimilarity treated above can be easily adapted to deal with branching bisimilarity. Please read Sections 1, 2, 3 and 6 of An O(m log n) Algorithm for Branching Bisimulation. Or (better) Section 1 (until "Another important insight"), 2, 3 (until "In order to see that the time complexity") and 7 of the published version. The key idea is that τ-moves between states within the same block are skipped over, whereas τ-moves between different blocks are treated as if τ is any normal action. For the rest it resembles the partition refinement algorithm for strong bisimilarity.

There are two crucial differences with the partition refinement algorithm for strong bisimilarity:

1. when checking whether a state in block B has an $\alpha$-transition to block C, it is OK if we can move through block B by doing only $\tau$-transitions, and then reach a state with an $\alpha$-transition to block C.
2. We never check $\tau$-transitions from block B to block B. ($\tau$-transitions to another block are fair game, though).
The complexity is the same as for strong bisimilarity, and thus better than for weak bisimilarity (see below). To obtain this complexity we must maintain a more tricky data structure, that is not covered in class.
To decide weak bisimilarity for finite transition systems we first compute the transitive closure: add a transition $p \stackrel{\alpha}\longrightarrow r$ whenever we have $p \stackrel{\tau}\longrightarrow q \stackrel{\alpha}\longrightarrow r$ or $p \stackrel{\alpha}\longrightarrow q \stackrel{\tau}\longrightarrow r$; also add a $\tau$-loop in every state. After this transformation weak bisimilarity coincides with strong bisimilarity, and we can use the partition refinement algorithm for strong bisimilarity to decide the latter. But the complexity of performing the above transitive closure is higher than that of partition refinement; the total complexity of deciding weak bisimilarity is $O(l.n^{2.376})$.
Modulo strong, branching or weak bisimulation, every process graph can be converted into one in which no two states are bisimilar. The states of the canonical representation will be the bisimulation equivalence classes (colours) of the states of the original graph. The transitions between such states are defined in the obvious way, and the initial state is the equivalence class containing the initial state of the original system.

## Lecture 12 Monday 20-7-2020: Alternating bit protocol; tools

In this lecture we model an abstract specification and an implementation of the alternating bit protocol in the process algebra ACP. See pages 71-79 in the book Introduction to Process Algebra by Wan Fokkink.

That book uses equational reasoning (applying axioms) to show that the implementation is rooted branching bisimilar to the specification. This mode of proving things is called equivalence checking. Another form of equivalence checking is to directly show the rooted branching bisimilarity of the transition systems of Spec and Impl, for instance using the partition refinement algorithm. An alternative to equivalence checking is model checking. This also involves modelling the (implementation of) the protocol in a process algebra or similar formalism, but instead of developing an abstract specification, one formalises intended properties in terms of temporal logics, like CTL or LTL. Then one applies techniques to directly show that the (transition system generated by the) implementation of the protocol satisfies its temporal formula. Out of the 8 major toolsets for system verification surveyed here, 7 employ model checking and 4 employ equivalence checking. Two of the latter use rooted branching bisimulation as their primary equivalence. A more complete overview of tools is found on pages 92-95 of the above book.

Of course equivalence checking and model checking can be combined: there are theorems that tell us that if Spec ~ Impl for a suitable equivalence ~, and Spec satisfies a temporal formula of a particular form, then also Impl satisfies this formula. Using such results, we can first use equivalence checking or partition refinement to replace a system with many states by a smaller system that is semantically equivalent. Then we use model checking to check that the smaller system satisfies a correctness property, given by a temporal formula. We then know that also the original system satisfies this correctness property.

The correctness of the alternating bit protocol (ABP) hinges on a fairness assumption, saying that if you try to send a message infinitely often, sooner or later it will come through correctly. Without relying on such an assumption, the protocol is obviously incorrect, on grounds that there can be no way to ensure correct message delivery. Fairness assumptions are important in verification. If we refuse to make them we declare any protocol like the ABP a priori incorrect, and miss an opportunity to check the protocol logic for flaws that have nothing to do with the unreliable nature of the communication channel.

A strong fairness assumption is already incorporated in the notions of (rooted) branching (or weak) bisimulation equivalence seen before. No fairness assumption is incorporated in divergence-preserving (rooted) branching (or weak) bisimulation equivalence (see lecture 5). Those equivalences would not be suitable for the verification of the ABP.

## Lecture 13 Tuesday 21-7-2020: Non-interleaving semantics

I give an overview of semantic equivalence relations between process that to some extent take causal independence into account. In one dimension semantic equivalences can be ordered from linear-time to branching-time semantics. Typical examples of linear-time semantics are partial trace and completed trace semantics. Typical branching-time semantics are strong bisimulation or branching bisimulation semantics. The canonical equation that holds for linear-time but not for branching-time is a(b+c) = ab+ac.

Orthogonal to this classification is a spectrum that ranges from interleaving semantics, in which parallelism is treated by arbitrary interleaving of atomic actions, and causality respecting semantics, in which causal relations between actions are explicitly taken into account. A typical equation that holds in interleaving semantics but not in causal semantics is a||b = ab+ba. It is rejected in causal semantics because in a||b the two actions a and b are causally independent, whereas in ab+ba either a depends on b or vice versa. A causal version of partial trace equivalence can be obtained by taking partial orders of actions instead of sequences. This is why causal semantics are sometimes called partial order semantics.

Between interleaving and causal semantics lays step semantics. It differs from interleaving semantics by explicitly taking into account the possibility that two concurrent actions a and b by pure chance happen at the exact same (real-time) moment. Here a||b differs from ab+ba because only the former has a trace where a and b happen at exactly then same time.

Between step semantics and causal semantics lays interval semantics. It takes causality into account only to the extend that it manifest itself by durational actions overlapping in time.

One argument to use at least interval semantics (and possibly causal semantics) instead of interleaving semantics is real-time consistency. If we imagine that actions take time to execute, interleaving semantics may equate processes that have different running times (it is not real-time consistent), whereas this cannot happen in interval or causal semantics. An other argument is action refinement: we want that two equivalent processes remain equivalent after systematically splitting all occurrences of an action a into a1;a2, or refining a into any more complicated process. This works for interval or causal semantics, but not for interleaving semantics.

A map of the equivalences discussed above appears on page 3 of this paper. In this class I will not treat the right-most column of that diagram. Please also read the text of page 3 below that diagram. And read section 1.2 (on pages 4-5), listing reasons for preferring one semantic equivalence over another.

I indicate how causal trace semantics can be represented in terms of pomsets, partial ordered multisets. The process $a(b||(c+de))$ for instance has two completed pomsets, which were drawn in class. A normal trace can be regarded as a totally ordered multiset of actions.

### Petri nets

I introduce Petri nets as a model of concurrency that forms an alternative to process graphs or labelled transition systems. I also give an interpretation of the operators of CCS etc. in terms of nets. Thus Petri nets can be used as semantic model of such languages. This paper presents Petri nets, as well as a Petri net interpretation of ACP. Please read page 1 of the introduction, Section 1 which summarises the language ACP, Section 2 until subsection 2.5, and Section 3, presenting the Petri net semantics of ACP.

Petri nets can be translated to process graphs by taking as states their markings, the initial marking being the initial state, and as process-graph transitions the transition firing relation between markings. A theorem states that translating an ACP or CCS expression into a net, and then translating further into a process graph, yields the same graph, up to strong bisimilarity, as directly interpreting the expression as a process graph (using the standard denotational or operational semantics).

On Petri nets and event structures it is common to not use interleaving equivalences that identify a|b and ab+ba. Reasons to distinguish these processes are preservation of semantic equivalence under action refinement and real-time consistency; see above.

## Lecture 14 Monday 27-7-2020: Progress, Justness and Fairness

I recall from Lecture 5 the concepts of safety properties, saying that something bad will never happen, and liveness properties, saying that something good will happen eventually. The terminology "liveness" used here stems from Lamport 1977, who was inspired by the older use of the term "liveness" in Petri nets, whose meaning however is different. Today I will be interested in a common and simple kind of liveness property, saying that a good state in a Kripke structure will be reached. In LTL this property can be modelled as Fg. Here $g$ is an atomic proposition assigned to those states in the Kripke structure that are deemed "good".

In the same spirit, a common and simple kind of safety property can be modelled in LTL as $\neg$Fb, or equivalently G$\neg$b. It says that there is no path from the initial state to a state where b holds. Here $b$ is an atomic proposition assigned to those states in the Kripke structure that are deemed "bad".

Safety properties are often established by means of invariants, properties of states that hold in the initial state, and are preserved under doing a transition. Being "not a bad state" could be such an invariant. Note that if the initial state is "not bad", and each transition that starts in a "not bad" state also ends in a "not bad" state, then indeed it follows that no bad state in reachable, and $\neg$Fb holds. In the formulation of invariants, states are often determined by the values of certain variables, and the invariants are formulated in terms of these variables. The bad thing might be that variable $x$ exceeds the value 137.

Whereas normally this course focuses on reactive systems, where visible actions model a synchronisation between the modelled system and its environment, today I focus on closed systems, which are not supposed to be influenced by their environment. In a reactive system, represented as a process graph, the labelling of the transitions provides an interface for the environment to interact with the system; the environment (or user) can either block or allow a certain action, and thereby steer the system when it has a choice between multiple actions. In Kripke structures the transitions are not labelled, and it is much more common to study closed systems. If in this setting one wants to model the interaction of our system with a user, it is more appropriate to model the user separately, and put this user model in a parallel composition with the system under investigation.

Whether liveness properties of systems hold depends often on whether or not we make appropriate progress and fairness assumptions. I discuss 5 such assumptions: progress, justness, weak fairness, strong fairness and full fairness. Please read page 7 of this paper for an informal description of the first four of these assumptions, and an illustrating example.

### Completeness criteria

Recall from Lecture 2 that a path of a process P, modelled as state in an LTS or in a Kripke structure, is an alternating sequence of states and transitions, starting from the state P, and either infinite or ending in a state, such that each transition goes from the state before it to the state after it. Now a completeness criterion is a criterion on the paths of all processes in all LTSs or Kripke structures, that tells for each path whether or not it counts as complete.

A linear-time property is a property of paths. LTL is a logic that specifies properties of paths in Kripke structure. Therefore any LTL formula models a linear-time property. A good example of a linear-time property is given by the LTL formula Fg discussed above. It says that a state that is labelled by the atomic proposition $g$ occurs on this path. But we can talk about linear-time properties also if they are specified informally, or using some other formalism.

Now I define what it means for a process $P$ to satisfy the linear-time property $\varphi$ under the completeness criterion $C$. This is denoted $P \models^C \varphi$. It holds if all complete paths of $P$ satisfy $\varphi$. Here a complete path is one that is selected as being complete by the completeness criterion $C$.

In a Kripke structure one normally assumes that there are no states without outgoing transitions. In the world of LTSs without a termination predicate, this condition is called deadlock-free, for a state without outgoing transitions is regarded as a deadlock. For such Kripke structures, the default completeness criterion says that a path is complete if and only if is infinite. If no other completeness criterion is discussed, this is always the intended one. Under this completeness criterion, formulas $\varphi$ holds for process $P$ iff it holds for all infinite paths of $P$. Naturally, $P \models^C \varphi$ can then be abbreviated by $P \models \varphi$. The superscript $C$, and the entire concept of a completeness criterion, need to come up only when we want to discuss alternative completeness criteria.

In LTSs, or in Kripke structures that are explicitly allowed to contain deadlock states, the default completeness criterion says that a path is complete if it is either infinite or ends in a state from which no further transitions are possible. This is the completeness criterion used when defining completed trace semantics―see Lecture 2.

We say that completeness criterion $D$ is stronger than completeness criterion $C$ if it rules our more paths as incomplete, that is, if the set of complete paths according to $D$ is a subset of the set of complete paths according to $C$. If $D$ is stronger than $C$, then $P \models^C \varphi$ implies $P \models^D \varphi$, but not necessarily vice versa. For to check $P \models^D \varphi$, we have to check that $\varphi$ holds for all complete paths, and under $D$ there are fewer complete paths than under $C$, so there is less to check.

Now progress, justness, weak fairness and strong fairness can be seen as completeness criteria, ruling out some paths from consideration when determining whether a process satisfies a property. They get stronger in the listed order, so $P \models^{\it SF} \varphi$ implies $P \models^{\it WF} \varphi$, which implies $P \models^J \varphi$, which implies $P \models^{\it Pr} \varphi$, for all processes $P$ and properties $\varphi$. Here $\it SF$, $\it WF$, $J$ and $\it Pr$ are the completeness criteria formalising those assumptions.

The strongest completeness criterion thinkable (let me call it $\infty$ in this paragraph only) says that no path is complete. Naturally, we now have $P \models^\infty \varphi$ for all processes $P$ and properties $\varphi$. This is not a useful completeness criterion, so we now forget it again.

The weakest completeness criterion possible (let me call it $\emptyset$) says that all paths are complete. This is in fact a useful completeness criterion. Note that $P \models^{\it Pr} \varphi$ implies $P \models^\emptyset \varphi$, so we have already a hierarchy of 5 completeness criteria.

### Weak fairness

Weak and strong fairness are defined in terms of tasks. A task is a set of transitions in a Kripke structure or process graph. To talk about weak and strong fairness, we need a structure $(S, \rightarrow, I, \mathcal{T})$. Here $(S,\rightarrow, I)$ models a Kripke structure or process graph, with a set $S$ of state, a transition relation $\rightarrow$, and an initial state $I\in S$. The added structure $\mathcal{T}$ is a collection of tasks $T \in \mathcal{T}$, each being a set of transitions.

A task $T$ is enabled is a state $s\in S$, if $s$ has an outgoing transition that belongs to that task. Task $T$ is perpetually enabled on a path $\pi$ if it is enabled in every state of $\pi$. Task $T$ occurs in $\pi$ if $\pi$ contains a transition that belongs to $T$.

A finite prefix of a path $\pi$ is the part of $\pi$ from its beginning until a given state. A suffix of a path $\pi$ is the path that remains after you remove a finite prefix of it. Note that if $\pi$ is infinite, then all its suffixes are infinite as well.

A path $\pi$ is called weakly fair if, for every suffix $\pi'$ of $\pi$, each task that is perpetually enabled on $\pi'$, occurs in $\pi'$.

Equivalently: A path $\pi$ is weakly fair if each task that from some state onwards is perpetually enabled on $\pi$, occurs infinitely often in $\pi$.

Weak fairness is a completeness criterion: the weakly fair paths are the complete ones.

### Strong fairness

A task $T$ is relentlessly enabled on a path $\pi$ if each suffix of $\pi$ contains a state in which it is enabled. This is the case if the task is enabled in infinitely many states of $\pi$, in a state that occurs infinitely often in $\pi$, or in the last state of a finite $\pi$.

A path $\pi$ is called strongly fair if, for every suffix $\pi'$ of $\pi$, each task that is relentlessly enabled on $\pi'$, occurs in $\pi'$.

Equivalently: A path $\pi$ is strongly fair if each task that is relentlessly enabled on $\pi$ occurs infinitely often in $\pi$.

Also strong fairness is a completeness criterion: the strongly fair paths are the complete ones. Note that a task that is perpetually enabled is certainly relentlessly enabled. Hence a strongly fair path is surely also weakly fair. Thus strong fairness is a stronger completeness criterion than weak fairness.

As an example, take the alternating bit protocol (ABP) encountered in Lecture 12. A property that this protocol should satisfy is that each message $d\in D$ that is offered by the user of the protocol on channel $A$ will eventually be delivered to the user at the other end via channel $C$. In LTL, implicitly applying the translation from process graphs to Kripke structures discussed in Lecture 10,

$\textbf{G}(r_A(d) \Rightarrow \textbf{F} s_C(d))$.

Let us consider the successful transmission of a message over the unreliable channel to be a task. If we look at the process graph of the ABP, there are infinite paths that get stuck in loops, and in which (from some point onwards) the successful sending of a message does not occur. Looking at the process graph, we see that our task is relentlessly enabled on such a path, but not perpetually enabled. Hence such a path is weakly fair but not strongly fair. It follows that the ABP is correct when we make a strong fairness assumption, but not when we only make a weak fairness assumption.

$\textrm{ABP} \models^{\it SF} \textbf{G}(r_A(d) \Rightarrow \textbf{F} s_C(d))$.
$\textrm{ABP} \not\models^{\it WF} \textbf{G}(r_A(d) \Rightarrow \textbf{F} s_C(d))$.

Strong fairness can be expressed in LTL as $${\bf G}({\bf GF}(\textit{enabled}(T)) \Rightarrow {\bf F} (\textit{occurs}(T)))$$ The above formula can equivalently be written as $${\bf G}(\neg {\bf G}({\bf F}(\textit{enabled}(T)) \wedge \neg \textit{occurs}(T)))$$ or $${\bf GF}(\textit{enabled}(T)) \Rightarrow {\bf GF}(\textit{occurs}(T)).$$

### Global versus local fairness

A local fairness assumption is tailored to a specific application, where certain choices are supposed to be fair, but others are not. Take as example the alternating bit protocol. The choice between a successful transition of a message over the unreliable channel versus an unsuccessful transmission can be regarded to be fair. This means that we count on it not failing all the time. At the same time, there is a choice between the environment sending us a message to transmit that is shorter then 12 characters, or is at least 12 characters long. This choice is not regarded as fair, as we have no clue on what drives the environment to supply us with certain messages.

To model local fairness assumptions, we need to explicitly declare our tasks. This means that we move from process graphs or Kripke structures to the extension of these structures with the tasks as an extra component.

A global fairness assumption, on the other hand, is independent of the system under investigation, and instead considers all choices of a given type to be fair. Here we work with models of concurrency that are equipped with some structure, and when translating those models to process graphs or Kripke structures, the Tasks are somehow extracted from the structure of the original model.

An example is fairness of actions. Here the underlying model of concurrency consists of process graphs, where the transitions are labelled with actions. Now there is a task for each action, and a transition belongs to that task if it has the corresponding label.

Another example is fairness of transitions. Here each transition constitutes a singleton task.

A third example is fairness of instructions. Here the underlying model is a programming language, where each program is translated to a process graph or Kripke structure. Now there is a task for each line of code (instruction) in the source program, and a transition in the process graph belongs to that task iff it stems from that instruction.

There are more such methods, but I will skip them here.

### Progress

The assumption of progress, for closed systems, says that we will not stay forever in a state with an outgoing transition. Seen as a completeness criterion, it doesn't restrict infinite paths, but declares a finite path to be complete only if it ends in a state without outgoing transitions.

When dealing with weak or strong fairness, it is often common to put each transition in at least one task. One way to accomplish this is to create one special task that contains all transitions. In this situation, a weakly (or strongly) fair path is certainly progressing. Namely, if a path $\pi$ is not progressing (fails the progress condition), then it ends in a state with an outgoing transition. Hence a task $T$ containing that transition is perpetually enabled in the suffix $\pi'$ of $\pi$ consisting of its last state only. Now weak fairness prescribes that task $T$ occurs in $\pi'$, which it clearly doesn't. So $\pi$ is not weakly fair.

This arguments shows that weak (as well as strong) fairness is a stronger completeness criterion than progress.

A progress assumption is build in in the logics CTL and LTL (for they work with infinite paths rather than partial paths).

### Justness

Justness is a local progress assumption. Consider the system X||Y, where X is given by the recursive specification X=a.X, and Y by Y=b.Y. This system consists of two parallel components, one of which performs infinitely many $a$s, and the other infinitely many $b$s. The process graph of this system has one state, with an $a$-loop and a $b$-loop.

Now if we considered $X$ it is own right, the assumption of progress would guarantee that $X$ will eventually do an $a$-action. It cannot stay in its initial state. However, in the context with the parallel process $Y$, the composition $X||Y$ is not guaranteed to ever do an $a$, at least when only assuming progress. For the infinite path that consists of $b$-transitions only does satisfy the progress condition.

Now justness allows us to apply progress locally, and conclude that also $X||Y$ must eventually do an $a$-action, at least when assuming that $X$ and $Y$ are parallel components that do not interact with each other in any way.

Formally, justness can be formulated in terms of resources needed by transitions. These resources could be components in a parallel composition that give rise to the transition. The process $a|\bar a$ for instance has two components, R(ight) and L(eft). The $a$-transitions needs one recourse, namely L. Likewise, the $\bar a$-transition only needs R. However, the $\tau$-transitions needs two resources, namely L and R. In the setting of Petri nets, the resources needed by a transition are the tokens that need be in its preplaces.

Now a path is called just if for each transition that is at some point enabled (meaning it is an outgoing transition of a state in the path), one of its resources is actually used for a transition that occurs in the path (past that state).

The rough idea is that if all resources for a transition are present, and none is ever used for anything else, then the transition ought to occur.

### Example

An illustrative example appears in Figure 4 of this paper (page 13). That net displays a shop with two clerks I and II and two returning customers A and B. There are 4 interactions between the clerks and the customers: I serves A, II serves A, I serves B, II serves B. In this net there are many traces that describe possible complete runs of the systems. The stronger our fairness assumption, the fewer runs are admissible. When assuming nothing, all partial traces model runs. When assuming progress, this rules out all finite traces; now only infinite traces model real completed runs. When assuming justness a run that only contains "I serves A" is ruled out, because the preconditions of "II serves B" are always met, and these tokens are never used for anything else. However, a run that alternates between "I serves A" and "II serves A" is still fine. Such a run is ruled out by weak fairness however, as in each state "I serves B" is enabled, yet customer B is never served. Weak fairness, however, allows the run that alternates between "I serves A" and "I serves B", because "II serves A" is not enabled in every state (due to customer A leaving repeatedly). Yet, this run is ruled out by strong fairness.

### Full fairness

A very strong fairness assumption is already incorporated in the notions of (rooted) branching (or weak) bisimulation equivalence seen before. We saw above that it takes strong fairness to make the alternating bit protocol correct. Yet, we succeeded in proving it correct by means of branching bisimilarity. The reasoning goes as follows: The specification is obviously correct, as it satisfies the requirement
$\textbf{G}(r_A(d) \Rightarrow \textbf{F} s_C(d))$.

To this end we only have to assume progress. Then we show that the implementation is branching bisimilar with the specification, and hence is also assumed to be correct / to satisfy this requirement. This method allows us to implicitly assume strong fairness. In fact we can even prove systems correct in cases where strong fairness is not strong enough. The form of fairness obtained in this way we call full fairness. Full fairness cannot be seen as a completeness criterion, because there are no specific paths that are ruled out. Yet it is a method to prove liveness properties, and on that basis it can be compared with strong fairness, and turns out to be strictly stronger.

In CTL, a useful liveness property often has the form

$\textbf{AG}(\varphi \Rightarrow \textbf{AF} \psi)$.

It is called an AGAF property, and says that whenever some precondition $\varphi$ is met, then eventually we reach a state where $\psi$ holds. By means of full fairness, we can obtain such a result, as soon as we have the weaker property

$\textbf{AG}(\varphi \Rightarrow \textbf{EF} \psi)$.

This AGEF property says that whenever some precondition $\varphi$ is met, it will always remain possible to reach a state where $\psi$ holds.

The notions of (rooted) branching (or weak) divergence-preserving bisimulation equivalence do not make fairness of justness assumptions.

### Opinion

My personal opinion is that it is often reasonable to assume (global) progress. In the absence of that, no non-trivial liveness property can be proven. When assuming progress I find it reasonable to also assume justness (globally). However, the use of global (weak or strong) fairness assumption should not be our default; it should be used only if for some reason we are convinced that such an assumption is warranted, or because we can only obtain a meaningful liveness property conditionally on making such an assumption.

## Course evaluation

Please fill in the course and instructor evaluation forms.

## Lecture 15 Tuesday 28-7-2020: Reactive temporal logic

I present "reactive CTL/LTL". In Lecture 10 we say how to interpret CTL and LTL on process graphs (or states in labelled transition systems) with deadlocks. We discussed 4 methods to deal with deadlock. In the context of CTL or LTL minus the next-state operator, the 3rd or 4th method was the best. But when using full CTL or LTL, with the next-state operator, the 2nd method is just as good, and this is what I use today. In a deadlock state in a Kripke structure (a state without outgoing transitions), the CTL formula $\neg$EXtrue holds, and in a state with an outgoing transition it does not hold. Hence this formula could be used to describe deadlock.

We use the De Nicola-Vaandrager translation from process graphs to Kripke structures. Formally, a process graph is a triple (S,I,→), with S a set (of states), I∈S an initial state, and a set of triples (s,a,t) with s,t ∈S and a an action, drawn from a set Act. The associated Kripke structure is $(S',I,\rightarrow',\models)$, where $S' := S \cup\{(s,a,t)\in{\rightarrow}\mid a\neq \tau\}$, and ${\rightarrow'} := \{(s,(s,a,t)),((s,a,t),t) \mid (s,a,t)\in S'\} \cup \{(s,t)\mid (s,\tau,t) \in {\rightarrow}\}$. This Kripke structure has the same states and transitions as the process graph, except that a new state $(s,a,t)$ is added halfway a transition from $s$ to $t$ labelled $a\neq\tau$. As atomic propositions we take the set Act of actions, except for $\tau$, and declare that a state satisfies the atomic proposition $a\in Act$ iff that state has the form $(s,a,t)$. So $r \models a$ iff $r=(s,a,t)$.

Now we consider all possible subsets $B$ of $Act\setminus\{\tau\}$. $B$ contains the actions that may be blocked by the environment. We define a $B$-complete path of a process graph as a path that is either infinite or ends in a state of which all outgoing transitions have a label from $B$. A $B$-complete path in a Kripke structure that arises as the De Nicola-Vaandrager translation of a process graph is a path that ends in a state, all of whose successor states are labelled with atomic propositions from $B$. Thus, the De Nicola-Vaandrager translation maps $B$-complete paths in process graphs to $B$-complete paths in the associated Kripke structures. An LTL formula $\phi$ is said to hold for the Kripke structure $K$ when the actions from $B$ might be blocking, notation $K \models_B \phi$, iff each $B$-complete path from the initial state of $K$ satisfies $\phi$. Likewise, $\models_B$ can be defined for CTL by simply replacing all references to infinite paths by $B$-complete paths.

As an example, consider a vending machine that takes in a coin $c$ and produces a pretzel $p$. We assume that taking in the coin requires cooperation from the environment, but producing the pretzel does not. A process algebraic specification is $$VM=c.p.VM$$ In old-fashioned LTL we have $VM \models G(c\Rightarrow Fp)$. This formula says that whenever a coin is inserted, eventually a pretzel is produced. This formula is intuitively true indeed. But we also have $VM \models G(p\Rightarrow Fc)$. This formula says that whenever a pretzel is produced, eventually a new coin will be inserted. This formula is intuitively false. This example shows that old-fashioned LTL is not suitable to describe the behaviour of this vending machine. Instead we use reactive LTL, and take $B=\{c\}$. This choice of $B$ says that the environment may bock the action $c$, namely by not inserting a coin; however, the environment may not block $p$. (Alternatively, one can interpret $B$ as an agreement with the user/environment, namely that the environment will never block actions outside of $B$.) As intuitively expected, we have $VM \models_B G(c\Rightarrow Fp)$ but $VM \not\models_B G(p\Rightarrow Fc)$.

Note that the old-fashioned LTL/CTL interpretation $\models$ is simply $\models_\emptyset$, obtained by taking the empty set of blocking actions.

There are 3 ways to incorporate the above idea.

• One is to pick a set $B$ that is fixed for a given application. In case of the vending machine above, we take $B=\{c\}$. Then we can leave out the subscript $B$ from $\models$.
• Alternatively, we use a specification of our system that is neutral on which actions are blocking, and instead use subscripts $B$ with $\models$. A formula $P \models_B \phi$ then simply says that "under the assumption that actions from $B$ may block, process $P$ satisfies formula $\phi$". Thus, information about blocking is taken aboard only when writing down the validity of temporal formulae.
• The third option is to stick to old-fashioned CTL or LTL, and instead add a $\tau$-loop to each state that intuitively has only blocking outgoing actions. This approach says that we assume that any system will make progress, unless we explicitly indicate that it can stay in a state by means of a $\tau$-loop. In this approach our vending machine would have to be adapted to $VM=\tau.VM + a.p.VM$.

### Progress, Justness and Fairness for reactive systems

I now generalise the work of Lecture 14 from open system to reactive systems. In making statements on properties holding for processes, I now use the notation $P \models^C_B \varphi$. Here the statement that process $P$ satisfies property $\varphi$ is qualified in two ways. The subscript $B$ is the set of actions that may be blocked by the environment, and $C$ is the completeness criterion in force when making this statement. Together, $B$ and $C$ determine the set of complete paths used when interpreting that statement:
1. If $C = \emptyset$ then each path is complete.
2. Progress ($C:=\it Pr$) says that a path is complete iff it is infinite or ends in a state where all outgoing transitions have labels from $B$.
3. Justness ($C := J$) says that a path is complete iff if for each nonblocking transition that is at some point enabled (meaning it is an outgoing transition of a state in the path), one of its resources is actually used for a transition that occurs in the path (past that state).
4. The definitions of weak fairness ($C := \it WF$) and strong fairness ($C := \it SF$) are unchanged, except that a task $T$ counts as enabled in a state $s$ iff a non-blocking transition from $T$ starts at state $s$.

### Liveness properties under the assumption of justness are not preserved under strong bisimilarity

When assuming justness but not weak fairness, there are (strongly) bisimilar systems with different liveness properties. Please read section 5 in this paper. You might also want to read Section 2, 3 and 4 of that paper.

Thus dealing with liveness properties under justness assumptions (without assuming weak fairness) calls for a semantic equivalence that distinguishes processes that are interleaving bisimilar. In fact, this opens a new dimension in our lattice of equivalences.

## Lecture 16 Monday 3-8-2020: Mutual exclusion

I explain that no Fair Scheduler can be modelled in CCS/CSP/ACP or Petri nets, at least not without making a weak or strong fairness assumption. This is an example showing that not everything can be done in those formalism. A fair scheduler is a mechanism for handling requests from two users, such that:
• a user intending to make a new request will succeed in doing so,
• each request made will eventually be granted,
• requests not made will not be granted,
• and between granting any two requests occurs a separating action e.
Since any mutual exclusion algorithm can be used to create a fair scheduler, it follows that also these algorithms cannot be correctly modelled in CCS/CSP/ACP or Petri nets, at least not without assuming fairness.

I review a few extensions of CCS-like languages in which it is possible to correctly model fair schedulers and mutual exclusion protocols. These come in two types:

1. formalisms that allows us to express that a write action to a memory cell will never have to compete with, or wait on, a read action to the same memory cell.
2. formalisms that distinguish between quick (or instantaneous) actions and time-consuming activities.
In class (a) one find CCS extended with signals, or, equivalently, Petri nets extended with read arcs. Another extension in this class is CCS extended with a priority mechanism.

In the second class one finds CCS, or labelled transition systems, extended with timeout actions.

## Lecture 17 Tuesday 4-8-2020: Value-passing process algebras

Today I treat value-passing process algebras. These are variants of standard process algebras like CCS, CSP or ACP, with the following modifications.
• Behind the process algebra sits a well chosen data structures. It consists of data types, like the integers, queues of integers, Booleans, or whatever we need in a given application. Within each data type we provide constants. For instance T and F are two constants of type Booleans, and 0 and 1 are constants of type Integer. The empty queue [] is a constant of type queues. On each data type there are also operators or functions, each with a given arity. For instance, negation is a unary function on the type of the Booleans, whereas conjunction is a binary function on the Booleans. Successor is a unary function on the integers, whereas addition and multiplication are binary functions on the integers. In addition, there can be functions whose arguments belongs to different data types and whose output again is of a different type. For instance, "is-empty" could be a function whose single argument is a queue (of integers) and whose output is a Boolean. append is a binary function whose arguments are an integer and a queue of integers, and whose output is a queue of integers.

Now a data expression is made from these constants and from variables ranging over a particular data type by applying the functions or operators specified by the data structure. For instance append(8+n,q) is a data expression of type queue. It is built from the constant 8 of type integer, the variable n of type integer, the binary operator + of type integer, the variable q of type queues of integers, and the operator append mentioned above.

Instead of atomic actions like a and b, we use actions that model the transmission of data over certain channels. In value-passing ACP there may be actions sc(d) indicating the sending of data d over the channel c. In value-passing CCS, the action a(d) denotes the sending of data d on the channel c. Here c is taken from a set of channels names, where each channel comes with a certain data type, and d is a data expression of that type. The action c(x) represents the receipt of some data on some channel c. Here x is a variable of the data type required by c. The process (c(5).0 | c(x).b(x+2).0)\c) in value-passing CCS, for example, first communicates the value 5 over channel c: the left component sends 5 and the right-hand side receives it, binding it to the variable x. This communication takes the shape of a $\tau$-transitions, abstracting from the transmitted value. Then the right-hand component sends the value 5+2=7 on channel b.

• Instead of recursive constants X that come with defining equations like X = a.X, we use parametrized recursive constants like X(n,q) where the parameters are variables ranging over certain data types. These parameters may be used in the body of the recursive equations. Such an equation could be $X(n,q) = c(x).X(x,append(n,q))$ for example. Here the process $X$ stores a number and a queue in the variables $n$ and $q$. It can read a number $x$ on channel $c$ and change its state by appending $n$ to its queue and storing $x$ instead of $n$.
• In addition to the usual process algebraic operators there also are guards: expressions like $[x>5].P$. In case the variable $x$ evaluates to a number larger than 5, this guard evaluates to true and the process will behave like $P$. In case the guard evaluates to false the process behaves like the CCS constant 0. Now one can model an "if $x>5$ then $P$ else $Q$" statement like $[x>5].P + [x \leq 5].Q$.
• Occasionally, one finds mixtures between process algebras and imperative programming languages, that also feature assignment. The process $X(n) = \bar{c}(n).[[ n:= n+1 ]] X(n)$, for instance, transmits a constantly increasing sequence of numbers on channel $\bar c$.
I sketch three styles of semantics for such process algebra, which by lack of universally agreed names I call type I, II and III. Let's take value-passing CCS, with the syntax
1. 0     (inaction)
2. $\bar{a}(\vec{exp}).P$     (send prefix) for each action/channel $a\in A$
3. $a(\vec{x}).P$     (read prefix) for each action/channel $a\in A$
4. P+Q     (summation, choice or alternative composition)
5. P|Q     (parallel composition)
6. P\a     (restriction) for each action/channel $a\in A$
7. P[f]     (relabelling) for each function $f:A \rightarrow A$.
8. $X(\vec{exp})$     (recursive call) for each declared recursion constant $X$
9. $[\varphi]P$     (guard)
10. $[[ x:= exp ]]P$     (assignment).
In (10) $exp$ is a data expression of the same type as the variable $x$. In (2) and (8), $\vec{exp}$ is a vector (or comma-separated list) of data expressions. With each action $a\in A$ and with each recursion constant $X$ is associated a vector of types, and the types of the expressions $\vec{exp}$ that are given as arguments to the action $\bar a$, or the recursion constant $X$, should match the types prescribed by that action. In (3) $\vec{x}$ is a vector of variables of the vector of types prescribed by $a$. Note that each action can have 0, 1 or more arguments. In (9), $\varphi$ is a formula about the data maintained by a process. The occurrences of the variables $\vec{x}$ in $P$ are considered bound in the process expression $a(\vec{x}).P$. The read prefix acts like a binder, similar to the quantifiers $\exists$ and $\forall$ in predicate logic.

### I. The late semantics

In the structural operational semantics of type I, the rules for send and read prefixing are the same as for action prefixing in standard CCS: $$\bar{a}(\vec{exp}).P \stackrel{\bar{a}(\vec{exp})}{\longrightarrow} P \qquad \mbox{and} \qquad a(\vec{x}).P \stackrel{a(\vec{x})}{\longrightarrow} P$$ whereas the rule for communication now is $$\frac{P \stackrel{\bar c(\vec{exp})}\longrightarrow P' \qquad Q \stackrel{c(\vec x)}\longrightarrow Q'}{P | Q \stackrel{\tau}\longrightarrow P'|Q'\{\vec{exp}/\vec{x}\}}$$ Here $\vec{exp}$ is a list of data expressions of the data types specified by the channel c and $Q'\{\vec{exp}/\vec{x}\}$ denotes the process $Q'$ with the expressions $\vec{exp}$ substituted for all free occurrences of the variables $\vec x$.

The other two transition rules for parallel composition, and the rules for $+$, are exactly the same as for standard CCS. So are the rules for 0, restriction and renaming, noting that the data carried around by actions is not affected by restriction and renaming operators. The rule for recursion is $$\frac{E \{\vec{exp}/\vec{x}\} \stackrel{\alpha}{\longrightarrow} Q}{X(\vec{exp}) \stackrel{\alpha}{\longrightarrow} Q}$$ when $X(\vec{x}) = E$ is the defining equation for the recursion constant $X$. The semantics for guards is more difficult, and skipped here, and we do not use assignment in this setting.

### II. The environment semantics

In the semantics of type II, a state is given by a pair $(\xi,P)$ of a valuation $\xi$ and a process expression $P$. $P$ might contain free variables $x$, and $\xi$ is a partial function that associates values (of the corresponding data type) with those variables. The valuation $\xi$ lifts to expressions. For instance, if $exp$ is the arithmetical expression $n+2$ and $\xi$ is a valuation with $\xi(n)=5$ then $\xi(exp)=7$. It also lifts to formulas. If $\varphi$ is the formula $x>5$ and $\xi(x)=7$ then $\xi(\varphi)=\bf true$.

The rule for the send prefix now is $$(\xi, \bar{a}(\vec{exp}).P) \stackrel{\bar{a}(\xi(\vec{exp}))}{\longrightarrow} (\xi,P).$$ The difference with the previous semantics is that the vector of expressions $\vec{exp}$ is now converted through the valuation function $\xi$ into a vector of values. Thus the arguments of actions in transition labels have a semantic rather than a syntactic nature.

The rule for the read prefix becomes $$(\xi, a(\vec{x}).P ) \stackrel{a(\vec{v})}{\longrightarrow} (\xi[\vec{x}:=\vec{v}], P).$$ Here $\vec v$ may be any vector of values of the types of the variables $\vec x$. Thus, if we have $a(n).P$, with $n$ a variable for type integer, then there are infinitely many outgoing transitions, namely one for each integer that can be received when reading $x$ on channel $a$. The valuation $\xi[\vec{x}:=\vec{v}]$ is just $\xi$, but updated to now bind the values $\vec v$ to the variables $\vec x$.

In this approach the effort of passing values from a sending to a receiving process is shifted from the communication rule for parallel composition (see above) to the read and send prefixes. The rule for communication is simply $$\frac{(\xi,P) \stackrel{\bar c(\vec{v})}\longrightarrow (\zeta,P') \qquad (\xi,Q) \stackrel{c(\vec v)}\longrightarrow (\zeta,Q')}{(\xi,P | Q) \stackrel{\tau}\longrightarrow (\zeta,P'|Q')}$$ similar to standard CCS. A problem in applying this rule, is that at the sending end we normally have a transition $(\xi,Q) \stackrel{\bar c(\vec v)}\longrightarrow (\xi,Q')$ rather than $(\xi,Q) \stackrel{\bar c(\vec v)}\longrightarrow (\zeta,Q')$. This problem can be solved by postulating that $(\xi,Q')=(\zeta,Q')$ when $\xi(x)=\zeta(x)$ for all data variables $x$ that actually occur in $Q'$. By renaming the bound variables $\vec{x}$ within expressions $a(\vec{x}).P$ one can arrange for these variables not to occur within the parallel process $Q$, and this way the above agreement between $\xi$ and $\zeta$ can be reached.

The other two transition rules for parallel composition, and the rules for $+$, are again exactly the same as for standard CCS, and so are the rules for 0, restriction and renaming. The rule for recursion is $$\frac{(\zeta,E) \stackrel{\alpha}{\longrightarrow} (\chi,Q)}{ (\xi,X(\vec{exp})) \stackrel{\alpha}{\longrightarrow} (\chi,Q)}$$ when $X(\vec{x}) = E$ is the defining equation for the recursion constant $X$. Here $\zeta$ is a valuation defined on the variables $\vec x$, such that $\zeta(\vec{x}) := \xi(\vec{exp})$.

The rule for guards is $$\frac{\xi(\varphi)={\bf true} \qquad (\xi,P) \stackrel{\alpha}{\longrightarrow} (\zeta,P')} { (\xi, [\varphi]P) \stackrel{\alpha}{\longrightarrow} (\zeta,P')}$$ and the rule of assignments $$(\xi,[[ x:= exp ]]P) \stackrel{\tau}{\longrightarrow} (\xi[ x:= \xi(exp)], P).$$

### III. The substitution semantics

This is by far the most commonly used semantics for process algebra, although type II is more common for imperative programming languages. Type III is somewhat similar to type II, except that the valuation $\xi$ is suppressed. This is made possible by requiring that the process expressions $P$ may not contain free occurrences of data variables.

The rule for send prefixing is $$\bar{a}(\vec{exp}).P \stackrel{\bar{a}([[\vec{exp}]])}{\longrightarrow} P$$ where the operator $[[ ... ]]$ converts a closed data expression (or formula) into the data value it denotes; for instance $[[2+5]] = 7$ (or $[[7 > 5]] = \bf true$). The rules for read prefixing are $$a(\vec{x}).P \stackrel{a(\vec{v})}{\longrightarrow} P\{\vec{v}/\vec{x}\}.$$ As above, $\vec v$ may be any vector of values of the types of the variables $\vec x$. The process expression $P\{\vec{v}/\vec{x}\}$ is the result of substituting (expressions denoting) the values $\vec v$ for the variables $\vec x$ in $P$. Note that since the variables $\vec x$ are bound in $a(\vec{x}).P$, this is a process without free variables, that is, a closed process expression. However, the expression $P$ may contain free occurrences of $\vec x$, and therefore does not count as a closed process expression. It is turned into a closed process expression by means of the substitution $\{\vec{v}/\vec{x}\}$. The rule for communication now is $$\frac{P \stackrel{\bar c(\vec{v})}\longrightarrow P' \qquad Q \stackrel{c(\vec v)}\longrightarrow Q'}{P | Q \stackrel{\tau}\longrightarrow P'|Q'}$$ and also the other two transition rules for parallel composition, and the rules for $+$, are exactly the same as for standard CCS. So are the rules for 0, restriction and renaming. The rule for recursion is $$\frac{E \{\vec{exp}/\vec{x}\} \stackrel{\alpha}{\longrightarrow} Q}{X(\vec{exp}) \stackrel{\alpha}{\longrightarrow} Q}$$ when $X(\vec{x}) = E$ is the defining equation for the recursion constant $X$, just as in the late semantics. The rule for guards is $$\frac{[[\varphi]]={\bf true} \qquad P \stackrel{\alpha}{\longrightarrow} P'} { [\varphi]P \stackrel{\alpha}{\longrightarrow} P'}$$ and we do not use assignment in this setting.

It is possible to translate a value-passing process algebra into a standard process algebra, provided it is has infinite sums $\sum_{i \in I}P_i$. This way, all the stuff about value passing can be seen as syntactic sugar. The crucial step in this translation is that a read prefix $a(\vec{x}).P$ is translated as $\sum_{\vec{v}}a(\vec v).P\{\vec{v}/\vec{x}\}$, where the sum represents an infinite choice between all expressions obtained from $a(\vec{x}).P$ by substituting concrete values $\vec{v}$ for the variables $\vec{x}$.

## Concluding remark

This is the end of the course COMP 6752, Modelling Concurrent Systems. The basics that were shown here have all kinds of extensions, to make them more applicable. For instance with probabilistic choice, and with constructs quantifying the passage of time. It was not possible to fit those in the present course. Also we didn't cover many techniques for using the basics provided here to model and verify industrial size systems. Various other courses on offer go further along those lines, and there is a wealth of literature. By all means contact me if you desire more pointers.
 Rob van Glabbeek homework rvg@cse.unsw.edu.au