%% LyX 1.5.1 created this file. For more info, see http://www.lyx.org/. %% Do not edit unless you really know what you are doing. \documentclass[10pt,twocolumn,english]{article} \usepackage{ae,aecompl} \usepackage[T1]{fontenc} \usepackage[latin9]{inputenc} \usepackage{geometry} \geometry{verbose,letterpaper,tmargin=1cm,bmargin=2cm,lmargin=1cm,rmargin=1cm,headheight=1cm,headsep=1cm,footskip=1cm} \setcounter{secnumdepth}{4} \setlength{\parskip}{\medskipamount} \setlength{\parindent}{0pt} \usepackage{amsmath} \usepackage{graphicx} \usepackage{amssymb} \IfFileExists{url.sty}{\usepackage{url}} {\newcommand{\url}{\texttt}} \makeatletter %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands. %% Because html converters don't know tabularnewline \providecommand{\tabularnewline}{\\} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands. \usepackage{oz} \usepackage{babel} \makeatother \begin{document} \title{\textbf{\emph{Mondex}}\textbf{, an electronic purse :}\\ \textbf{specification and refinement checks}\\ \textbf{with the} \textbf{\emph{Alloy}} \textbf{model-finding method}} \author{\textbf{Tahina Ramananandro}% \thanks{École Normale Supérieure --- 45, rue d'Ulm --- 75005 Paris (France) --- \protect\url{Tahina.Ramananandro@ens.fr}% }} \maketitle \begin{abstract} This paper explains how the \emph{Alloy} model-finding method has been used to check the specification of an electronic purse (also called smart card) system, called the \emph{Mondex} case study, initially written in Z. After describing the payment protocol between two electronic purses, and presenting an overview of the Alloy model-finding method, this paper explains how technical issues about integers and conceptual issues about the object layout in Z have been tackled in Alloy, giving general methods that can be used in most case studies with Alloy. This work has also pointed out some significant bugs in the original Z specification such as reasoning bugs in the proofs, and proposes a way to solve them.\\ \\ \textbf{Keywords :} \emph{Alloy}, formal methods, model-finding, \emph{Mondex} electronic purse, refinement, security properties, software verification, specification. \end{abstract} \section{Introduction} \subsection{The \emph{Mondex} case study} In 1994, National Westminster Bank developed an electronic purse (or \emph{smart card}) system, called \emph{Mondex} \cite{bib:mondexwww,bib:mondexwwwcom}. An electronic purse is a card-sized device intended to replace {}``real'' coins with electronic cash. In contrast to a credit or debit card, an electronic purse stores its balance in itself, thus does not necessarily require any network access to update a remote database during a transaction. So, electronic purses can be used in small stores or shops, such as bakeries, where small amounts of money are involved. But everything regarding cash requires a critically high security level. So, in 1998, National Westminster Bank asked researchers to verify security properties about \emph{Mondex}: \begin{itemize} \item any value must be accounted; in particular, in case of a failed transaction, lost value must be logged (it is necessary, but the converse is not true); \item no money may suddenly appear on a purse without being debited from another purse through an achieved transaction. \end{itemize} In fact, the loss of money is a global property that cannot be considered at the local scale of one purse. This research led to a formal proof by hand of the \emph{Mondex} electronic purse system% \footnote{The whole system has been proved, except cryptographical issues% } with the Z specification language \cite{bib:z,bib:zref}. This proof has been published in 2000 by Susan Stepney, David Cooper and Jim Woodcock \cite{bib:specif}. It has critically helped the \emph{Mondex} system be granted ITSEC security level 6 out of 6. This proof consists in a specification relying on a refinement relation between two models: \begin{itemize} \item the abstract model, a very simple model with an atomic transaction, and each purse storing the amount of its balance and the amount it has lost; \item the concrete model, which corresponds to the actual implementation with a non-atomic transaction protocol based on message exchange through an insecure communications channel. \end{itemize} Several security issues are raised by the \emph{Concrete} protocol: \begin{itemize} \item a purse can be disconnected from the system too early; \item a message can be lost by the communications channel; \item a message can be replayed several times in the communications channel, but has to be read only at most once; \item a message can be read by any purse. \end{itemize} A \emph{Concrete} transaction follows a 5-step protocol: \begin{enumerate} \item The {}``from'' purse receives a initialization message. \item The {}``to'' purse receives a initialization message and sends a request message. \item The {}``from'' purse receives the request message, decreases its balance and sends the value message. \item The {}``to'' purse receives the value message, increases its balance and sends the acknowledgment message. It is done. \item The {}``from'' purse receives the acknowledgment message. It is done. \end{enumerate} If the transaction cannot go on for some reason (for instance if one of the two purses is disconnected too early), then a mechanism of \emph{abortion} is provided (that could occur after a timeout in the real world). Then, in abortion cases where money could be lost, aborting purses have to \emph{log} the transaction details into a private logging archive, so that if a transaction is actually lost, then it has necessarily been logged. Later purses may also copy the contents of their private log to a global archive. So, the system is nondeterministic, insofar as a purse can decide to abort instead of going on with the transaction. But in both cases, the specification assumes that, once purses are connected to the system, they behave correctly and follow the operation protocol. The specification also assumes that messages related to the protocol cannot be forged (they are {}``protected'', for instance cryptographically), they can only be replayed. However, other {}``foreign'' messages can be forged. The proof layout in the Z specification consists in showing that security properties hold for the \emph{Abstract}, then refining the \emph{Abstract} model by the \emph{Concrete}. But, as the \emph{Concrete} model is not constrained enough, refinement is made easier by making it two-step, through a \emph{Between} world which has the same structure as the \emph{Concrete} but is constrained. So: \begin{itemize} \item The \emph{Between} is abstracted by the \emph{Abstract} by computing the values stored by abstract purses corresponding to the \emph{Between}; however, for each purse, those computations may involve several purses because of the logs. This proof is a backwards refinement involving a prophecy variable, \emph{chosenLost}: among the set of transactions for which the {}``from'' purse has already decreased its balance but the {}``to'' purse has not increased its own one yet, no purse having aborted yet, some transactions are chosen in advance to be lost. \item The refinement of the \emph{Between} by the \emph{Concrete} is rather an invariant proof than a refinement proof. The proof layout is a forwards simulation. \end{itemize} % \begin{figure} \begin{centering} \includegraphics[width=1\columnwidth,keepaspectratio]{concrete-proto} \par\end{centering} \caption{\emph{Concrete 5-step protocol, with the statuses of the purses depending on the operations.}} \begin{centering} \emph{Once purses are connected to the system, they are assumed to follow the protocol.} \par\end{centering} \begin{centering} \emph{The central authority sending the} startFrom \emph{and} startTo \emph{messages could correspond to pressing a button to initialize the transaction. It is not modelled: those messages spontaneously appear in the} ether\emph{.} \par\end{centering} \begin{centering} \emph{The statuses} eaTo \emph{and} eaFrom \emph{may be interpreted as a single {}``idle'' status.} \par\end{centering} \end{figure} \subsection{The Alloy model-finding method} \emph{Alloy} \cite{bib:alloy,bib:alloybook}\emph{ }is a modeling method that includes both a modeling language based on first-order logic and relational calculus including transitive closures, and a tool, called \emph{Alloy Analyzer }\cite{bib:aa}% \footnote{\emph{The }Alloy Analyzer\emph{ is the analysis engine for Alloy 3.0, with which the Mondex case study has been tackled. The new version of Alloy, 4.0, is based on another analysis tool, }Kodkod\emph{ \cite{bib:kodkod,bib:kodkodwww}, which is a major improvement in translating specifications to SAT formulae.}% } and based on model-finding through SAT-solving \cite{bib:trans}, to analyze specifications in this language. The analysis consists in \emph{checking} a theorem: the specification is translated into a SAT formula so that an instance of this formula corresponds to a \emph{counterexample} to the theorem being checked. A \emph{model} of an Alloy specification is a set of \emph{atoms}, or objects, satisfying all the \emph{facts}, or axioms, in addition to the theorem being checked. The \emph{scope} of the model is the cardinality of its atoms. All models considered by Alloy analyses are \emph{finite}. % \begin{figure} \noindent \begin{centering} \includegraphics[scale=0.5]{modelfinding} \par\end{centering} \caption{The \emph{Alloy} model-finding method} \end{figure} The Alloy modeling language is based on \emph{relations}. A relation corresponds to a set of \emph{tuples}, a tuple is an ordered combination of atoms. In Alloy, every relation has a fixed \emph{arity}: in every relation, all the tuples have the same number of atoms. Alloy provides the user with a relational calculus close to set theory: + (union), - (difference), \textbf{\&} (intersection)... The cartesian product is denoted \textbf{->} . The most notable operator is the join operator denoted . : given two relations \emph{A} of arity \emph{a} and \emph{B} of arity \emph{b}, then \emph{A}.\emph{B} corresponds to the following set: \[ A.B=\left\{ \left(\vec{a},\vec{b}\right):\exists x,\left(\vec{a},x\right)\in A\wedge\left(x,\vec{b}\right)\in B\right\} \] Special operators are also provided: \textasciitilde{} (for binary relations only) denotes the reciprocal relation (with the tuples turned upside down); \emph{s}\textbf{~<:}\emph{~r} (resp. \emph{r}\textbf{~:>}~s) denotes the restriction of a relation \emph{r} where the first (resp. last) components of its tuples are in the signature \emph{s}. Then, a formula consists in: \begin{itemize} \item a multiplicity formula to denote whether the relation corresponds to a non-empty set (\textbf{some}),\textbf{ }an empty set (\textbf{no}), a singleton (\textbf{one}), a singleton or empty set (\textbf{lone}); \item an inclusion between two relations (\textbf{in}); \item a Boolean combination of formulae: \{...\} (and), \textbf{or}, \textbf{implies}, \textbf{not}...; \item a quantified formula: universal (\textbf{all}), existential (\textbf{some}), existential with unicity condition (\textbf{one}), unicity {}``if it exists'' (\textbf{lone}), universal with negation (\textbf{no}). The \textbf{disj} keyword ensures the quantified variables to denote sets of tuples that are disjoint one to the other. \end{itemize} To construct relations of a given arity, it is necessary to declare unary relations, or \emph{signatures}. Signatures correspond to sets of tuples. A signature is declared by \textbf{sig}~\emph{name}. A signature can also be declared as a subset of an existing signature thanks to the \textbf{in} keyword. But the user can also. Finally, the \textbf{abstract} keyword states that all the atoms of this signature belong to a signature extending it. In fact, a signature declared as \textbf{sig~}\emph{name} is implicitly considered extending the top-level abstract signature \textbf{object}. The user can also declare a relation along with the signature of its first component : for instance, \textbf{sig}~\emph{A~}\{\emph{r}:~\emph{B}\} declares a signature \emph{A} and a binary relation \emph{r} in \emph{A}\textbf{->}\emph{B} (that is, in $A\times B$). Multiplicity keywords as above can also occur, adding constraints on the relation. The user can also declare predicates (\textbf{pred}) and functions (\textbf{fun}, with a {}``typing'' indication about the return value) to factor the code of the specification. The Alloy system is modular: a specification can be split in several module files beginning with \textbf{module}~\emph{name}. A module is then included via \textbf{open}. Modules can take signatures as parameters. {\footnotesize }% \begin{figure} {\footnotesize }\begin{tabular}{lll} \emph{\footnotesize module} & {\footnotesize ::=} & \textbf{\footnotesize module}{\footnotesize ~}\emph{\footnotesize modname~}{\footnotesize {[}}\emph{\footnotesize signame}{\footnotesize {*}]~}\emph{\footnotesize opendir}{\footnotesize {*}~}\emph{\footnotesize declaration{*}}\tabularnewline \emph{\footnotesize opendir} & {\footnotesize ::=} & \textbf{\footnotesize open~}\emph{\footnotesize modname~}{\footnotesize {[}}\emph{\footnotesize signame}{\footnotesize {*}]}\tabularnewline \emph{\footnotesize declaration} & {\footnotesize ::=} & {\footnotesize | }\emph{\footnotesize sigdecl}{\footnotesize{} | }\emph{\footnotesize preddecl }{\footnotesize | }\emph{\footnotesize factdecl }{\footnotesize | }\emph{\footnotesize fundecl}\tabularnewline \emph{\footnotesize sigdecl} & {\footnotesize ::=} & {\footnotesize | }\textbf{\footnotesize abstract}{\footnotesize ?~}\textbf{\footnotesize sig~}\emph{\footnotesize signame~extends}{\footnotesize ?~\{}\emph{\footnotesize args{*}}{\footnotesize \}}\tabularnewline & & {\footnotesize | }\textbf{\footnotesize sig}{\footnotesize ~}\emph{\footnotesize signame}{\footnotesize ~}\textbf{\footnotesize in}{\footnotesize ~}\emph{\footnotesize expr}{\footnotesize ~\{}\emph{\footnotesize args{*}}{\footnotesize \}}\tabularnewline \emph{\footnotesize extends} & {\footnotesize ::=} & \textbf{\footnotesize extends}{\footnotesize ~}\emph{\footnotesize signame}\tabularnewline \emph{\footnotesize args} & {\footnotesize ::=} & \emph{\footnotesize relname}{\footnotesize{} : }\emph{\footnotesize mult}{\footnotesize ? }\emph{\footnotesize signame sigprodend{*}}{\footnotesize{} ,}\tabularnewline \emph{\footnotesize sigprodend} & {\footnotesize ::=} & \textbf{\footnotesize ->}\emph{\footnotesize mult}{\footnotesize ?}\emph{\footnotesize{} signame}\tabularnewline \emph{\footnotesize preddecl} & {\footnotesize ::=} & \textbf{\footnotesize pred}{\footnotesize ~}\emph{\footnotesize predname}{\footnotesize ~(}\emph{\footnotesize args}{\footnotesize {*})}\emph{\footnotesize ~andformula}\tabularnewline \emph{\footnotesize factdecl} & {\footnotesize ::=} & \textbf{\footnotesize fact}{\footnotesize ~}\emph{\footnotesize factname~andformula}\tabularnewline \emph{\footnotesize fundecl} & {\footnotesize ::=} & \textbf{\footnotesize fun}{\footnotesize ~}\emph{\footnotesize funname}{\footnotesize ~(}\emph{\footnotesize args{*}}{\footnotesize )~:~}\emph{\footnotesize signame~sigprodend{*}}{\footnotesize ~\{}\emph{\footnotesize expr}{\footnotesize \}}\tabularnewline \emph{\footnotesize expr} & {\footnotesize ::=} & {\footnotesize | }\emph{\footnotesize relname}{\footnotesize{} | }\emph{\footnotesize funname}{\footnotesize{} (}\emph{\footnotesize expr,}{\footnotesize {*}) | }\emph{\footnotesize expr}{\footnotesize +}\emph{\footnotesize expr}{\footnotesize{} }\tabularnewline & & {\footnotesize | }\emph{\footnotesize expr-expr}{\footnotesize{} | }\emph{\footnotesize expr}\textbf{\footnotesize ->}\emph{\footnotesize expr }{\footnotesize | }\emph{\footnotesize expr}\textbf{\footnotesize \&}\emph{\footnotesize expr}{\footnotesize{} }\tabularnewline & & {\footnotesize | }\emph{\footnotesize expr}{\footnotesize .}\emph{\footnotesize expr}{\footnotesize{} | }\emph{\footnotesize expr}{\footnotesize <:}\emph{\footnotesize expr}{\footnotesize{} | }\emph{\footnotesize expr}{\footnotesize :>}\emph{\footnotesize expr}{\footnotesize{} | ...}\tabularnewline \emph{\footnotesize andformula} & {\footnotesize ::=} & {\footnotesize \{}\emph{\footnotesize formula{*}}{\footnotesize \}}\tabularnewline \emph{\footnotesize formula} & {\footnotesize ::=} & {\footnotesize | }\emph{\footnotesize predname}{\footnotesize (}\emph{\footnotesize expr,{*}}{\footnotesize ) | }\emph{\footnotesize mult~expr}{\footnotesize{} | }\emph{\footnotesize expr~}\textbf{\footnotesize in~}\emph{\footnotesize expr }\tabularnewline & & {\footnotesize | }\emph{\footnotesize quant~}\textbf{\footnotesize disj}{\footnotesize ?}\emph{\footnotesize ~args}{\footnotesize ~}\emph{\footnotesize args}{\footnotesize {*}~}\emph{\footnotesize andformula}{\footnotesize{} | }\emph{\footnotesize andformula}\tabularnewline & & {\footnotesize | }\emph{\footnotesize andformula}{\footnotesize ~}\textbf{\footnotesize implies}{\footnotesize ~}\emph{\footnotesize andformula}\tabularnewline & & {\footnotesize | }\emph{\footnotesize andformula}{\footnotesize ~}\textbf{\footnotesize or}{\footnotesize ~}\emph{\footnotesize andformula}{\footnotesize{} | ...}\tabularnewline \emph{\footnotesize mult} & {\footnotesize ::=} & \textbf{\footnotesize | some | one}{\footnotesize{} | }\textbf{\footnotesize lone | no}\tabularnewline \emph{\footnotesize quant} & {\footnotesize ::=} & {\footnotesize | }\textbf{\footnotesize all}{\footnotesize{} | }\emph{\footnotesize mult}\tabularnewline \end{tabular}{\footnotesize \par} {\footnotesize \caption{The \emph{Alloy} syntax (simplified)} } \end{figure} {\footnotesize \par} \subsection{Overview of the main issues encountered while writing the Alloy specification} The use of the \emph{Alloy} method (the \emph{Alloy} specification language and the \emph{Alloy Analyzer}) raised some issues at different levels, due to the logical conception of the \emph{Alloy} specification language, or to the current implementation of the \emph{Alloy Analyzer}. we solved them in two steps. First, we wrote a preliminary version of the \emph{Mondex} specification in \emph{Alloy} that used to follow the Z specification as close as possible. But this translation introduced several artifacts that did not seem natural to the general ways of writing specifications in Alloy, so that we rewrote the specification to remove those artifacts. This {}``optimisation'' has shown to be more suitable to automated analysis (indeed, analysis times became much shorter). Moreover, it has eventually pointed out some errors in the previous model. On the one hand, we had to tackle rather technical issues about integers: whereas the Z specification frequently uses them, they are not well handled by the implementation of the \emph{Alloy} \emph{Analyzer}. But in fact, not all properties of integers are used, so that there are ways to represent the corresponding data more efficiently than with integers. This offers an interesting way to compute sums of sets of values without recursion on the set of values. On the other hand, we had to tackle a more conceptual issue regarding how Z and Alloy treat the notion of the identity of objects. Whereas Z schemas define records, Alloy specifications define relations between objects that have their own identity. For instance, two abstract purses having the same \emph{balance} and \emph{lost} values are represented by the same record in Z, so they have to be distinguished somehow ; in Alloy, it is the converse : as different objects can have the same properties, constraints have to be added to consider them as records. After tackling those modeling issues, the obtained models have been able to find bugs in the Z specification of the Mondex electronic purse. Those bugs are related to the insufficient formalism of that Z specification. This work has been done within an internship at MIT \cite{bib:mondextech}; the Alloy model files are available on the author's website \cite{bib:mondextechwww}. \section{Representing integers with Alloy} The implementation of integers in the \emph{Alloy Analyzer}% \footnote{The new engine \emph{Kodkod} for Alloy 4.0 models integers in a different way through their binary representations. Few work has been done to translate the models to this new version.% } provides not very efficient analyses. Indeed, the translation of integers and their operations into boolean formulae consumes a lot of time and space, by building the whole arithmetic circuits, and dramatically reduces the definable scope. The idea commonly retained by \emph{Alloy} users, and also by the researchers who develop \emph{Alloy} themselves (within Daniel Jackson's Software Design group) is that for most models written in \emph{Alloy}, integers may be replaced with another representation providing similar properties, and which could fit the model better. This idea holds for the \emph{Mondex} case study, so that author-level encodings may be used, as described in this section. \subsection{Using an order rather than sequence numbers} Sequence numbers are used to distinguish different transactions led by purses. In some way, they represent a \emph{time scale} increasing whenever a transaction begins. It is not specified how this time scale increases: only the comparison relation is used. So, we only need an order to model them. One idea is to use the ordering module provided along with the standard distribution of the \emph{Alloy Analyzer}: \verb+util/ordering+. {\footnotesize \begin{verbatim} sig SEQNO {} open util/ordering [SEQNO] \end{verbatim}}{\footnotesize \par} Moreover, the \emph{Alloy Analyzer} treats this module in an optimized way, in terms of symmetry breakings when building the SAT boolean formula, rather than explicitly defining the order. \subsection{Representing amounts through coins} Even though all the first-order properties of integers are used to model amounts, they are used in a particular way. Comparisons only occur between the pre-state and the post-state of an operation: either a purse decreasing its balance, or the whole global world balance, is concerned. In particular, two balances of \emph{different} purses (associated to different names) are never compared. The solution proposed by members of the SDG group, namely Emina Torlak and Derek Rayside, is to use \emph{sets of coins} to represent an amount. The amount will not be represented by the cardinality of the set, but the coins themselves, as with \emph{real} coins in \emph{non-electronic} purses. \subsubsection{Computing with coins} With this approach, operations are redefined as follows : \begin{itemize} \item The sum of two values is the disjoint union of the corresponding sets of coins. \item The difference of two values is the (set) difference of the corresponding sets of coins, as soon as the set being subtracted is included in the original set. \item The comparison relation is the set inclusion between sets of coins. \end{itemize} Indeed, when a purse decreases its \emph{balance}, it actually \emph{gives away part of it}. So there is how the \emph{Abstract} world can be defined: {\footnotesize \begin{verbatim} sig NAME {} sig Coin {} sig AbPurse { balance : set Coin, lost : set Coin } sig AbWorld { abAuthPurse : NAME -> AbPurse } \end{verbatim}}{\footnotesize \par} This approach allows computing the sum of sets of values through simply \emph{gathering} them with a relational expression. Whereas the Z specification defines a sum of set of values through a recursive definition: {\footnotesize \begin{schema}{Totals} totalBalance, totalLost : (NAME \ffun AbPurse) \fun \num \ST totalBalance (\varnothing) = 0 \\ totalLost (\varnothing) = 0 \\ \forall f : NAME \ffun AbPurse; name : NAME; AbPurse | \\ \t1 name \in \dom f \land \theta AbPurse = f(name) \\ @ \\ \t1 totalBalance (f) = totalBalance (name \ndres f) + balance \\ \t1 \land totalLost (f) = totalLost (name \ndres f) + lost \end{schema}}{\footnotesize \par} in Alloy, one would simply write \verb+S.r+ where \emph{S} is a set of \emph{NAME}s and \emph{r} is a relation that maps a name to some coins. For instance, if \emph{a} is an \emph{AbWorld}, one would simply write \verb+a.abAuthPurse.balance+ to compute the sum of the balances of all abstract purses. \subsubsection{Defining constraints to avoid coin sharing} However, this approach requires to define additional constraints to avoid \emph{coin sharing}, the fact that, for instance, two amounts being added could have common coins. Indeed, such constraints ensure, for instance, the considered sums actually being \emph{disjoint} unions of sets of coins. First constraints are added on the \emph{Abstract} world. They are quite simple to express: \begin{itemize} \item There is no coin common to two purses, regardless of whether it would belong to the \emph{balance} or the \emph{lost} store of either purse. In other words, a coin must belong to at most one purse. \item There is no coin common to the \emph{balance} store and the \emph{lost} store of a purse. In other words, a coin must be either not lost, or lost. \end{itemize} {\footnotesize \begin{verbatim} fact noCoinSharing { all w : AbWorld { no disj n1, n2 : NAME { some n1.(w.abAuthPurse).(balance + lost) & n2.(w.abAuthPurse).(balance + lost) } no p : AbPurse { p in NAME.(w.abAuthPurse) some p.balance & p.lost } } } \end{verbatim}}{\footnotesize \par} These constraints only apply to abstract authentic purses, that is purses actually belonging to an abstract world (although the second could even have been defined for any abstract purse). Then, the Concrete purses also use coins: {\footnotesize \begin{verbatim} sig PayDetails { from, to : NAME, fromSeqNo, toSeqNo : SEQNO, value : set Coin } sig ConPurse { name : NAME, balance : set Coin, pdAuth : PayDetails, exLog : set PayDetails, nextSeqNo : SEQNO, status : STATUS } sig ConWorld { conAuthPurse : NAME -> lone ConPurse, ether : set MESSAGE, archive : NAME -> PayDetails } \end{verbatim}}{\footnotesize \par} Equivalent constraints to avoid coin sharing have to be added to the \emph{Concrete} world. In the first model, we added the following constraints : {\footnotesize \begin{verbatim} fact noCoinSharingConcrete { all p : ConPurse { no p.exLog.value & p.balance -- 1 } all w : ConWorld { no disj n1, n2 : NAME { some n1.(w.conAuthPurse).balance & n2.(w.conAuthPurse).balance -- 2 } no p : ConPurse, pd : PayDetails { p in NAME.(w.conAuthPurse) pd in NAME.archive some p.balance & pd.value -- 3 } } } \end{verbatim}}{\footnotesize \par} \begin{enumerate} \item A purse has no coin common to its balance and a transaction it has logged to its \emph{exLog}. \item Two distinct purses have no coin common to their balances. \item A purse has no coin common to its balance and a transaction that has been logged in the global \emph{archive}. \end{enumerate} But although constraint 2 makes sense, constraints 1 and 3 are too strong. Indeed, as regards constraint 3: \begin{itemize} \item Assume the {}``to'' purse has received the money and sends the acknowledgment message. If the {}``from'' purse aborts before receiving it, logging the transaction into its \emph{exLog,} then this constraint prevents the {}``from'' purse from copying the details relevant to this transaction to the global \emph{archive}. Indeed, the {}``to'' \emph{balance} contains the coins corresponding to the \emph{value} of the transaction. \item Assume the {}``to'' purse has just sent the request message but aborts, then logging the transaction into its \emph{exLog}. If the {}``from'' purse aborts before receiving this message, then it will have kept the coins of the transaction value in its \emph{balance}. Thus, the {}``to'' purse will not be able to copy the details relevant to this transaction to the global \emph{archive}. \end{itemize} In both cases, the corresponding transaction is {}``locked'' in the \emph{exLog}, which consequently cannot clear it through a \emph{ClearExceptionLog} operation. Roughly speaking, the point is to find constraints which could be equivalent to the abstract constraint preventing a coin to be {}``lost and not lost'' at the same time. The solution may be found by referring to the \emph{Abstract/Between} refinement relation. It relies on the definition of two functions : \begin{itemize} \item \emph{definitelyLost} corresponds to the set of details referring to transactions definitely lost, that is either logged by the two purses, or logged by the {}``to'' purse while the {}``from'', having sent the money, is still expecting an acknowledgment. \item \emph{maybeLost} corresponds to critically ambiguous transactions, where the {}``to'' purse expects the value while the {}``from'' purse has already sent it and either expects the acknowledgment or has logged the transaction before the {}``to'' received the value. \end{itemize} {\footnotesize }% \begin{figure} {\footnotesize \begin{verbatim} fun allLogs (c : ConWorld) : ConPurse -> PayDetails { c.archive + (c.conAuthPurse <: exLog.c) } fun authenticFrom (c : ConWorld) : set PayDetails { from.(c.conAuthPurse) } fun authenticTo (c : ConWorld) : set PayDetails { to.(c.conAuthPurse) } fun fromLogged (c : ConWorld) : set PayDetails { authenticFrom (c) & ConPurse.(allLogs (c) & ~from) } fun toLogged (c : ConWorld) : set PayDetails { authenticTo (c) & ConPurse.(allLogs (c) & ~to) } fun toInEpv (c : ConWorld) : set PayDetails { authenticTo (c) & to.status.c.epv & (iden & to.(pdAuth.c)).PayDetails } fun fromInEpr (c : ConWorld) : set PayDetails { authenticFrom (c) & from.status.c.epr & (iden & from.(pdAuth.c)).PayDetails } fun fromInEpa (c : ConWorld) : set PayDetails { authenticFrom (c) & from.status.c.epa & (iden & from.(pdAuth.c)).PayDetails } fun definitelyLost (c : ConWorld) : set PayDetails { toLogged (c) & (fromLogged (c) + fromInEpa (c)) } fun maybeLost (c : ConWorld) : set PayDetails { (fromInEpa (c) + fromLogged (c)) & toInEpv (c) } \end{verbatim}}{\footnotesize \par} {\footnotesize \caption{{\footnotesize Alloy definitions of the }\emph{\footnotesize definitelyLost}{\footnotesize{} and }\emph{\footnotesize maybeLost}{\footnotesize{} sets of transactions}} }{\footnotesize \par} \end{figure} {\footnotesize \par} In both cases, we know that the value has been debited from the {}``from'' balance but not yet credited to the {}``to'' balance. Then, it is sound to replace constraint 3 above with the following one, stating that no coin in the value of a transaction in \emph{definitelyLost} or \emph{maybeLost} may be in a purse \emph{balance} at the same time : {\footnotesize \begin{verbatim} all w : ConWorld { no p : ConPurse { p in NAME.(w.conAuthPurse) some p.balance & (definitelyLost (w) + maybeLost (w)) -- new 3 } } \end{verbatim}}{\footnotesize \par} This constraint prevents a coin from being in a \emph{balance} and a \emph{lost} store at the same time, even if the purses are distinct. As regards constraint 1, it is too strong if the following situation arises : the {}``to'' purse logs the transaction just after sending the request, but the {}``from'' aborts before receiving it (thus it does not log). Then, no money has been sent yet, but the transaction has been logged by the {}``to'' purse. In that case, the {}``to'' purse cannot receive the corresponding coins in a further transaction attempt involving them, because they are already in the logged transaction, even though they are still in the {}``from'' balance. All those situations have been found by counterexamples while trying to rewrite the Alloy specification. Indeed, those constraints were first defined within the \emph{Concrete} world, so that the \emph{Between/Concrete} refinements did not hold. Actually, some constraints in the \emph{Between} were not necessarily kept through operations, so they were too strong. Thus, they caused some legal operations not to arise. The new constraints defined here have been moved to the \emph{Between} world. This has allowed them to be checked as invariants through the \emph{Between/Concrete} refinement. \subsubsection{Redefining \emph{chosenLost} set : coins as a tracking system} Using coins has another interesting effect : they allow to better \emph{track} the amounts through operations. The \emph{Abstract/Between} refinement relies on a prophecy variable, \emph{chosenLost}, gathering the ambiguous (pending) transactions that are chosen \emph{in advance} to be lost. This prophecy variable causes the protocol to be nondeterministic. But this set, used to compute the \emph{lost} values of the abstract purses, is uniquely known for a given \emph{Between} world, as soon as the corresponding \emph{Abstract} world is known. Indeed, thanks to the constraint preventing a coin from belonging to the values of two distinct transactions considered ambiguous, it is possible to determine to which transaction a coin corresponds. It is easily possible to show that the \emph{definitelyLost} and \emph{maybeLost} sets of transactions are disjoint (see definition above in Section ...). Indeed, for the former, the {}``to'' purse has to have logged the transaction, whereas for the latter, the {}``to'' purse has to be still waiting for the value being credited, which means that the transaction is still pending. It is also obvious that coins being accounted in the \emph{Abstract} model correspond to either a concrete \emph{balance}, or a \emph{definitelyLost} or \emph{maybeLost} transaction amount, the latter case including the case of a transaction chosen lost. So there are four solutions: \begin{itemize} \item the coin is in a concrete \emph{balance}: then, it will be accounted into the abstract \emph{balance} of the corresponding purse; \item the coin is in a \emph{definitelyLost} transaction: then, it will be accounted into the \emph{lost} of the {}``from'' purse of this transaction; \item the coin is in a transaction considered \emph{maybeLost}, but not chosen lost: then, it will be accounted into the \emph{balance} of the {}``from'' purse of this transaction; \item the coin is in a transaction considered \emph{maybeLost} and chosen lost: then, it will be accounted into the \emph{lost} of the {}``from'' purse of this transaction. \end{itemize} Then, it is possible to {}``revolve'' this table to define the \emph{chosenLost} set. Just take the transactions of \emph{maybeLost}, the coins of which are in an abstract \emph{lost} : {\footnotesize \begin{verbatim} fun getChosenLost (a : AbWorld, b : BetweenWorld) : PayDetails { NAME.(a.abAuthPurse).lost.(~value :> maybeLost (b)) } \end{verbatim}}{\footnotesize \par} \section{Records in Z, objects in Alloy} \subsection{\label{sub:ident}The identity of objects} A major difference between Z and Alloy is how they represent objects. On the one hand, a Z schema defines records, so that two records having the same values denote the same object. On the other hand, Alloy specifications define relations between atomic objects, each of which has its own identity regardless of how it is related to others. \subsubsection{Simulating objects with Z records and names} In Z, an abstract purse is only a \emph{record} with two fields, \emph{balance} and \emph{lost}. {\footnotesize \begin{schema}{AbPurse} balance,lost: \num \ST balance \geq 0 \\ lost \geq 0 \end{schema}}{\footnotesize \par} So, when two abstract purses have the same \emph{balance} values and the same \emph{lost} values, then it is impossible to distinguish them. Now consider the definition of the \emph{AbWorld} abstract world. It is a set of purses. To distinguish between two purses having the same values, the Z specification introduces \emph{names}. This method is commonly retained in object-oriented Z specifications \cite{bib:ooz}. {\footnotesize \begin{zed} [NAME] \end{zed} \begin{schema}{AbWorld} abAuthPurse: NAME \ffun AbPurse \end{schema}}{\footnotesize \par} In Alloy, the notion of property only corresponds to the way atomic objects are related to each other. That is why names are not necessary: keeping them would introduce an artifact in the Alloy specification. So the purses can simply be defined through the following signature: {\footnotesize \begin{verbatim} sig AbPurse { balance: set Coin, lost: set Coin } \end{verbatim}}{\footnotesize \par} So, an Abstract world is simply a set of purses, a subset of the \verb+AbPurse+ signature. \subsubsection{Simulating records with Alloy objects : canonicalization} Conversely, in Alloy, there is no notion of records and fields, as two distinct objects may be related to the same values by the relations. However, it is possible to simulate Z's behaviour by introducing a notion of records in Alloy. One solution could be to \emph{canonicalize} signatures: that is, to introduce canonicalization constraints which enforce two abstract worlds having the same properties to be equal : The main purpose of this constraint would be to reduce the search space by eliminating redundant cases when analyzing the specification. However, such a canonicalization constraint may be also necessary for the \emph{Abstract} purses, as the refinement relation could --- and does, without this constraint --- give different purses having the same \emph{balance} and \emph{lost} fields. The Z specification also defines {}``true'' records, for instance \emph{TransferDetails} and \emph{PayDetails} which represent respectively abstract and concrete transaction details. {\footnotesize \begin{schema}{TransferDetails} from,to: NAME \\ value: \num \ST value \geq 0 \\ \end{schema} \begin{schema}{PayDetails} TransferDetails \\ fromSeqNo, toSeqNo: \num \ST fromSeqNo \geq 0 \\ toSeqNo \geq 0 \end{schema}}{\footnotesize \par} So, such data types have to be represented in Alloy as records, i.e. with the canonicalization constraint. {\footnotesize \begin{verbatim} sig TransferDetails { from, to : Purse, value : set Coin } sig PayDetails extends TransferDetails { fromSeqNo, toSeqNo : SEQNO } fact payDetailsCanon { no disj p, p' : PayDetails { p'.from = p.from p'.to = p.to p'.fromSeqNo = p.fromSeqNo p'.toSeqNo = p.toSeqNo } } \end{verbatim}}{\footnotesize \par} \subsection{Consequence : existential quantification and constraints} The simulation proofs require to show that for any \emph{Between} operation and \emph{Abstract} post-state, there \emph{exists} an \emph{Abstract} pre-state such that the \emph{Abstract} operation holds (as required by the backwards refinement), and similarly for the \emph{Between/Concrete} refinement proof (but forwards). It is important to understand the notion of existence in the right way. Indeed, in the Z notation, an existential theorem corresponds to the fact that a record with the right field values may be \emph{constructed}. That is, if the theorem is stated in an existential way, the proof will give the witness. But in \emph{Alloy}, existence is the \emph{actual} existence of the corresponding atomic objects in the model. That explains the following behaviour in the \emph{Abstract/Between} refinement. Let us try to show the following predicate for the \emph{Between Abort} operation, using the method of {}``encapsulating'' the \emph{chosenLost} set into a specific signature as a field of this signature: {\footnotesize \begin{verbatim} sig ChosenLost {pd : set PayDetails} assert ReqEx { all b, b' : BetweenWorld, a' : AbWorld, cl' : ChosenLost {  { Rab (a', b', cl'.pd) Req (b, b') } implies some a : AbWorld, cl : ChosenLost { Rab (a, b, cl.pd) AbIgnore (a, a') } } } \end{verbatim}}{\footnotesize \par} Then, a counterexample would come: the model with only one \emph{ChosenLost} object, preventing some cases where the \emph{ChosenLost} must change from the post-state to the pre-state. This is also the reason why a sanity-check property has to be verified through simulating a predicate rather than trying to check an existential assertion. Indeed, if we naively tried to show that there exists a \emph{BetweenWorld}, to show that the constraints are not too strong and allow an object to exist: {\footnotesize \begin{verbatim} assert BetweenEx { some BetweenWorld } \end{verbatim}}{\footnotesize \par} then, the immediate counterexample comes: the empty model, with no atoms at all! A naive idea would be to constrain the Alloy model to match the Z notion of existence, that is to constrain any constructible object to exist. But that idea is very naive, as an immediate problem arises with the \emph{Alloy Analyzer}: the scope dramatically grows. That is why the only solution is to construct the witness in the theorem itself, and to \emph{assume} that an object exists once we have \emph{enough properties} to define it. For instance, an \emph{Abstract} world is completely determined if we know its \emph{abAuthPurse}, that is the set of all its authentic purses and their properties. Thus, we can consider that the \emph{Rab} abstraction relation, which computes the values of \emph{balance} and \emph{lost} fields of the authentic purses of an \emph{Abstract} world abstracting the given \emph{Between} world and the \emph{ChosenLost} variable, constructs an object which has the structure of an \emph{Abstract} world. But assuming the existence of a constrained object does not make sense: thus it is necessary to not define constraints as such, but define them as predicates which will be used as implication hypotheses in assertions. For instance, instead of defining and using the \emph{Abstract} and \emph{Between} worlds as follows: {\footnotesize \begin{verbatim} sig BetweenWorld extends ConWorld {} fact BetweenConstraints {...} assert RabIgnore { all b, b': BetweenWorld, a' : AbWorld, cl' : set PayDetails { { Rab (a', b', cl') Ignore (b, b') } implies some a : AbWorld { Rab (a, b, cl') AbIgnore (a, a') } } } \end{verbatim}}{\footnotesize \par} it is a better idea to define constraints as predicates rather than facts: {\footnotesize \begin{verbatim} sig AbWorld {abAuthPurse : NAME -> AbPurse} pred Abstract (a : AbWorld) { a.abAuthPurse : NAME -> lone AbPurse ... -- and abstract coin sharing constraints } pred Between (b : ConWorld) {...} \end{verbatim}}{\footnotesize \par} This also allows to check constraints (including coin sharing constraints defined in the previous section) as invariants. Then, the abstraction relation could be also defined {}``structurally'', with no references to the {}``constraints'': {\footnotesize \begin{verbatim} pred Rab (a : AbWorld, b : BetweenWorld, cl : set PayDetails) { a.abAuthPurse.AbPurse = b.conAuthPurse.ConPurse -- 1 all n : NAME { n in b.conAuthPurse.ConPurse implies { one n.(a.abAuthPurse) -- 2 n.(a.abAuthPurse).balance = ... n.(a.abAuthPurse).lost = ... } } } \end{verbatim}}{\footnotesize \par} \begin{enumerate} \item The authentic names are the same for the abstract as for the between world. \item for any authentic name, there is exactly one corresponding abstract purse. \end{enumerate} Then, the assertion could be stated as follows: {\footnotesize \begin{verbatim} assert RabIgnore { all b, b': BetweenWorld, a, a' : AbWorld, cl' : set PayDetails { { Abstract (a') Rab (a', b', cl') Ignore (b, b') Rab (a, b, cl') } implies { Abstract (a) -- 1 AbIgnore (a, a') } } } \end{verbatim}}{\footnotesize \par} It is worth noting that multiplicity constraints also have to be defined as additional constraints. Then, the following lemma would avoid conclusion 1 to be checked each time: {\footnotesize \begin{verbatim} assert RabEx { all b : ConWorld, a : AbWorld, cl : set PayDetails { { Rab (a, b, cl) } implies { Abstract (a) } } } \end{verbatim}}{\footnotesize \par} That is, the abstraction relation (provided the \emph{chosenLost} set of transactions consists in only critically ambiguous transactions that may be lost, a constraint that has to be defined in the abstraction relation) always defines an \emph{Abstract} world starting from a \emph{Between}. Or, in other words, any object that would have the same structure of an \emph{Abstract} world but would abstract a given \emph{Between} world through the abstraction relation, automatically verifies the constraints of an \emph{Abstract} world, thus is itself a {}``true'' abstract world. \section{Results} \subsection{Bugs found in the Z specification} The use of the \emph{Alloy Analyzer} gave some counterexamples not related to the way of modeling the \emph{Mondex} specification in \emph{Alloy}. Indeed, some of those counterexamples correspond to real \emph{bugs} in the original Z specification. Those bugs were discovered very early, in analysing the initial specification. However, the optimized specification gave no further bugs. The \emph{Alloy Analyzer} found two bugs related to reasoning errors in the specification. This points out the fact that the proofs led in the Z specification \cite{bib:specif} are not formal enough, as they rely on informal comments that can be only implicitly checked by automated formal methods. Those informal comments can induce a wrong reasoning schema, leading to a proof that is formally valid but useless as it is not the proof of a given theorem: when a theorem is split into lemmas, the link between the theorem and the lemmas is sometimes shown only through informal comments, not through a formal proof. The first bug, about the \emph{Abort} proof schema of the \emph{Abstract/Between} refinement, illustrates the effect of an incorrect informal splitting of a theorem into cases, leading to an incorrect proof of the theorem, whereas the second bug, about the framing schema, points out how a lemma is incorrectly used in the proof of a theorem even though the proof of the lemma itself is correct. The \emph{Alloy Analyzer} also found a bug in the specification itself: missing constraints about authenticity. This notion is important insofar as it prevents money from magically appearing or disappearing during an operation, either because a purse appears or disappears, or because a purse is making a transaction with a non-authentic purse. \subsubsection{Abort proof schema} Mostly, the \emph{Alloy} method allows to directly check the specification without going through intermediate lemmas. But some theorems consumed too much time of analysis, or even did not terminate if directly checked at once. So, for such theorems, we had to go into the proof details. Consider the \emph{Abort} operation on a \emph{Between} world. This operation is triggered by a purse when it decides to get rid of a transaction it is involved in, for instance after a timeout when the other purse has been disconnected too early. In the \emph{Abstract/Between} refinement, this operation refnes \emph{AbIgnore}, the \emph{Abstract} no-op. Indeed, the actual transfer only happens once the {}``to'' purse is credited via the \emph{Val} operation. Thus, any other operation is considered abstractly to be no-op. For this \emph{Abort/AbIgnore} refinement, a check on a scope of 8 did not terminate after 2 days of computation. So, it was necessary to tackle a lemma. The \emph{Abort} operation can be split into three cases: \begin{enumerate} \item when the transaction has gone so far that aborting it leads to definitely losing the money; \item when the transaction has not gone far enough to decide; \item when there was no transaction to abort (the purse was idle). \end{enumerate} Case 3 is easy to separate. Just discriminate on the status of the purse, when the aborting purse has no pending transaction, hence nothing to abort. To distinguish between cases 1 and 2, the Z proof claims that it is enough to discriminate on whether the transaction in progress is in \emph{maybeLost}, that is critically ambiguous, arguing that in this case, the {}``to'' purse is necessarily aborting. Actually, this is false, as the \emph{Alloy Analyzer} generates a counterexample where the transaction in progress is in \emph{maybeLost} but the {}``from'' purse is aborting, not the {}``to''. It is worth noting that a transaction becomes lost only when the {}``to'' purse has logged the transaction. For instance, the {}``from'' purse may abort after having sent the money whereas the {}``to'' purse has still neither received the value nor aborted. The right condition that makes the proof work --- and thus, the theorem hold, as expected --- is that the aborting purse is the {}``to'' purse waiting to be credited. This is actually one of the two cases when the transaction is in progress. The other case is when the aborting purse is the {}``from'' purse expecting the acknowledgment. The latter case never causes money to be lost. The false claim has been present only in the informal text of the proof : it has not been formalized why splitting the proof of \emph{Abort} through that condition worked. That is why this bug has not been found by other methods as of May 2006. \subsubsection{Framing schema for operations that first abort} To make the proof easier, and to avoid showing several times that \emph{Abort} refines \emph{AbIgnore}, it is wise to show that operations that first abort (that is, operations initializing a transaction or a log clear) may be decomposed into elementary operations, the first being \emph{Abort}. The problem is that if such decomposition theorems are tackled with the \emph{Alloy Analyzer}, they generate counterexamples! So there is necessarily a bug in the Z specification. Actually, whereas some elementary operations output specific messages, \emph{Abort} outputs a generic message called $\perp$. The Z proof argues that operations first aborting are defined through a \emph{framing schema} $\Phi$, that is through a definition of the form : {\footnotesize $\exists\Delta ConPurse\bullet\Phi\wedge\left(Abort;ElementaryOp\right)$}{\footnotesize \par} where ; is the \emph{composition} operation. Then, the Z proof argues that this can be decomposed into two parts : {\footnotesize $\left(\exists\Delta ConPurse\bullet\Phi\wedge Abort\right);$}{\footnotesize \par} {\footnotesize $\left(\exists\triangle ConPurse\bullet\Phi\wedge ElementaryOp\right)$}{\footnotesize \par} using a lemma assuming that $\Phi$ is of the following form : {\footnotesize \begin{schema}{ConWorld} conAuthPurse : NAME \pfun ConPurse \end{schema} \begin{schema}{\Phi} \Delta ConWorld \\ \Delta ConPurse \\ n? : NAME \ST n? \in \dom conAuthPurse \\ conAuthPurse ~ n? = \theta ConPurse \\ conAuthPurse' = conAuthPurse \oplus \{ n? \mapsto \theta ConPurse' \} \end{schema}}{\footnotesize \par} But, even though the lemma itself might be true, actually the \emph{process} is wrong because $\Phi$ is actually \emph{not} of the specified form! Actually, the lemma neglects the non-functional fields of $ConWorld$, among which is the \emph{ether}. This means that messages are not handled by this schema. This explains the obtained counterexamples, for which \emph{Abort} and the elementary operation output different messages, so that it is impossible to compose them. One solution is to constrain the generic message $\perp$ to be necessarily in the \emph{Between} \emph{ether}. In that case, the composition does work, as the \emph{Abort} operation does not add any new message to the \emph{ether}. But the lemma would still have to be adapted, for instance by handling some non-functional fields (such as \emph{ether}) and by showing a modified form of this lemma where the first operation does not modify the non-functional fields but the second may do so. \subsubsection{Authenticity} The original Z specification requires that for any {}``from'' purse expecting a request, its \emph{pdAuth}, that is the current transaction details held by the purse, must be \emph{authentic} : its \emph{from} field must match the {}``from'' purse. But, even though a general constraint requires the purse to match either the \emph{from} or the \emph{to} field, there is no more precise constraint for the {}``to'' purse expecting the value, or even the {}``from'' purse expecting the acknowledgment. Due to this lack, trying to check the \emph{Abort/AbIgnore} refinement yields a counterexample. Actually, while trying to check this refinement with the method described above, two counterexamples are (successively) generated in addition to the one related to the \emph{Abort} refinement itself: \begin{itemize} \item the one if the purse holds a \emph{pdAuth} indicating that it is actually the \emph{from} purse, but expecting to be credited (a state in which only a {}``to'' purse can be); \item the other if the purse holds a \emph{pdAuth} indicating that it is actually the \emph{to} purse, but expecting an acknowledgment (a state in which only a {}``from'' purse can be). \end{itemize} This lack of authenticity creates an inconsistency in the actual role played by the purse in the transaction: their status does not match the indication in their transaction information. Adding the corresponding contraints in the \emph{Concrete}, or even in the \emph{Between} world, solves this problem and suppresses these counterexamples. This bug has also been found by other methods like Z/Eves \cite{bib:eves} or KIV \cite{bib:kiv}. \subsection{Scopes and times of checks} The choice of the scope for a theorem is a very tough issue. Indeed, the user has to find a balance between the time they want to spend checking an assertion, and the confidence level they require for it. At least, for each signature, the scope should be as large as the number of quantifications over objects of this signature. Indeed, if the scope is not large enough, then hypotheses may not be able to hold, and the theorem would be trivially true within this scope. It is often admitted that a scope of 8 is reasonable for most models. Actually, as regards the \emph{Mondex} case study: \begin{itemize} \item Given an operation, it is sound to bound the number of abstract or concrete worlds to the number of times they are quantified over in the formula. Indeed, outside the considered operation, states are independent on each other. \item But this reasoning does not apply to purses: whereas it is sound to require at least 2 purses (the {}``from'' and the {}``to''), they \emph{do} depend on other purses because of their local \emph{exLog}. In particular, even the computation of the corresponding abstract \emph{balance} and \emph{lost} does depend on several purses. Moreover, it is also interesting to consider some non-authentic purses. \item No bound on transactions or messages may be found either, for a similar reason. \end{itemize} The problem is that the time of checking exponentially increases with the scope. % \begin{figure} \begin{centering} \includegraphics[width=0.5\columnwidth,angle=-90]{oldgraph} \par\end{centering} \caption{\emph{Time exponentially increases with the scope}} \begin{centering} \emph{This graph was obtained with the first model.} \par\end{centering} \end{figure} Besides scope problems, intensive SAT-solving raises technical issues: \begin{itemize} \item machines have to be powerful enough to be able to tackle the problem, so the times of checks also depend on the speed of the processor and the amount of memory; \item but even on a given machine, the same problem being tackled by different SAT-solvers may take different times, or even crash. \end{itemize} Roughly speaking, SAT-solvings have been tackling from a few seconds to several hours, up to one day, except for the \emph{Abort/Between} refinement which has been stopped after two days of unsuccessful computation. Whereas the scopes have been successively checked for the first model, the final model has been directly checked for a scope of 10 (modulo restrictions for worlds), except for the \emph{Abstract/Between} refinement and the \emph{Between} model consistency where the scope has been limited to 8, as for the first model. It is worth noting that in that case, the times are sensitively longer for the final model than for the first model. On the one hand, this is due to the constraints, which were too strong in the first model, and have been weakened in the final model. On the other hand, it might be also due to the way the \emph{Alloy Analyzer} constructs the search space. Indeed, in the final model, there are almost no facts: all the {}``constraints'' are defined by predicates then used as hypotheses in implication formulae in assertions. Thus, the \emph{Alloy Analyzer} might have to consider \emph{every possible combination} of the atoms to define relations. \subsection{Limits to the use of the \emph{Alloy Analyzer}} Because \emph{Alloy} is based on first-order, even despite transitive closures, some properties such as the finiteness of the set of purses in an Abstract world, have to be dropped. But as regards the \emph{Mondex} case study, finiteness properties may be shown \emph{indirectly} by showing, for instance, that during an operation, such sets are obtained by union or symmetric difference from pre-state sets which are assumed to be finite. This is true following the definition of the operations. But what is more annoying is the \emph{finite scope}. Indeed, the checks led with the \emph{Alloy Analyzer} only show that the theorems hold for a certain number of atoms. More generally, as discussed in \cite{bib:frias}, in no way can the \emph{Alloy Analyzer} be used to give a rigorous proof of the checked theorems. A first attempt could be to try to increase scopes by improving ambient conditions (machines, etc.), or even by using the newer version of the \emph{Alloy Analyzer} based on \emph{Kodkod} currently being developed by SDG. But those methods are still bounded, and do not generalize. We could also try to show a \emph{small model theorem}, a meta-theorem which could in some way {}``compute a minimal scope'', or \emph{threshold}, for signatures. For instance Lee Momtahan's idea \cite{bib:smt} would be to show that, starting from a scope, it is possible to compute a threshold for one signature, for which any greater scope than this threshold would be automatically true, other signatures keeping the same scope. But this approach is still not powerful enough because: \begin{itemize} \item the extended signature may not be quantified over (except skolemizable quantifications); \item only one signature scope may be extended at the same time. \end{itemize} So, it could be wise to get rid of the scope issue and to choose a more direct approach of really \emph{proving} assertions. Then this will require the use of \emph{external} tools, that is other than the \emph{Alloy Analyzer}. It would be also an interesting way to show that the \emph{Alloy} specification language can be tacked with different methods, not only model-finding. \emph{Prioni} \cite{bib:prioni} translates an \emph{Alloy} specification into the input language of the \emph{Athena} \cite{bib:athena} proof assistant, which is based on a logic with powerful relational calculus. But the problem is that \emph{Athena}, as a proof assistant, is not automated enough. It makes sense to consider that the more expressive the logic, the less automated the tool. Then comes up an apparently interesting solution: automated \emph{first-order} theorem provers. Indeed, if finiteness properties are dropped, then it is interesting to point out the fact that the \emph{Mondex} case study can be entirely written as a first-order theory, and even without transitive closures. Actually, any higher-order quantification can be turned into first-order. For instance, to clear a set of transaction details from the logs, the Z specification computes a code, called \emph{clear code}, to represent the set being cleared. So, it is possible to quantify over this clear code instead of the whole set being cleared. Moreover, as operations are considered individually, transitive closures are not useful : there are no theorems to be shown about sequences of operations. \section{Conclusion and related work} The \emph{Alloy} formal method, based on first-order relational logic with transitive closures, allowed to specify the \emph{Mondex} case study almost entirely, that is just dropping the properties about finiteness, even though those properties may be shown indirectly. Then, without those properties, this work shows that the \emph{Mondex} case study can be rewritten as a first-order theory, even without transitive closures. Despite some implementation issues that should be improved in its successor version currently under development by the SDG group, the use of the \emph{Alloy Analyzer} allows to rapidly and efficiently develop a specification ; thanks to \emph{model-finding}, sanity checks are made in a straightforward way. The \emph{Alloy Analyzer} also allowed us to find \emph{bugs} in the original Z specification. Those bugs may be relevant to the specification itself as much as to the proof, or even to informal comments guiding the proof. Those bugs have also been found by other methods such as Z/Eves or KIV, so the \emph{Alloy Analyzer} can fairly compete in finding bugs in specifications. However, beyond finding those bugs, the \emph{Alloy Analyzer} itself does not provide any \emph{proof} of the theorems, as discussed in \cite{bib:frias}, only a confidence level depending on the size of the search space. But although this is often considered to be enough in industrial software verification, it would not fit to try to prove security-sensitive specifications such as the \emph{Mondex} case study. So it is necessary to extend the results obtained with the \emph{Alloy Analyzer}. Lee Momtahan's work upon a small model theorem \cite{bib:smt} could be a first step towards generalizing results given by the model-finder. But its too strong constraints over the specification, requiring signatures to not be quantified at all, do not fit the \emph{Mondex} case study. So, other formal methods have to complete the use of the \emph{Alloy Analyzer}. \emph{Prioni} \cite{bib:prioni} intends to use \emph{Alloy} specifications with the \emph{Athena} proof assistant, which is not fully automatic. But trying to handle \emph{Alloy} models in first-order logic could be also interesting. We have done some first attempts \cite{bib:mondextech}, but only \emph{Abstract} security properties have been shown so far. In fact, to be able to practically use theorem provers, it would be necessary to improve the conception of automated theorem provers, which is the concern raised by competitions such as TPTP \cite{bib:tptp}. For more general cases than \emph{Mondex} which might use transitive closures, Tal Lev Ami's work \cite{bib:tcfol} could represent an interesting first-order logic complement to \emph{Alloy}, as it moreover tries to handle transitive closures. It could be also interesting to develop syntactic analysis of Alloy specifications, or even to automatize relational calculus and reasoning directly at the formula level, which would make the constraint of finite scope irrelevant, as discussed in \cite{bib:frias}. \section{Acknowledgments} I would like to acknowledge Daniel Jackson and all the SDG members for their useful help --- sometimes overnight --- throughout, and even after, this internship, and for letting me discover the spirit of the laboratory, which introduced me to some research topics not tackled in France, through visiting the laboratories and attending talks led by researchers from all over the world. I would also like to acknowledge all the VSR-net members for their attendance on the meetings, in particular Jim Woodcock and Juan Bicarregui for their hospitality and their particularly enjoyable organization of the VSR-net events in England. I would also like to acknowledge Patrick Cousot, the students' advisor within the Computer Science Department of the École Normale Supérieure, for connecting Daniel Jackson and me, and more globally for his reliable pieces of advice about research topics. {\footnotesize \bibliographystyle{alpha} \addcontentsline{toc}{section}{\refname}\bibliography{mondex-alloy} } \end{document}