Research

Publications » Talks »

The research activities I've performed during my PhD thesis (from mid-2000 to beginning of 2004) are in the area of typing of programming languages. This technique allows analyzing a program statically (i.e. without running it) in order to establish some property about its execution. The most common (known as type safety) ensures the absence of runtime errors, but others can be considered, such as security properties. Roughly speaking, typing consists in associating with each program fragment a type, which describes some aspects of its behaviour, and can be deduced from the types of its sub-fragments by a set of logical rules. The determination of the type of each program component can be performed manually by the programmer thanks to annotations in the program's source code (so that the compiler has only to check the validity of provided types), or automatically thanks to a type inference algorithm. Probably, the best known programming languages equipped with a complete type inference algorithm are those of the ML family, such as SML and Caml.

Information flow analysis

Information flow analysis consists in statically determining how a program's outputs are related to its inputs, i.e. how the former depend, directly or indirectly, on the latter. This allows establishing secrecy or integrity properties of a program, i.e. proving that some aspects of its behavior convey no information about those of its inputs deemed ``secret'', or remain independent of those deemed ``unreliable''. These properties are instances of noninterference: they state the absence of certain dependencies.

I have designed, with François Pottier, a type-based information flow analysis for the ML language: a call-by-value lambda-calculus equipped with data structures, references and exceptions. To the best of my knowledge, this is the first type system for information flow analysis dealing with a real-sized programming language whose correctness has been formally proved: the calculus we studied is indeed very close to the core of the functional programming language Caml. Another interest of this work relies in the fact that, in keeping with the ML tradition, our type system has let-polymorphism and type inference, two prominent features absent of most previous works on this topic. In addition to structure, our types describe information flow; polymorphism allows writing code that is generic with respect to both. Type inference is indispensable, because types in information flow systems are verbose, and because information flow often occurs in unexpected ways.

Lastly, I have implemented the Flow Caml system, which is an extension of the OCaml compiler with the previous type system. This allows experimenting our framework on real cases and assessing its practicability. Another interest of Flow Caml relies in the fact that it is one of the first realistic programming language, which features both subtyping and polymorphism, to be implemented.

Constraint-based type systems and type inference

During the 1990s and early 2000s, several researchers have proposed formulations of type systems and type inference algorithms which involve constraints, that means logical assertions about types and type variables. One of the most popular is the parametrized extension of Hindley and Milner's type discipline, HM(X), whose original description is due to Odersky, Sulzmann, and Wehr. As is now widely recognized, the introduction of constraints is beneficial for at least two reasons. First, it is pleasant to reduce type inference to constraint solving, even when working within the basic setting of ML. Second, this allows moving easily to much more general settings, e.g. systems with subtyping.

First and foremost, I have studied structural subtyping. This is a widely used form of subtyping, where two comparable types must have the same shape and can only differ by their atomic leaves (in particular, there is no lowest or greatest type). This form of subtyping naturally arises when extending some unification-based type system with atomic annotations belonging to a poset, in order to perform some static analysis (e.g. uncaught exceptions detection, data or information flow analysis). I have in particular designed a realistic and efficient constraint solving algorithm, for type inference in systems equipped with structural subtyping and polymorphism. I gave a faithful description of it in a paper, together with a formal proof of correctness. The Dalton library is an implementation of this algorithm I have carried out in OCaml. This library comes as a Caml functor parametrized by a series of modules describing the client's type system; hence, I hope it will be a suitable type inference engine for a variety of applications. In particular, I used it in the development of Flow Caml.

I have also be interested in some enrichment of the HM(X) framework. First, I have proposed an extension of HM(X) with bounded existential and universal data-types. This work generalizes Odersky and Läufer's abstract types for ML, by allowing quantifications to be bounded by an arbitrary constraint. Here, type inference can be reduced to solving constraints which involve (restricted) forms of universal quantification and implication. These constructs are generally not handled by existing constraint solver for subtyping; but, I also proposed a realistic constraint solving algorithm for the case of structural subtyping, which handles these non-standard forms.

Following this work, I studied, with François Pottier, type inference for guarded algebraic data-types. Guarded algebraic data types, which subsume the concepts known in the literature as indexed types, guarded recursive data-type constructors, and phantom types, and are closely related to inductive types, have the distinguishing feature that, when typechecking a function defined by cases, every branch must be checked under different typing assumptions. This mechanism allows exploiting the presence of dynamic tests in the code to produce extra static type information. We proposed an extension of HM(X) with pattern matching, guarded algebraic data-types, and polymorphic recursion, and show that, up to a few mandatory type annotations, type inference for this system may be reduced to constraint solving. This has been the starting point of a series of research works leading to the implementation of guarded algebraic data-types in OCaml, in 2012.