[1] 
Vincent Simonet.
Classifying YouTube channels: a practical system.
In Proceedings of the 2nd International Workshop on Web of
Linked Entities (WOLE 2013), in Proceedings of the 22nd International
conference on World Wide Web companion, pages 12951304, Rio de Janeiro
(Brasil), May 2013.
©ACM.
[ bib 
PDF 
At Google's 
At ACM's ]
This paper presents a framework for categorizing channels of videos in a thematic taxonomy with high precision and coverage. The proposed approach consists of three main steps. First, videos are annotated by semantic entities describing their central topics. Second, semantic entities are mapped to categories using a combination of classifiers.Last, the categorization of channels is obtained by combining the results of both previous steps. This framework has been deployed on the whole corpus of YouTube, in 8 languages, and used to build several user facing products. Beyond the description of the framework, this paper gives insight into practical aspects and experience: rationale from product requirements to the choice of the solution, spam filtering, humanbased evaluations of the quality of the results, and measured metrics on the live site.

[2] 
Vincent Simonet and François Pottier.
A constraintbased approach to guarded algebraic data types.
ACM Transactions on Programming Languages and Systems, 29(1),
January 2007.
[ bib 
PostScript 
PDF 
At ACM's ]
We study HMG(X), an extension of the constraintbased type system HM(X) with deep pattern matching, polymorphic recursion, and guarded algebraic data types. Guarded algebraic data types subsume the concepts known in the literature as indexed types, guarded recursive datatype constructors, (firstclass) phantom types, and equality qualified types, and are closely related to inductive types. Their characteristic property is to allow every branch of a case construct to be typechecked under different assumptions about the type variables in scope. We prove that HMG(X) is sound and that, provided recursive definitions carry a type annotation, type inference can be reduced to constraint solving. Constraint solving is decidable, at least for some instances of X, but prohibitively expensive. Effective type inference for guarded algebraic data types is left as an issue for future research.

[3] 
Vincent Simonet and François Pottier.
Constraintbased type inference for guarded algebraic data
types.
Research Report 5462, INRIA, January 2005.
[ bib 
PostScript 
PDF 
At INRIA's ]
Guarded algebraic data types subsume the concepts known in the literature as indexed types, guarded recursive datatype constructors, and firstclass phantom types, and are closely related to inductive types. They have the distinguishing feature that, when typechecking a function defined by cases, every branch may be checked under different assumptions about the type variables in scope. This mechanism allows exploiting the presence of dynamic tests in the code to produce extra static type information.

[4] 
Vincent Simonet.
Inférence de flots d'information pour ML: formalisation et
implantation.
PhD thesis, Université Paris 7, March 2004.
[ bib 
PostScript 
PDF ]
This thesis describes the conception of an information flow analyser for a language of the ML family, from its theoretical foundation to the practical issues. The first part of the dissertation presents the tool that was implemented, Flow Caml, and illustrates its use on concrete example. The second part gives a formalization of the type system featured by Flow Caml, together with a proof of its correctness. This is the first type system for information flow analysis in a realistic programming language that has been formally proved. Lastly, the third part is devoted to the formalization and the proof of an efficient algorithm for type inference in the presence of structural subtyping and polymorphism. An instance of this algorithm is used to synthesize types in Flow Caml.

[5] 
Vincent Simonet.
Type inference with structural subtyping: A faithful
formalization of an efficient constraint solver.
In Atsushi Ohori, editor, Proceedings of the Asian Symposium on
Programming Languages and Systems (APLAS'03), volume 2895 of Lecture
Notes in Computer Science, pages 283302, Beijing, China, November 2003.
SpringerVerlag.
©SpringerVerlag.
[ bib 
PostScript 
PDF 
Full version (PostScript) 
Full version (PDF) 
At Springer's ]
We are interested in type inference in the presence of structural subtyping from a pragmatic perspective. This work combines theoretical and practical contributions: first, it provides a faithful description of an efficient algorithm for solving and simplifying constraints; whose correctness is formally proved. Besides, the framework has been implemented in Objective Caml, yielding a generic type inference engine. Its efficiency is assessed by a complexity result and a series of experiments in realistic cases.

[6] 
Vincent Simonet.
An extension of HM(X) with bounded existential and universal
datatypes.
In Proceedings of the 8th ACM SIGPLAN International Conference
on Functional Programming (ICFP 2003), pages 3950, Uppsala, Sweden, August
2003.
©ACM.
[ bib 
PostScript 
PDF 
Full version (PostScript) 
Full version (PDF) 
At ACM's ]
We propose a conservative extension of HM(X), a generic constraintbased type inference framework, with bounded existential (a.k.a. abstract) and universal (a.k.a. polymorphic) datatypes. In the first part of the article, which remains abstract of the type and constraint language (i.e. the logic X), we introduce the type system, prove its safety and define a type inference algorithm which computes principal typing judgments. In the second part, we propose a realistic constraint solving algorithm for the case of structural subtyping, which handles the nonstandard construct of the constraint language generated by type inference: a form of bounded universal quantification.

[7] 
Vincent Simonet.
The Flow Caml System: documentation and user's manual.
Technical Report 0282, Institut National de Recherche en Informatique
et en Automatique (INRIA), July 2003.
©INRIA.
[ bib 
HTML 
PostScript 
PDF 
At INRIA's ]
Flow Caml is an extension of the Objective Caml language with a type system tracing information flow. Its purpose is basically to allow to write “real” programs and to automatically check that they obey some confidentiality or integrity policy. In Flow Caml, standard ML types are annotated with security levels chosen in a userdefinable lattice. Each annotation gives an approximation of the information that the described expression may convey. Because it has full type inference, the system verifies, without requiring source code annotations, that every information flow caused by the analyzed program is legal with regard to the security policy specified by the programmer.

[8] 
Vincent Simonet.
Flow Caml in a nutshell.
In Graham Hutton, editor, Proceedings of the first APPSEMII
workshop, pages 152165, Nottingham, United Kingdom, March 2003.
[ bib ]
Flow Caml is an extension of the Objective Caml language with a type system tracing information flow. It automatically checks information flow within Flow Caml programs, then translates them to regular Objective Caml code that can be compiled by the ordinary compiler to produce secure programs. In this paper, we give a short overview of this system, from a practical viewpoint.

[9] 
François Pottier and Vincent Simonet.
Information flow inference for ML.
ACM Transactions on Programming Languages and Systems,
25(1):117158, January 2003.
©ACM.
[ bib 
PostScript 
PDF 
At ACM's ]
This paper presents a typebased information flow analysis for a callbyvalue λcalculus equipped with references, exceptions and letpolymorphism, which we refer to as ML. The type system is constraintbased and has decidable type inference. Its noninterference proof is reasonably lightweight, thanks to the use of a number of orthogonal techniques. First, a syntactic segregation between values and expressions allows a lighter formulation of the type system. Second, noninterference is reduced to subject reduction for a nonstandard language extension. Lastly, a semisyntactic approach to type soundness allows dealing with constraintbased polymorphism separately.

[10] 
Vincent Simonet.
Finegrained information flow analysis for a λcalculus
with sum types.
In Proceedings of the 15th IEEE Computer Security Foundations
Workshop (CSFW 15), pages 223237, Cape Breton, Nova Scotia (Canada), June
2002.
©IEEE.
[ bib 
PostScript 
PDF 
Full version (PostScript) 
Full version (PDF) ]
This paper presents a new type system tracing information flow for a λcalculus equipped with polymorphic “let” and with sums (a.k.a. union types or polymorphic variants). The type system allows establishing (weak) noninterference properties. Thanks to original forms of security annotations and constraints, it is more accurate than existing analyses. Through a straightforward encoding into sums, this work also provides a new typebased information flow analysis for programming languages featuring exceptions. From these systems, one may derive constraintbased formulations, in the style of HM(X), which have decidable type inference.

[11] 
François Pottier and Vincent Simonet.
Information flow inference for ML.
In Proceedings of the 29th ACM Symposium on Principles of
Programming Languages (POPL'02), pages 319330, Portland, Oregon, January
2002.
©ACM.
[ bib 
PostScript 
PDF 
Full version (PostScript) 
Full version (PDF) 
At ACM's ]
This paper presents a typebased information flow analysis for a callbyvalue lambdacalculus equipped with references, exceptions and letpolymorphism, which we refer to as Core ML. The type system is constraintbased and has decidable type inference. Its noninterference proof is reasonably lightweight, thanks to the use of a number of orthogonal techniques. First, a syntactic segregation between values and expressions allows a lighter formulation of the type system. Second, noninterference is reduced to subject reduction for a nonstandard language extension. Lastly, a semisyntactic approach to type soundness allows dealing with constraintbased polymorphism separately.

[12]  Vincent Simonet. Inférence de flots d'information pour ML. Master's thesis, DEA « Programmation : Sémantique, Preuves, et Langages », March 2001. In French. [ bib  PostScript  PDF ] 