Intersection types

intro, inheritance, subtypes, polymorphism, types, hiding, self-reference, summary, Q/A, literature

We define our second version of the typed lambda calculus ( λ_∧) as an extension of the first version ( λ_∧), an extension which provides facilities for (ad hoc) overloading polymorphism. Our extension consists of adding a type expression λ_∧ which denotes a finite conjunction of types. Such a conjunction of types, that we will also write as λ_∧ is called an intersection type. The idea is that an expression e of type λ_∧ is correctly typed if e : τ_i for some i in e : τ_i . This is expressed in the type assignment rule given in slide 9-c-intersection.

τ:: = ρ | τ₁ → τ₂ | ∧[ τ₁ .. τ_n ]

∧[ τ₁ .. τ_n ] \leqslant τ_i
Γ\vdash ∧[ σ→ τ₁ .. σ→ τ_n ] \leqslant σ→ ∧[ τ₁ .. τ_n ]

slide: The intersection type calculus

The subtyping rule for intersection types states that any subtype of a type occurring in the intersection type

τ:: = ρ | τ₁ → τ₂ | ∧[ τ₁ .. τ_n ]

∧[ τ₁ .. τ_n ] \leqslant τ_i
Γ\vdash ∧[ σ→ τ₁ .. σ→ τ_n ] \leqslant σ→ ∧[ τ₁ .. τ_n ]

is itself a subtype of the intersection type. In addition we have two subtyping rules without premises, the first of which says that the intersection type itself may be regarded as a subtype of any of its components. In other words, from a typing perspective an intersection type is equal (hence may be replaced by) any of its component types. Also, we may refine a function, with domain σ, which has an intersection type ∧[ τ₁,…, τ_n ] as its range into an intersection type consisting of functions ∧[ τ₁,…, τ_n ] for i = 1..n. Intersection types allow us to express a limited form of overloading, by enumerating a finite collection of possible types. Since the collection of types comprising an intersection type is finite, we do not need a higher order calculus here, although we might have used type abstraction to characterize intersection types.

i = 1..n

slide: Intersection types -- examples

A typical example of an intersection type is presented by the addition operator, overloaded for integers and reals, which we may define as

   
  
  
   
   + : ∧[ Int ×Int → Int, Real ×Real → Real ]

According to our refinement rule, we may specialize an intersection type into any of its components. For example, when we have an intersection type defining a mapping for integers and a mapping for reals, we may choose the one that fits our purposes best. This example illustrates that intersection types may be an important tool for realizing optimizations that depend upon (dynamic) typing. Similarly, we may refine a generic function working on objects into a collection of (specialized) functions by dividing out the range type. See slide 9-ex-intersection. The resulting intersection type itself may subsequently be specialized into one of the component functions. In [CGL93], a similar kind of type is used to model the overloading of methods in objects, that may but need not necessarily be related by inheritance. The idea is to regard message passing to objects as calling a polymorphic function that dispatches on its first argument. When the type of the first argument is compatible with multiple functions (which may happen for methods that are refined in the inheritance hierarchy) the most specific function component is chosen, that is the method with the minimal object type. A similar idea is encountered in CLOS, which allows for the definition of multi-methods for which dynamic dispatching takes place for all arguments. (A problem that occurs in modeling methods as overloaded functions is that the subtyping relation between methods no longer holds, due to the domain contravariance requirement. See [CGL93] for a possible solution.)

Overloading in C++

Although C++ does not provide support for subtyping, it does provide extensive support for function overloading. Given a collection of functions (overloading a particular function name) C++ employs a system of matches to select the function that is most appropriate for a particular call.

Overloaded function selection rules

C++

[1] no or unavoidable conversions -- array->pointer, T -> const T
[2] integral promotion -- char->int, short->int, float->double
[3] standard conversions -- int->double, double->int, derived* -> base*
[4] user-defined conversions -- constructors and operators
[5] ellipsis in function declaration -- ...

Multiple arguments -- intersect rule

better match for at least one argument and at least as good a match for every other argument

slide: Overloading in C++

Matches may involve built-in or user-defined conversions. The general rule underlying the application of conversions is that {\em conversions that are considered less error-prone and surprising are to be preferred over the others}. This rule is reflected in the ordering of the C++ overloading selection rules depicted in slide 9-cc-over. According to the rules, the absence of conversions is to be preferred. For compatibility, with C, array to pointer conversions are applied automatically, and also T to const T conversions are considered as unproblematic. Next, we have the integral promotion rules, allowing for the conversion of char to int and short to int, for example. These conversions are also directly inherited from C, and are safe in the sense that no information loss occurs. Further, we have the standard conversions such as int to double and derived* to base*, user-defined conversions (as determined by the definition of one-argument constructors and conversion operators), and the ... ellipsis notation, which allows us to avoid type-checking in an arbitrary manner. For selecting the proper function from a collection of overloaded functions with multiple arguments, the so-called intersect rule is used, which states that the function is selected with a better match for at least one argument and at least as good a match for every other argument. In the case that no winner can be found because there are multiple candidate functions with an equally good match, the compiler issues an error, as in the example below:

  void f(int, double);
  void f(double, int);
  
  f(1,2.0);  f(int, double);
  f(2.0,1);  f(double,int);
  f(1,1);  error: ambiguous

slide: example

The reason that C++ employs a system of matches based on declarations and actual parameters of functions is that the graph of built-in conversions (as inherited from C) contains cycles. For example, implicit conversions exist from int to double and double to int (although in the latter case the C++ compiler gives a warning). Theoretically, however, the selection of the best function according to the subtype relation would be preferable. However, the notion of best is not unproblematic in itself. For example, consider the definition of the overloaded function f and the classes P and C in slide 9-cc-best.

  class P;
  class C;
  
  void f(P* p) { cout << "f(P*)"; }  (1)
  void f(C* c) { cout << "f(C*)"; }  (2)
  
  class P {
  public:
  virtual void f() { cout << "P::f"; } (3)
  };
  
  class C : public P {
  public:
  virtual void f() { cout << "C::f"; }  (4)
  };

slide: Static versus dynamic selection

What must be considered the best function f, given a choice between (1), (2), (3) and (4)?

  P* p = new P;  static and dynamic P*
  C* c = new C;  static and dynamic C*
  P* pc = new C;  static P*, dynamic C*
  
  f(p);  f(P*)
  f(c);  f(C*)
  f(pc);  f(P*)
  
  p->f();  P::f
  c->f();  C::f
  pc->f();  C::f

In the example given above, we see that for the functions f (corresponding to (1) and (2)) the choice is determined by the static type of the argument, whereas for the member functions f (corresponding to (3) and (4)) the choice is determined by the dynamic type. We have a dilemma. When we base the choice of functions on the dynamic type of the argument, the function subtype refinement rule is violated. On the other hand, adhering to the domain contravariance property seems to lead to ignoring the potentially useful information captured by the dynamic type of the argument.

Intersection types

Overloading in C++

Overloaded function selection rules

Multiple arguments -- intersect rule

f(int, double);

f(double,int);

error: ambiguous

(1)

(2)

(3)

(4)

static and dynamic P*

static and dynamic C*

static P, dynamic C

f(P)*

f(C)*

f(P)*

P::f

C::f

C::f