Union Types with Disjoint Switches

Metadata

Title: Union Types with Disjoint Switches
Venue: ECOOP’22
Area: Intersection and union duality
Source URL: https://i.cs.hku.hk/~bruno/papers/ecoop22switches.pdf
Downloaded PDF: Union-Types-with-Disjoint-Switches.pdf
Extracted assets: _assets/Union-Types-with-Disjoint-Switches/
Pages: 32
Embedded image count reported by PyMuPDF: 0

Why this note exists

This note is part of the HKU PL Group literature knowledge base for work on iso-recursive subtyping, intersection types, union types, disjointness, merge operators, and compositional programming. It is designed for later Coding Agent / Proof Agent use during proof search, design brainstorming, and comparison against prior systems.

Extraction quality note

This is the fast pass generated with pymupdf4llm, not the heavier marker-pdf pass. The original PDF is linked above and kept in _pdfs/. Figures/images detected by PyMuPDF are saved under _assets/. Some mathematical notation may still require checking against the PDF, especially inference rules with complex layouts.

Technical landmarks detected automatically

These are line references into the converted Markdown below. They are meant as a quick navigation layer for rules, lemmas, theorems, proofs, algorithms, examples, and counterexamples.

line 1: Union Types with Disjoint Switches
line 3: Baber Rehman �
line 13: ~~Abstract~~
line 21: Keywords and phrases Union types, switch expression, disjointness, intersection types
line 25: Supplementary Material ECOOP 2022 Artifact Evaluation approved artifact available at : https: //github.com/baberrehman/disjoint-switches
line 29: 1 Introduction
line 39: 25:2 Union Types with Disjoint Switches
line 65: To illustrate the applications of disjoint switches, we show that they provide an alternative to certain forms of overloading , and they enable a simple approach to nullable (or optional)
line 73: 25:4 Union Types with Disjoint Switches
line 75: types . All the results about λu and its extensions have been formalized in the Coq theorem prover. In summary, the contributions of this paper are:
line 79: 2 Overview
line 83: 2.1 Tagged Union Types
line 110: 2.2 Type-directed Elimination forms for Union Types
line 121: Type-directed elimination. Some systems [22] support type-directed elimination of union types. For instance, tostring2 has different behaviors depending on the type of x .
line 143: 25:6 Union Types with Disjoint Switches
line 165: 2.3 Union Types and Disjoint Switches in Ceylon
line 167: The Ceylon language [39] supports type-directed union elimination by a switch expression with branches. The following program is an example with union types using Ceylon’s syntax:
line 206: 2.4 Nullable Types
line 218: 25:8 Union Types with Disjoint Switches
line 221:
line 228: Figure 1 Ceylon’s subtyping hierarchy. Note that Null only has Nothing as its subtype.
line 234: is rejected with a type error, since null cannot have type String . Instead, a type that can have the null value must be defined explicitly in Ceylon using union types:
line 240: Now we cannot call str.size , as str may be null , and size is not defined on null . To get the size of str , we must first check whether str is null or not using disjoint switches:
line 249: Other uses of Union Types. Union types are also useful in many other situations. In Section 2.2 we illustrated a safediv operation, which can be easily encoded in Ceylon as:
line 260: 2.5 Key Ideas in Our Work
line 270: Disjointness. A central concept in the formulation of disjoint switches is disjointness. Our first hurdle was to come up with a suitable formal definition of disjointness. Consider the simple λu switch expression:
line 281: ▶ Definition 1 ( ⊥ -Disjointness) . A ∗ B ::= ∀ C, if C < : A and C < : B then ⌋C ⌊
line 287: ▶ Definition 2 ( ∧ -Disjointness) . A ∗ B ::= ∄ C[◦] , C[◦] < : A and C[◦] < : B.
line 297: 25:10 Union Types with Disjoint Switches
line 301: void do (<Integer & String > | Boolean val) / do something /
line 317: Whereas, the following program will be accepted if we know that Person and Vehicle are disjoint:
line 325: 3 The Union Calculus λu
line 336:
line 343: Figure 2 Syntax and subtyping for λu .
line 345: 3.1 Syntax
line 349: 3.2 Subtyping
line 353: 3.3 Disjointness
line 365: 25:12 Union Types with Disjoint Switches
line 377: - Lemma 3 (Bottom-Like Soundness and Completeness) . ⌋A⌊ if and only if ∀B, A < : B.
line 379: Declarative Disjointness. The declarative definition for disjointness is as follows:
line 381: - Definition 4 ( ⊥ -Disjointness) . A ∗ B ::= ∀ C, if C < : A and C < : B then ⌋C ⌊
line 383: That is, two types are disjoint if all their common subtypes are bottom-like . We give a few examples next, employing a bold font to highlight the types being compared for disjointness:
line 385: 1. A = Int , B = Int → Bool : Int and Int → Bool are disjoint types. All common subtypes of Int and Int → Bool are bottom-like types, including ⊥ and unions of ⊥ types.
line 387: 2. A = Int ∨ Bool , B = ⊥ : Int ∨ Bool and ⊥ are disjoint types. All common subtypes are bottom-like . In general, the type ⊥ (or any bottom-like type) is disjoint to another type.
line 402:
line 406:
line 410:
line 413: Figure 3 Bottom-like types, algorithmic disjointness and typing for λu .
line 419: - Theorem 5 (Soundness and Completeness of Algorithmic Disjointness) . A ∗a B if and only
line 425: 25:14 Union Types with Disjoint Switches
line 427: if A ∗ B.
line 431: ▶ Lemma 6 (Disjointness contravariance) . If A ∗ B and C < : A and D < : B then C ∗ D.
line 433: 3.4 Typing
line 437: 3.5 Operational Semantics
line 439: Now we discuss the small-step operational semantics of λu . An important aspect of this semantics is that union elimination is type-directed : types are used to pick the branch of the switch expression.
line 451: e −→ e[′] (Operational Semantics) e 1 −→ e 1 [′] e −→ e[′] step-appl[step-appr] e 1 e 2 −→ e 1 [′][e] 2 v e −→ v e[′] ( λx.e ) v −→ e [ x � v ][step-beta] e −→ e[′]
line 461: Figure 4 Operational semantics and approximate type definitions for λu .
line 463: 3.6 Type Soundness and Determinism
line 467: - Theorem 7 (Type Preservation) . If Γ ⊢ e : A and e −→ e[′] then Γ ⊢ e[′] : A.
line 469: - Theorem 8 (Progress) . If · ⊢ e : A then either e is a value; or e −→ e[′] for some e[′] .
line 473: ▶ Lemma 9 (Exclusivity of Disjoint Types) . If A ∗ B then ∄ v such that both Γ ⊢ v : A and Γ ⊢ v : B holds.
line 475: - Theorem 10 (Determinism) . If Γ ⊢ e : A and e −→ e 1 and e −→ e 2 then e 1 = e 2 .
line 477: 3.7 An Alternative Specification for Disjointness
line 479: The current definition of disjointness (Definition 4) works well for the calculus presented in this section. But it is not the only possible formulation of disjointness. An equivalent formulation of disjointness is:
line 481: - Definition 11 ( ∧ -Disjointness) . A ∗ B ::= ∄ C[◦] , C[◦] < : A and C[◦] < : B
line 485: For the calculus presented in this section, we prove that the new definition is equivalent to the previous definition of disjointness.
line 491: 25:16 Union Types with Disjoint Switches
line 493: ▶ Lemma 12 (Disjointness Equivalence) . Definition 11 (∧-Disjointness) is sound and complete to Definition 4 (⊥-Disjointness) in λu defined in this section.
line 497: 4 λu with Intersections, Distributive Subtyping and Nominal Types
line 501: 4.1 Syntax, Well-formedness and Ordinary Types
line 514:
line 524:
line 527: Figure 5 Additional syntax and well-formedness.
line 529: 4.2 Distributive Subtyping
line 539: 25:18 Union Types with Disjoint Switches
line 543: (Declarative Subtyping with Distributivity)
line 546:
line 549: Figure 6 Distributive subtyping for λu with intersection types and nominal types.
line 551: states that ( A → B 1) ∧ ( A → B 2) is a subtype of A → ( B 1 ∧ B 2). Rule ds-distarru states that ( A 1 → B ) ∧ ( A 2 → B ) is a subtype of ( A 1 ∨ A 2) → B type. Rule ds-distor distributes intersections over unions.
line 563: ∆ ⊢ A < : B (Algorithmic Subtyping with Distributivity)

Full converted paper text

Union Types with Disjoint Switches

Baber Rehman �

The University of Hong Kong

Xuejing Huang � The University of Hong Kong

Ningning Xie � University of Cambridge

Bruno C. d. S. Oliveira � The University of Hong Kong

Abstract

Union types are nowadays a common feature in many modern programming languages. This paper investigates a formulation of union types with an elimination construct that enables case analysis (or switches) on types. The interesting aspect of this construct is that each clause must operate on disjoint types. By using disjoint switches, it is possible to ensure exhaustiveness (i.e. all possible cases are handled), and that none of the cases overlap. In turn, this means that the order of the cases does not matter and that reordering the cases has no impact on the semantics, helping with program understanding and refactoring. While implemented in the Ceylon language, disjoint switches have not been formally studied in the research literature, although a related notion of disjointness has been studied in the context of disjoint intersection types .

We study union types with disjoint switches formally and in a language independent way. We first present a simplified calculus, called the union calculus ( λu ), which includes disjoint switches and prove several results, including type soundness and determinism . The notion of disjointness in λu is in essence the dual notion of disjointness for intersection types. We then present a more feature-rich formulation of λu , which includes intersection types, distributive subtyping and a simple form of nominal types. This extension reveals new challenges. Those challenges require us to depart from the dual notion of disjointness for intersection types, and use a more general formulation of disjointness instead. Among other applications, we show that disjoint switches provide an alternative to certain forms of overloading , and that they enable a simple approach to nullable (or optional) types . All the results about λu and its extensions have been formalized in the Coq theorem prover.

2012 ACM Subject Classification Theory of computation → Type theory

Keywords and phrases Union types, switch expression, disjointness, intersection types

Digital Object Identifier 10.4230/LIPIcs.ECOOP.2022.25

Supplementary Material ECOOP 2022 Artifact Evaluation approved artifact available at : https: //github.com/baberrehman/disjoint-switches

Acknowledgements We thank the anonymous reviewers for their helpful and constructive comments. This research was funded by the University of Hong Kong and Hong Kong Research Grants Council projects number 17209519, 17209520 and 17209821.

1 Introduction

Most programming languages support some mechanism to express terms with alternative types. Algol 68 [54, 55] included a form of tagged unions for this purpose. With tagged unions an explicit tag distinguishes between different cases in the union type. Such an approach has been adopted by functional languages, like Haskell, ML, or OCaml, which allow tagged unions (or sum types [48]), typically via either algebraic datatypes [15] or variant types [32]. Languages like C or C++ support untagged union types where values of the alternative

© Baber Rehman, Xuejing Huang, Ningning Xie, and Bruno C. d. S. Oliveira; licensed under Creative Commons License CC-BY 4.0

36th European Conference on Object-Oriented Programming (ECOOP 2022). Editors: Karim Ali and Jan Vitek; Article No. 25; pp. 25:1–25:32 Leibniz International Proceedings in Informatics Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany

--- end of page.page_number=1 ---

25:2 Union Types with Disjoint Switches

types are simply stored at the same memory location. However, there is no checking of types when accessing values of such untagged types. It is up to the programmer to ensure that the proper values are accessed correctly in different contexts; otherwise the program may produce errors by accessing the value at the incorrect type.

Modern OOP languages, such as Scala 3 [44], Flow [24], TypeScript [13], and Ceylon [39], support a form of untagged union types. In such languages a union type A ∨ B denotes expressions which can have type A or type B . Union types have grown to be quite popular in some of these languages. A simple Google search on questions regarding union types on StackOverflow returns around 6620 results (at the time of writing), many of which arising from TypeScript programmers. Union types can be useful in many situations. For instance, union types provide an alternative to some forms of overloading and they enable an approach to nullable types (or explicit nulls) [34, 43].

To safely access values with union types, some form of elimination construct is needed. Many programming languages often employ a language construct that checks the types of the values at runtime for this purpose. Several elimination constructs for (untagged) union types have also been studied in the research literature [10, 29, 22]. Typically, such constructs take the form of a type-based case analysis expression.

A complication is that the presence of subtyping introduces the possibility of overlapping types . For instance, we may have a Student and a Person , where every student is a person (but not vice-versa). If we try to eliminate a union using such types we can run into situations where the type in one branch can cover a type in a different branch (for instance Person can cover Student ). More generally, types can partially overlap and for some values two branches with such types can apply, whereas for some other values only one branch applies. Therefore, the design of such elimination constructs has to consider what to do in situations where overlapping types arise. A first possibility is to have a non-deterministic semantics , where any of the branches that matches can be taken. However, in practice determinism is a desirable property, so this option is not practical. A second possibility, which is commonly used for overloading, is to employ a best-match semantics , where we attempt to find the case with the type that best matches the value. Yet another option is to use a first-match semantics , which employs the order of the branches in the case. Various existing elimination constructs for unions [10, 22] employ a first-match approach. All of these three options have been explored and studied in the literature.

The Ceylon language [39] is a JVM-based language that aims to provide an alternative to Java. The type system is interesting in that it departs from existing language designs, in particular with respect to union types and method overloading. The Ceylon designers had a few different reasons for this. They wanted to have a fairly rich type system supporting, among others: subtyping ; generics with bounded quantification ; union and intersection types ; and type-inference . The aim was to support most features available in Java, as well as a few new ones. However the Ceylon designers wanted to do this in a principled way, where all the features interacted nicely. A stumbling block towards this goal was Java-style method overloading [2]. The interaction of overloading with other features was found to be challenging. Additionally, overloaded methods with overlapping types make reasoning about the code hard for both tools and humans. Algorithms for finding the best match for an overloaded method in the presence of rich type system features (such as those in Ceylon) are challenging, and not necessarily well-studied in the existing literature. Moreover allowing overlapping methods can make the code harder to reason for humans: without a clear knowledge of how overloading resolution works, programmers may incorrectly assume that a different overloaded method is invoked. Or worse, overloading can happen silently, by simply reusing

--- end of page.page_number=2 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:3

the same name for a new method. These problems can lead to subtle bugs. For these reasons, the Ceylon designers decided not to support Java-style method overloading.

To counter the absence of overloading, the Ceylon designers turned to union types instead, but in a way that differs from existing approaches. Ceylon includes a type-based switch construct where all the cases must be disjoint . If two types are found to be overlapping, then the program is statically rejected. Many common cases of method overloading, which are clearly not ambiguous, can be modelled using union types and disjoint switches. By using an approach based on disjointness, some use cases for overloading that involve Java-style overloading with overlapping types are forbidden. However, programmers can still resort to creating non-overloaded methods in such a case, which arguably results in code easier to reason about. Disjointness ensures that it is always clear which implementation is selected for an “overloaded” method, and only in such cases overloading is allowed[1] . In the switch construct, the order of the cases does not matter and reordering the cases has no impact on the semantics, which can also aid program understanding and refactoring. Finally, from the language design point of view, it would be strange to support two mechanisms (method overloading and union types), which greatly overlap in terms of functionality.

While implemented in the Ceylon language, disjoint switches have not been studied formally. To our knowledge, the work by Muehlboeck and Tate [42] is the only work where Ceylon’s switch construct and disjointness are mentioned. However, their focus is on algorithmic formulations of distributive subtyping with unions and intersection types. No semantics of the switch construct is given. Disjointness is informally defined in various sentences in the Ceylon documentation. It involves a set of 14 rules described in English [1]. Some of the rules are relatively generic, while others are quite language specific. Interestingly, a notion of disjointness has already been studied in the literature for intersection types [45]. That line of work studies calculi with intersection types and a merge operator [49]. Disjointness is used to prevent ambiguity in merges, which can create values with types such as Int ∧ Bool. Only values with disjoint types can be used in a merge.

In this paper, we study union types with disjoint switches formally and in a language independent way. We present the union calculus ( λu ), which includes disjoint switches and union types. The notion of disjointness in λu is interesting in the sense that it is the dual notion of disjointness for intersection types. We prove several results, including type soundness , determinism and the soundness and completeness of algorithmic formulations of disjointness. We also study several extensions of λu . In particular, the first extension (discussed in Section 4) adds intersection types, nominal types and distributive subtyping to λu . It turns out such extension is non-trivial, as it reveals a challenge that arises for disjointness when combining union and intersection types: the dual notion of disjointness borrowed from disjoint intersection types no longer works, and we must employ a novel, more general, notion instead. Such change also has an impact on the algorithmic formulation of disjointness, which must change as well. We also study two other extensions for parametric polymorphism and a subtyping rule for a class of empty types in the extended version of this paper. We prove that all the extensions retain the original properties of λu . Furthermore, for our subtyping relation in Section 4 we give a sound , complete and decidable algorithmic formulation by extending the algorithmic formulation employing splittable types by Huang and Oliveira [36].

To illustrate the applications of disjoint switches, we show that they provide an alternative to certain forms of overloading , and they enable a simple approach to nullable (or optional)

1 Ceylon does allow dynamic type tests, which in combination with switches can simulate some overlapping.

ECOOP 2022

--- end of page.page_number=3 ---

25:4 Union Types with Disjoint Switches

types . All the results about λu and its extensions have been formalized in the Coq theorem prover. In summary, the contributions of this paper are:

The λu calculus: We present a simple calculus with union types, nullable types and a disjoint switch construct. We then present a richer extension of λu with intersection types, distributive subtyping and nominal types. In addition, in the extended version of the paper, we study extensions with parametric polymorphism and a subtyping rule to detect empty types. All calculi and extensions are type sound and deterministic. Sound, complete and decidable formulations of disjointness and subtyping: We present two formulations of disjointness, which are general and language independent. The second formulation is novel and more general, and can be used in a calculus that includes intersection types as well. We also extend a previous subtyping relation [36] to include nominal types. For both disjointness and subtyping we show that the specifications are sound, complete and decidable and present the corresponding algorithmic formulations. Mechanical formalization: The results about λu and its extensions have been formalized in the Coq theorem prover and can be found in the supplementary materials, together with an extended version of the paper.

2 Overview

This section provides some background on union types and some common approaches to eliminate union types. Then it describes the Ceylon approach to union types, and discusses a few applications of union types. Finally, it presents the key ideas and challenges in our work.

2.1 Tagged Union Types

We start with a brief introduction to union types. An expression has a union type A ∨ B , if it can be considered to have either type A or type B . Many systems model tagged union types (also called sum types or variants types ), where explicit tags are used to construct terms with union types, as in languages with algebraic datatypes [15] or (polymorphic) variants [32]. In their basic form, there are two introduction forms: inj1 : A → A ∨ B turns the type of an expression from A into A ∨ B ; and inj2 : B → A ∨ B turns the type of an expressions from B into A ∨ B . Using tagged union types, we can implement a safe integer division function, as[2] :

String|Intsafediv(x:Int)(y:Int)=
if(y==0)theninj1"Dividedbyzero"elseinj2(x/y)//usestags

Here the intention is to have a safe (integer) division operation that detects division by zero errors, and requires clients of this function to handle such errors. The return type String | Int denotes that the function can either return an error message (a string), or an integer, when division is performed without errors.

Elimination form for tagged union types. Tagged union types are eliminated by some form of case analysis. For consistency with the rest of the paper, we use a syntactic form with switch expressions for such case analysis. For example, the following program tostring has different behaviors depending on the tag of x , where show takes an Int and returns back its string representation.

2 Throughout this paper, we write union types as A | B in code, since this is widely adopted in programming languages (e.g., Ceylon, Scala, and TypeScript), and as A ∨ B in the formal calculi, which is more frequently used in the literature.

--- end of page.page_number=4 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:5

Stringtostring(x:String|Int)=switch(x)
inj1str->str
inj2num->shownum

2.2 Type-directed Elimination forms for Union Types

While tags are useful to make it explicit which type a value belongs to, they also add clutter in the programs. On the other hand, in systems with subtyping for union types [29, 47, 42], explicit tags are replaced by implicit coercions represented by the two subtyping rules A < : A ∨ B and B < : A ∨ B . In this paper we refer to union types where the explicit tags are replaced by implicit coercions as untagged union types , or simply union types . In those systems, a term of type A or B can be directly used as if it had type A ∨ B , and thus we can write safe division as:

String|Intsafediv2(x:Int)(y:Int)=
if(y==0)then"Dividedbyzero"else(x/y)//notags!

However, now the elimination form of union types cannot rely on explicit tags anymore, and different systems implement elimination forms differently. The most common alternative is to employ types in the elimination form. We review type-directed union elimination next.

Type-directed elimination. Some systems [22] support type-directed elimination of union types. For instance, tostring2 has different behaviors depending on the type of x .

Stringtostring2(x:String|Int)=switch(x)
(y:String)->y
(y:Int)->showy

However, compared to tag-directed elimination, extra care must be taken with typedirected elimination. In particular, while we can easily distinguish tags, ambiguity may arise when types in a union type overlap for type-directed elimination. For example, consider the type Person | Student , where we assume Student is a subtype of Person . With type-directed elimination, we can write:

Boolisstudent(x:Person|Student)=switch(x)
(y:Person)->False
(y:Student)->True

Now it is unclear what happens if we apply isstudent to a term of type Student , as its type matches both branches. In some calculi [29], the choice is not determined in the semantics, in the sense that either branch can be chosen. This leads to a non-deterministic semantics. In some other languages or calculi [22], branches are inspected from top to bottom, and the first one that matches the type gets chosen. However, in those systems, as Person is a supertype of Student , the first branch subsumes the second one and will always get chosen, and so the second branch will never get evaluated! This may be unintentional, and similar programs being accepted can lead to subtle bugs. Even if a warning is given to alert programmers that a case can never be executed, there are other situations where two cases overlap, but neither case subsumes the other. For instance we could have Student and Worker as subtypes of Person . For a person that is both a student and a worker, a switch statement that discriminates between workers and students could potentially choose either branch. However for persons that are only students or only workers, only one branch can be chosen.

ECOOP 2022

--- end of page.page_number=5 ---

25:6 Union Types with Disjoint Switches

Best-match and overloading. Some languages support an alternative to typed-based union elimination via method overloading. Such form is used in, for example, Java [33] and Julia [57]. In Java, we can encode isstudent2 as an overloaded method, which has different behaviors when the type of the argument differs.

booleanisstudent2(Personx){returnFalse;}
booleanisstudent2(Studentx){returnTrue;}

Java resolves overloading by finding and selecting, from all method implementations, the one with the best type signature that describes the argument. If we apply isstudent2 to a term of type Student , the second implementation is chosen, as Student is the best type describing the argument. As we can see, such a best-match strategy eliminates the order-sensitive problem, as it always tries to find the best-match despite the order. That is, in Java the method order does not matter: in this case, we have the method for Person before the one for Student , but Java still finds the one for Student .

However, the best-match strategy can also be confusing, especially when the system features implicit upcasting (e.g., by subtyping). If programmers are not very familiar with how overloading resolution works, they may assume that the wrong implementation is called in their code. For instance, in Java we may write:

Personp=newStudent ();
isstudent2(p);

In this case Java will pick the isstudent2 method with the argument Person , since Java overloading uses the static type ( p has the static type Person ) to resolve overloading. But some programmers may assume that the implementation of the method for Student would be chosen instead, since the person is indeed a student in this case. This can be confusing and lead to subtle bugs.

Moreover, there are other tricky situations that arise when employing a best-match strategy. For example, suppose that the type Pegasus is a subtype of both type Bird and type Horse . If a method isbird is overloaded for Bird and Horse , then which method implementation should we choose when we apply isbird to a term of type Pegasus , the one for Bird , or the one for Horse ? In such case, we have an ambiguity. Things get worse when the type system includes more advanced type system features, such as generics, intersections and union types, or type-inference.

2.3 Union Types and Disjoint Switches in Ceylon

The Ceylon language [39] supports type-directed union elimination by a switch expression with branches. The following program is an example with union types using Ceylon’s syntax:

voidprint(String|Integer|Floatx){
switch(x)
case(isString){print("String:‘‘x‘‘");}
case(isInteger|Float){print("Number:‘‘x‘‘");}
}

For the switch expression, Ceylon enforces static type checking with two guarantees: exhaustiveness , and disjointness . First, Ceylon ensures that all cases in a switch expression are exhaustive . In the above example, x can either be a string, an integer or a floating point number. The types used in the cases do not have to coincide with the types of x . Nevertheless, the combination of all cases must be able to handle all possibilities. If the last case only dealt with Integer (instead of Integer|Float ), then the program would be statically rejected, since no case deals with Float .

--- end of page.page_number=6 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:7

Second, Ceylon enforces that all cases in a switch expression are disjoint . That is, unlike the approaches described in Section 2.2, in Ceylon, it is impossible to have two branches that match with the input at the same time. For instance, if the first case used the type String | Float instead of String , the program would be rejected statically with an error. Indeed, if the program were to be accepted, then the call print(3.0) would be ambiguous, since there are two branches that could deal with the floating point number. Note that, since the cases in a switch cannot overlap, their order is irrelevant to the program’s behavior and its evaluation result. All of the overlapping examples from the previous section will statically be rejected in similar fashion.

Union types as an alternative to overloading. One motivation for such type-directed union elimination in Ceylon is to model a form of function overloading. The following example, which is adapted from TypeScript’s documentation [3], demonstrates how to define an “overloaded” function padLeft , which adds some padding to a string. The idea is that there can be two versions of padLeft : one where the second argument is a string; and the other where the second argument is an integer:

`String` `space(Integer n){`	`String` `space(Integer n){`
`if (n==0)`	`{ return ""; }`
`else`	`{ return " "+space(n-1); }`
`}`
`String` `padLeft(String v, String	Integer x) {`
`switch (x)`
`case (is`	`String)` `{ return`	`x+v; }`
`case (is`	`Integer) { return`	`space(x)+v; }`
`}`
`print(padLeft("?", 5)); // "`		`?"`
`print(padLeft("World", "Hello`		`")); // "Hello`	`World"`

In padLeft , there are two cases of the switch construct depending on the type of x : the first one appends a string to the left of v , and the other calls function space to generate a string with x spaces, and then append that to v . Although statically x has type String|Integer , as a concrete value it can only be a string or an integer. As such, when values with such types are passed to the function, the corresponding branch is chosen and executed.

2.4 Nullable Types

Besides being used for overloading, union types can be used for other purposes too. Null pointer exceptions (NPEs) are a well-known and tricky problem in many languages. The problem arises when dereferencing a pointer with the null value. For instance, if we have a variable str , which is assigned to null , the the code print(str.size) , in a Java-like language, will raise a null pointer exception. This is because of so-called implicit nulls in Java and other popular languages. With implicit nulls, any variable of a reference type can be null .

An interesting application of union types in Ceylon is to encode nullable types (or optional types ) [34] in a type-safe way. A similar approach to nullable types has also been recently proposed for Scala [43]. In those languages, there is a special type Null , which is inhabited by the null value. Note that Null differs from Nothing (the bottom type in Ceylon), in the sense that Null is inhabited while Nothing is not. To illustrate the subtle difference, Figure 1 presents a part of the subtyping lattice in Ceylon. Anything , the top type in Ceylon, is an enumerated class. Anything is also a supertype of Object , which is the root of primitive types, function types, all interfaces and any user-defined class. Notably, Null is disjoint to Object , and therefore, to all user-defined classes.

In Ceylon the following code:

ECOOP 2022

--- end of page.page_number=7 ---

25:8 Union Types with Disjoint Switches

----- Start of picture text -----
Anything
Null Object
Char Integer
Nothing
----- End of picture text -----

Figure 1 Ceylon’s subtyping hierarchy. Note that Null only has Nothing as its subtype.

Stringstr=null;

is rejected with a type error, since null cannot have type String . Instead, a type that can have the null value must be defined explicitly in Ceylon using union types:

String|Nullstr=null;

Now we cannot call str.size , as str may be null , and size is not defined on null . To get the size of str , we must first check whether str is null or not using disjoint switches:

String|Nullstr=null;
switch(str)
case(isString){print(str.size);}
case(isNull){print ();}

Other uses of Union Types. Union types are also useful in many other situations. In Section 2.2 we illustrated a safediv operation, which can be easily encoded in Ceylon as:

String|Integersafediv3(Integerx,Integery){
if(y==0){return"Dividedbyzero";}
else{return(x/y);}
}

The return value can be a string or an integer, with no explicit tag needed, as union types are implicitly introduced. As long as the declared return type of the function is a supertype of all possible return values, it is valid in Ceylon.

2.5 Key Ideas in Our Work

We first introduce a simplified formulation of λu , which formalizes the basic ideas of union types with disjoint switches similar to those in the Ceylon Language. To our best knowledge, there is yet no formalism of disjoint switches, and λu studies those features formally and precisely. In particular, λu captures the key idea for type-directed elimination of union types in its switch construct in a language independent way, and formally defines disjointness, disjointness and subtyping algorithms, and the operational semantics. The simplified formulation of λu is useful to compare with existing calculi with union types in the literature [41, 9, 30, 29, 47, 18]. Moreover, we study a more fully featured formulation of λu that includes practical extensions, such as intersection types, distributive subtyping, nullable types and a simple form of nominal types. λu is proved to enjoy many desirable properties, such as type soundness, determinism and the soundness/completeness of disjointness and subtyping definitions. All the Ceylon examples in Sections 2.3 and 2.4 can be encoded in λu .

--- end of page.page_number=8 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:9

Disjointness. A central concept in the formulation of disjoint switches is disjointness. Our first hurdle was to come up with a suitable formal definition of disjointness. Consider the simple λu switch expression:

switchx{
(y:String|Int)->0
(y:Int|Bool)->1
}

Here we wish to determine whether String ∨ Int and Int ∨ Bool are disjoint or not. In other words, we wish to determine whether, for any possible (dynamic) type that x can have, it is unambiguous which branch to choose. In this case, it turns out that there is ambiguity. For instance, if x is an integer, then either branch can be chosen. Thus λu rejects this program with a disjointness error. In this example, the reason to reject the program is basically that Int < : String ∨ Int and Int < : Int ∨ Bool. That is we can find a common subtype (Int) of the types in both branches. Moreover, that subtype can be inhabited by values (integer values in this case). If the only common subtypes of the types in the two branches would be types like ⊥ (which has no inhabitants), then the switch should be safe because we would not be able to find a value for x that would trigger two branches. This observation leads to the notion of disjointness employed in the first variant λu in Section 3. Formally, we have:

▶ Definition 1 ( ⊥ -Disjointness) . A ∗ B ::= ∀ C, if C < : A and C < : B then ⌋C ⌊

Here we use ⌋C ⌊ to denote that type C is equivalent to type ⊥ , or, bottom-like (i.e. C < : ⊥ ). In either definition, Int serves as a counter-example for String ∨ Int and Int ∨ Bool to be disjoint. Thus λu rejects the program above with a disjointness error. It is worth noting that this first notion of disjointness is essentially dual to a definition of disjointness for intersection types in the literature in terms of top-like common supertypes [45].

Disjointness in the presence of intersection types. The variant of λu in Section 3 does not include intersection types. Unfortunately, the disjointness definition above does not work in the presence of intersection types. The reason is simple: with intersection types we can always find common subtypes, such as Int ∧ Bool, which are not bottom-like, and yet they have no inhabitants. That is, Int ∧ Bool is not a subtype of ⊥ , but no value can have both type Int and type Bool. We address this issue by reformulating disjointness in terms of ordinary types [28], which are basically primitive types (such as integers or functions). If we can find common ordinary subtypes between two types, we know that they are not disjoint. Thus the disjointness definition used for formulations of λu with intersection types is:

▶ Definition 2 ( ∧ -Disjointness) . A ∗ B ::= ∄ C[◦] , C[◦] < : A and C[◦] < : B.

Note that here C[◦] is a metavariable denoting ordinary types. Under this definition we can check that Int and Bool are disjoint, since no ordinary type is a subtype of both of these two types. This definition avoids the issue with Definition 1, which would not consider these two types disjoint. Moreover, this definition is a generalization of the previous one, and in the variant with union types only the two definitions coincide.

This new definition requires a different approach to algorithmic disjointness. Our new approach is to use the notion of lowest ordinary subtypes : For any given type, we calculate a finite set to represent all the possible values that can match the type. Then we can easily determine whether two types are disjoint by ensuring that the intersection of their lowest ordinary subtypes is empty.

ECOOP 2022

--- end of page.page_number=9 ---

25:10 Union Types with Disjoint Switches

Distributive Subtyping. In Section 4, we study λu with an enriched distributive subtyping relation inspired by Ceylon programming language. Distributive subtyping is more expressive than standard subtyping and adds significant complexity in the system, in particular for a formulation of algorithmic subtyping and the completeness proof of the disjointness algorithm. Nevertheless, distributive subtyping does not affect the disjointness definition and its algorithm remains the same with and without distributive subtyping. The following code snippet elaborates on the expressiveness of distributive subtyping:

`void do (<Integer & String > | Boolean val) { /* do something */ }`

The function do in above code snippet takes input value of type (Int ∧ String) ∨ Bool. However, we cannot pass a value of type (Int ∨ Bool) ∧ (String ∨ Bool) to the function do : we get a type error if we try to do that in a system with standard subtyping (without distributivity), as standard subtyping fails to identify that the value has a subtype of the expected argument type. Distributive rules enable this subtyping relation. With distributivity of unions over intersections (and vice-versa), the type (Int ∨ Bool) ∧ (String ∨ Bool) is a subtype of (Int ∧ String) ∨ Bool (in particular, by rule ds-distor in Figure 6). As such with distributive subtyping, the following Ceylon program type-checks:

variable<Integer|Boolean >&<String|Boolean >x=true;do(x);

Nominal Types and Other Extensions to λu . We also study several extensions to λu , including nominal types. The extension with nominal types is interesting, since nominal types are highly relevant in practice. We show a sound, complete and decidable algorithmic formulation of subtyping with nominal types by extending an approach by Huang and Oliveira [36]. We show that disjointness can also be employed in the presence of nominal types. This extension rejects ambiguous programs with overlapping nominal types in different branches of switch construct at compile time. It illustrates that disjointness is practically applicable to structural types as well as the nominal types. For example, the following program will statically be rejected in λu with nominal types:

Boolisstudent(x:Person|Student)=switch(x)
(y:Person)->False
(y:Student)->True

Whereas, the following program will be accepted if we know that Person and Vehicle are disjoint:

Boolisvehicle(x:Person|Vehicle)=switch(x)
(y:Person)->False
(y:Vehicle)->True

3 The Union Calculus λu

This section introduces the simplified union calculus λu . The distinctive feature of the λu calculus is a type-based switch expression with disjoint cases, which can be used to eliminate values with union types. In this first formulation of λu we only include the essential features of a calculus with disjoint switches: union types and disjoint switches. Section 4 then presents a richer formulation of λu with several extensions of practical relevance. We adapt the notion of disjointness from previous work on disjoint intersection types [45] to λu , and show that λu is type sound and deterministic.

--- end of page.page_number=10 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:11

----- Start of picture text -----
Type A, B , C ::= ⊤| ⊥| Int | A → B | A ∨ B | Null
Expr e ::= x | i | λx.e | e 1 e 2 | switch e { ( x : A ) → e 1 , ( y : B ) → e 2 } | null
Value v ::= i | λx.e | null
Context Γ ::= · | Γ , x : A
A < : B (Subtyping)
A < : ⊤ [s-top] Null < : Null [s-null] Int < : Int [s-int]
B 1 < : A 1 A 2 < : B 2 A < : C B < : C
s-arrow s-ora
A 1 → A 2 < : B 1 → B 2 ⊥ < : A [s-bot] A ∨ B < : C
A < : B A < : C
A < : B ∨ C [s-orb] A < : B ∨ C [s-orc]
----- End of picture text -----

Figure 2 Syntax and subtyping for λu .

3.1 Syntax

Figure 2 shows the syntax for λu . Metavariables A , B and C range over types. Types include top ( ⊤ ), bottom ( ⊥ ), integer types (Int), function types ( A → B ), union types ( A ∨ B ) and null types (Null). Metavariable e ranges over expressions. Expressions include variables ( x ), integers ( i ), lambda abstractions ( λx.e ), applications ( e 1 e 2), a novel switch expression (switch e { ( x : A ) → e 1 , ( y : B ) → e 2 } ) and the null expression. The switch expression evaluates a specific branch by matching the types in the cases. Note that, although the switch expression in λu only has two branches, a multi-branch switch can be easily encoded by employing nested switch expressions. We model the two-branch switch for keeping the formalization simple, but no expressive power is lost compared to a multi-branch switch. Metavariable v ranges over values. Values include i , λx.e and null expressions. Finally, a context (Γ) maps variables to their associated types.

3.2 Subtyping

The subtyping rules for λu are shown at the bottom of Figure 2. The rules are standard. Rule s-top states that all types are subtypes of the ⊤ type. Rule s-bot states that ⊥ type is subtype of all types. Rule s-null states that the Null type is a subtype of itself. Rules s-int and s-arrow are standard rules for integers and functions respectively. Functions are contravariant in input types and covariant in output types. Rules s-ora, s-orb, and s-orc deal with the subtyping for union types. Rule s-ora says that the union type of A and B is a subtype of another type C if both A and B are subtypes of C . Rules s-orb and s-orc state if a type is subtype of one of the components of a union type, then it is subtype of the union type. The subtyping relation for λu is reflexive and transitive.

3.3 Disjointness

The motivation for a definition of disjointness based on bottom-like types is basically that in disjoint switches, the selection of branches can be viewed as a type-safe downcast. For instance, recall the example in Section 2.5:

switchx{

ECOOP 2022

--- end of page.page_number=11 ---

25:12 Union Types with Disjoint Switches

(y:String|Int)->0
(y:Int|Bool)->1
}

Here x may have type Int | String | Bool and the two branches in the disjoint switch cover two subtypes String | Int and Int | Bool . When considered together those subtypes cover all possibilities for the value x (i.e. x can be either an integer, a string or a boolean, and the two cases cover all those possibilities). The exhaustiveness of the downcasts is what ensures that the downcasts are type-safe (that is they cannot fail at runtime). However, we also need to ensure that the two cases do not overlap to prevent ambiguity. In essence, in this simple setting of λu , checking that two types do not overlap amounts to check that there are no basic types (like Int or Bool ) in common. In other words the only common subtypes should be bottom-like types.

Bottom-Like Types. Bottom-like types are types that are equivalent (i.e. both supertypes and subtypes) to ⊥ . In λu , there are infinitely many such types, and they all are uninhabited by values. According to the inductive definition shown at the top of Figure 3, they include the bottom type itself (via rule bl-bot) and unions of two bottom-like types (via rule bl-or), e.g. ⊥∨⊥ . The correctness of our definition for bottom-like types is established by the following property:

Lemma 3 (Bottom-Like Soundness and Completeness) . ⌋A⌊ if and only if ∀B, A < : B.

Declarative Disjointness. The declarative definition for disjointness is as follows:

Definition 4 ( ⊥ -Disjointness) . A ∗ B ::= ∀ C, if C < : A and C < : B then ⌋C ⌊

That is, two types are disjoint if all their common subtypes are bottom-like . We give a few examples next, employing a bold font to highlight the types being compared for disjointness:

1. A = Int , B = Int → Bool : Int and Int → Bool are disjoint types. All common subtypes of Int and Int → Bool are bottom-like types, including ⊥ and unions of ⊥ types.

2. A = Int ∨ Bool , B = ⊥ : Int ∨ Bool and ⊥ are disjoint types. All common subtypes are bottom-like . In general, the type ⊥ (or any bottom-like type) is disjoint to another type.

3. A = Int , B = ⊤ : Int and ⊤ are not disjoint types because they share a common subtype Int which is not bottom-like . In general no type is disjoint to ⊤ , except for bottom-like types. Also, one type is not disjoint with itself, unless it is bottom-like .

4. A = Int → Bool , B = String → Char : The types Int → Bool and String → Char are not disjoint, since we can find non-bottom-like types that are subtypes of both types. For instance ⊤→⊥ is a subtype of both types. More generally, any two function types can never be disjoint: it is always possible to find a common subtype, which is not bottom-like .

Disjointness for Intersection Types. In essence, disjointness for λu is dual to the disjointness notion in λi [45], a calculus with disjoint intersection types. In λu , two types are disjoint if they do not share any common subtype which is not bottom-like . While in λi , two types are disjoint if they do not share any common supertype which is not top-like (i.e. equivalent to ⊤ ). While a disjoint switch provides deterministic behavior for downcasting, disjointness in intersection types prevents ambiguity in upcasting. In a type-safe setting, if two values v 1 and v 2 (of type A 1 and A 2) can both be upcasted to type B , then B must be a common supertype of A 1 and A 2. The disjointness restriction of A 1 and A 2 means they cannot have any non-top-like common supertype, so when the two values together upcasted to a type like

--- end of page.page_number=12 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:13

Figure 3 Bottom-like types, algorithmic disjointness and typing for λu .

Int, only one of them can contribute to the result. Prior work on disjoint intersection types is also helpful to find an algorithmic formulation of disjointness. Declarative disjointness does not directly lead to an algorithm. However, we can find an algorithmic formulation that employs dual rules to those for disjoint intersection types.

Algorithmic Disjointness. We present an algorithmic version of disjointness in the middle of Figure 3. Rules ad-btmr and ad-btml state that the ⊥ type is disjoint to all types. Rules ad-intl and ad-intr state that Int and A → B are disjoint types. Algorithmic disjointness can further be scaled to more primitive disjoint types such as Bool and String by adding more rules similar to rules ad-intl and ad-intr for additional primitive types. Rules ad-null-intl and ad-null-intr state that Null and Int are disjoint types. Similarly, rules ad-null-funl and ad-null-funr state that Null and A → B are disjoint types. Rules ad-orl and ad-orr are two symmetric rules for union types. Any type C is disjoint to an union type A ∨ B if C is disjoint to both A and B . We show that algorithmic disjointness is sound and complete with respect to its declarative specification (Definition 4).

Theorem 5 (Soundness and Completeness of Algorithmic Disjointness) . A ∗a B if and only

ECOOP 2022

--- end of page.page_number=13 ---

25:14 Union Types with Disjoint Switches

if A ∗ B.

A natural property of λu is that if type A and type B are two disjoint types, then subtypes of A are disjoint to subtypes of B . This property dualises the covariance of disjointness property in calculi with disjoint intersection types [4].

▶ Lemma 6 (Disjointness contravariance) . If A ∗ B and C < : A and D < : B then C ∗ D.

3.4 Typing

The typing rules are shown at the bottom of Figure 3. They are mostly standard. An integer has type Int, null has type Null and variable x gets type from the context. Rule typ-app is the standard rule for function application. Similarly, rule typ-sub and rule typ-abs are standard subsumption and abstraction rules respectively. The most interesting and novel rule is for switch expressions (rule typ-switch). It has four conditions. First, Γ ⊢ e : A ∨ B ensures exhaustiveness of the cases in the switch: e must check against the types in the branches of the switch. The next two conditions ensure that branches of case expressions are well-typed and have type C , where the input variable is bound to type A and to type B respectively in the two branches. Finally, A ∗ B guarantees the disjointness of A and B . This forbids overlapping types for the branches of case expressions to avoid non-deterministic results. Since all the branches have type C , the whole switch expression has type C . Note that the two branches can have different return types. For example, if e 1 and e 2 have type Int and String respectively, the whole expression can have type Int ∨ String.

3.5 Operational Semantics

Now we discuss the small-step operational semantics of λu . An important aspect of this semantics is that union elimination is type-directed : types are used to pick the branch of the switch expression.

Figure 4 shows the operational semantics of λu . Rules step-appl, step-appr, and step-beta are the standard call-by-value reduction rules for applications. Of particular interest are rules step-switch, step-switchl, and step-switchr, which reduce the switch expressions. First, rule step-switch reduces the case expression e , until it becomes a value v , at which point we must choose between the two branches of switch . We do so by inspecting the type of v : if the approximate type of v is a subtype of type of the left branch, then rule step-switchl evaluates the left branch of the switch expression, or otherwise if it is a subtype of the type of the right branch, rule step-switchr evaluates the right branch.

Note that the approximate type definition gives only a subtype of the actual type for a lambda value. This works, because the approximate type is only employed to allow the selection of a case with a function type, and in λu two function types can never be disjoint. Therefore, if there is a branch with a function type, then that must be the branch that applies to a lambda value. Note also that the program has been type-checked before hand, so we know that the static type of the value is compatible with the types on the branches. The subtyping condition in rules step-switchl and step-switchr is important, as it provides flexibility for the value to have various subtypes of A and B , instead of strictly having those types. Recall that the typing rule for switch (rule typ-switch) requires that types of left and right branches of a switch expression to be disjoint. This ensures that rules step-switchl and step-switchr cannot overlap, which, as we will see, is important for the operational semantics to be deterministic .

--- end of page.page_number=14 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:15

e −→ e[′] (Operational Semantics) e 1 −→ e 1 [′] e −→ e[′] step-appl[step-appr] e 1 e 2 −→ e 1 [′][e] 2 v e −→ v e[′] ( λx.e ) v −→ e [ x � v ][step-beta] e −→ e[′]

switch e { ( x : A ) → e 1 , ( y : B ) → e 2 } −→ switch e[′] { ( x : A ) → e 1 , ( y : B ) → e 2 }[step-switch]

⌊v⌋<:A switch v {(x:A)→e_1,(y :B)→e_2} −→e_1[x �_v] step-switchl ⌊v⌋<:B switch v {(x:A)→e_1,(y :B)→e_2} −→e_2[y �_v] step-switchr	Approximate Type ⌊v⌋
	⌊i⌋ = Int ⌊λx.e⌋ = ⊤→⊥ ⌊null⌋ = Null

Figure 4 Operational semantics and approximate type definitions for λu .

3.6 Type Soundness and Determinism

In this section, we prove that λu is type sound and deterministic. Type soundness is established by the type preservation and progress theorems. Type preservation (Theorem 7) states that types are preserved during reduction. Progress (Theorem 8) states that well typed programs never get stuck: a well typed expression e is either a value or it can reduce to some other expression e[′] .

Theorem 7 (Type Preservation) . If Γ ⊢ e : A and e −→ e[′] then Γ ⊢ e[′] : A.
Theorem 8 (Progress) . If · ⊢ e : A then either e is a value; or e −→ e[′] for some e[′] .

Determinism of λu (Theorem 10) ensures that a well-typed program reduces to a unique result. In particular, it guarantees that switch expressions are not order-sensitive: at any time, only one of the rules step-switchl and step-switchr can apply. The determinism of the switch expression relies on an essential property that a value cannot check against two disjoint types (Lemma 9).

▶ Lemma 9 (Exclusivity of Disjoint Types) . If A ∗ B then ∄ v such that both Γ ⊢ v : A and Γ ⊢ v : B holds.

Theorem 10 (Determinism) . If Γ ⊢ e : A and e −→ e 1 and e −→ e 2 then e 1 = e 2 .

3.7 An Alternative Specification for Disjointness

The current definition of disjointness (Definition 4) works well for the calculus presented in this section. But it is not the only possible formulation of disjointness. An equivalent formulation of disjointness is:

Definition 11 ( ∧ -Disjointness) . A ∗ B ::= ∄ C[◦] , C[◦] < : A and C[◦] < : B

According to the new definition, two types are disjoint if they do not have common subtypes that are ordinary . Ordinary types (denoted by C[◦] ) are essentially those types that are primitive, such as integers and functions (see Figure 5 for a formal definition).

For the calculus presented in this section, we prove that the new definition is equivalent to the previous definition of disjointness.

ECOOP 2022

--- end of page.page_number=15 ---

25:16 Union Types with Disjoint Switches

▶ Lemma 12 (Disjointness Equivalence) . Definition 11 (∧-Disjointness) is sound and complete to Definition 4 (⊥-Disjointness) in λu defined in this section.

Why do we introduce the new definition of disjointness? It turns out that the previous definition is not sufficient when the calculus is extended with intersection types. As we will see, the new definition will play an important role in such variant of the calculus.

4 λu with Intersections, Distributive Subtyping and Nominal Types

In this section we extend λu with intersection types, nominal types and an enriched distributive subtyping relation. The study of an extension of λu with intersection types is motivated by the fact that most languages with union types also support intersection types (for example Ceylon, Scala or TypeScript). Furthermore, languages like Ceylon or Scala also support some form of distributive subtyping, as well as nominal types. Therefore it is important to understand whether those extensions can be easily added or whether there are some challenges. As it turns out, adding intersection types does pose a challenge, since the notion of disjointness inspired from disjoint intersection types [45] no longer works. Moreover subtyping relations with distributive subtyping add significant complexity, and we need an extension that supports nominal types as well. We show that desirable properties, including type soundness and determinism, are preserved in the extended version of λu . Moreover we prove that both disjointness and subtyping have sound, complete and decidable algorithms.

4.1 Syntax, Well-formedness and Ordinary Types

The syntax for this section mostly follows from Section 3, with the additional syntax given in Figure 5. The most significant difference and novelty in this section is the addition of intersection types A ∧ B and an infinite set of nominal types. We use metavariable P to stand for nominal types. Expressions are extended with a new expression (new P) to create instances of nominal types. The expression new P is also a value. Context Γ stays the same as in Section 3. We add a new context ∆, to track nominal types and their supertypes. For example, adding P1 ≤ P2 to ∆declares a new nominal type P1 that is a subtype of P2. For a well-formed context, the supertype P2 has to be declared before P1. We also allow to declare a new nominal type P1 with ⊤ as its supertype by adding P1 ≤⊤ to ∆. Metavariable A[◦] , B[◦] and C[◦] ranges over ordinary types [28]. There are four kinds of ordinary types: integers, null, function types and nominal types. Well-formed types and well-formedness of ordinary contexts ∆are shown in Figure 5.

Remark on Nominal Types. Note that our formulation of nominal types is simplified in two ways compared to languages like Java. Firstly, we do not consider arguments when building new expressions (i.e. we do not allow expressions like new Person("John") ). Secondly, we also do not introduce class declarations, which would allow nominal types to be associated with method implementations. We follow a design choice for nominal types similar to Featherweight Java [38]. Featherweight Java uses a fixed size context for nominal types. Diamond inheritance is also not supported in Featherweight Java, and we follow that design choice as well. However, we believe that supporting diamond inheritance in our calculus is relatively easy. These simplifications keep the calculus simple, while capturing the essential features that matter for disjointness and the formalization of disjoint switches. Allowing for a more complete formulation of nominal types can be done in mostly standard ways.

--- end of page.page_number=16 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:17

----- Start of picture text -----
A, B , C ::= … | A ∧ B | P
A [◦] , B [◦] , C [◦] ::= Int | Null | A → B | P
e ::= … | new P
v ::= … | new P
Γ ::= · | Γ , x : A
∆ ::= · | ∆ , P1 ≤ P2 | ∆ , P ≤⊤
∆ ⊢ A (Well-formed Types)
∆ ⊢⊤ [wft-top] ∆ ⊢⊥ [wft-bot] ∆ ⊢ Int [wft-int] ∆ ⊢ Null [wft-null]
∆ ⊢ A ∆ ⊢ B P ∈ dom ∆ ∆ ⊢ A ∆ ⊢ B
wft-arrow wft-prim wft-or
∆ ⊢ A → B ∆ ⊢ P ∆ ⊢ A ∨ B
∆ ⊢ A ∆ ⊢ B
wft-and
∆ ⊢ A ∧ B
----- End of picture text -----

ok ∆ (Well-formed Nominal Contexts)

Figure 5 Additional syntax and well-formedness.

4.2 Distributive Subtyping

Another interesting feature of this section is the addition of distributive subtyping to λu . Ceylon employs an enriched distributive subtyping relation [42] that is based on the B+ logic [50, 53]. To obtain an equivalent algorithmic formulation of subtyping, we employ the idea of splittable types [36], but extend that algorithm with the Null type and nominal types.

Distributive subtyping relation. Figure 6 shows a declarative version of distributive subtyping for λu with intersection and nominal types. Subtyping includes axioms for reflexivity (rule ds-refl) and transitivity (rule ds-trans). Rules ds-top, ds-bot, ds-arrow, and ds-ora have been discussed in Section 3. Rule ds-prim states that a nominal type is a subtype of type A if it is declared as subtype of A in ∆. Note that A can either be a nominal type or ⊤ under a well-formed context ∆. With the help of rule ds-trans, the subtyping of primitive types can also be constructed indirectly, e.g. P1 ≤⊤, P2 ≤ P1 , P3 ≤ P2 ⊢ P3 ≤ P1. Compared with the algorithmic formulation, having an explicit transitivity rule considerably simplifies the rules for nominal types. Rules ds-orb and ds-orc state that a subpart of a union type is a subtype of whole union type. Rule ds-anda states that a type A is a subtype of the intersection of two types B and C only if A is a subtype of both B and C . Rules ds-andb and ds-andc state that intersection type A 1 ∧ A 2 is a subtype of both A 1 and A 2 separately. Rule ds-distarr distributes function types over intersection types. It

ECOOP 2022

--- end of page.page_number=17 ---

25:18 Union Types with Disjoint Switches

∆ ⊢ A ≤ B

(Declarative Subtyping with Distributivity)

Figure 6 Distributive subtyping for λu with intersection types and nominal types.

states that ( A → B 1) ∧ ( A → B 2) is a subtype of A → ( B 1 ∧ B 2). Rule ds-distarru states that ( A 1 → B ) ∧ ( A 2 → B ) is a subtype of ( A 1 ∨ A 2) → B type. Rule ds-distor distributes intersections over unions.

Algorithmic Subtyping. Distributive rules make it hard to eliminate the transitivity rule. Our algorithmic formulation of distributive subtyping is based on a formulation using splittable types by Huang and Oliveira [36]. The basic idea is to view the distributive rules as some expansion of intersection and union types. For example, rule ds-distarr makes A → B 1 ∧ B 2 and ( A → B 1) ∧ ( A → B 2) mutual subtypes. Thus A → B 1 ∧ B 2 is treated like ( A → B 1) ∧ ( A → B 2) in the three intersection-related rules as-anda, as-andb, and as-andc. Here we use A ≊ B ∧ C to denote that type A can be split into B and C (and therefore, A is equivalent to B ∧ C ) according to the procedure designed by Huang and Oliveira. Union and union-like types (e.g. ( A 1 ∨ A 2) ∧ B ≊ A 1 ∧ B ∨ A 2 ∧ B ) are handled in similar way in rules as-ora, as-orb, and as-orc. For further details of algorithmic subtyping we refer to their paper.

Subtyping Nominal Types. However, Huang and Oliveira’s algorithm does not account for Null and nominal types. We add the nominal context ∆in the subtyping judgment and extend the subtyping algorithm with Null and nominal types. Nominal types are not splittable, and their subtyping relation is defined by the transitive closure of the context.

--- end of page.page_number=18 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:19

∆ ⊢ A < : B (Algorithmic Subtyping with Distributivity)

∆_⊢B_1 <:_A_1 ∆_⊢A_2 <:_B_2 ∆_⊢A_1 →A_2 <:B_1 →B_2 as-arrow ok ok ∆ ∆_⊢A ∆_⊢A <:A as-refl ok (∆,P2 <:A) ∆,_P2 A_≊_A_1_∨A_2 ∆_⊢A_1 <:B ∆_⊢A_2 <:B ∆_⊢A <:B A_≊_A_1_∨A_2 ∆_⊢B <:A_1 ∆_⊢B <:A as-orb A_≊_A_1_∧A_2 ∆_⊢B <:A_1 ∆_⊢B <:A_2 ∆_⊢B <:A A_≊_A_1_∧A_2 ∆_⊢A_1 <:B ∆_⊢A <:B as-andb	∆_⊢B_1 <:_A_1 ∆_⊢A_2 <:_B_2 ∆_⊢A_1 →A_2 <:B_1 →B_2 as-arrow ok ok ∆ ∆_⊢A ∆_⊢A <:A as-refl ok (∆,P2 <:A) ∆,_P2 A_≊_A_1_∨A_2 ∆_⊢A_1 <:B ∆_⊢A_2 <:B ∆_⊢A <:B A_≊_A_1_∨A_2 ∆_⊢B <:A_1 ∆_⊢B <:A as-orb A_≊_A_1_∧A_2 ∆_⊢B <:A_1 ∆_⊢B <:A_2 ∆_⊢B <:A A_≊_A_1_∧A_2 ∆_⊢A_1 <:B ∆_⊢A <:B as-andb	s-arrow ok ok (∆_,_P2 <:A)	ok	(∆_,P1 <:P2) ∆_⊢_P2 <:P3 ∆,P1 <:P2 ⊢_P1 <:P3 as-primEq P1 =P2 ∆_⊢_P1 <:P3 <:A ⊢_P1 <:P3 as-primNeq as-ora ok ∆ ∆_⊢A ∆_⊢A <:⊤ as-top A_≊_A_1_∨A_2 ∆_⊢B <:A_2 ∆_⊢B <:A as-orc as-anda ok ∆ ∆_⊢A ∆_⊢⊥<_:A as-bot A_≊_A_1_∧A_2 ∆_⊢A_2 <:B ∆_⊢A <:B as-andc


				∆_⊢B_ as-anda _A_≊_A_1_∧A_2
				∆_⊢A _

Figure 7 Algorithmic subtyping for λu with distributivity, intersection and nominal types.

They are supertypes of ⊥ and subtypes of ⊤ , but not related with other primitive types like Int and Null. So for nominal types, we mainly focus on checking the subtyping relationship among them in our algorithm. Given a well-formed context, any nominal type P appears only once in a subtype position as an explicit declaration for P, and its direct supertype, if is not ⊤ , must be declared before P. Thus if ∆ ⊢ P1 < : P2 holds, either P2 is introduced before P1 in ∆, or they are the same type, in which case the goal can be solved by rule as-refl. For the other cases, we recursively search for P1 in all subtype positions of the context ∆ (rule as-primNeq). When we find P1, we check its direct supertype. If it is ⊤ , no other nominal types can be supertypes of P1. So in rule as-primEq, we only consider when the direct supertype is another primitive P2. For P3 to be a supertype of P1, it must either equal to P2, or it is related to P2 by the smaller context. In either case, we can prove that P3 is a supertype of the direct supertype of P1.

Inversion Lemmas for Type Soundness. Having an algorithmic formulation of subtyping is useful to prove several inversion lemmas that are used in the type soundness proof. For instance, it allows us to prove the following lemma:

▶ Lemma 13 (Inversion on Function Types) . If ∆ ⊢ A 1 → A 2 < : B 1 → B 2 then ∆ ⊢ B 1 < : A 1 and ∆ ⊢ A 2 < : B 2 .

While the additional distributive rules make function types more flexible, they retain the contravariance of argument types and covariance of return types. In addition, we show the formulation is sound and complete to the declarative subtyping and it is decidable whether a subtyping judgment holds under a given context.

▶ Lemma 14 (Equivalence of subtyping) . ∆ ⊢ A ≤ B if and only if ∆ ⊢ A < : B.

Lemma 15 (Decidability of subtyping) . ∆ ⊢ A ≤ B is decidable.

ECOOP 2022

--- end of page.page_number=19 ---

25:20 Union Types with Disjoint Switches

4.3 Disjointness Specification

Disjointness is another interesting aspect of the extension of λu . Unfortunately, Definition 4 does not work with intersection types. In what follows, we first explain why Definition 4 does not work, and then we show how to define disjointness in the presence of intersection types.

Bottom-like types, intersection types and disjointness. Recall that disjointness in Section 3 (Definition 4) depends on bottom-like types, where two types are disjoint only if all their common subtypes are bottom-like. However, this definition no longer works with the addition of intersection types. According to the subtyping rule for intersection types, any two types have their intersection as one common subtype. For non-bottom-like types, their intersection is also not bottom-like. For example, type Int and type Bool now have a non-bottom like subtype Int ∧ Bool. In other words, the disjointness definition fails, since we can always find a common non-bottom-like subtype for any two (non-bottom-like) types.

A possible solution: the Ceylon approach. A possible solution for this issue is to add a subtyping rule which makes intersections of disjoint types subtypes of ⊥ .

This rule is adopted by the Ceylon language [42]. With the rule s-disj now the type Int ∧ Bool would be a bottom-like type, and the definition of disjointness used in Section 3 could still work. The logic behind this rule is that if we interpret types as sets of values, and intersection as set intersection, then intersecting disjoint sets is the empty set. In other words, we would get a type that has no inhabitants. For instance the set of all integers is disjoint to the set of all booleans, and the intersection of those sets is empty. However we do not adopt the Ceylon solution here for two reasons:

1. Rule s-disj complicates the system because it adds a mutual dependency between subtyping and disjointness: disjointness is defined in terms of subtyping, and subtyping now uses disjointness as well in rule s-disj. This creates important challenges for the metatheory. In particular, the completeness proof for disjointness becomes quite challenging.

2. Additionally, the assumption that intersections of disjoint types are equivalent to ⊥ is too strong for some calculi with intersection types. If a merge operator [49] is allowed in the calculus, intersection types can be inhabited with values (for example, in λi [45], the type Int ∧ Bool is inhabited by 1 , , True). Considering those types bottom-like would lead to a problematic definition of subtyping, since some bottom-like types (those based on disjoint types) would be inhabited.

For those reasons we adopt a different approach in λu . Nevertheless, in the extended version of the paper we show that it is possible to create an extension of λu that includes (and in fact generalizes) the Ceylon-style rule s-disj.

Disjointness based on ordinary types to the rescue. To solve the problem with the disjointness specification, we resort to the alternative definition of disjointness presented in Section 3.7. Note that now the disjointness definition also contains ∆as an argument to account for nominal types.

Definition 16 ( ∧ -Disjointness) . ∆ ⊢ A ∗ B ::= ∄ C[◦] , ∆ ⊢ C[◦] < : A and ∆ ⊢ C[◦] < : B.

--- end of page.page_number=20 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:21

Lowest Ordinary Subtypes (LOS) |A| ∆

|LowestOrdinarySubtypes(LOS)|A|∆||||| |---|---|---|---|---| |
|⊤|∆
=
{Int, ⊤→⊥,Null} ∪_dom ∆
|⊥|∆
=
{}
|Int|∆
=
{Int}
|A →B|∆
=
{⊤→⊥}
|A ∨B|∆
=
|A|∆_∪|B|∆
|A ∧B|∆
=
|A|∆_∩|B|∆
|Null|∆
=
{Null}
|P|∆
=
{P} ∪_∆(P)|Nominal Subtypes||∆(A)|| ||⊤|∆
=
{Int, ⊤→⊥,Null} ∪_dom ∆
|⊥|∆
=
{}
|Int|∆
=
{Int}
|A →B|∆
=
{⊤→⊥}
|A ∨B|∆
=
|A|∆_∪|B|∆
|A ∧B|∆
=
|A|∆_∩|B|∆
|Null|∆
=
{Null}
|P|∆
=
{P} ∪_∆(P)||||| ||·(A)
=
{}
(∆_′,P_≤B)(A)
=



{P} ∪_∆′(A) if P_≤A ∈_∆
∆′_(A) otherwise|||| |||ok ∆
∆_⊢_P
∆; Γ_⊢_new P:P ptyp-prim|||

Figure 8 Lowest ordinary subtypes function and additional typing rule for λu with intersection types and nominal types.

Interestingly, while in Section 3 such definition was equivalent to the definition using bottom-like types, this is no longer the case for λu with intersection types. To see why, consider again the types Int and Bool. Int and Bool do not share any common ordinary subtype. Therefore, Int and Bool are disjoint types according to Definition 16. We further illustrate Definition 16 with a few concrete examples:

1. A = Int ∨ Bool , B = ⊥ : Since there is no ordinary type that is a subtype of both Int ∨ Bool and ⊥ , Int ∨ Bool and ⊥ are disjoint types. In general, the ⊥ type is disjoint to all types because ⊥ does not have any ordinary subtype.

2. A = Int ∧ Bool , B = Int ∨ Bool : There is no ordinary type that is a subtype of both Int ∧ Bool and Int ∨ Bool. Therefore, Int ∧ Bool and Int ∨ Bool are disjoint types. In general, an intersection of two disjoint types (Int ∧ Bool in this case) is always disjoint to all types.

4.4 Algorithmic Disjointness

The change in the disjointness specification has a significant impact on an algorithmic formulation. In particular, it is not obvious at all how to adapt the algorithmic formulation in Figure 3. To obtain a sound, complete and decidable formulation of disjointness, we employ the novel notion of lowest ordinary subtypes .

Lowest ordinary subtypes ( |A| ∆ ). Figure 8 shows the definition of lowest ordinary subtypes (LOS) ( |A| ∆). LOS is defined as a function which returns a set of ordinary subtypes of the given input type. No ordinary type is a subtype of ⊥ . The only ordinary subtype of Int is Int itself. The function case is interesting. Since no two functions are disjoint in the calculus proposed in this paper, the case for function types A → B returns ⊤→⊥ . This type is the least ordinary function type, which is a subtype of all function types. Lowest ordinary subtypes of ⊤ are Int, ⊤→⊥ , Null and all the nominal types defined in ∆. In the case of union types A ∨ B , the algorithm collects the LOS of A and B and returns the union of the two sets. For intersection types A ∧ B the algorithm collects the LOS of A and B and returns the intersection of the two sets. The lowest ordinary subtype of Null is Null itself. Finally, the LOS of P is the union of P itself with all subtypes of P defined in ∆. Note that LOS is defined as a structurally recursive function and therefore its decidability is immediate.

ECOOP 2022

--- end of page.page_number=21 ---

25:22 Union Types with Disjoint Switches

Algorithmic disjointness. With LOS, an algorithmic formulation of disjointness is straightforward:

▶ Definition 17. ∆ ⊢ A ∗a B ::= |A| ∆ ∩|B| ∆ = {}.

The algorithmic formulation of disjointness in Definition 17 states that two types A and B are disjoint under the context ∆if they do not have any common lowest ordinary subtypes. In other words, the set intersection of |A| ∆ and |B| ∆ is the empty set. Note that this algorithm is naturally very close to Definition 16.

Soundness and completeness of algorithmic disjointness. Next, we show that disjointness algorithm is sound and complete with respect to disjointness specifications (Theorem 18). Soundness and completeness of LOS are essential to prove Theorem 18. Both of these properties are shown in Lemma 19 and Lemma 20 respectively.

▶ Theorem 18 (Disjointness Equivalence) . ∆ ⊢ A ∗a B if and only if ∆ ⊢ A ∗ B.

▶ Lemma 19 (Soundness of |A| ∆) . ∀ well-formed ∆ and A and B that are well-formed under ∆ , if B ∈|A| ∆ , then ∆ ⊢ B < : A.

▶ Lemma 20 (Completeness of |A| ∆) . ∀ A B[◦] , if ∆ ⊢ B[◦] < : A, then B[◦] ∈|A| ∆ , or B[◦] is an arrow type and ⊤→⊥∈|A| ∆ .

4.5 Typing, Semantics and Metatheory

Both typing and the operational semantics are parameterized by the nominal context ∆. The typing rules are extended with a rule for nominal types rule ptyp-prim as shown at the right side in Figure 8. The typing rule ptyp-prim states that under a well-formed context ∆ and well-formed type P, new P has type P. No additional reduction rule is required because new P is a value. However, the rules step-switchl and step-switchr require ∆because they do a subtyping check. We illustrate the updated rule step-switchl next:

∆ ⊢⌊v⌋ < : A

nstep-switchl

∆ ⊢ switch v { ( x : A ) → e 1 , ( y : B ) → e 2 } −→ e 1[ x � v ]

Rule step-switchr is updated similarly. All the other rules are essentially the same as in Section 3, modulo the extra nominal context ∆.

Example. Assuming a context ∆= Person ≤⊤, Student ≤ Person , Robot ≤⊤ , y : Person | Robot and x : Student, we could write the following two switches:

`switch(y)` `// Accepted!`		`switch (x)` `// Rejected!`
`(z : Person)` `-> False`		`(z : Person)` `-> False`
`(z : Robot)` `-> True`		`(z : Student) -> True`

In the above code, the first switch, using y is accepted, while the second one (using x ) is rejected because the types overlap in that case.

Key Properties. We proved that λu with intersection types, nominal types and subtyping distributivity preserves type soundness and determinism.

Theorem 21 (Type Preservation) . If ∆; Γ ⊢ e : A and e −→ e[′] then ∆; Γ ⊢ e[′] : A.
Theorem 22 (Progress) . If ∆; · ⊢ e : A then either e is a value; or e can take a step to e[′] .
Theorem 23 (Determinism) . If ∆; Γ ⊢ e : A and e −→ e 1 and e −→ e 2 then e 1 = e 2 .

--- end of page.page_number=22 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:23

Union types. Union types were first introduced by MacQueen et al. [41]. They proposed a typing rule that eliminates unions implicitly. The rule breaks type preservation under the conventional reduction strategy of the lambda calculus. Barbanera et al. [9] solved the problem by reducing all copies of the same redex in parallel. Dunfield and Pfenning [30, 29] took another approach to support mutable references. They restricted the elimination typing rule to only allow a single occurrence of a subterm with a union type when typing an expression. Pierce [47] proposed a novel single-branch case construct for unions. As pointed by Dunfield and Pfenning, compared to the single occurrence approach, the only effect of Pierce’s approach is to make elimination explicit.

Union types and elimination constructs based on types are widely used in the context of XML processing languages [35, 10], and have inspired proposals for object oriented languages [37]. Generally speaking, the elimination constructs in those languages offer a first-match semantics, where cases can overlap and reordering the cases may change the semantics of the program. This is in contrast to our approach. Union types have also been studied in the context of XDuce programming language [35]. XDuce employs regular expression types. Pattern matching can be on expressions and types in XDuce. Expressions are considered as special cases of types. CDuce [10] is an extension of XDuce. Work on the more foundational aspects of CDuce, and in particular on semantic subtyping [31] and set-theoretic types, also employs a form of first-match semantics elimination construct, though in a different form. In particular, work by Castagna et al. [18, 20] proposes a conditional construct that can test whether a value matches a type. If it matches then the first branch is executed and the type for the value is refined. Otherwise, the second branch is executed and the type of the value is refined to be the negation of the type (expressing that the value does not have such type). Union types are also studied in the context of semantic subtyping and object-oriented calculi [6, 5, 27] which focus on designing subtyping algorithms to employ semantic subtyping in OOP. In contrast, we study a deterministic and type-safe switch construct for union elimination.

Muehlboeck and Tate [42] give a general framework for subtyping with intersection and union types. They illustrate the significance of their framework using the Ceylon programming language. The main objective of their work is to define a generic framework for deriving subtyping algorithms for intersection and union types in the presence of various distributive subtyping rules. For instance, their framework could be useful to derive an algorithmic formulation for the subtyping relation presented in Figure 6. They also briefly cover disjointness in their work. As part of their framework, they can also check disjointness given some disjointness axioms. For instance, for λu , such axioms could be similar to rule ad-btmr or rule ad-intl in Figure 3. However, they do not have a formal specification of disjointness. Instead they assume that some sound specification exists and that the axioms respect such specification. If some unsound axioms are given to their framework (say Int ∗a Int) this would lead to a problematic algorithm for checking disjointness. In our work we provide specifications for disjointness together with sound and complete algorithmic formulations. In addition, unlike us, they do not study the semantics of disjoint switch expressions.

Occurrence Typing. Occurrence typing or flow typing [51] specializes or refines the type of variable in a type test. An example of occurrence typing is:

Integeroccurrence(Integer|Stringval){
if(valisInteger){returnval +1;}
else{returntoInt(val)+2;}

ECOOP 2022

--- end of page.page_number=23 ---

25:24 Union Types with Disjoint Switches

In such code, val initially has type Int ∨ String. The conditional checks if the val is of type Int. If the condition succeeds, it is safe to assume that val is of type Int, and the type of val is refined in the branch to be Int. Otherwise, it is safe to assume that val is of type String, in the other branch (and the type is refined accordingly). The motivation to study occurrence typing was to introduce typing in dynamically typed languages. Occurrence typing was further studied by Tobin-Hochstadt and Felleisen [52], which resulted into the development of Typed Racket. Variants of occurrence typing are nowadays employed in mainstream languages such as TypeScript, Ceylon or Flow. Castagna et al. [21] extended occurrence typing to refine the type of generic expressions, not just variables. They also studied the combination with gradual typing. Occurrence typing in a conditional construct, such as the above, provides an alternative means to eliminate union types using a first-match semantics. That is the order of the type tests determines the priority.

Nullable Types. Nullable types are types which may have the null value. Recently, Nieto et al. [43] proposed an approach with explicit nulls in Scala using union types. The Ceylon language has implemented a similar approach for a few years now. However our’s and Ceylon’s approaches are based on disjoint switches to test for nullability, while Nieto et al.’s [43] approach is based on a simplified form of occurrence typing.

Various approaches have been proposed to deal with nullability such as T? in Kotlin [40], Swift [7] and Flow [25]. The Checker Framework [46] is another line of related work to detect null pointer deferences in Java programs. Banerjee et al. [8] proposes an approach to explicitly associate nullable and non-nullable properties with expressions in Java. However, differently from our work, in those approaches nullable types are not encoded with union types. Blanvillain et al. [14] study a notion of match types for type level programming. They also employ a notion of disjointness in match types and can encode nullable types. However, they provide match types at the type level and do not use them for union elimination. Furthermore, they do not study intersection and union types formally. In contrast, we provide a term level switch construct for union elimination.

Disjoint Intersection Types. Disjoint intersection types were first studied by Oliveira et al. [45] in the λi calculus to give a coherent calculus for intersection types with a merge operator. The notion of disjointness used in λu , discussed in Section 3, is inspired by the notion of disjointness of λi . In essence, disjointness in λu is the dual notion: while in λi two types are disjoint if they only have top-like supertypes, in λu two types are disjoint if they only have bottom-like subtypes. Disjoint polymorphism [4] has been studied for calculi with disjoint intersection types.

None of calculi with disjoint intersection types [45, 11, 4, 12] in the literature includes union types. One interesting discovery of our work is that the presence of both intersections and unions in a calculus can affect disjointness. In particular, as we have seen in Section 4, adding intersection types required us to change disjointness. The notion of disjointness that was derived from λi stops working in the presence of intersection types. Interestingly, a similar issue happens when union types are added to a calculus with disjoint intersection types. If disjointness of two types A and B is defined to be that such types can only have top-like types, then adding union types immediately breaks such definition. For example, the types Int and Bool are disjoint but, with union types, Int ∨ Bool is a common supertype that is not top-like. We conjecture that, to add union types to disjoint intersection types, we can use the following definition of disjointness:

--- end of page.page_number=24 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:25

▶ Definition 24. A ∗ B ::= ∄ C[◦] , A < : C[◦] and B < : C[◦] .

which is, in essence, the dual notion of the definition presented in Section 4. Under this definition Int and Bool would be disjoint since we cannot find a common ordinary supertype (and Int ∨ Bool is a supertype, but it is not ordinary). Furthermore, there should be a dual notion to LOS, capturing the greatest ordinary supertypes. Moreover, if a calculus includes both disjoint switches and a merge operator, then the two notions of disjointness must coexist in the calculus. This will be an interesting path of exploration for future work.

Overloading. Union and intersection types also provide a form of function overloading or ad-hoc polymorphism using the switch and type-based case analysis. A programmer may define the argument type to be a union type. By using type-based case analysis, it is possible to execute different code for each specific type of input. Intersection types have also been studied for function overloading. For example, a function with type Int → Int ∧ Bool → Bool can take input values either of type Int or Bool. In such case, it returns either Int or Bool depending upon the input type. Function overloading [19, 17, 56] has been studied in detail in the literature. Wadler and Blott [56] studied type classes as an alternative way to provide overloading based on parametric polymorphism.

6 Conclusion and Future Work

This work develops the union calculus ( λu ) with union types and a type-based union elimination construct based on disjointness. We presented the operational semantics of the calculus, and showed type-soundness and determinism. Disjointness plays a crucial role for the determinism result, as it ensures that only one branch in the switch elimination construct can apply for any given value. A nice aspect of the work was that we were able to adapt the notion of disjointness used in disjoint intersection types to our variant of λu with union types. We believe that this reinforces fundamental connections between union and intersection types via duality. The addition of intersection types to λu lead to some interesting discoveries. In particular, it showed that the notion of disjointness that we were able to formulate, inspired by the work on disjoint intersection types, breaks. This is not showing that the duality stops working. Instead, it shows that the combination of intersections and unions in the same system affects disjointness. As discussed in Section 5, adding union types to calculi with disjoint intersection types leads to a similar problem, and the solution in λu can inspire solutions for adding union types to disjoint intersection types.

We plan to extend λu for practical programming languages with more advanced features. An interesting line of research for λu is to study the addition of the merge operator, which calculi with disjoint intersection types include. The main challenge is that types such as Int ∧ Bool become inhabited. It could also be interesting to study a variant of λu that uses a best-match approach based on the dynamic type. This would relate to the extensive line of research on multi-methods [23] and multiple dispatching [26]. Finally, a current limitation of our approach is that it relies on a global context for nominal types. This enables some simplifications, since we can search the global nominal environment for subtypes. However this assumption breaks in a setting where new nominal types can be added. Ceylon solves this issue in a modular way using of clauses that enumerate all the possible subtypes that a class can have. It would be interesting to adopt this approach to enable the addition of new nominal types.

ECOOP 2022

--- end of page.page_number=25 ---

25:26 Union Types with Disjoint Switches

References

1 Disjointness in ceylon. URL: http://web.mit.edu/ceylon_v1.3.3/ceylon-1.3.3/doc/en/ spec/html_single .
2 Overloading in ceylon. URL: https://github.com/ceylon/ceylon-spec/issues/73 .
3 Union types in typescript. URL: https://www.typescriptlang.org/docs/handbook/ unions-and-intersections.html .
4 João Alpuim, Bruno C. d. S. Oliveira, and Zhiyuan Shi. Disjoint polymorphism. In European Symposium on Programming (ESOP) , 2017.
5 Davide Ancona and Andrea Corradi. Sound and complete subtyping between coinductive types for object-oriented languages. In European Conference on Object-Oriented Programming , pages 282–307. Springer, 2014.
6 Davide Ancona and Andrea Corradi. Semantic subtyping for imperative object-oriented languages. ACM SIGPLAN Notices , 51(10):568–587, 2016.
7 Inc Apple. Swift language guide, 2021. URL: https://docs.swift.org/swift-book/ LanguageGuide/TheBasics.html .
8 Subarno Banerjee, Lazaro Clapp, and Manu Sridharan. Nullaway: Practical type-based null safety for java. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering , pages 740–750, 2019.
9 Franco Barbanera, Mariangiola Dezaniciancaglini, and Ugo Deliguoro. Intersection and union types: syntax and semantics. Information and Computation , 119(2):202–230, 1995.
10 Véronique Benzaken, Giuseppe Castagna, and Alain Frisch. Cduce: an xml-centric generalpurpose language. ACM SIGPLAN Notices , 38(9):51–63, 2003.
11 Xuan Bi, Bruno C. d. S. Oliveira, and Tom Schrijvers. The Essence of Nested Composition. In European Conference on Object-Oriented Programming (ECOOP) , 2018.
12 Xuan Bi, Ningning Xie, Bruno C. d. S. Oliveira, and Tom Schrijvers. Distributive disjoint polymorphism for compositional programming. In Luís Caires, editor, Programming Languages and Systems , pages 381–409, Cham, 2019. Springer International Publishing.
13 Gavin Bierman, Martín Abadi, and Mads Torgersen. Understanding typescript. In European Conference on Object-Oriented Programming , pages 257–281. Springer, 2014.
14 Olivier Blanvillain, Jonathan Immanuel Brachthäuser, Maxime Kjaer, and Martin Odersky. Type-level programming with match types. Proc. ACM Program. Lang. , 6(POPL), jan 2022. doi:10.1145/3498698 .
15 R. M. Burstall, D. B. MacQueen, and D. T. Sannella. Hope: An experimental applicative language. Technical Report CSR-62-80, Computer Science Dept, Univ. of Edinburgh, 1981.
16 Peter Canning, William Cook, Walter Hill, Walter Olthoff, and John C Mitchell. F-bounded polymorphism for object-oriented programming. In Proceedings of the fourth international conference on functional programming languages and computer architecture , pages 273–280, 1989.
17 Luca Cardelli and Peter Wegner. On understanding types, data abstraction, and polymorphism. ACM Computing Surveys (CSUR) , 17(4):471–523, 1985.
18 Giuseppe Castagna and Alain Frisch. A gentle introduction to semantic subtyping. In Proceedings of the 7th ACM SIGPLAN international conference on Principles and practice of declarative programming , pages 198–199, 2005.
19 Giuseppe Castagna, Giorgio Ghelli, and Giuseppe Longo. A calculus for overloaded functions with subtyping. Information and Computation , 117(1):115–135, 1995.
20 Giuseppe Castagna and Victor Lanvin. Gradual typing with union and intersection types. Proceedings of the ACM on Programming Languages , 1(ICFP):1–28, 2017.
21 Giuseppe Castagna, Victor Lanvin, Mickaël Laurent, and Kim Nguyen. Revisiting occurrence typing. arXiv preprint arXiv:1907.05590 , 2019.

--- end of page.page_number=26 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:27

22 Giuseppe Castagna, Kim Nguyen, Zhiwu Xu, Hyeonseung Im, Sergueï Lenglet, and Luca Padovani. Polymorphic functions with set-theoretic types: Part 1: Syntax, semantics, and evaluation. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages , POPL ’14, page 5–17, New York, NY, USA, 2014. Association for Computing Machinery. doi:10.1145/2535838.2535840 .
23 Craig Chambers. Object-oriented multi-methods in Cecil. In Ole Lehrmann Madsen, editor, ECOOP ’92, European Conference on Object-Oriented Programming, Utrecht, The Netherlands , volume 615, pages 33–56. Springer-Verlag, 1992.
24 Avik Chaudhuri. Flow: a static type checker for javascript. SPLASH-I In Systems, Programming, Languages and Applications: Software for Humanity , 2015.
25 Avik Chaudhuri, Panagiotis Vekris, Sam Goldman, Marshall Roch, and Gabriel Levi. Fast and precise type checking for javascript. Proceedings of the ACM on Programming Languages , 1(OOPSLA):1–30, 2017.
26 Curtis Clifton, Gary T. Leavens, Craig Chambers, and Todd Millstein. Multijava: Modular open classes and symmetric multiple dispatch for java. In Proceedings of the 15th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications , OOPSLA ’00, page 130–145, 2000.
27 Ornela Dardha, Daniele Gorla, and Daniele Varacca. Semantic subtyping for objects and classes. In Formal Techniques for Distributed Systems , pages 66–82. Springer, 2013.
28 Rowan Davies and Frank Pfenning. Intersection types and computational effects. In Proceedings of the fifth ACM SIGPLAN international conference on Functional programming , pages 198–208, 2000.
29 Joshua Dunfield. Elaborating intersection and union types. Journal of Functional Programming , 24(2-3):133–165, 2014.
30 Joshua Dunfield and Frank Pfenning. Type assignment for intersections and unions in callby-value languages. In International Conference on Foundations of Software Science and Computation Structures , pages 250–266. Springer, 2003.
31 Alain Frisch, Giuseppe Castagna, and Véronique Benzaken. Semantic subtyping. In Proceedings 17th Annual IEEE Symposium on Logic in Computer Science , pages 137–146. IEEE, 2002.
32 Jacques Garrigue. Programming with polymorphic variants. In ML workshop , 1998.
33 James Gosling, Bill Joy, Guy Steele, Gilad Bracha, Alex Buckley, Daniel Smith, and Gavin Bierman. The java language specification, 2021. URL: https://docs.oracle.com/javase/ specs/jls/se14/html/index.html .
34 Eric Gunnerson. Nullable types. In A Programmer’s Guide to C# 5.0 , pages 247–250. Springer, 2012.
35 Haruo Hosoya and Benjamin C Pierce. Xduce: A statically typed xml processing language. ACM Transactions on Internet Technology (TOIT) , 3(2):117–148, 2003.
36 Xuejing Huang and Bruno C d S Oliveira. Distributing intersection and union types with splits and duality (functional pearl). Proceedings of the ACM on Programming Languages , 5(ICFP):1–24, 2021.
37 Atsushi Igarashi and Hideshi Nagira. Union types for object-oriented programming. In Proceedings of the 2006 ACM symposium on Applied computing , pages 1435–1441, 2006.
38 Atsushi Igarashi, Benjamin C. Pierce, and Philip Wadler. Featherweight java: a minimal core calculus for java and gj. ACM Trans. Program. Lang. Syst. , 23(3):396–450, 2001.
39 Gavin King. The ceylon language specification, version 1.0, 2013.
40 Foundation Kotlin. Kotlin programming language, 2021. URL: https://kotlinlang.org/ .
41 David MacQueen, Gordon Plotkin, and Ravi Sethi. An ideal model for recursive polymorphic types. In Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages , pages 165–174, 1984.
42 Fabian Muehlboeck and Ross Tate. Empowering union and intersection types with integrated subtyping. Proceedings of the ACM on Programming Languages , 2(OOPSLA):1–29, 2018.

ECOOP 2022

--- end of page.page_number=27 ---

25:28 Union Types with Disjoint Switches

43 Abel Nieto, Yaoyu Zhao, Ondřej Lhoták, Angela Chang, and Justin Pu. Scala with Explicit Nulls. In 34th European Conference on Object-Oriented Programming (ECOOP 2020) , Leibniz International Proceedings in Informatics (LIPIcs), pages 25:1–25:26, 2020.
44 Martin Odersky. Scala 3: A next generation compiler for scala, 2021. URL: https://dotty. epfl.ch .
45 Bruno C. d. S. Oliveira, Zhiyuan Shi, and Joao Alpuim. Disjoint intersection types. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming , pages 364–377, 2016.
46 Matthew M Papi, Mahmood Ali, Telmo Luis Correa Jr, Jeff H Perkins, and Michael D Ernst. Practical pluggable types for java. In Proceedings of the 2008 international symposium on Software testing and analysis , pages 201–212, 2008.
47 Benjamin C Pierce. Programming with intersection types, union types. Technical report, and polymorphism. Technical Report CMU-CS-91-106, Carnegie Mellon University, 1991.
48 Benjamin C. Pierce. Types and Programming Languages . The MIT Press, 1st edition, 2002.
49 John C Reynolds. Preliminary design of the programming language forsythe. 1988.
50 Richard Routley and Robert K Meyer. The semantics of entailment—iii. Journal of philosophical logic , 1(2):192–208, 1972.
51 Sam Tobin-Hochstadt and Matthias Felleisen. The design and implementation of typed scheme. ACM SIGPLAN Notices , 43(1):395–406, 2008.
52 Sam Tobin-Hochstadt and Matthias Felleisen. Logical types for untyped languages. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming , pages 117–128, 2010.
53 Steffen van Bakel, Mariangiola Dezani-Ciancaglini, Ugo de’Liguoro, and Yoko Motohoma. The minimal relevant logic and the call-by-value lambda calculus. Technical report, Citeseer, 2000.
54 Adriaan Van Wijngaarden, Barry J Mailloux, John EL Peck, Cornelius HA Koster, M Sintzoff, CH Lindsey, LGLT Meertens, and RG Fisker. Report on the algorithmic language algol 68. Numerische Mathematik , 14(1):79–218, 1969.
55 Adriaan van Wijngaarden, Barry James Mailloux, John Edward Lancelot Peck, Cornelis HA Koster, CH Lindsey, M Sintzoff, Lambert GLT Meertens, and RG Fisker. Revised report on the algorithmic language Algol 68 . Springer Science & Business Media, 2012.
56 Philip Wadler and Stephen Blott. How to make ad-hoc polymorphism less ad hoc. In Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , pages 60–76, 1989.
57 Francesco Zappa Nardelli, Julia Belyakova, Artem Pelenitsyn, Benjamin Chung, Jeff Bezanson, and Jan Vitek. Julia subtyping: a rational reconstruction. Proceedings of the ACM on Programming Languages , 2(OOPSLA):1–27, 2018.

A Further Extensions and Discussion

The calculus introduced in Section 3 is a simple foundational lambda calculus with union types, similar to prior work on union types and their elimination forms [10, 29, 22]. In Section 4 we extend λu with various interesting features including intersection types, nominal types and subtyping distributivity, inspired by Ceylon, which has similar features. In this section we discuss two more practical extensions:

Disjoint Polymorphism: The first extension is an extension with a form of disjoint polymorphism [4], which allows the specification of disjointness constraints for type variables. Although Ceylon supports polymorphism, it does not support disjoint polymorphism. The extension with disjoint polymorphism is inspired by the work on disjoint intersection types, where disjoint polymorphism has been proposed to account for disjointness in a polymorphic language.

--- end of page.page_number=28 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:29

A, B , C ::= … | α | ∀ ( α ∗ G ) .B Lowest Ordinary Subtypes (LOS) |A| ∆;Γ e ::= … | e A | Λ( α ∗ G ) .e … = … v ::= … | Λ( α ∗ G ) .e |∀ ( α ∗ G ) .B| ∆;Γ = {∀ ( α ∗⊥ ) .⊥} G Γ ::=::= …⊤| ⊥| | Γ , Int α ∗ | Null G | A → B |α| ∆;Γ = ( |⊤| ∆;Γ) - ( |G| ∆;Γ) where α ∗ G ∈ Γ | G 1 ∨ G 2 | G 1 ∧ G 2 | ∀ ( α ∗ G ) .B ∆; Γ ⊢ A < : B (Additional subtyping rules) ok ∆ ∆; Γ ⊢ α ∆; Γ ⊢ G 1 < : G 2 ∆; Γ , α ∗ G 2 ⊢ B 1 < : B 2 polys-tvar polys-all ∆; Γ ⊢ α < : α ∆; Γ ⊢∀ ( α ∗ G 1) .B 1 < : ∀ ( α ∗ G 2) .B 2

∆; Γ ⊢ e : A (Additional typing rules)

∆; Γ ⊢ e −→ e[′] (Additional reduction rules) ∆; Γ ⊢ e −→ e[′] ∆; Γ ⊢ e B −→ e[′] B[polystep-tappl] ∆; Γ ⊢ (Λ( α ∗ G ) .e ) B −→ e [ α � B ][polystep-tapp]

Figure 9 Syntax, additional typing, subtyping, and reduction rules for λu with polymorphism.

A Special Subtyping Rule for Empty Types: The second extension that we discuss is an alternative subtyping formulation with a special subtyping rule for empty types, which follows the Ceylon approach.

Note that both extensions above have also been formalized in Coq and proved typesound and deterministic. In addition, we also have a brief discussion about implementation considerations.

A.1 Polymorphism

Polymorphism is an essential feature of almost all the modern programming languages. In this section we discuss an extension of λu with parametric polymorphism along with intersection and nominal types. The interesting aspect about this extension is the presence of disjointness constraints. For example, in λu with polymorphism a polymorphic disjoint switch such as: Γ , α ∗ Int ⊢ switch e { ( x : Int) → true , ( y : α ) → false } is accepted. It is safe to use Int and α in alternative branches in a switch in this example. The disjointness constraint in the context (Γ , α ∗ Int) on type variable α ensures that α must only be instantiated with types disjoint to Int. Thus an instantiation of α with Null or A → B is allowed. Whereas, an instantiation of α with Int is rejected by the type system.

ECOOP 2022

--- end of page.page_number=29 ---

25:30 Union Types with Disjoint Switches

Syntax. Figure 9 shows the extension in the syntax of λu with polymorphism. Types are extended with type variables α and disjoint quantifiers ∀ ( α ∗ G ) .B . ∀ ( α ∗ G ) .B is also an ordinary type. The reader can think of this extension in the context of bounded quantification [17, 16] where bounded quantifiers ( ∀ ( α < : A ) .B ) are replaced by disjoint quantifiers ( ∀ ( α ∗ G ) .B ). Bounded quantification imposes a subtyping restriction on type variables, whereas disjoint quantification imposes disjointness restriction on type variables. Disjoint quantification only allows the instantiation of disjoint types. For example, ∀ ( α < : Int ∨ Bool) .α allows α to be instantiated only with subtypes of Int ∨ Bool and restricts all other types. Whereas, ∀ ( α ∗ Int ∨ Bool) .α restricts all the instantiations of α which share an ordinary subtype with Int ∨ Bool. In other words, the permitted instantiations of α are the types disjoint to Int ∨ Bool. Null is a valid instantiation in this case, while Int is not a valid instantiation.

Expressions are extended with type application e A and type abstraction Λ( α ∗ G ) .e . A type abstraction is also a value. Additionally, context Γ now also contains type variables with their respective disjointness constraints. The disjointness constraint of type variables is restricted to ground types ( G ), which includes all the types except type variables. Ground types are shown at the top left of Figure 9.

Subtyping, Typing and Operational Semantics. Figure 9 shows additional rules in the formalization of λu with polymorphism. Note that subtyping, typing, and reduction relations now have two contexts ∆and Γ. Subtyping is extended for the two newly added types. The subtyping rule for type variables is a special case of reflexivity (rule polys-tvar)). Rule polys-all is interesting. It says that input and output types of two disjoint quantifiers are covariant in the subtype relation. This contrasts with calculi with bounded quantification and disjoint polymorphism [4], where the subtyping between the type bounds of the constraints is contravariant, and the subtyping between the types in the universal quantification body is covariant. Note that in the calculus that we formalized in Coq, we study parametric polymorphism without distributive subtyping rules.

Similarly, typing is extended to assign the type to two newly added expressions. Rule ptyptap is for type applications and rule ptyp-tabs is for type abstractions. Rule polysteptappl is standard reduction rule for type application. Rule polystep-tapp replaces α with type B in expression e .

Disjointness. Disjointness has to be updated to accommodate type variables and disjoint quantifiers. The definition of algorithmic disjointness is roughly the same as discussed in Section 4, except that it takes an additional argument Γ. Context Γ is also an argument of LOS. LOS is extended to handle the additional cases of α and ∀ ( α ∗ G ) .B and is shown at the top right of Figure 9. LOS returns ∀ ( α ∗⊥ ) .⊥ as the least ordinary subtype of ∀ ( α ∗ G ) .B . The type variable case is interesting. It returns the set difference of all ordinary subtypes and LOS of the disjointness constraint of type variable. Note that the disjointness constraint of type variables is restricted to ground types.

▶ Definition 25 (Disjointness) . ∆; Γ ⊢ A ∗ B ::= |A| ∆;Γ ∩|B| ∆;Γ = {}.

Type-safety and Determinism. The extension with disjoint polymorphism retains the properties of type-soundness and determinism. All the metatheory is formalized in Coq theorem prover. Progress and determinism does not require significant changes for this extension. Type preservation requires the preservation of disjointness after substitution and disjointness narrowing along with disjointness weakening. Disjointness substitution states

--- end of page.page_number=30 ---

B. Rehman, X. Huang, N. Xie, and B. C. d. S. Oliveira

25:31

that if two types are disjoint before type substitution, they must be disjoint after type substitution as stated in Lemma 26. The disjointness narrowing relates disjointness and subtyping. It states that it is safe to change the bounds of type variables from subtypes to supertypes as stated in Lemma 27.

▶ Lemma 26 (Disjointness Substitution) . If ∆; Γ , α ∗ G 1 ⊢ B ∗ C and ∆; Γ ⊢ G 2 ∗ G 1 then ∆; Γ[ α � G 2] ⊢ B [ α � G 2] ∗ C [ α � G 2]

▶ Lemma 27 (Disjointness Narrowing) . If ∆; Γ , α ∗ G 1 ⊢ B ∗ C and ∆; Γ ⊢ G 1 < : G 2 then ∆; Γ , α ∗ G 2 ⊢ B ∗ C

A.2 A More General Subtyping Rule for Bottom Types

As discussed in Section 4.3, Ceylon includes the following subtyping rule:

It is possible to support, and in fact generalize, such a rule in λu . The idea is to employ our definition of lowest ordinary subtypes, and add the following rule to λu with intersection types:

Rule s-los is an interesting addition in subtyping of λu . It says that if the LOS returns the empty set for some type A , then A is a subtype of all types. In other words, such type behaves like a bottom-like type. Such rule generalizes the rule s-disj employed in Ceylon, since when A is an intersection type of two disjoint types, we get the empty set. Moreover, adding rule s-los makes rule s-bot redundant as well, since the LOS for the bottom type is also the empty set. It is trivial to prove a lemma which says that ⊥ is a subtype of all types. We drop rule s-bot from the calculus discussed in Section 4 and prove Lemma 28 to show this property instead:

▶ Lemma 28 (Bottom Type Least Subtype) . ⊥ < : A.

A similar lemma can be proved to show that disjoint types are bottom-like (as in rule s-disj), when rule s-los is added to subtyping:

▶ Lemma 29 (Disjont Intersections are Bottom-Like) . If A ∗ B then A ∧ B < : ⊥.

The use of rule s-los instead of rule s-disj also has the advantage that it does not create a mutual dependency between disjointness and subtyping. We can have the definition of disjointness, which depends only on subtyping and ordinary types, and the definition of subtyping, which depends on LOS but not on disjointness.

We have formalized and proved all the metatheory, including type soundness, transitivity of subtyping, soundness and completeness of disjointness and determinism for a variant of λu with intersection types, nominal types, standard subtyping and rule s-los in Coq.

ECOOP 2022

--- end of page.page_number=31 ---

25:32 Union Types with Disjoint Switches

A.3 Implementation of Disjoint Switches

Ceylon code runs on the Java Virtual Machine (JVM). A Ceylon program compiles to JVM bytecode. The final bytecode to which a Ceylon program is compiled to erase annotations for types not supported in the JVM. In particular, union types such as String ∨ Null are erased into Object . Disjoint switches are implemented by type casts. For each branch there is an instanceof to test the type of the branch and select a particular branch. An implementation of the λu calculus could also use a similar approach for compilation. In essence the use of union types and disjoint switches provides an elegant alternative to type-unsafe idioms, based on instanceof tests, that are currently widely used by Java programmers, while keeping comparable runtime performance.

--- end of page.page_number=32 ---

Tony's Wiki

Explorer

Union Types with Disjoint Switches

Union Types with Disjoint Switches

Metadata

Why this note exists

Extraction quality note

Technical landmarks detected automatically

Full converted paper text

Union Types with Disjoint Switches

Baber Rehman �

Abstract

1 Introduction

25:4 Union Types with Disjoint Switches

2 Overview

2.1 Tagged Union Types

2.2 Type-directed Elimination forms for Union Types

25:6 Union Types with Disjoint Switches

2.3 Union Types and Disjoint Switches in Ceylon

2.4 Nullable Types

25:8 Union Types with Disjoint Switches

2.5 Key Ideas in Our Work

▶ Definition 1 ( ⊥ -Disjointness) . A ∗ B ::= ∀ C, if C < : A and C < : B then ⌋C ⌊

▶ Definition 2 ( ∧ -Disjointness) . A ∗ B ::= ∄ C[◦] , C[◦] < : A and C[◦] < : B.

25:10 Union Types with Disjoint Switches

void do (<Integer & String > | Boolean val) { /* do something */ }

3 The Union Calculus λu

3.1 Syntax

3.2 Subtyping

3.3 Disjointness

25:14 Union Types with Disjoint Switches

if A ∗ B.

3.4 Typing

3.5 Operational Semantics

3.6 Type Soundness and Determinism

3.7 An Alternative Specification for Disjointness

25:16 Union Types with Disjoint Switches

4 λu with Intersections, Distributive Subtyping and Nominal Types

4.1 Syntax, Well-formedness and Ordinary Types

4.2 Distributive Subtyping

25:18 Union Types with Disjoint Switches

25:20 Union Types with Disjoint Switches

4.3 Disjointness Specification

4.4 Algorithmic Disjointness

25:22 Union Types with Disjoint Switches

▶ Definition 17. ∆ ⊢ A ∗a B ::= |A| ∆ ∩|B| ∆ = {}.

4.5 Typing, Semantics and Metatheory

5 Related Work

▶ Definition 24. A ∗ B ::= ∄ C[◦] , A < : C[◦] and B < : C[◦] .

6 Conclusion and Future Work

25:26 Union Types with Disjoint Switches

References

25:28 Union Types with Disjoint Switches

A Further Extensions and Discussion

A.1 Polymorphism

25:30 Union Types with Disjoint Switches

A.2 A More General Subtyping Rule for Bottom Types

25:32 Union Types with Disjoint Switches

A.3 Implementation of Disjoint Switches

Graph View

Table of Contents

Backlinks

`void do (<Integer & String > | Boolean val) { /* do something */ }`