Language Design: Four Kinds of Unions
Overview
Syntactic Wrapping | No Syntactic Wrapping | |
---|---|---|
No Runtime Tags | untagged union (Rust union, C union, C++ union) |
union type (TypeScript union type) |
Runtime Tags | discriminated union/tagged union (Rust enum, F# discriminated union) |
? (Algol united mode, Core union, C# nominal type union) |
Upper Left: Untagged Unions
Some languages like Rust, C or C++ provide untagged unions, where the chosen variant has to be specified on creation and access:
union Pet {
cat: Cat,
dog: Dog
}
let pet = Pet { cat: Cat("Molly", 9) }
Values of untagged unions do not contain metadata (runtime tags) to distinguish variants (though such information can be added to the union as an additional union field).
They require that every access is qualified with variant information.
Upper Right: Union Types
Other languages provide union types where the definition of the union type (Pet
)
refers to existing types in scope for its variants (Cat
and Dog
).1
type Pet = Cat | Dog
let pet: Pet = Cat("Molly", 9)
A value of such a union does not contain metadata (runtime tags) to tell its variants apart.
Lower Left: Discriminated Unions
A “traditional” discriminated union definition as it exists in various languages defines both the enum itself
(Pet
), as well as its variants (Cat
and Dog
).
enum Pet {
Cat(name: String, lives: Int),
Dog(name: String, age: Int)
}
let pet: Pet = Cat("Molly", 9)
A discriminated union value contains a tag to allow telling its variants apart – even in cases where the types are them same (such as in Result[String, String]
).
Lower Right: ?
The combination of unions with runtime tag but without syntactic wrapping has existed in various languages, though no common, language-spanning name has been established for this concept.
“United modes” in Algol 68:
STRUCT Cat (STRING name, INT lives);
STRUCT Dog (STRING name, INT years);
MODE Pet = UNION (Cat, Dog);
class Cat(name: String, lives: Int)
class Dog(name: String, age: Int)
union Pet of Cat, Dog
“Nominal union types” as proposed for C# 15:
public record Cat(string Name, long lives);
public record Dog(string Name, long lives);
public union Pet(Cat, Dog);
All these define a union Pet
that refers to existing types Cat
and Dog
.
An instance of Cat
can be directly assigned to pet
(of union type Pet
) – without syntactic wrapping:
// core
let pet: Pet = Cat("Molly", 9)
Intuitively, this works similarly to permits
clauses of sealed interfaces in Java in the sense that
sealed interface Pet permits Cat, Dog { ... }
does not define Cat
or Dog
, but refers to existing Cat
and Dog
types in scope.2
Benefits of such unions
- Union variants have types, because they have a “real” class/struct/… declaration.
(This fixes a mistake that some languages like Rust or Haskell made with their enum/data types.34) - Variants can be reference types or value types (as they refer to “real”
class
orvalue
definitions). - No “stutter”, where variant names have to be invented to wrap existing types. (Rust has this issue.)
- Union values can be passed/created more easily, as no syntactic wrapping is required.
- Variants can be re-used in different unions.
- The ability to build ad-hoc unions out of existing types obviates the need for a separate type alias feature.
Example for 1., 2., 3.
enum Option[T] { Some(value: T), None }
… would receive little benefit from being written as …
union Option[T] of Some[T], None
value Some[T](value: T)
module None
…, but even trivial types like a JSON representation would benefit.
Instead of …
enum JsonValue {
JsonObject(Map[String, JsonValue])
JsonArray (Array[JsonValue]),
JsonString(String),
JsonNumber(Float64),
JsonBool (Bool),
JsonNull,
...
}
… one would write (with Array
, Float64
and String
being existing types in the language):
union JsonValue of
Map[String, JsonValue]
Array[JsonValue],
String,
Float64
Bool,
JsonNull,
...
module JsonNull
Example for 4.
No wrapping required when passing arguments (unlike “traditional” enum approaches):
fun someValue(value: JsonValue) = ...
someValue(JsonString("test")) // "traditional" approach
someValue("test") // proposed union design
Example for 5.
Consider this class definition:
class Name(name: String)
With the proposed union design, Name
can be used multiple times – in different unions (and elsewhere):
union PersonIdentifier of
Name,
... // other identifiers like TaxId, Description, PhoneNumber etc.
union DogTag of
Name,
... // other identifiers like RegId, ...
This kind of union design reduce indirection at use-sites and can be used in more scenarios (compared to more “traditional” enums), while not changing their runtime costs or representation.