Bringing Union Types to C#: A Deep Dive into .NET 11's Latest Addition

Bringing Union Types to C#: A Deep Dive into .NET 11's Latest Addition

For years, developers coming from functional languages like F#, Rust, or TypeScript have looked at C# and wondered why it lacked a first-class way to represent a value that could be one of several different types. Until now, the common workarounds involved fragile base classes, object casting, or third-party libraries like OneOf.

With the .NET 11 preview (and C# 15), the language finally introduces the union keyword. This addition allows developers to model "this OR that" scenarios directly at the type level, significantly improving type safety and reducing the boilerplate associated with the "Result pattern."

Understanding Union Types in C# 15

At its core, a union type allows a single variable to hold one of several predefined types. While functional languages have used these for decades, C#'s implementation focuses on tagged unions (also known as discriminated unions or sum types). These act as wrappers that ensure a value is exactly one of the allowed types.

The union Keyword in Action

Consider a scenario where you need to support multiple operating systems, each with different data requirements. Previously, you might have used a base class or an enum tag. In C# 15, you can define this concisely:

public record Windows(string Version);
public record Linux(string Distro, string Version);
public record MacOS(string Name, int Version);

// Define the union type
public union SupportedOS(Windows, Linux, MacOS);

Creating an instance is intuitive, supporting both explicit construction and implicit conversion:

// Implicit conversion
SupportedOS os = new MacOS("Tahoe", 25);

Exhaustive Pattern Matching

The real power of unions is realized when combined with switch expressions. The compiler now understands the boundaries of the union, meaning it can enforce exhaustiveness. If you fail to handle one of the possible types, the compiler will issue a warning (CS8509), eliminating the need for a default discard (_) case unless you specifically want one.

string GetDescription(SupportedOS os) => os switch
{
    Windows windows => $"Windows {windows.Version}",
    Linux linux => $"{linux.Distro} {linux.Version}",
    MacOS macOS => $"MacOS {macOS.Name} ({macOS.Version})",
}; // No discard case required!

Under the Hood: Implementation and Boxing

To understand the performance implications, it is helpful to look at how the compiler implements the union keyword. By default, a union is generated as a struct that implements a new IUnion interface and is decorated with a [Union] attribute.

The Default Implementation

The standard generated code looks roughly like this:

[Union]
public struct SupportedOS : IUnion
{
    public object? Value { get; }
    public SupportedOS(Windows value) => this.Value = (object) value;
    public SupportedOS(Linux value) => this.Value = (object) value;
    public SupportedOS(MacOS value) => this.Value = (object) value;
}

Because the inner value is stored as an object, value types (like int or bool) are boxed onto the heap. For many applications, this overhead is negligible. However, for high-performance "hot paths," this can be a significant bottleneck.

Avoiding Boxing with Custom Unions

For performance-critical code, the .NET 11 framework allows you to bypass the default boxing behavior by implementing a TryGetValue pattern. By providing specific TryGetValue methods for each member of the union, the compiler will use these methods during switch expressions instead of accessing the boxed Value property.

[Union]
public struct IntOrBool : IUnion
{
    private readonly bool _isBool;
    private readonly int _value;

    public bool TryGetValue(out int value)
    {
        value = _value;
        return !_isBool;
    }

    public bool TryGetValue(out bool value)
    {
        value = _isBool && _value is 1;
        return _isBool;
    }
    
    public object Value => _isBool ? _value is 1 : _value;
}

Community Perspectives and Trade-offs

The introduction of union types has sparked significant discussion among the developer community. While generally welcomed, several points of contention have emerged:

  • The "F# Influence": Many observers noted that C# is effectively absorbing features from F# over time. As one commenter put it:

    "F# has had this for decades, C# is basically just slowly becoming F# with a C-style syntax."

  • Tagged vs. Untagged Unions: A critical technical distinction was raised regarding the terminology. C# is implementing tagged unions (algebraic data types), which differ from the untagged unions found in TypeScript, where a variable can be of type A | B without a wrapper constructor.
  • Performance Concerns: Some developers expressed disappointment that boxing is the default behavior, arguing that the language should have implemented the TryGetValue pattern internally to avoid heap allocations for value types by default.

Looking Ahead

Union types are just the beginning of a broader push toward better exhaustiveness in C#. The language roadmap includes several related proposals:

  1. Closed Enums: Enums that allow switch expressions without a catch-all discard case.
  2. Closed Hierarchies: The ability to mark a class as closed to prevent external derivation, enabling exhaustive matching for class hierarchies.
  3. Union Member Providers: A way to define union members on a separate type from the union itself.

By integrating these features, C# is moving closer to the safety and expressiveness of functional languages while maintaining the versatility of a general-purpose, object-oriented language.

Sources