Algebraic Data Types

Sherub Thakur
Shuttl Tech
Published in
3 min readNov 3, 2020

--

An algebraic data type is a data type defined out of a combination of two constructions: products and sums. I know you are like well bloody hell mate, what a’ you yappin’ ‘bout? So let’s start with —

An Example

Disclaimer: This section uses python examples but you should able to follow the example in you are familiar with Object-Oriented language.

Let’s say we are asked to model the state of an Oven. Where we define it as either it can be OFF or it can be ON with a temperature value. How will you model this idea?

The usual solution in Object-Oriented World would go something like —

Python Example of modeling the OvenState

Now let’s talk about what are the few problems with this approach.

A calculation of representable states

Let’s say for the simplification that class Temp only has 10 possible values. Now the question is how many possible values can OvenState have.

2 for switch and 10 for Temp => 20 possible states

And how many did we actually need?

1 signifying it is off and 10 for Temp values when it's on => 11

Even in this small example, we have a huge gap between what we want to represent and what our model can represent.

Enter Algebraic Data Types

Here is how you will model the same problem with Algebraic Data Types.

Haskell Example of modeling the OvenState

The | here means or . So we can read it like “the type OvenState is either Off or On with a Temp value.

Now let’s see how many possible states are here.

1 signifying it is off and 10 for Temp values when it's on => 11

In this case, there is no gap between what we want to represent and what the model can represent. What we have effectively done here is made all the impossible states unrepresentable in our model.

Can you think of any benefits that can come from making impossible states unrepresentable?

A caveat

You can represent OvenState in OO languages using inheritance in a way such that it will also represent 11 states.

But that is not how most people think right off the bat (I have asked this same question to a lot of folks and only once a person said that they will represent it using inheritance).

Definition

Now let’s try to get back to the definition.

An algebraic data type is a kind of composite type (a type formed by combining other types). Two common classes of algebraic types that are used in it are product types (think tuples) and sum types (think enums).

The values of a product type typically contain several values, called fields. All values of that type have the same combination of field types. The set of all possible values of a product type is the set-theoretic product, i.e., the Cartesian product, of the sets of all possible values of its field types.

Product types are quite prevalent in a lot of popular languages. The OvenState class is an example of this type.

The values of a sum type are typically grouped into several classes, called variants. Each variant has its own constructor, which takes a specified number of arguments with specified types. The set of all possible values of a sum type is the set-theoretic sum, i.e., the disjoint union, of the sets of all possible values of its variants.

Sum types are quite uncommon (baring Enumeration) but most newer languages are supporting these. The Switch class is an example of sum type. In general, Enumeration types are a special case of sum types in which the constructors take no arguments, as exactly one value is defined for each constructor

Care to guess why Algebraic Data Types are called so?

--

--