Abusing the algebra of algebraic data types – why does this work?

The ‘algebraic’ expression for algebraic data types looks very suggestive to someone with a background in mathematics. Let me try to explain what I mean.

Having defined the basic types

  • Product
  • Union +
  • Singleton X
  • Unit 1

and using the shorthand for X•X and 2X for X+X et cetera, we can then define algebraic expressions for e.g. linked lists

data List a = Nil | Cons a (List a)L = 1 + X • L

and binary trees:

data Tree a = Nil | Branch a (Tree a) (Tree a)T = 1 + X • T²

Now, my first instinct as a mathematician is to go nuts with these expressions, and try to solve for L and T. I could do this through repeated substitution, but it seems much easier to abuse the notation horrifically and pretend I can rearrange it at will. For example, for a linked list:

L = 1 + X • L

(1 - X) • L = 1

L = 1 / (1 - X) = 1 + X + X² + X³ + ...

where I’ve used the power series expansion of 1 / (1 - X) in a totally unjustified way to derive an interesting result, namely that an L type is either Nil, or it contains 1 element, or it contains 2 elements, or 3, etc.

It gets more interesting if we do it for binary trees:

T = 1 + X • T²

X • T² - T + 1 = 0

T = (1 - √(1 - 4 • X)) / (2 • X)

T = 1 + X + 2 • X² + 5 • X³ + 14 • X⁴ + ...

again, using the power series expansion (done with Wolfram Alpha). This expresses the non-obvious (to me) fact that there is only one binary tree with 1 element, 2 binary trees with two elements (the second element can be on the left or the right branch), 5 binary trees with three elements etc.

So my question is – what am I doing here? These operations seem unjustified (what exactly is the square root of an algebraic data type anyway?) but they lead to sensible results. does the quotient of two algebraic data types have any meaning in computer science, or is it just notational trickery?

And, perhaps more interestingly, is it possible to extend these ideas? Is there a theory of the algebra of types that allows, for example, arbitrary functions on types, or do types require a power series representation? If you can define a class of functions, then does composition of functions have any meaning?

7 Answers
7

Leave a Comment