Wolfram Alpha and some fancy calculators have the ability to manipulate complicated algebraic expressions, such as \(2x - y^2\). Among other things, they generally will be able to perform simplification, expansion, differentiation, and integration.

I’ve always wondered exactly how these things work? After all, doing these things manually tends to require a lot of intution and real human brain-power, which doesn’t seem feasible for a machine. But maybe it’s possibly to devise some kind of rules for the computer to follow such that it can do most of these things without needing any self-awareness.

Now, I don’t actually know how this is usually implemented, if there even is a standard method, but what I came up with certainly works for what I was planning and was a nice little project for a Haskell beginner such as myself.

# How does it work?

I decided first to tackle the problem of expanding expressions, which, I figure, can then be modified slightly to simplify things too. When it comes down to it, expansion is actually quite a trivial thing. Sure, for a human, it would take *forever* to go through each possible expansion of a relatively complicated expression, but for a computer, it’s no trouble at all.

So here’s how it works. First, a set of rules are defined somewhere, telling the program that certain pairs of expressions are equivalent and as such one can be changed into the other. For a small example, consider these rules:

\[ sin(a + b) \rightarrow sin(a)cos(b) + cos(a)sin(b) \] \[ cos(a + b) \rightarrow cos(a)cos(b) - sin(a)sin(b) \] \[ tan(x) \rightarrow \frac{sin(a)}{cos(a)} \]

These look a lot like equations, but more specifically they represent the fact that to expand the expressions on the left, they are to be replaced with the corresponding right-hand-side.

Using just these three rules - iteratively, such that they are applied until they can’t be anymore - the following equalities can be derived:

\[ sin(\pi + 2) = sin(\pi)cos(2) + cos(\pi)sin(2) \] \[ tan(x + y) = \frac{sin(x + y)}{cos(x + y)} = \frac{sin(x)cos(y) + cos(x)sin(y)}{cos(x)cos(y) - sin(x)sin(y)} \]

Anyway, that’s how it works. At least at the highest level. When we start implementing it there will for sure be some little details to explain here and there, but at least we can get started now.

# The implementation

First off, `Data.Maybe`

will need to be imported, for the `fromMaybe`

function, which will be used later. `Data.List`

will also need to be imported, for `intercalate`

.

Now, before we can write these transformations between expressions, we need some way to actually encode the notion of an expression as a Haskell data type.

```
data Term
= Hole String
| Number Double
| Variable String
| Op String Term Term
| Prefix String Term
| Call String [Term]
deriving (Show, Read, Eq)
```

These expressions are defined as a tree-like recursive structure. There are six possible types of expressions: holes, numbers, variables, infix operators, prefix operators, and function calls. All of these are probably obvious as to their meaning except from `Hole`

.

A hole is essentially a variable which can be substituted, for example in the transformation \(sin(a + b) \rightarrow sin(a)cos(b) + cos(a)sin(b)\), \(a\) and \(b\) are holes. A variable which may not be a hole could be \(\pi\), for example. This is obvious to a human, but we must explicitly tell the program which variables can be substituted and which cannot to resolve the ambiguity.

Another thing is that operators will need to be written by hand a lot when writing this program, and it would help if some kind of shorthand was defined:

While not entirely necessary, this will mean that an addition of two variables can be written as:

Now we can define some type aliases:

`Pattern`

is simply an alias for a term, to allow for better understanding of function type signatures which we will define later. Specifically, a pattern will be something on the left-hand-side of a transformation, such as \(sin(a + b)\) from earlier.

`Rule`

is another name for a transformation, but just takes a few less time to type out. It is a mapping from a term to another term, showing what the left-hand-side (the first element) should become when transformed.

Finally, `Match`

is an associative list, mapping strings (which represent holes) with their values. This will be returned from the function which checks for a match between a pattern and a term, and will be used to make a subsitution for the new term.

Another thing to do with these types, before the actual algebra logic is introduced, is to make a function to convert terms to strings. This will be mostly useful when playing around with terms in ghci:

```
isInteger :: Double -> Bool
isInteger n = n == (fromIntegral $ round n)
showTerm :: Term -> String
showTerm (Hole x) = "?" ++ x
showTerm (Number x) = if isInteger x then show $ round x else show x
showTerm (Variable x) = x
showTerm (Op op l r) = "(" ++ showTerm l ++ " " ++ op ++ " " ++ showTerm r ++ ")"
showTerm (Prefix op x) = op ++ showTerm x
showTerm (Call fn args) = fn ++ "(" ++ intercalate ", " (map showTerm args) ++ ")"
```

Pretty standard. Note, a helper function `isInteger`

was defined for stringifying the `Number`

term, because it will have a different representation based on the outcome of this.

With these definitions, it makes sense that a set of expansion rules could be defined as a list of type `[Rule]`

, and indeed this is how I did it. This is an example set of rules, implementing the trig expansions I mentioned at the start:

```
trigExpansionRules :: [Rule]
trigExpansionRules =
[ ((Call "sin" [Hole "a" @+ Hole "b"]), ((Call "sin" [Hole "a"]) @*
(Call "cos" [Hole "b"])) @+ ((Call "cos" [Hole "a"]) @* (Call "sin" [Hole "b"])))
, ((Call "cos" [Hole "a" @+ Hole "b"]), ((Call "cos" [Hole "a"]) @*
(Call "cos" [Hole "b"])) @- ((Call "sin" [Hole "a"]) @* (Call "sin" [Hole "b"])))
, ((Call "tan" [Hole "a"]), (Call "sin" [Hole "a"] @/ Call "cos" [Hole "a"])) ]
```

To say this is messy would be an understatement, but it’s the best we can do with the syntax we defined in the `Term`

definition. A fun thing to do might be to try to implement a parser to clean things up a bit.

## The Matching Algorithm

Matching is, in a sense, the main “substance” of the CAS. It is the functionality which can determine that \(sin(x)\) “matches with” \(sin(a + b)\) and \(x^y\) matches \(sin^2(x)\), while simultaneously extracting the “values” of the holes. For my first example here, \(sin(x)\) and \(sin(a + b)\), the value of the hole \(x\) will be \(a + b\).

We have already defined the necessary types to implement this, hence the signature of this function will be:

As the types imply, this function takes two arguments, a pattern and a term, and returns either `Nothing`

or `Just (some match)`

. Recall that the `Pattern`

type is just an alias for `Term`

, and `Match`

is an alias for `[(String, Term)]`

. So, the first argument will be the pattern to try and match against (\(sin(x)\), going back to my example from above) and the second argument will be the term to try to match with the pattern (\(sin(a + b)\).)

If they do indeed match, the values of the holes will be returned. Otherwise, `Nothing`

.

Here is its definition:

```
match (Hole x) term = Just [(x, term)]
match (Op patternOp patternL patternR) (Op termOp termL termR)
| patternOp /= termOp = Nothing
| otherwise = (++) <$> (match patternL termL) <*> (match patternR termR)
match (Prefix patternOp patternX) (Prefix termOp termX)
| patternOp /= termOp = Nothing
| otherwise = match patternX termX
match (Call patternFn patternArgs) (Call termFn termArgs)
| patternFn /= termFn = Nothing
| length patternArgs /= length termArgs = Nothing
| otherwise = concat <$> sequence (zipWith match patternArgs termArgs)
match a b = if a == b then Just [] else Nothing
```

Once you have a good read through this, it’s pretty intuitive, I think. Basically, there are certain pairs of terms which have their own function clauses - for example two binary operators (`Op`

s) - which require more than a plain equality check between the pattern and the term. For all other pairs, a match can be determined just by checking if `pattern == term`

.

I’ll go through one of the clauses, but they’re all quite similar. If `match`

is called with two `Op`

arguments, it will match if, and only if, the operators are the same and the operands match the respective operands of the other operator term. For example, \(a + b\) won’t match \(a - b\), because they have different operators. So the code will first check if the operators are the same. If they aren’t, it will straight away return `Nothing`

, because it knows at that point that it cannot match. If they are the same, it will have to check that the two left-hand-sides match and the two right-hand-sides also match. The expression…

… will attempt to match the left-hand-sides and the right-hand-sides, then concatenate the two match results or return `Nothing`

if either of them is also `Nothing`

, due to the properties of the `Maybe`

type.

To make sure this works, we can open up a ghci session and try it out on a few expressions:

```
Prelude> match (Hole "a" @+ Hole "b") (Number 1 @+ Variable "x")
Just [("a",Number 1.0),("b",Variable "x")]
Prelude> match (Hole "a" @+ Hole "b") (Number 1 @- Variable "x")
Nothing
Prelude> match
(Call "sin" [Prefix "-" (Hole "a" @* Hole "b")])
(Call "sin" [Prefix "-" (Number 1 @* Number 2)])
Just [("a",Number 1.0),("b",Number 2.0)]
```

Seems to work alright!

## What’s Left?

Now with the matching function working, we can write the rest of the code. The things which need to be done to apply a transformation are:

- Check if the pattern matches the term, creating an associative list to store the values of the holes.
- Substitute these hole values into the holes on the right-hand-side of the transformation rule.
- Replace the term with the new transformed and substituted term from the rule.

We’ve written the function already for part 1, so logically the next thing to do would be the substitution, but instead we’ll take a detour into the functor world.

## A Functor Interface

Terms can act as functors, in a sense.

For example, applying a function *inside* an operator term should surely apply that function to each of the operands; and, applying a function inside a function call term should surely apply that function to each of the arguments. The thing is, the Haskell `Functor`

class only accepts types whose kind is `* -> *`

, meaning they must take a type parameter. `Term`

, however, does not, and as such cannot be a member of the `Functor`

class.

But despite not having the right kind to become a real functor, functor-like functionality will still be useful to us in a few cases to tidy up code, so we can define a function, `applyInside`

, which is analogous to `fmap`

, if it was possible to make `Term`

a regular `Functor`

. This can be defined like so:

```
applyInside :: (Term -> Term) -> Term -> Term
applyInside f (Op op left right) = Op op (f left) (f right)
applyInside f (Prefix op x) = Prefix op (f x)
applyInside f (Call fn args) = Call fn $ map f args
applyInside _ term = term
```

It’s a pretty simple definition. Only three types of term actually need any special treatment — the rest are essentially not functors, and so are just returned as is.

## Substitution

Now, on to substitution!

With our brand new functor-y interface for terms, substitution becomes pratically trivial to implement. A substitution function should take a `Match`

containing the values of the holes, along with a term, and return a new term which is identical to the previous, except all of the holes - where possible - are replaced with their concrete values from the match.

```
substitute :: Match -> Term -> Term
substitute vars term@(Hole x) = fromMaybe term (lookup x vars)
substitute vars term = applyInside (substitute vars) term
```

We can try this out in ghci quickly, just to make sure everything works as it should:

```
Prelude> substitute
[("a", Number 1), ("b", Number 2)]
(Hole "a" @+ Hole "b")
Op "+" (Number 1.0) (Number 2.0)
```

Along with the main functionality of substitution, two “helper” functions can be defined:

```
applyRule :: Rule -> Term -> Term
applyRule rule@(pattern, replacement) term = case match pattern term of
Nothing -> applyInside (applyRule rule) term
Just vars -> substitute vars replacement
applyRules :: [Rule] -> Term -> [Term]
applyRules rules term = nub $ map (\r -> applyRule r term) rules
```

`applyRule`

takes a rule (remember: a mapping from a pattern to a term), and a term, and applies that rule to the term, if possible. If not possible, it will apply the rule inside the term using the functor interface. This is important, as it allows for matching patterns deep inside an expression.

`applyRules`

just runs `applyRule`

for each of the rules in a given list and returns a list of terms corresponding to these applications. `nub`

is also used, to ensure that no duplicates exist.

## Applying Transformations Until They Can’t Be Applied Any More

There’s not a huge amount of use in just applying these transformation rules once. What we really want is a fully expanded expression, meaning the transformations must be applied as many times as possible until they are redundant.

There are two steps to this. First, we can generate an infinite list, where each element contains a number of possible expansions of the input. Each element will be similar to the last, except there may be more possible expansions, meaning some transformations were applied that iteration. This can be implemented like this:

```
expansionIterations :: [Rule] -> Term -> [[Term]]
expansionIterations rules term = map nub $ iterate (>>= applyRules rules) [term]
```

It uses the `iterate`

function, starting from a singleton list containing just the input term. The iteration function is the monadic bind function, right-applied to `applyRules rules`

. This, along with the definition of a list as a monad, gives us something like this:

```
[term] >>= applyRules rules
[term] >>= applyRules rules >>= applyRules rules
[term] >>= applyRules rules >>= applyRules rules >>= applyRules rules
...
```

Remember that, for lists, `(>>=) = concatMap`

. This means that what is happening is that, each iteration, all of the rules are applied to all of the elements currently in the list, creating a list of lists, then flattening that list into a new list which will therefore contain all possible expansions of the previous iteration’s elements.

Ok, so we have an infinite list which contains all possible expansions. But what happens if we want our computations to finish in some finite kind of timeframe?

Simple, all we have to do is find the first iteration which is identical to the previous one. At that point, no transformations were applied last iteration, and due to the pure nature of Haskell functions, we can be certain that none ever will from thereon. This means that we can snip the list off at that point, because the rest of the list will literally be the same element forever afterwards. This can be done with the following function:

```
firstIdempotentIteration :: (Eq a) => [a] -> a
firstIdempotentIteration (x:y:ys)
| x == y = x
| otherwise = firstIdempotentIteration (y:ys)
```

The name, `firstIdempotentIteration`

, means “the first iteration which had no effect”. The way it works is going through the list, two elements at a time, so each pair (overlapping, due to the `x:y:ys`

in the iterative call) is checked. If they are equal, one of them (irrelevant which one) is returned, and otherwise the function is recursively called with `y:ys`

, meaning that the next pair will be checked.

This function is generic in `a`

, just because why not, but in reality we will only be using it specialised to `[[Term]] -> [Term]`

.

## A Nice API

The majority of the CAS is finished, at least for now. Of course there are loads and loads of things which don’t work yet, but I’ll talk about that later. But the main thing which needs doing right now is making an API to which you can give a set of rules and a term and it will give you back a fully expanded term.This can be done like this:

```
expansions :: [Rule] -> Term -> [Term]
expansions rules = firstIdempotentIteration . expansionIterations rules
expand :: [Rule] -> Term -> Term
expand rules = last . expansions rules
```

The first function, `expansions`

, takes a set of rules and a term and returns all (well, almost all - see conclusion) expansions of that term, simply by composing `firstIdempotentIteration`

with `expansionIterations rules`

.

Then, `expand`

builds on top of that by taking some rules and a term and returning just the most expanded term. It is clear from the earlier code that the last element in the list will be the fully expanded one, so it simply just returns the last element from the full expansion list.

And that’s it! You can try it out in ghci if you want, it’s quite fun to put it some complicated trig expressions and see what it does with them.

# Conclusion

So we now have a functional computer algebra system! Well, *sort of*.

There are still actually loads and loads of things left to do. The main thing is that you may have noticed that it doesn’t properly handle multiple holes of the same name. What *should* happen is that, for example, \(sin(a + a)\) would only match things like \(sin(1 + 1)\) but not \(sin(1 + 2)\) but if you try it out you’ll see that both match. I’ll talk about this in the next post.

If you want to see the entire CAS in one file, as opposed to the little snippets in this post, have a look at the GitHub page.