← back

Zac Garby's Website

Computer Algebra System in less than 100 lines

Some fancy calculators have the ability to manipulate complicated algebraic expressions, such as $2x - y^2$, meaning they are able to do some very useful things. Among other things, they generally will be able to perform simplification, expansion, differentiation, and integration.

For a while, I was wondering how they could possibly do this? After all, doing these things manually tends to require a lot of intution and real human brain-power, which doesn't seem feasible for a machine. But maybe it's possibly to devise some kind of rules for the computer to follow such that it can do most of these things without needing any self-awareness.

Now, I don't actually know how this is usually implemented, if there even is a standard method, but what I came up with certainly works for what I was planning and was a nice little project for a Haskell beginner such as myself.

How does it work?

I decided first to tackle the problem of expanding expressions, which, I figure, can then be modified slightly to simplify things too. When it comes down to it, expansion is actually quite a trivial thing. Sure, for a human, it would take forever to go through each possible expansion of a relatively complicated expression, but for a computer, it's no trouble at all.

So here's how it works. First, a set of rules are defined somewhere, telling the program that certain pairs of expressions are equivalent and as such one can be changed into the other. For a small example, consider these rules:

$$ sin(a + b) \rightarrow sin(a)cos(b) + cos(a)sin(b) $$ $$ cos(a + b) \rightarrow cos(a)cos(b) - sin(a)sin(b) $$ $$ tan(x) \rightarrow \frac{sin(a)}{cos(a)} $$

These look a lot like equations, but more specifically they represent the fact that to expand the expressions on the left, they are to be replaced with the corresponding right-hand-side.

Using just these three rules - iteratively, such that they are applied until they can't be anymore - the following equalities can be derived:

$$ sin(\pi + 2) = sin(\pi)cos(2) + cos(\pi)sin(2) $$ $$ tan(x + y) = \frac{sin(x + y)}{cos(x + y)} = \frac{sin(x)cos(y) + cos(x)sin(y)}{cos(x)cos(y) - sin(x)sin(y)} $$

Among an infinite number of others, of course.

Anyway, that's how it works. At least at the highest level. When we start implementing it there will for sure be some little algorithms and such to explain here and there, but at least we can get started now.

The implementation

First off, Data.Maybe will need to be imported, for the fromMaybe function, which will be used later. Data.List will also need to be imported, for intercalate.

import Data.List
import Data.Maybe

Now, before we can write these transformations between expressions, we need some way to actually encode the notion of an expression as a Haskell data type.

data Term
    = Hole String
    | Number Double
    | Variable String
    | Op String Term Term
    | Prefix String Term
    | Call String [Term]
    deriving (Show, Read, Eq)

These expressions are defined as a tree-like recursive structure. There are six possible types of expressions: holes, numbers, variables, infix operators, prefix operators, and function calls. All of these are probably obvious as to their meaning except from Hole. A hole is essentially a variable which can be substituted, for example in the transformation $ sin(a + b) \rightarrow sin(a)cos(b) + cos(a)sin(b) $, $a$ and $b$ are holes. A variable which may not be a hole could be $\pi$, for example. This is obvious to a human, but we must explicitly tell the program which variables can be substituted and which cannot to resolve the ambiguity.

Another thing is that operators will need to be written by hand a lot when writing this program, and it would help if some kind of shorthand was defined:

-- Shorthand for some common operators
(@+) = Op "+"
(@-) = Op "-"
(@*) = Op "*"
(@/) = Op "/"

While not entirely necessary, this will mean that an addition of two variables can be written as:

Variable "a" @+ Variable "b"

Now we can define some type aliases:

type Pattern = Term
type Rule = (Pattern, Term)
type Match = [(String, Term)]

Pattern is simply an alias for a term, to allow for better understanding of function type signatures which we will define later. Specifically, a pattern will be something on the left-hand-side of a transformation, such as $ sin(a + b) $ from earlier.

Rule is another name for a transformation, but just takes a few less key-strokes to type out. It is a mapping from a term to another term, showing what the left-hand-side (the first element) should become when transformed.

Finally, Match is an associative list, mapping strings (which represent holes) with their values. This will be returned from the function to check for a match between a pattern and a term, and will be used to substitute into the new term.

Another thing to do with these types, before the actual algebra logic is introduced, is to make a function to convert terms to strings. This will be mostly useful when playing around with terms in ghci:

isInteger :: Double -> Bool
isInteger n = n == (fromIntegral $ round n)

showTerm :: Term -> String
showTerm (Hole x) = "?" ++ x
showTerm (Number x) = if isInteger x then show $ round x else show x
showTerm (Variable x) = x
showTerm (Op op l r) = "(" ++ showTerm l ++ " " ++ op ++ " " ++ showTerm r ++ ")"
showTerm (Prefix op x) = op ++ showTerm x
showTerm (Call fn args) = fn ++ "(" ++ intercalate ", " (map showTerm args) ++ ")"

Pretty standard. Note, a helper function isInteger was defined for stringifying the Number term, because it will have a different representation based on the outcome of this

With these definitions, it makes sense that a set of expansion rules could be defined as a list of type [Rule], and indeed this is how I did it. This is an example set of rules, implementing the trig expansions I mentioned at the start:

trigExpansionRules :: [Rule]
trigExpansionRules =
    [ ((Call "sin" [Hole "a" @+ Hole "b"]), ((Call "sin" [Hole "a"]) @*
        (Call "cos" [Hole "b"])) @+ ((Call "cos" [Hole "a"]) @* (Call "sin" [Hole "b"])))
    , ((Call "cos" [Hole "a" @+ Hole "b"]), ((Call "cos" [Hole "a"]) @*
        (Call "cos" [Hole "b"])) @- ((Call "sin" [Hole "a"]) @* (Call "sin" [Hole "b"])))
    , ((Call "tan" [Hole "a"]), (Call "sin" [Hole "a"] @/ Call "cos" [Hole "a"])) ]

This looks a bit messy, but it's the best we can do with the syntax we defined in the Term definition. As an exercise to the reader, maybe try to implement a parser so pairs of strings could be entered instead and they would be parsed into the Term type.

The Matching Algorithm

Matching is, in a sense, the main "substance" of the CAS. It is the functionality which can determine that $ sin(x) $ "matches with" $ sin(a + b) $ and $ x^y $ matches $ sin^2(x) $, while simultaneously extracting the "values" of the holes. For my first example here, $ sin(x) $ and $ sin(a + b) $, the value of the hole $x$ will be $ a + b $.

We have already defined the necessary types to implement this, hence the signature of this function will be:

match :: Pattern -> Term -> Maybe Match

As the types imply, this function takes two arguments, a pattern and a term, and returns either Nothing or Just some match. Recall that the Pattern type is just an alias for Term, and Match is an alias for [(String, Term)]. So, the first argument will be the pattern to try and match against ($ sin(x) $, going back to my example from aboe) and the second argument will be the term to try to match with the pattern ($ sin(a + b) $).

If they do indeed match, the values of the holes will be returned. Otherwise, Nothing will be returned.

Here is its definition:

match (Hole x) term = Just [(x, term)]
match (Op patternOp patternL patternR) (Op termOp termL termR)
    | patternOp /= termOp = Nothing
    | otherwise = (++) <$> (match patternL termL) <*> (match patternR termR)
match (Prefix patternOp patternX) (Prefix termOp termX)
    | patternOp /= termOp = Nothing
    | otherwise = match patternX termX
match (Call patternFn patternArgs) (Call termFn termArgs)
    | patternFn /= termFn = Nothing
    | length patternArgs /= length termArgs = Nothing
    | otherwise = concat <$> sequence (zipWith match patternArgs termArgs)
match a b = if a == b then Just [] else Nothing

Once you have a good read through this, it's pretty intuitive, I think. Basically, there are certain pairs of terms which have their own function clauses - for example two binary operators (Ops) - which require more than a plain equality check between the pattern and the term. For all other pairs, a match can be determined just by checking if pattern == term.

I'll go through one of the clauses, but they're all quite similar. If match is called with two Op arguments, it will match if, and only if, the operators are the same and the operands match the respective operands of the other operator term. For example, $a + b$ won't match $a - b$, because they have different operators. So the code will first check if the operators are the same. If they aren't, it will straight away return Nothing, because it knows at that point that it cannot match. If they are the same, it will have to check that the two left-hand-sides match and the two right-hand-sides also match. The expression..

(++) <$> (match patternL termL) <*> (match patternR termR)

.. will attempt to match the left-hand-sides and the right-hand-sides, then concatenate the two match results or return Nothing if either of them is also Nothing, due to the properties of the Maybe type.

To make sure this works, we can open up a ghci session and try it out on a few expressions:

Prelude> match (Hole "a" @+ Hole "b") (Number 1 @+ Variable "x")
Just [("a",Number 1.0),("b",Variable "x")]
Prelude> match (Hole "a" @+ Hole "b") (Number 1 @- Variable "x")
Nothing
Prelude> match
    (Call "sin" [Prefix "-" (Hole "a" @* Hole "b")])
    (Call "sin" [Prefix "-" (Number 1 @* Number 2)])
Just [("a",Number 1.0),("b",Number 2.0)]

Seems to work alright!

The Overview, So Far

Now with the matching function working, we can write the rest of the code. The things which need to be done to apply a transformation are:

  1. Check if the pattern matches the term, creating an associative list to store the values of the holes.
  2. Substitute these hole values into the holes on the right-hand-side of the transformation rule.
  3. Replace the term with the new transformed and substituted term from the rule.

We have written the function already for part 1, so logically the next thing to do would be the substitution - but what's programming without a bit of flawed logic?

A Functor Interface

Terms can act as functors, in a sense.

For example, applying a function inside an operator term should surely apply that function to each of the operands; and, applying a function inside a function call term should surely apply that function to each of the arguments. The thing is, the Haskell Functor class only accepts types whose kind is * -> *, meaning they must take a type parameter. Term, however, does not, and as such cannot be in the Functor class.

But despite not having the right kind to become a real functor, functor-like functionality will still be useful to us in a few cases to tidy up code, so we can define a function, applyInside, which is analogous to fmap, if it was possible to make Term a regular Functor. This can be defined like so:

applyInside :: (Term -> Term) -> Term -> Term
applyInside f (Op op left right) = Op op (f left) (f right)
applyInside f (Prefix op x) = Prefix op (f x)
applyInside f (Call fn args) = Call fn $ map f args
applyInside _ term = term

It's a pretty simple definition. Only three types of term actually need any special treatment — the rest are essentially not functors, and so are just returned as is.

Substitution

With our brand new functor-y interface for terms, substitution becomes trivial to implement. A substitution function should take a Match containing the values of the holes, along with a term, and return a new term which is identical to the previous, except all of the holes - where possible - are replaced with their concrete values from the match.

substitute :: Match -> Term -> Term
substitute vars term@(Hole x) = fromMaybe term (lookup x vars)
substitute vars term = applyInside (substitute vars) term

We can try this out in ghci quickly, just to make sure everything works as it should:

Prelude> substitute
                [("a", Number 1), ("b", Number 2)]
                (Hole "a" @+ Hole "b")
Op "+" (Number 1.0) (Number 2.0)

Along with the main functionality of substitution, two "helper" functions can be defined:

applyRule :: Rule -> Term -> Term
applyRule rule@(pattern, replacement) term = case match pattern term of
    Nothing -> applyInside (applyRule rule) term
    Just vars -> substitute vars replacement

applyRules :: [Rule] -> Term -> [Term]
applyRules rules term = nub $ map (\r -> applyRule r term) rules

applyRule takes a rule (remember: a mapping from a pattern to a term), and a term, and applies that rule to the term, if possible. If not possible, it will apply the rule inside the term using the functor interface. This is important, as it allows for matching patterns deep inside an expression.

applyRules just runs applyRule for each of the rules in a given list and returns a list of terms corresponding to these applications. nub is also used, to ensure that no duplicates exist.

Applying Transformations Until They Can't Be Applied Any More

There's not a huge amount of use in just applying these transformation rules once. What we really want is a fully expanded expression, meaning the transformations must be applied as many times as possible until they are redundant.

There are two steps to this. First, we can generate an infinite list, where each element contains a number of possible expansions of the input. Each element will be similar to the last, except there may be more possible expansions, meaning some transformations were applied that iteration. This can be implemented like this:

expansionIterations :: [Rule] -> Term -> [[Term]]
expansionIterations rules term = map nub $ iterate (>>= applyRules rules) [term]

It uses the iterate function, starting from a singleton list containing just the input term. The iteration function is the monadic bind function, right-applied to applyRules rules. This, with the fact that considering a list as a monad can be thought of as a non-deterministic computation, means it will generate a list which looks something like:

[term] >>= applyRules rules
[term] >>= applyRules rules >>= applyRules rules
[term] >>= applyRules rules >>= applyRules rules >>= applyRules rules
...

Remember that, for lists, (>>=) = concatMap. This means that what is happening is that, each iteration, all of the rules are applied to all of the elements currently in the list, creating a list of lists, then flattening that list into a new list which will therefore contain all possible expansions of the previous iteration's elements.

Ok, so we have an infinite list which contains all possible expansions. But what happens if we want our computations to finish in some finite kind of timeframe?

Simple, all we have to do is find the first iteration which is identical to the previous one. At that point, no transformations were applied last iteration, and due to the pure nature of Haskell functions, we can be certain that none ever will from thereon. This means that we can snip the list off at that point, because the rest of the list will literally be the same element forever afterwards. This can be done with the following function:

firstIdempotentIteration :: (Eq a) => [a] -> a
firstIdempotentIteration (x:y:ys)
    | x == y = x
    | otherwise = firstIdempotentIteration (y:ys)

The name, firstIdempotentIteration, means "the first iteration which had no effect". The way it works is going through the list, two elements at a time, so each pair (overlapping, due to the y:ys in the iterative call) is checked. If they are equal, one of them (irrelevant which one) is returned, and otherwise the function is recursively called with y:ys, meaning that the next pair will be checked.

This function is generic in a, just because why not, but in reality we will only be using it specialised to [[Term]] -> [Term].

A Nice API

The majority of the CAS is finished, at least for now. Of course there are loads and loads of things which don't work yet, but I'll talk about that later. But the main thing which needs doing right now is making an API to which you can give a set of rules and a term and it will give you back a fully expanded term.

This can be done like this:

expansions :: [Rule] -> Term -> [Term]
expansions rules = firstIdempotentIteration . expansionIterations rules

expand :: [Rule] -> Term -> Term
expand rules = last . expansions rules

The first function, expansions, takes a set of rules and a term and returns all (well, almost all - see conclusion) expansions of that term, simply by composing firstIdempotentIteration with expansionIterations rules.

Then, expand builds on top of that by taking some rules and a term and returning just the most expanded term. It is clear from the earlier code that the last element in the list will be the fully expanded one, so it simply just returns the last element from the full expansion list.

And that's it! You can try it out in ghci if you want, it's quite fun to put it some complicated trig expressions and see what it does with them:

Prelude> expand trigExpansionRules (Call "sin"
            [Variable "x" @+ (Call "cos" [Variable "y" @* Variable "z"])])
Call "sin" [Op "+" (Variable "x") (Call "cos" [Op "*"
    (Variable "y") (Variable "z")])]

Conclusion

So we now have a functional computer algebra system! Well, sort of. There are still actually loads and loads of things left to do:

  1. Make a REPL, involving an expression parser, to make it into a proper program which can be used from somewhere other than ghci.
  2. Only allow a match if a hole always refers to the same value. For example $sin(a + a)$ should not match $sin(1 + 2)$.
  3. Simplification. This is a tricky one, because it's not entirely clear to "simplify" something. What is more simple, $sin(a + b)$ or $sin(a)cos(b) + cos(a)sin(b)$? Therefore, it would probably make a lot more sense to keep simplification separate as lots of different functions for different uses, like $trig\_expand(sin(x + y))$.

It's likely that I will address some, if not all of these in future posts, but for now, perhaps you could have a go at implementing some of these things yourself!

If you want to see the entire CAS in one file, as opposed to the little snippets in this post, have a look at the GitHub page.