In my last post, I showed you how you can make a very simplistic computer algebra system which can expand trigonometric expressions (but is trivially expandable to support whichever expressions you like.) It’s probably worth reading that before reading this post, because otherwise it might not make a lot of sense.

The thing was, there were some pretty severe limitations to what it could do. I’ll address one of these in this post and show you how you could ammend that.

# Match-variable Consistency

Suppose you have the transformation rule,

\[ sin^2(x) + cos^2(x) \rightarrow 1 \]

and you enter this into the program and tell it to expand the expression

\[ sin^2(\pi) + cos^2(\pi) \]

You would expect that it is “expanded” into \(1\), and indeed that is what happens. But, what if you ask it to expand this?

\[ sin^2(\pi) + cos^2(2\pi) \]

This clearly shouldn’t match (because \(\pi \neq 2\pi\)), but in its current state, the program will happily tell you that it too is equal to \(1\). The reason for this becomes obvious when you take a look at the current definition of `match`

, which if you recall is the function which is in charge of taking a pattern (the left-hand-side of a rule) and checking if it matches another expression, while simultaneously returning an associative list of the matched “holes” and their corresponding values.

To clarify, suppose it was given the rule I described earlier, \(sin^2(x) + cos^2(x) \rightarrow 1\). The pattern would be \(sin^2(x) + cos^2(x)\), with two holes. The holes denote a variable-like expression - \(x\), in this case - which can be substituted for any other value. This value which is substituted for it in the expression will be considered the hole’s “value”. I hope that makes sense, but if not, have a quick read of the `match`

function section in my last post.

Anyway, this is its implementation at the moment:

```
match :: Pattern -> Term -> Maybe Match
match (Hole x) term = Just [(x, term)]
match (Op patternOp patternL patternR) (Op termOp termL termR)
| patternOp /= termOp = Nothing
| otherwise = (++) <$> (match patternL termL) <*> (match patternR termR)
match (Prefix patternOp patternX) (Prefix termOp termX)
| patternOp /= termOp = Nothing
| otherwise = match patternX termX
match (Call patternFn patternArgs) (Call termFn termArgs)
| patternFn /= termFn = Nothing
| length patternArgs /= length termArgs = Nothing
| otherwise = concat <$> sequence (zipWith match patternArgs termArgs)
match a b = if a == b then Just [] else Nothing
```

The first clause, when the pattern is a `Hole`

type, is where the problem lies. It never checks if a hole is already defined, and as a result, both values will be added to the associative list. This is not a problem if they share the same value, but if they are different, the pattern should not match, as demonstrated above.

I can think of two ways to solve this problem - one is relatively easy and requires a small amount of changed code, and the other is more complicated and, in a way, more clever. Here they are, see if you can guess which is which:

After the match has finished, check the returned associative list element-by-element to ensure that it is consistent. This can be done by going through the list and, for each element, checking if its value is equal to the first entry of the same key in the match result.

A solution involving a new definition of

`Match`

, semigroups, and other things.- Redefining
`Match`

as an ADT with two constructors, one producing an associative list like what a`Just Match`

currently is, and one representing a failed match, similar to`Nothing`

. - Making
`Match`

a semigroup, where the operator is defined such that:

`Match a <> Match b = if conflict a b then NoMatch else Match (a <> b)`

`MoMatch`

otherwise

- Using \(\diamond\) instead of
`++`

in the definition of`match`

to preserve consistency when building results.

- Redefining

So the first one’s a lot simpler and easier but as well as sounding more clever the second option does have certain advantages in terms of efficiency, since it can stop running early when a conflict is found. I’ve decided to show how both of them could be done here, for fun.

## The First One

Here’s my implementation of the first solution. Relatively straightforward changes, as expected.

```
match' :: Pattern -> Term -> Maybe Match
match' (Hole x) term = Just [(x, term)]
match' (Op patternOp patternL patternR) (Op termOp termL termR)
| patternOp /= termOp = Nothing
| otherwise = (++) <$> (match' patternL termL) <*> (match' patternR termR)
match' (Prefix patternOp patternX) (Prefix termOp termX)
| patternOp /= termOp = Nothing
| otherwise = match' patternX termX
match' (Call patternFn patternArgs) (Call termFn termArgs)
| patternFn /= termFn = Nothing
| length patternArgs /= length termArgs = Nothing
| otherwise = concat <$> sequence (zipWith match' patternArgs termArgs)
match' a b = if a == b then Just [] else Nothing
match :: Pattern -> Term -> Maybe Match
match pattern term = case match' pattern term of
Nothing -> Nothing
Just vars -> if consistent vars then Just vars else Nothing
consistent :: Match -> Bool
consistent ms = all check ms
where check (name, val) = (fromJust $ lookup name ms) == val
```

I first renamed `match`

to `match'`

, and replaced all the instances of it inside itself with the new name too. Then, everything works the same, but it leaves the name `match`

for another function. I defined `match`

such that, at an abstract level, it just calls `match'`

and checks for consistency before returning it again. If there is inconsistency, `Nothing`

will be returned instead, as if the match was unsuccessful.

To do this, another function, `consistent`

, had to be defined. It checks if a given match result is consistent with itself, by checking, for all elements, if the value of that element is the same as the value of the first element in the list with the same hole name. This may not be an optimal solution, since it has a complexity of \(O(n^2)\), and to be honest something could probably be done quite easily with a hash-set to get it down to \(O(n)\), but since there will generally be less than, say, 10 match results, this won’t matter.

And with that, the problem is fixed! Nothing else in the code needs to be fixed, because the type signature of `match`

hasn’t changed.

## The Other Way

So going back to the state of the program, I wanted to try the semigroup method because it seems a lot nicer. Also, I figure, the code will be a lot more readable - or, at least the parts that matter should be. This is what needs doing.

- Redefine
`Match`

as an ADT with two options. The first option is`Match`

which takes one parameter - a`Map String Term`

. The other option is`NoMatch`

, which represents a failed match. This type is actually isomorphic to`Maybe Match`

(using the old definition of`Match`

.) - Despite being isomorphic, it is useful to define it as a new type because it allows us to give it a
`Semigroup`

instance. Then, the`<>`

semigroup operator concatenates two matches as you’d expect but if it finds any conflicts then it will return`NoMatch`

. - Rewrite a load of other code to make it work with the new
`Match`

type.

Now let’s get started on the implementation.

First off, we need to import `Data.Map.Strict`

. Plain old `Data.Map`

could also be used, but the strict version is probably more efficient in this situation.

The new definition of `Match`

then looks like

Pretty simple. The first constructor contains a mapping from strings to terms, just like the associative list used to do. The second constructors takes no arguments and represents a failed match. This type derives `Show`

and `Eq`

, which we’ll need to use later. (Though, `Show`

isn’t strictly necessary, but it’s useful for debugging in ghci.)

Then, the semigroup instance of `Match`

can be defined like so:

```
instance Semigroup Match where
-- It can be assumed, when joining two matches, that each individual
-- map is already consistent with itself. This should be the case when
-- a match is generated from the `match` function.
Match a <> Match b
| consistent = Match (a <> b)
| otherwise = NoMatch
where consistent = all checkConsistency (M.toList b)
checkConsistency (k, v) = case M.lookup k a of
Just prev -> v == prev
Nothing -> True
_ <> _ = NoMatch
```

The first clause basically says that to append two matches, you must first check if they are consistent with each other. If they are, then the two maps can be appended with the semigroup `<>`

operator and then wrapped in a `Match`

. If they aren’t consistent, though, `NoMatch`

will be returned to signify the fact that the pattern and the expression are not compatible because they have conflicting hole values.

The consistency of a single element in `b`

can be confirmed by showing that there is no key-value pair in `a`

which conflicts with it. The consistency of `b`

, then, can be confirmed by showing that all elements of it are consistent. This also implicitly confirms the consistency of `a`

- it should be clear that `b`

being consistent with respect to `a`

implies the fact that `a`

is also consistent with respect to `b`

.

The second clause says that if two matches are appended and either one of them (or both) is not a `Match`

(and thus is a `NoMatch`

), then the result is just `NoMatch`

.

We can confirm that this works by opening up the file in ghci and trying a few expressions against it:

```
ghci> Match (M.fromList [("a", Number 1)])
<> Match (M.fromList [("b", Number 2)])
Match {getMatch = fromList [("a",Number 1.0),("b",Number 2.0)]}
ghci> Match (M.fromList [("a", Number 1)])
<> Match (M.fromList [("b", Number 2)])
<> Match (M.fromList [("a", Number 1)])
Match {getMatch = fromList [("a",Number 1.0),("b",Number 2.0)]}
ghci> Match (M.fromList [("a", Number 1)])
<> Match (M.fromList [("b", Number 2)])
<> Match (M.fromList [("a", Number 2)])
NoMatch
```

The first example shows the basic fact that two matches can be successfuly appended. The second example shows that if two values of the same value are present but are consistent with each other, it still matches. The third examples shows that if two conflicting values are present, `NoMatch`

will be returned.

With the new definition of `Match`

, a few other things will have to change - but don’t worry, they will actually become simpler since it encapsulates the idea of a value *maybe* existing with its actual value in one type.

This is the new definition of `match`

. Note that the type signature has changed: the return type is now `Match`

instead of `Maybe Match`

.

```
match :: Pattern -> Term -> Match
match (Hole x) term = Match $ M.fromList [(x, term)]
match (Op patternOp patternL patternR) (Op termOp termL termR)
| patternOp /= termOp = NoMatch
| otherwise = (match patternL termL) <> (match patternR termR)
match (Prefix patternOp patternX) (Prefix termOp termX)
| patternOp /= termOp = NoMatch
| otherwise = match patternX termX
match (Call patternFn patternArgs) (Call termFn termArgs)
| patternFn /= termFn = NoMatch
| length patternArgs /= length termArgs = NoMatch
| otherwise = foldl (<>) (Match M.empty) (zipWith match patternArgs termArgs)
match a b = if a == b then Match M.empty else NoMatch
```

The changes are small. Notably:

- The applicative style is no longer used to join together two matches (e.g. on line 5), because the matches are no longer wrapped up in
`Maybe`

s. - Where
`Just x`

was used before,`Match $ M.fromList x`

is now used. - Where
`Nothing`

was returned previously,`NoMatch`

is now returned.

We’ll also need to make a couple of other tiny changes to the `substitute`

and `applyRule`

functions:

```
substitute :: Map String Term -> Term -> Term
substitute vars term@(Hole x) = fromMaybe term (M.lookup x vars)
substitute vars term = applyInside (substitute vars) term
applyRule :: Rule -> Term -> Term
applyRule rule@(pattern, replacement) term = case match pattern term of
NoMatch -> applyInside (applyRule rule) term
Match vars -> substitute vars replacement
```

The change in `substitute`

is that the first parameter, which used to be a `Match`

, is now a `Map String Term`

. It can’t remain a `Match`

type, because that would allow it to be `NoMatch`

, and then substitute wouldn’t be able to do anything. Since we know, at the time when `substitute`

is called, that the match has succeeded, we can just take the string-term map out of the match and feed it straight to the function. Due to this change, `M.lookup`

has to be used instead of the regular `lookup`

for associative lists.

The changes in `applyRule`

are even simpler - literally, `Nothing`

is changed to `NoMatch`

and `Just`

is changed to `Match`

.

And that’s it! It’s a bit more long winded than the other solution (about 10 more lines of code), but I would say that it is much *nicer*, and I’m sure that you’d agree.

# Conclusion

Now we have a *more* functional CAS than at the end of my last post, so that’s cool. There are a few more problems, so I guess I’ll fix them at some point in the future. I reckon the next thing I’ll do is a parser so I don’t have to type out the ridiculously verbose constructors to make even simple expressions, which will help with writing rule sets as well as actually using the system (from a REPL, most likely).

At that point, it would also be trivial to load rules from files, which would be a nice little addition too.

And again, you can find the full source code on my GitHub.