Like the childcatcher in Chitty-Chitty-Bang-Bang luring kids into captivity with sweets and toys, recruiters to undergraduate Physics like to fool about with soap bubbles and boomerangs, but when the door clangs shut, it's "Right, children, time to learn about partial differentiation!". Me too. Don't say I didn't warn you.
Here's another warning: the following code needs {-# LANGUAGE KitchenSink #-}
, or rather
{-# LANGUAGE TypeFamilies, FlexibleContexts, TupleSections, GADTs, DataKinds,
TypeOperators, FlexibleInstances, RankNTypes, ScopedTypeVariables,
StandaloneDeriving, UndecidableInstances #-}
in no particular order.
Differentiable functors give comonadic zippers
What is a differentiable functor, anyway?
class (Functor f, Functor (DF f)) => Diff1 f where
type DF f :: * -> *
upF :: ZF f x -> f x
downF :: f x -> f (ZF f x)
aroundF :: ZF f x -> ZF f (ZF f x)
data ZF f x = (:<-:) {cxF :: DF f x, elF :: x}
It's a functor which has a derivative, which is also a functor. The derivative represents a one-hole context for an element. The zipper type ZF f x
represents the pair of a one-hole context and the element in the hole.
The operations for Diff1
describe the kinds of navigation we can do on zippers (without any notion of "leftward" and "rightward", for which see my Clowns and Jokers paper). We can go "upward", reassembling the structure by plugging the element in its hole. We can go "downward", finding every way to visit an element in a give structure: we decorate every element with its context. We can go "around",
taking an existing zipper and decorating each element with its context, so we find all the ways to refocus (and how to keep our current focus).
Now, the type of aroundF
might remind some of you of
class Functor c => Comonad c where
extract :: c x -> x
duplicate :: c x -> c (c x)
and you're right to be reminded! We have, with a hop and a skip,
instance Diff1 f => Functor (ZF f) where
fmap f (df :<-: x) = fmap f df :<-: f x
instance Diff1 f => Comonad (ZF f) where
extract = elF
duplicate = aroundF
and we insist that
extract . duplicate == id
fmap extract . duplicate == id
duplicate . duplicate == fmap duplicate . duplicate
We also need that
fmap extract (downF xs) == xs -- downF decorates the element in position
fmap upF (downF xs) = fmap (const xs) xs -- downF gives the correct context
Polynomial functors are differentiable
Constant functors are differentiable.
data KF a x = KF a
instance Functor (KF a) where
fmap f (KF a) = KF a
instance Diff1 (KF a) where
type DF (KF a) = KF Void
upF (KF w :<-: _) = absurd w
downF (KF a) = KF a
aroundF (KF w :<-: _) = absurd w
There's nowhere to put an element, so it's impossible to form a context. There's nowhere to go upF
or downF
from, and we easily find all none of the ways to go downF
.
The identity functor is differentiable.
data IF x = IF x
instance Functor IF where
fmap f (IF x) = IF (f x)
instance Diff1 IF where
type DF IF = KF ()
upF (KF () :<-: x) = IF x
downF (IF x) = IF (KF () :<-: x)
aroundF z@(KF () :<-: x) = KF () :<-: z
There's one element in a trivial context, downF
finds it, upF
repacks it, and aroundF
can only stay put.
Sum preserves differentiability.
data (f :+: g) x = LF (f x) | RF (g x)
instance (Functor f, Functor g) => Functor (f :+: g) where
fmap h (LF f) = LF (fmap h f)
fmap h (RF g) = RF (fmap h g)
instance (Diff1 f, Diff1 g) => Diff1 (f :+: g) where
type DF (f :+: g) = DF f :+: DF g
upF (LF f' :<-: x) = LF (upF (f' :<-: x))
upF (RF g' :<-: x) = RF (upF (g' :<-: x))
The other bits and pieces are a bit more of a handful. To go downF
, we must go downF
inside the tagged component, then fix up the resulting zippers to show the tag in the context.
downF (LF f) = LF (fmap ( (f' :<-: x) -> LF f' :<-: x) (downF f))
downF (RF g) = RF (fmap ( (g' :<-: x) -> RF g' :<-: x) (downF g))
To go aroundF
, we strip the tag, figure out how to go around the untagged thing, then restore the tag in all the resulting zippers. The element in focus, x
, is replaced by its entire zipper, z
.
aroundF z@(LF f' :<-: (x :: x)) =
LF (fmap ( (f' :<-: x) -> LF f' :<-: x) . cxF $ aroundF (f' :<-: x :: ZF f x))
:<-: z
aroundF z@(RF g' :<-: (x :: x)) =
RF (fmap ( (g' :<-: x) -> RF g' :<-: x) . cxF $ aroundF (g' :<-: x :: ZF g x))
:<-: z
Note that I had to use ScopedTypeVariables
to disambiguate the recursive calls to aroundF
. As a type function, DF
is not injective, so the fact that f' :: D f x
is not enough to force f' :<-: x :: Z f x
.
Product preserves differentiability.
data (f :*: g) x = f x :*: g x
instance (Functor f, Functor g) => Functor (f :*: g) where
fmap h (f :*: g) = fmap h f :*: fmap h g
To focus on an element in a pair, you either focus on the left and leave the right alone, or vice versa. Leibniz's famous product rule corresponds to a simple spatial intuition!
instance (Diff1 f, Diff1 g) => Diff1 (f :*: g) where
type DF (f :*: g) = (DF f :*: g) :+: (f :*: DF g)
upF (LF (f' :*: g) :<-: x) = upF (f' :<-: x) :*: g
upF (RF (f :*: g') :<-: x) = f :*: upF (g' :<-: x)
Now, downF
works similarly to the way it did for sums, except that we have to fix up the zipper context not only with a tag (to show which way we went) but also with the untouched other component.
downF (f :*: g)
= fmap ( (f' :<-: x) -> LF (f' :*: g) :<-: x) (downF f)
:*: fmap ( (g' :<-: x) -> RF (f :*: g') :<-: x) (downF g)
But aroundF
is a massive bag of laughs. Whichever side we are currently visiting, we have two choices:
- Move
aroundF
on that side.
- Move
upF
out of that side and downF
into the other side.
Each case requires us to make use of the operations for the substructure, then fix up contexts.
aroundF z@(LF (f' :*: g) :<-: (x :: x)) =
LF (fmap ( (f' :<-: x) -> LF (f' :*: g) :<-: x)
(cxF $ aroundF (f' :<-: x :: ZF f x))
:*: fmap ( (g' :<-: x) -> RF (f :*: g') :<-: x) (downF g))
:<-: z
where f = upF (f' :<-: x)
aroundF z@(RF (f :*: g') :<-: (x :: x)) =
RF (fmap ( (f' :<-: x) -> LF (f' :*: g) :<-: x) (downF f) :*:
fmap ( (g' :<-: x) -> RF (f :*: g') :<-: x)
(cxF $ aroundF (g' :<-: x :: ZF g x)))
:<-: z
where g = upF (g' :<-: x)
Phew! The polynomials are all differentiable, and thus give us comonads.
Hmm. It's all a bit abstract. So I added deriving Show
everywhere I could, and threw in
deriving instance (Show (DF f x), Show x) => Show (ZF f x)
which allowed the following interaction (tidied up by hand)
> downF (IF 1 :*: IF 2)
IF (LF (KF () :*: IF 2) :<-: 1) :*: IF (RF (IF 1 :*: KF ()) :<-: 2)
> fmap aroundF it
IF (LF (KF () :*: IF (RF (IF 1 :*: KF ()) :<-: 2)) :<-: (LF (KF () :*: IF 2) :<-: 1))
:*:
IF (RF (IF (LF (KF () :*: IF 2) :<-: 1) :*: KF ()) :<-: (RF (IF 1 :*: KF ()) :<-: 2))
Exercise Show that the composition of differentiable functors is differentiable, using the chain rule.
Sweet! Can we go home now? Of course not. We haven't differentiated any recursive structures yet.
Making recursive functors from bifunctors
A Bifunctor
, as the existing literature on datatype generic programming (see work by Patrik Jansson and Johan Jeuring, or excellent lecture notes by Jeremy Gibbons) explains at length is a type constructor with two parameters, corresponding to two sorts of substructure. We should be able to "map" both.
class Bifunctor b where
bimap :: (x -> x') -> (y -> y') -> b x y -> b x' y'
We can use Bifunctor
s to give the node structure of recursive containers. Each node has subnodes and elements. These can just be the two sorts of substructure.
data Mu b y = In (b (Mu b y) y)
See? We "tie the recursive knot" in b
's first argument, and keep the parameter y
in its second. Accordingly, we obtain once for all
instance Bifunctor b => Functor (Mu b) where
fmap f (In b) = In (bimap (fmap f) f b)
To use this, we'll need a kit of Bifunctor
instances.
The Bifunctor Kit
Constants are bifunctorial.
newtype K a x y = K a
instance Bifunctor (K a) where
bimap f g (K a) = K a
You can tell I wrote this bit first, because the identifiers are shorter, but that's good because the code is longer.
Variables are bifunctorial.
We need the bifunctors corresponding to one parameter or the other, so I made a datatype to distinguish them, then defined a suitable GADT.
data Var = X | Y
data V :: Var -> * -> * -> * where
XX :: x -> V X x y
YY :: y -> V Y x y
That makes V X x y
a copy of x
and V Y x y
a copy of y
. Accordingly
instance Bifunctor (V v) where
bimap f g (XX x) = XX (f x)
bimap f g (YY y) = YY (g y)
Sums and Products of bifunctors are bifunctors
data (:++:) f g x y = L (f x y) | R (g x y) deriving Show
instance (Bifunctor b, Bifunctor c) => Bifunctor (b :++: c) where
bimap f g (L b) = L (bimap f g b)
bimap f g (R b) = R (bimap f g b)
data (:**:) f g x y = f x y :**: g x y deriving Show
instance (Bifunctor b, Bifunctor c) => Bifunctor (b :**: c) where
bimap f g (b :**: c) = bimap f g b :**: bimap f g c
So far, so boilerplate, but now we can define things like
List = Mu (K () :++: (V Y :**: V X))
Bin = Mu (V Y :**: (K () :++: (V X :**: V X)))
If you want to use these types for actual data and not go blind in the pointilliste tradition of Georges Seurat, use pattern synonyms.
But what of zippers? How shall we show that Mu b
is differentiable? We shall need to show that b
is differentiable in both variables. Clang! It's time to learn about partial differentiation.
Partial derivatives of bifunctors
Because we have two variables, we shall need to be able to talk about