Haskell: Indexed Recursion Schemes
Finding your way back after a recursive descent into madness
Indexing schemes
Today, we are going on a deep dive to show how recursion schemes may be extended by the addition of indexing. Last time, we defined over 30 related functions, and this time we're making indexed variants of all of them, so at least 30 more, plus any supporting functions we might need. It could be a descent into a labyrinth of madness - but don't worry, we will be smart as Theseus, with our own version of Ariadne's golden thread. No matter how deep we go, indices will guide us on our way back.
In a previous article, we developed version
0.0.1
of a birecursion-schemes package, focused on 5 core recursive functions:cata
,ana
,hylo
,iso
, andtrans
, plus their-M
monadic variants. For each, we then derived their birecursivebi-
and direcursivedi-
variants. This library has been updated to0.0.2
following along with today's article.
As we go exploring, we are going to do our best to provide examples to illustrate things, so you should get comfortable and have a snack handy - it's going to be a long one.
As always, we start by asking questions.
What's in an index?
That which we call a rose, by any other index would smell as sweet.
William Shakespeare, almost
So what exactly is an index? As always, etymology is our first resort. The word 'index' (plural 'indices') has its origins in latin, and its meaning remains largely unchanged: literally 'forefinger' as in your index or pointer finger, but also abstractly as 'sign' or 'pointer' or 'indicator'. It is derived from the verb 'indico': prefix 'in-', meaning 'towards', followed by 'dico', meaning 'to point out, make known, set apart'. Note the relationship between 'indices' and 'indico' and 'indicate'
Taken all together, an 'index' is a thing that points you in the right direction to where something can be made known: it guides you in a traversal of knowledge (that's quite the clue as to what we're going to be getting into). This is largely the same as its modern usage, even for niche-specific variants, a bit of linguistic preservation. We have been using indices for millenia to look up more specific data, from catalogues in roman libraries, to the the modern dewey decimal system, and now we are going to use indices to find our way within data structures.
Now, we could just say that an index tells us how to access a element of that data structure, and be done with it - but that isn't the whole story. You see, indexing can be a muddy affair if we are not precise with our terminology - sometimes you have an index
like with Vector
, other times a key
as in Map
, still other times a path
like Json
.
These terms all have colloquial use, but we're going to have to precisely define them - taking into account things like that an index may point to a child of the same type in a recursive structure, or it may point to content of a different type in a container - and that recursive containers have both children and content? It all seems to depend on how and what you are using indexing for - a change of perspective results in a change of emphasis, after all. Given that we're going to be mixing indexing with recursion, it's going to to pay off to be precise, otherwise we're going to get lost in the labyrinth.
Another concern is that sometimes the index is obviously related to the shape of the data structure - a Vector
is indexed by integers that tell you exactly where each element is - but other times, the index is not obviously related to the shape of the data structure at all, and instead it functions more as an arbitrary label - a Map k
is seemingly indexed by whatever you want. We have, seemingly in contention, two different perspectives of what an index is, and our explanation has to handle both.
This last point is interesting enough that it marks our starting point, and by clarifying that last point, we may begin our descent. Our first step is to split the concept of an index into two different groups - structural indices, and profunctoral keys.
Structural indices
Sometimes an index tells you exactly where to go.
In this case, an index is in some sense 'a reduction of the rest of the data structure' - sufficient instructions as to navigate quickly and efficiently to the desired value. The book is the fifth from the right, on the top shelf of the third row. We will call this the 'structural' perspective of indices.
It is not an alien concept, with Connor McBride having written the rather relevant The Derivative of a Regular Type is its Type of One-Hole Contexts. This term 'one-hole context' is an elegant way to refer to 'the rest of the data structure', and so we will adopt the terminology for that reason.
A structural index could be considered the minimal / forgetful 'one-hole context' , with all extraneous data not leading to the hole removed / replaced by ()
- aka Free StructureF ()
where everything not on 'the path' is implicitly ()
. The resulting structure is a spine or path that goes all the way from the hole back up to the top of the original data structure, and contains just enough information to navigate back down to it. I haven't made such lovely graphs as the one-hole-context paper yet, so you'll have to imagine tracing a line from the hole back to the top of the data structure, and clipping off everything else.
The most clear example is the simple List
indexed by Nat
.
data List elem = Nil | Cons elem (List elem)
data Nat = Zero | Succ Nat
-- The index of 'c' is SSZ = 2
abc = Cons 'a' (Cons 'b' (Cons 'c' Nil))
-- Cons () (Cons () Nil)
-- Succ (Succ Zero)
We can clearly see that a Nat
is a List
that has forgotten every element before the one that we are indexing, but also every child list after. This explains why Nat
is isomorphic to [()]
, and why Succ (Succ Zero) == [Unit,Unit]
. Note that we should be able to define such an index for just about any sensible sum of products type.
However, we caution the reader to observe carefully that the derivation is not always so simple - for instance, Zero
does not correspond to Nil
but rather to the singular elem
position within Cons
. Some data structures have multiple functor content points. Take, for instance, a Bilist
which contains two elements per Bicons
.
Note that rather than explicitly fill with ()
, we'll just ignore whatever's there with a blank _
since it's a little easier on the eyes.
data Bit = Zit | Bit -- NOTE: Bit = Biunit aka Bool
data Binat = Bizero Bit | Bisucc Binat
data Bilist a = Binil | Bicons a a (Bilist a)
-- The index of 'c' is SZ(0), and the index of 'd' is SZ(1).
abcdef = Bicons 'a' 'b' (Bicons 'c' 'd' (Bicons 'e' 'f' Binil))
-- Bicons _ _ (Bicons 'c' _ _) -- 'c'
-- Bisucc (Bizero Zit) -- SZ(0)
-- Bicons _ _ (Bicons _ 'd' _) -- 'd'
-- Bisucc (Bizero Bit) -- SZ(1)
Another pitfall is multiple points of recursion - this also affects how we derive our structural index:
data Dir = Left | Right -- NOTE: Dir == Bit
data Tree a = Tip | Node a (Tree a) (Tree a)
data Branch = Leaf | Limb Dir Branch -- NOTE: is just [Dir] or `([Dir], Unit) in disguise
-- The index of 'c' is `[Right]` or `Limb Right Leaf`
treeabc = Node 'a' (Node 'b' Tip Tip) (Node 'c' Tip Tip)
-- Node _ _ (Node 'c' _ _)
-- Limb Right Leaf
In fact we can see both interacting with a tree that contains both two content points and two recursion points.
data Bitree a = Bitip | Binode a a (Bitree a) (Bitree a)
data Bibranch = Bileaf Bit | Bilimb Dir Bibranch
-- The index of 'c' is `([Left], Zit)` or `Bilimb Left (Bileaf Zit)`
bitreeabcdef = Binode 'a' 'b' (Binode 'c' 'd' Bitip Bitip) (Binode 'e' 'f' Bitip Bitip)
-- Binode _ _ (Binode 'c' _ _ _ ) _
-- Bilimb Left (Bileaf Zit)
Okay, something is funny - we started with the statement that an index is based on the structure of the original, and yet the index of a Bitree is list-like! In fact, they're all list-like!
It should be obvious now that indices are related to the form data Way path end = End end | Path path (Way path end)
- and that the index of List
should actually be Way Unit Unit
because it only has one recursion point (hence the first Unit
) and also only one element slot (hence the second). A Bilist
has index Way Unit Bit
, a Tree
has index Way Dir Unit
, and a Bitree
has index Way Dir Bit
. The Bitree
case is especially illuminating because it shows that indexes always have zero or more "branches", followed by exactly one "leaf" - this exactly matches our examples. We can even use Way Void end
to index data structures that explicitly aren't recursive!
What gives!? Well, we've traded off "replacing irrelevant data points with blanks" for "counting the number of positions filled with blanks before our data" It just so happens that with List
, the two concepts coincide - because it has both one content slot, and one recursion point.
Way
is still a reduction of the one-hole context - the rest of the data structure. path
is a reduction of the recursive branches taken, and end
is a reduction of the final branch containing the indexed value. This means that for our purposes, Way
is just a different way of representing Free StructureF ()
(again, with the constraint that everything off-path is replaced by ()
or _
) - one is efficient in the metric of 'shaped like the original data structure' and the other is efficient in the metric of 'has a compact representation'.
This 'filling blanks' vs 'counting blanks' doesn't matter, because we can always go back and forth. We can always count filled blanks, and having a count always tells us how many blanks to fill, and everything after the indexed element is either truncated or ignored.
NOTE: When it comes to an index for a single selected element, we can either fill blanks, or count blanks and it doesn't matter - but there are cases where it may start to matter for efficiency 'how' we reduce the rest of the data structure to an index. When it comes to distributed data and distributed indices, blanks are filled with
proofs
instead of()
, and the difference betweenindex
anddata
becomes more like 'how much do you know'. There is a lot to think about.
For the majority of the article, we will focus on structural indices, as they form the foundation of today's work, and everything we just discussed will come in handy. But first, let us discuss for a moment the other perspective.
Profunctoral keys
Sometimes an index is a label that helps you search, because anything that you can turn into an label, is also label, and thus an index.
Rather than telling you exactly where something is, this type of index helps narrow it down, often in steps. The book is in the fiction section, between "Green" and "Grishom". Note that while structural indices are concrete, profunctoral keys are abstracted. This allows for indexing collections that involve factors like balancing - a profunctoral key doesn't tell you exactly where the book is, because that information is specific to a given library layout. If the library were to reorganize itself, a structural index would have to change - but a profunctoral key would not have to.
This makes profunctoral keys great for organized piles where you don't always know where things are, but you do know where things aren't - and when you know that 'all other values are somewhere else', you then know that 'if there is any value here, it is the one that I seek'.
We call it profunctoral in reference to:
class Profunctor (f :: * -> * -> *) where
promap :: (a -> b) -> (c -> d) -> p b c -> p a d -- aka Profunctor.dimap
premap :: (a -> b) -> p b c -> p a c -- aka Profunctor.lmap
postmap :: (b -> c) -> p a b -> p a c -- aka Profunctor.rmap
Specifically, we can use premap / lmap
to change the type of a key in data structures like Map k
, eg if you have a f :: k2 -> k
you can turn the m :: Map k
into a m2 :: Map k2
, because you can on-the-fly turn all k2
s into k
s immediately before using them using m2 = premap f m
.
Profunctoral keys often work by offloading some of the organizational logic to a constraint, such Ord
, which is why Map
requires such a constraint - it is masking that, inside, it turns k
into something else to navigate (like an Int
). We can even go as far as to use just an Eq
constraint - if Ord k => Map k a
is an organized pile, then Eq k => AssocList k a
is an unorganized pile, and yet you can still, albeit with a loss of efficiency. If a structural index gets its organization factor from the data structure, a profunctoral index gets its organization from the key.
An appropriate example of a profunctoral key might be IntMap
, where the true index isn't Int
but rather (Min Int, Max Int)
, because we start at the top with an index of (-Infinity, Infinity)
and narrow it down via (min,max)
as we descend recursively - our initial "key" gives us one of our bounds, and which branch we take tells us the other. We reduce (min,max)
until we have min == max
because that is where our value is located. It is only when we consider the entire recursive functor head structure of the IntMap that the index sums up the steps that we took to become an Int
.
Profunctoral keys afford a great deal of flexibility and convenience, because they allow you to ignore the precise structure required to reach an element. This in turn mean that the keys can have other desirable properties such as being able to survive edits of a data structure due to the insertion or deletion of other elements, or balancing of the data structure - the same sort of indirection that makes pointers useful in C.
The abstraction of profunctoral keys is not free, however - it must necessarily build off of the structural indices, despite the two seemingly contradictory perspectives. Do not worry - at the end of this article, we will explain this contradiction and reveal how the profunctoral keys have been hidden inside the entire time.
Existing packages
With all of that in mind, we should take a quick look at the existing ecosystem - it will help us decide on how to go about implementing this sort of logic. At this point, we have enough information to more clearly search for existing libraries, modules, and classes - and we find ourselves a scattered but mostly-consistent ecosystem. This could all use a good deal of unification, but for now let us see what we have discovered.
In no particular order, we have:
containers & vector
Several commonly-used modules (such as Map
and Vector
) have functions specific to their data structures.
Vector.imap :: (Int -> a -> b) -> Vector a -> Vector b
Map.mapWithKey :: (k -> a -> b) -> Map k a -> Map k b
If we generalize, we get imap :: (i -> a -> b) -> f a -> f b
, which is a function type that we can use in a class definition.
lens
The lens
package provides a class for indexed functors, using FunctionalDependencies
:
-- lens: Control.Lens.Combinators
class Functor f => FunctorWithIndex i (f :: Type -> Type) | f -> i where
imap :: (i -> a -> b) -> f a -> f b
indexed-traversable
The indexed-traversable
package has a practically identical definition - unsurprising as it was extracted from lens
in the first place.
-- indexed-traversable: Data.Functor.WithIndex
class Functor f => FunctorWithIndex i f | f -> i where
imap :: (i -> a -> b) -> f a -> f b
Notably, it also contains FoldableWithIndex
and TraversableWithIndex
- we'll keep these in mind.
keys
The keys
package is a package that uses TypeFamilies
instead of FunctionalDependencies
. However, it is very minimal, and also deprecated.
-- keys: Data.Key
type family Key (f :: * -> *)
class Functor f => Keyed f where
mapWithKey :: (Key f -> a -> b) -> f a -> f b
mono-traversable-keys
An honorable mention is mono-traversable-keys
, which is keys
for monomorphic functors. It is worth noting for inclusion, but for now we're not going to be exploring monomorphic functors, even though they do relate to recursives through a recursive structure being a monomorphic functor where its elements are the same type as itself.
-- mono-traversable-keys: Data.MonoTraversable.Keys
type family MonoKey key
class MonoFunctor mono => MonoKeyed mono where
omapWithKey :: (MonoKey mono -> Element mono -> Element mono) -> mono -> mono
base
Another honorable mention is Data.Ix
in base
. It isn't exactly what we want but we can sense a relation because the class is used for indexing contiguous arrays while still allowing the index type to be abstracted to anything orderable + rangeable.
-- base: Data.Ix
class Ord a => Ix a where
range :: (a, a) -> [a]
index :: (a, a) -> a -> Int
inRange :: (a, a) -> a -> Bool
Remember IntMap
secretly operating over (min,max)
rather than Int
? That notion is expressed here in class form.
adjunctions
Finally, we have a really interesting class found in adjunctions
- we really won't get anywhere near this for a while, but nonetheless:
-- adjunctions: Data.Functor.Rep
class Distributive f => Representable f where
type Rep f :: *
tabulate :: (Rep f -> a) -> f a
index :: f a -> Rep f -> a
We'll find Representable
to be the subject of future discussion, but I have not explored the relation deep enough to say anything of depth.
Kleisli arrows of outrageous fortune
Beyond that, there is the paper Kleisli arrows of outrageous fortune by Connor McBride (again) involving Atkey-style indexed / parameterized functor, which will be relevant if we wish to generalize from type-specific indices and paths, to a parametric / type argument index. However, in keeping with 'start simple', we'll get to that when we get to it.
recursion-schemes-ix
Finally there is already technically a library for indexed recursion schemes but it uses singletons and is a bit too abstracted for my taste - remember, we don't want to get lost in the madness, and I'm not even sure that it does everything I need. I'm looking for a solution that provides clarity and more compatibility with the way that existing indexed functor packages work, in order to aid in our exploration of this space.
Indexed functors and foldables
With examples of the existing ecosystem fresh in our minds, we are now going to define our own classes for indexing, based on what we just covered and also the opinions developed during use.
Our core function will be of the form
imap :: (i -> a -> b) -> f a -> f b
.In keeping with our existing use of type families for
Base
, we will opt to also use type families forIndex
instead offunctional dependencies
.We will be providing separate classes for
Indexed
,IndexedFunctor
,IndexedFoldable
, andIndexedTraversable
Indexed
and indexed
We start with the Index
type, and its accompanying class, Indexed
which provides the function indexed :: f a -> f (Index f, a)
, which tuples up content with its index.
type family Index (f :: * -> *)
class Indexed f where
indexed :: f a -> f (Index f, a)
default indexed :: (IndexedFunctor f) => f a -> f (Index f, a)
indexed = imap (,)
All that indexed
does is annotate an element with its index - this is rarely the function that we want to use, but it is good to have for the sake of a proper hierarchy.
Note that there is a default implementation for indexed
, which relies on our next class, IndexedFunctor
. We have opted however to keep the two classes separate, and use Indexed
as a superclass for IndexedFunctor
.
IndexedFunctor
and imap
Our next class, IndexedFunctor
, combines both Indexed
and Functor
to provide the function imap :: (Index f -> a -> b) -> f a -> f b
. The difference between imap
and fmap
is that we now get access to the index
of a value while we are mapping.
class (Indexed f, Functor f) => IndexedFunctor f where
imap :: (Index f -> a -> b) -> f a -> f b
imap f = fmap (uncurry f) . indexed
Note that imap
also has a default implementation - and the default implementations of indexed
and imap
rely on each other. This means that we must actually implement at least one of them, preferably imap
, for reasons that will be explained later.
IndexedFoldable
and ifoldMap
Now, you may be asking, why have we declared Indexed
and IndexedFunctor
separately? It is because the next class, IndexedFoldable
, does not rely on Functor
. Instead, it combines both Indexed
and Foldable
to provide the function ifoldMap :: (Monoid m) => (Index f -> a -> m) -> f a -> m
. Like imap
, with ifoldMap
we now get access to the index
of a value while we are folding.
class (Indexed f, Foldable f) => IndexedFoldable f where
ifoldMap :: (Monoid m) => (Index f -> a -> m) -> f a -> m
ifoldMap f = foldMap (uncurry f) . indexed
I do not think it is possible to define a default implementation of indexed
in terms of ifoldMap
(as we did with imap
), even if we wanted - we would need fromList
or some other way of constructing the specific type, which is not captured by Foldable
.
In the name of preserving our immediate sanity, we have opted to leave defining ifoldr
and other indexed foldable functions for later, as they are beyond the necessary scope of this already-overburdened article.
IndexedTraversable
and itraverse
Finally, to complete the trio, we of course have IndexedTraversable
. Just like Functor
and Foldable
, if something is both an IndexedFunctor
and an IndexedFoldable
, it should probably also be an IndexedTraversable
(and vice versa), even though it only actually requires Indexed
and Traversable
constraints.
class (Indexed t, Traversable t) => IndexedTraversable t where
itraverse :: Applicative f => (Index t -> a -> f b) -> t a -> f (t b)
itraverse f = traverse (uncurry f) . indexed
NOTE: The class constraints here should possibly be
(IndexedFunctor t, IndexedFoldable t, Traversable t)
instead - we're still figuring out exactly how weak / strong we want these connections to be.
Ditto on leaving imapM
and friends undefined for now.
Notes on default implementations
Note that we can choose to implement indexed
, and get imap
, ifoldMap
, and itraverse
for free; however, it is strongly preferred that indexed
use the default implemention from imap
, and that we implement imap
, ifoldMap
, and itraverse
, in order to only perform one map / traversal of the structure and fuse away the default indexed
's inner fmap
. We need this fusing not just for efficiency, but also out of concern for the need to avoid fetching the entire data structure to index-map it all at once - this will become more important later.
Time for a break
Go take a break, stretch your legs or something. You've earned it, having survived the theoretical and philosophical discussion. All of the stuff above, is more or less known, but we're about to cross into the unknown, so you'll want your energy.
Indexed recursives
With indexed functors out of the way, it seems now we should try and take the next logical step, and define... indexed recursives? Sure, lets take a stab at that and see what we get. For perspective, remember that the default implementation of imap
was fmap (uncurry f) . indexed
, so we can work off of that.
For our first attempt, we might be tempted to try and define indexed versions of hylo and friends:
icata :: (Recursive a, Indexed (Base a)) => (Base a (Index (Base a), b) -> b) -> a -> b
icata alg = cata (alg . indexed)
ihylo :: (Indexed f, Functor f) => (f (Index f, b) -> b) -> (a -> f a) -> a -> b
ihylo alg = hylo (alg . indexed)
However, this definition uses indexed
rather than imap
, which is preferred due to our pattern of extending hylo
by adding new capabilities to fmap
. We can fuse the indexed
into the hylo
by implementing it using imap
instead of fmap
, at the cost of the constraint now requiring IndexedFunctor
:
icata :: (Recursive a, IndexedFunctor (Base a)) => (Base a (Index (Base a), b) -> b) -> a -> b
icata alg = ihylo alg project
ihylo :: IndexedFunctor f => (f (Index f, b) -> b) -> (a -> f a) -> a -> b
ihylo alg coalg = h where h = alg . imap (curry (second h)) . coalg
So we have a few functions that seem sensible - but if we take a closer look, and one finds that this lacks our desired symmetry.
Note that the type definition requires a tuple (Index f, b)
rather than a curried function such as Index f -> f b -> b
- and that the index is entirely absent on the ana / coalg
side. Indeed, the notion of an index only matters on the cata / alg
side - we could pass an index to the child but it does not matter because it must immediately discard it. Indeed, it seems there is no sensible iana
that doesn't just drop them immediately, like this:
iana :: (Corecursive t, Indexed (Base t)) => (a -> Base t (Index (Base t), a)) -> a -> t
iana = hylo embed . fmap (fmap snd)
Something else, with a type definition like (Index (Base t) -> a -> Base t a)) -> Index (Base t) -> a -> t
or (Index (Base t) -> a -> Base t (Index (Base t), a)) -> Index (Base t) -> a -> t
hardly seems better. Why does it appear that having the index only matters when ascending back up, only after - but not while - we have descended recursively? If we look more closely at those horrid attempts to create a non-trivial iana
, the answer is already given - ... -> Index (Base t) -> a -> t
.
We would have to provide an index for the top-level - but there is none. An index is about a child's position within a parent, but a top level can have no index, being without a parent, and as we unfold, we must treat each child as a top-level - and thus the entire descent is without indexing. What we have here is not so much a golden thread telling us how to get back, but rather one telling us what the next steps forward could be - and it is only once we have traversed them that they collectively become a part of the way back.
That isn't to say that the icata
and ihylo
functions aren't useful, merely that they are in fact degenerate or trivial versions of the functions of what we seek, and that we have to keep working to derive what we want.
In particular, icata
and ihylo
can be used for simple catamorphic hashing of JSON based on its structure, and not just by hashing a JSON-encoded String, as you so often see in naive the crypto space. This can be used to de-duplicate, parallelize, diff, and patch a stream of on-demand JSON fragments, being the most simple variant of a series of functions which we will get into when this article series finishes with the foundations and begins focusing on the practical applications.
Nevertheless, it seems we have only half the story - we are missing something.
Indexed base functors
If we take the time to remember the steps that we took to promote functors to recursives, it gives us a clue as to what we are missing. Index
is clearly a functoral concept, but that means it only operates over a single level of a recursive at a time. hylo
differentiates itself from fmap
by operating over multiple levels, so its indices must also do the same. Our indexed recursives really need to have recursive indices, made out of the indices of the base functors leading back to the top level - a path of sorts.
Pathed recursives
This really should have been obvious from the get-go, what with all of the Index (Base t)
going on, but now we can get on with what we were really trying to define in the first place - pathed recursives.
In fact, lets define a few quick alias:
type Path (r :: *) = [Index (Base r)]
type PathedRecursive a = (Recursive a, IndexedFunctor (Base a))
type PathedCorecursive a = (Corecursive a, IndexedFunctor (Base a))
Note that while a functor has an Index
, a recursive has a Path
, made up of a sequence of base indices. An index and a path are separate but related concepts, because a path directs you along the internal structure of a recursive, while an index points all the way to a functor content slot. It helps to think of a path as potentially ending 'inside' of the data structure, while an index goes 'all the way through', ending 'at the edge or boundary' between the functor and its content. If the functor is itself recursive, then by necessity its indices will contain a path as part of its structure, in some sort of index-path - but that's getting ahead of ourselves.
We have also created the PathedRecursive
and PathedCorecursive
types by upgrading from a Functor (Base a)
constraint (implicit in the Recursive
constraint) to an IndexedFunctor (Base a)
constraint.
We're defining these as type aliases for now, rather than as type families / classes, but we can always go back and change that - and we might actually need to later when we get into sourced paths, once we solidify things. For the moment, this suffices, so let us proceed to trying to define pathed hylo and friends again, with what we have just newly learned.
Pathed morphisms
Whereas before, we have i-
functions to denote indexed-
, here we will have p-
functions, to denote pathed-
, and we will start with pcata
since we know it will be meaningful. But how do we promote icata
to pcata
if we don't even know what the type of pcata
is yet?
pcata
and pana
Well, what are the changes in our understanding? Well, for one, we want to keep track of the path of indices that we have encountered before and pass that in at every level, rather than the indices of children which we haven't encountered yet. And secondly, it should be compatible with an upgrade from fmap
to imap
- so vaguely it needs to be like h = alg . imap h . coalg
, except we need to pass in a path parameter somewhere.
In other words, for pcata
we're actually looking for an algorithm with the form path -> f b -> b
, instead of f (index, a) -> a
. Having the index inside of the functor (as in f (i, a) -> a
) is entirely unnecessary, as we can combine the index into the path immediately before passing that along with the child to the next recursion - meaning that pcata
itself also needs an initial path ->
argument after the alg argument. Additionally, by symmetry we now have a sensible co-algorithm form for pana
- path -> a -> f a
- and with this form, the path affects the unfolding in a meaningful way!
Another thing that must be true of our desired set of functions is that, if we ignore the index / path information entirely, it should be the same as plain hylo and friend - that is, hylo alg coalg = phylo (const alg) (const coalg) undefined
. This actually tells us enough to implement both pcata
and pana
in terms of phylo
, even though we haven't defined phylo
yet!
Thus, we can define pcata
and pana
as having the same type as cata
and ana
, but simply having added Path a ->
to a few places, and also using const
to let project
and embed
ignore this new parameter when appropriate.
-- cata :: Recursive t => ( Base t a -> a) -> t -> a
pcata :: PathedRecursive t => (Path t -> Base t a -> a) -> Path t -> t -> a
pcata alg = phylo alg (const project)
-- ana :: Corecursive t => ( a -> Base t a) -> a -> t
pana :: PathedCorecursive t => (Path t -> a -> Base t a) -> Path t -> a -> t
pana coalg = phylo (const embed) coalg
phylo
With that in place, we are free to define the type of phylo
. We know it must be something similar to pana
and pcata
, but since we are back to the level of indexed functors we have slightly different constraints - IndexedFunctor f
rather than PathedRecursive a
or PathedCorecursive a
- and so we need use [Index f]
rather than Path
(since Path
is defined as Path r = [Index (Base r)]
). Otherwise, we can clearly see our desired function now taking shape:
-- hylo :: Functor f => ( f b -> b) -> ( a -> f a) -> a -> b
phylo :: IndexedFunctor f => ([Index f] -> f b -> b) -> ([Index f] -> a -> f a) -> [Index f] -> a -> b
phylo = _
However, the question is now implementing phylo
. Compared to hylo
, phylo
and its alg
and coalg
each all require an additional parameter. Rather than h = alg . fmap h . coalg
, we need something like h p = alg p . imap h p . coalg p
, except that clearly doesn't work because imap
doesn't take a path argument, and we want to join it with the index anyway which this doesn't do.
We need to join the path p
with the index i
inside of the imap
's argument, like imap (h p)
instead of imap h p
. A little bit of playing around to join index onto a path, and we get something sensible.
phylo :: IndexedFunctor f => ([Index f] -> f b -> b) -> ([Index f] -> a -> f a) -> [Index f] -> a -> b
phylo alg coalg = h where h p = alg p . imap (\ i -> h (i:p)) . coalg p
-- hylo alg coalg = h where h = alg . fmap h . alg
This is actually what we've been after - we have successfully defined a pathed hylomorphism - something that, while we are traversing it, knows where we came from. This is pretty sweet! It is as powerful as hylo, but we also have a nice golden thread to help find our way back out of the labyrinth.
Doing something like using Cofree
to annotate a data structure with its path is now trivial!
annotatePaths :: Json -> Cofree JsonF JsonPath
annotatePaths = pcata (:<) []
NOTE: It is worth mentioning that paths are formed in reverse, with the most recent index on top and root at te end - eg,
/path/to/file.md
yields["file.md", "to", "path", "/"]
.
In general, these functions do everything that hylo
does, with the addition of its path being available to it - so these are morphisms that know where they are!. This makes it possible to untie a structure, breaking it into shards and fragments that we can collect in a list or dictionary.
type Frag r = Base r (Path r)
type Shard r = (Path r, Frag r) -- Or CofreeF (Base r) (Path r) (Path r)
shatterList :: (Recursive r, IndexedFunctor (Base r), Foldable (Base r)) => r -> [Shard r]
shatterList r = snd $ pcata shatter [] r where
-- shatter :: Path r -> Base r (Path r, [Shard r]) -> (Path r, [Shard r])
shatter p fs = (p, (p, fmap fst fs) : concatMap snd fs)
shatterMap :: (Recursive r, IndexedFunctor (Base r), Foldable (Base r)) => r -> Map (Path r) (Frag r)
shatterMap = Map.fromList . shatterList
Reconstituting the original object is trivial, and we can imagine this being useful for turning recursive objects into streams of fragments, allowing you to avoid sending megabytes of of JSON when a few kilobytes would do.
Note that pathed morphisms are also very useful when you want to differentiate identical values that differ only by their position within a data structure. Take for instance the following JSON:
{
"account" : {
"balance": 500
},
"address" : {
"streetno" : 500
}
}
Note that the values of account.balance
and address.streetno
are both 500
- but they are in a sense different 500's. You might use types to differentiate a bank balance from a street number, but then what if you have two bank accounts? Distinguishing by type is no longer enough. You see, the location of data can be just as important as the type of the data (in fact these notions can be related more strongly if we wish).
Lets say we were cryptographically hashing this data. With simple catamorphic hashing, we might accidentally reveal that the bank account and street number are equivalent, and this is of course very bad. This would happen because we would start at the bottom, and we would see 500
as a base case on two separate occasions, hash it individually twice, and get the same hash value for each both times. This is because when hashing from the bottom up, we of course don't include any of the 'up', and so as far as those two 'different' 500 values go, they are exactly the same. hash(500)
is the same everywhere.
Note that this property isn't bad by itself - it turns out that this property enables deduplication, at the cost of revealing similarities between data because thats what deduplication is. So this property can be valuable, especially for public data.
However, with pathed catamorphic hashing, we have a chance to include the path in the hashing. Instead, our base cases for hashing would be, not hash(500)
and hash(500)
, but rather hash(account.balance.500)
and hash(address.streetno.500)
. The inclusion of the path differentiates the two, and makes the resulting hash of each value unique to not just its structural value, but also its location within a larger structure. With this, revealing that account.balance = 500
does not reveal that address.streetno = 500
.
So, sometimes location matters, and when it does, 'morphisms that know where they are' are especially useful.
Monadic pathed morphisms
We may now go about defining monadic variants, if we like. Anything that a pathed morphism can do, we can now do monadically. They are of course especially useful if we need to perform IO while traversing our recursive structure. Otherwise, everything we just said for pathed morphisms, just slap a monad on it.
pcataM
pcataM
:: (PathedRecursive a, IndexedTraversable (Base a), Monad m)
=> (Path a -> Base a b -> m b)
-> Path a -> a -> m b
pcataM alg = phyloM alg (const projectM)
We could use pcataM
here to simply print every path of a JSON object, from the bottom up while folding:
printJsonPaths :: JsonPath -> Json -> IO ()
printJsonPaths = pcataM (const . print)
panaM
panaM
:: (PathedCorecursive b, IndexedTraversable (Base b), Monad m)
=> (Path b -> a -> m (Base b a))
-> Path b -> a -> m b
panaM coalg = phyloM (const embedM) coalg
We could use panaM
to recursively fetch fragments of a JSON object in a way that combines the path with a digest.
fetchJson :: (JsonPath -> Digest -> IO (JsonF Digest)) -> JsonPath -> Digest -> IO Json
fetchJson = panaM
phyloM
phyloM
:: (Monad m, IndexedTraversable t)
=> ([Index t] -> t b -> m b)
-> ([Index t] -> a -> m (t a))
-> [Index t] -> a -> m b
phyloM alg coalg = h where h p = alg p <=< itraverse (\ i -> h (i:p)) <=< coalg p
phyloM
is quite exciting because can use the path in the alg and coalg to perform IO while both unfolding (reading the next chunk of data from a file or network) and folding (writing updated chunks out). Given the amount of IO involving relative paths, this function can come in quite handy if you build yourself a FileTree / FileTreeF
data structure.
data FileTree = File ByteString | Directory (Map FileName FileTree)
data FileTreeF a = ...
cloneFilesToDisk
:: (FilePath -> FileTreeF () -> IO ()) -- Write to disk
-> (FilePath -> Digest -> IO (FileTreeF Digest)) -- Fetch file tree fragment
-> FilePath -> Digest -> IO ()
cloneFilesToDisk = phyloM
Special pathed morphisms
I suppose we should define p-
equivalents for iso
and trans
, and their monadic -M
variants too.
Unsurprisingly, both piso
and pisoM
are trivial, as ignoring the path argument just makes them equivalent to iso
and isoM
. However, ptrans
and ptransM
are not trivial - because the path argument makes it to the natural transformation argument, it is able to have an effect on the resulting structure.
piso
piso :: (Iso s t, IndexedFunctor (Base s)) => Path s -> s -> t
piso = phylo (const embed) (const project)
-- iso = hylo ( embed( ( project)
This function is trivial, and equivalent to const iso
.
ptrans
ptrans
:: (PathedRecursive s, Corecursive t)
=> (Path s -> forall a. Base s a -> Base t a)
-> Path s -> s -> t
-- trans f = hylo ( embed . f ) ( project)
ptrans f = phylo (\ p -> embed . f p) (const project)
This function is non-trivial, because the path may affect the transformation.
pisoM
pisoM :: (Monad m, Iso s t, IndexedTraversable (Base s)) => Path s -> s -> m t
pisoM = phyloM (const embedM) (const projectM)
This function is trivial, and equivalent to const isoM
.
ptransM
ptransM
:: (Monad m, PathedRecursive s, Corecursive t, IndexedTraversable (Base s))
=> (Path s -> forall a. Base s a -> m (Base t a))
-> Path s -> s -> m t
ptransM f = phyloM (\ p -> embedM <=< f p) (const projectM)
This function is non-trivial, because the path may affect the transformation.
Biindexed bifunctors and bifoldables
Okay, so that wasn't too bad. But this is birecursion-schemes, and so you knew it was coming - biindexed bifunctors. Luckily, this is going to be very easy, because we did all of the hard work to get pathed recursion schemes, we've just got to combine the methods of lifting from recursion to birecursion, with the method of lifting to indexed / pathed recursion. That means we've got two indices now. Oh no! Anyway...
First, we define our index type families.
type family FirstIndex (f :: * -> * -> *) :: *
type family SecondIndex (f :: * -> * -> *) :: *
type Biindex f = Either (FirstIndex f) (SecondIndex f)
Note that forall a . Index (f a) == SecondIndex f
since forall a . Functor (f a) => Bifunctor f
(assuming the bifunctor-superclass proposal, which we haven't quite yet applied)
What's this Biindex
type? Well, since birecursion means that we recurse twice, and so when we are recursing, the resulting path may contain a mix of FirstIndex
and SecondIndex
depending on whether the recursion went through the First
slot or the Second
slot at a given level.
Trust me, it'll make more sense as we go on. On with the show!
Biindexed
and biindexed
If indexed
is just imap (,)
, then biindexed
is just ibimap (,) (,)
.
class Biindexed f where
biindexed :: f a b -> f (FirstIndex f, a) (SecondIndex f, b)
default biindexed :: (BiindexedBifunctor f) => f a b -> f (FirstIndex f, a) (SecondIndex f, b)
biindexed = ibimap (,) (,)
Again, we have a default implementation in terms of ibimap
, but the same thoughts as before regarding fusing still apply.
BiindexedBifunctor
and ibimap
If imap f = fmap (uncurry f) . indexed
, then ibimap f g = bimap (uncurry f) (uncurry g) . biindexed
class (Biindexed f, Bifunctor f) => BiindexedBifunctor f where
ibimap :: (FirstIndex f -> a -> b) -> (SecondIndex f -> c -> d) -> f a c -> f b d
ibimap f g = bimap (uncurry f) (uncurry g) . biindexed
BiindexedBifoldable
and ibifoldMap
Now we really are starting to harvest the result of our efforts earlier. We upgrade foldMap (uncurry f) . indexed
to bifoldMap (uncurry f) (uncurry g) . biindexed
- because we took the time to define IndexedFoldable
, we can define BiindexedBifoldable
just as easily!
class (Biindexed f, Bifoldable f) => BiindexedBifoldable f where
ibifoldMap :: (Monoid m) => (FirstIndex f -> a -> m) -> (SecondIndex f -> b -> m) -> f a b -> m
ibifoldMap f g = bifoldMap (uncurry f) (uncurry g) . biindexed
Our types are beginning to grow unseasonably large, however.
BiindexedBitraversable
and ibitraverse
BiindexedBitraversable
is again just so much more purely mechanical lifting.
class (Biindexed t, Bitraversable t) => BiindexedBitraversable t where
ibitraverse
:: Applicative f
=> (FirstIndex t -> a -> f c)
-> (SecondIndex t -> b -> f d)
-> t a b -> f (t c d)
ibitraverse f g = bitraverse (uncurry f) (uncurry g) . biindexed
As far as I can tell, while indexed-traversable
as a package exists, biindexed-bitraversable
does not, and we are firmly in novel territory here. However, these are necesary for... are you ready for it? Bipathed birecursion!
Biindexed birecursives
This time, we're going to skip the i-
variants entirely, and get straight to the p-
pathed variants.
Biindexed bibase bifunctors
Since before we had indexed base functors, now we have biindexed bibase bifunctors. Oh dear, what terribly babbling terminology.
Bipathed birecursives
Just as before, we define our types, this time lifted from both recursive to birecursive and indexed to biindexed.
type Bipath (r :: *) = [Biindex (Bibase r)]
-- PathedRecursive t = ( Recursive t, IndexedFunctor (Base t))
type BipathedBirecursive t = (Birecursive t, BiindexedBifunctor (Bibase t))
type BipathedCobirecursive t = (Cobirecursive t, BiindexedBifunctor (Bibase t))
Our golden thread still shows us our way back, however - and it tells us which doors we passed through on the way - the first recursion or the second.
Again, these are just type aliases for now.
Bipathed bimorphisms
NOTE: I am unsure of whether our use of the term 'bimorphism' meets the definition a 'bimorphism' as "A morphism which is both a monomorphism and an epimorphism.". It may clash, or it may turn out to be equivalent, if we squint at it hard enough. I haven't investigated yet.
We're going to get through these bimorphisms real quick, because effectively all that they do is take imap / itraverse
and lift to ibimap / ibitraverse
, Path
to Bipath
, etc. Lots of things just get bi-
shoved in front. Anything that isnt straightforward will get called out.
pbicata
Just pcata
with bi-
plastered everywhere.
pbicata -- pcata
:: (BipathedBirecursive t) -- :: PathedRecursive t
=> (Bipath t -> Bibase t a a -> a) -- => (Path t -> Base t a -> a)
-> Bipath t -> t -> a -- -> Path t -> t -> a
pbicata alg = pbihylo alg (const biproject)
-- pcata alg = phylo alg (const project)
Note that Bibase t a a
takes an extra a
argument compared to Base t a
, because it has a base bifunctor.
pbiana
Just pana
with bi-
plastered everywhere.
pbiana -- pana
:: (BipathedCobirecursive t) -- :: PathedCorecursive t
=> (Bipath t -> a -> Bibase t a a) -- => (Path t -> a -> Base t a)
-> Bipath t -> a -> t -- -> Path t -> a -> t
pbiana coalg = pbihylo (const biembed) coalg
-- pana coalg = phylo (const embed) coalg
pbihylo
Remember that Biindex f = Either (FirstIndex f) (SecondIndex f)
? This is why!
pbihylo -- phylo
:: (BiindexedBifunctor f) -- :: IndexedFunctor f
=> ([Biindex f] -> f b b -> b) -- => ([Index f] -> f b -> b)
-> ([Biindex f] -> a -> f a a) -- -> ([Index f] -> a -> f a)
-> [Biindex f] -> a -> b -- -> [Index f] -> a -> b
pbihylo alg coalg = h where
h p = alg p . ibimap (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) . coalg p
-- phylo
-- = alg p . imap (\ i -> h ( i : p)) . coalg p
With fmap
to bimap
, we just need to pop h
into both slots, but with imap
to ibimap
, we have to deal with that FirstIndex
is not necessarily the same as SecondIndex
. As a result, instead of just copying (\ i -> h (i : p))
into both recursion slots, we have to differentiate by using Left
and Right
, as Bipath
uses Biindex
which contains an Either
. Otherwise, we're just doing the same path-combining as with phylo
, using Left i : p
and Right j : p
instead of just i : p
.
Can you imagine this without the type-aliases? And yet, we can still clearly see the original hylo hidden inside.
Monadic bipathed bimorphisms
These just lift what just did, to also be monadic. So, ibimap
becomes ibitraverse
, .
becomes <=<
, and we add a few -M
s in places. Otherwise, noting of note, just rote.
pbicataM
pbicataM
:: (Monad m, Birecursive t, BiindexedBifunctor (Bibase t), BiindexedBitraversable (Bibase t))
=> (Bipath t -> Bibase t a a -> m a)
-> Bipath t -> t -> m a
pbicataM alg = pbihyloM alg (const biprojectM)
pbianaM
pbianaM
:: (Monad m, Cobirecursive t, BiindexedBifunctor (Bibase t), BiindexedBitraversable (Bibase t))
=> (Bipath t -> a -> m (Bibase t a a))
-> Bipath t -> a -> m t
pbianaM coalg = pbihyloM (const biembedM) coalg
pbihyloM
pbihyloM
:: (Monad m, BiindexedBifunctor f, BiindexedBitraversable f)
=> ([Biindex f] -> f b b -> m b)
-> ([Biindex f] -> a -> m (f a a))
-> [Biindex f] -> a -> m b
pbihyloM alg coalg = h where
h p = alg p <=< ibitraverse (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) <=< coalg p
Special bipathed bimorphisms
For completeness, of course.
pbiiso
Trivially const biiso
.
pbiiso :: (BipathedBirecursive s, Biiso s t) => Bipath s -> s -> t
pbiiso = pbihylo (const biembed) (const biproject)
pbitrans
Not trivial.
pbitrans
:: (BipathedBirecursive s, Cobirecursive t)
=> (Bipath s -> forall a b. Bibase s a b -> Bibase t a b) -> Bipath s -> s -> t
pbitrans f = pbihylo (\ p -> biembed . f p) (const biproject)
pbiisoM
Trivially const biisoM
.
pbiisoM
:: (Monad m, BipathedBirecursive s, Biiso s t, BiindexedBitraversable (Bibase s))
=> Bipath s -> s -> m t
pbiisoM = pbihyloM (const biembedM) (const biprojectM)
pbitransM
Not trivial.
pbitransM
:: (Monad m, BipathedBirecursive s, Cobirecursive t, BiindexedBitraversable (Bibase s))
=> (Bipath s -> forall a b. Bibase s a b -> m (Bibase t a b))
-> Bipath s -> s -> m t
pbitransM f = pbihyloM (\ p -> biembedM <=< f p) (const biprojectM)
Take another break.
You've just finished getting through not just pathed recursion, but bipathed birecursion! The end of this is in sight, after this, we just have to define what indexing and pathing mean for recursive functors.
Diindexed difunctors and difoldables, diindexed dibase bifunctors
In comparison to Indexed
and Pathed
, I have not yet created a Diindexed
class, nor are there DiindexedDifunctor
or DiindexedDifoldable
or DiindexedDitraversable
classes. You see, a Difunctor
is just a Functor
and a Recursive
, and so it already has an Index
from Functor
and a Path
from Recursive
. Its base functor is a base bifunctor, and so Biindexed
, BiindexedBifunctor
, BiindexedBifoldable
, and BiindexedBitraversable
are already sufficient.
Dipathed direcursives
At last we arrive here - the summit of an exhausting journey, but also a fundamental building block, the first step in a much larger journey that I wish to take you on. Be proud of reaching this point with me, as it prepares you for things yet to come.
If we have indexed functors and biindexed bifunctors, and pathed recursives and bipathed birecursives, what do we get for difunctors aka recursive functors? Well, they have both an index, being a functor, as well as a path, being a recursive at the same time. In fact, remember that their dibase bifunctor is a base bifunctor that is also a functor for the fixed / recursive functor's content, instead of being a bibase bifunctor that recurses in both slots.
From this we know that a recursive functor's index must be made up of its base bifunctor's (bi)indices. Whereas for bipathed birecursion, our golden thread back could weave back and forth through both base bifunctor slots, now it threads through one base slot repeatedly - the path made up of one slot's indices, pointing repeatedly at more recursive structure - before finally weaving across through the other base slot once to finally point at a functor content element - the tip of the path, a house number on a street address, the last leg of a journey.
NOTE: Why would we bother with this concept? Well, it is in order to chain recursion - our goal is to stack / transform recursive functors, and that means we must do the same with their indices. If we have a compound type
f (g (h a))
, then we have a compound path(Index h, (Index g, (Index f, ())))
, and if they are recursive, the paths must link up too, giving us( (FirstIndex h, [SecondIndex h]), ((FirstIndex g, [SecondIndex g]), ((FirstIndex f, [SecondIndex f]), ())))
. It is easier and clearer however to define it asWay (SecondIndex h) (FirstIndex h, Way (SecondIndex g) ...)
and embed inner paths at the end of outer paths.
Let us define some helpful type aliases.
type Diindex (r :: * -> *) = (FirstIndex (Dibase r), [SecondIndex (Dibase r)]) -- NOTE: Should be ~ Index r
type Dipath (r :: * -> *) = [SecondIndex (Dibase r)] -- NOTE: Should be ~ Path (r a)
-- NOTE: These constraints often end up adding little, due to incomplete hierarchy
-- Eg, `Direcursive t, BiindexedBitraversable (Dibase t)` is sufficient as compared
-- to `PathedDirecursive t, Bitraversable (Dibase t)` which isn't.
type DipathedDirecursive t = (Direcursive t, BiindexedBifunctor (Dibase t))
type DipathedCodirecursive t = (Codirecursive t, BiindexedBifunctor (Dibase t))
Note that forall a . Path (t a) ~ Dipath t
since forall a . Index (t a) ~ SecondIndex t
, and that Diindex t ~ Index t
too. Also note that (FirstIndex f, [SecondIndex f])
is just Way (SecondIndex f) (FirstIndex f)
in reverse - it makes sense because Diindex
is just Index
, and Index
is Way
as we discovered early on. At the time we haven't completely solidified this hierarchy, but when we do, these equivalences should be more explicit, and should be enforced similarly to the bifunctor-superclass proposal - it is related after all.
So Dipath
is just Path
in bifunctor clothing, and Diindex
is just Index
because a recursive functor is already a functor - but what exactly is Diindex
? Whereas Index
is ignorant of what lay beneath, Diindex
is aware of the internal structure of a recursive functor - it knows that it must take a Path
made up of SecondIndex
, and cap it off at the end with a FirstIndex
, in order to make the Index
.
For many recursive functors, there exists only one logical content slot per base constructor - in that case the FirstIndex
is equivalent to ()
, and it can be ignored entirely. For these functors, Dipath f ~ Diindex f
, and Index f ~ Path (f a)
.
data Identity a = Identity a
data List a = Nil | Cons a (List a)
data Tree a = Tip | Tree a (List a) (List a)
data BinTree a = Leaf a | Node (BinTree a) (BinTree a)
However, for others with multiple content slots per case, FirstIndex
actually requires some information to be stored.
-- v FirstIndex v SecondIndex
data BilistIx = Bizero Bool | Bisucc () BilistIx -- Or: Way () Bool
data Bilist a = Binil | Bicons a a (Bilist a)
Bilist
requires that SecondIndex BilistF ~ Unit
because there is only one possible successor, and it requires that FirstIndex BilistF ~ Bool
because each successor skips two content slots, and so the Diindex
of a Bilist
effectively counts by units of twos until it reaches the end, then uses the Bool
to pick one of the two slots.
It is important to understand how sometimes FirstIndex
(often) or SecondIndex
(rarely) can be implicitly ignored sometimes - it will help you make sense of why sometimes a path is an index, and sometimes it is not.
Dipathed dimorphisms
NOTE: Here we have made up the term 'dimorphism' wholesale, as a cursory search of "category theory" with "dimorphism" yielded nothing of value.
Again, we're going to skip the i-
variants entirely, and get straight to it.
pdicata
This is better explained through pdihylo
, but we'll define it first.
pdicata
:: (DipathedDirecursive f)
=> (Diindex f -> a -> b)
-> (Dipath f -> Dibase f b d -> d)
-> Dipath f -> f a -> d
pdicata f alg = pdihylo f alg (const diproject)
-- pcata alg = phylo alg (const project)
Also, note that enforcing Functor, Recursive => Direcursive
would allow for enforcing (Index f ~ Diindex f, forall a . Path (f a) ~ Dipath f)
implicitly, allowing us to define pdicata
using Index
and Path
. See here with those constraints made explicit:
pdicata
:: (DipathedDirecursive f, Index f ~ Diindex f, Path (f a) ~ Dipath f)
=> (Index f -> a -> b)
-> (Path (f a) -> Dibase f b d -> d)
-> Path (f a) -> f a -> d
pdicata f alg = pdihylo f alg (const diproject)
-- pcata alg = phylo alg (const project)
This could be the preferable way of defining this, once we tighten the hierarchy.
pdiana
Ditto everything said for pdicata
.
pdiana
:: (DipathedCodirecursive f)
=> (Diindex f -> a -> b)
-> (Dipath f -> c -> Dibase f a c)
-> Dipath f -> c -> f b
pdiana f coalg = pdihylo f (const diembed) coalg
-- pana coalg = phylo (const embed) coalg
pdihylo
If dihylo
is hylo
plus fmap
, then pdihylo
is phylo
plus imap
. This function gives you the power to do one thing going down the branches, a second thing to map the leaves, and a third thing coming back up - all with an index or path to tell you where you are.
pdihylo
:: BiindexedBifunctor f
=> ((FirstIndex f, [SecondIndex f]) -> a -> b)
-> ([SecondIndex f] -> f b d -> d)
-> ([SecondIndex f] -> c -> f a c)
-> [SecondIndex f] -> c -> d
pdihylo f alg coalg = h where
h p = alg p . ibimap (\ i -> f ( i , p)) (\ j -> h ( j : p)) . coalg p
-- phylo
-- = alg p . imap (\ i -> h ( i : p)) . coalg p
-- pbihylo
-- = alg p . ibimap (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) . coalg p
Note the correlations and differences with phylo
and pbihylo
. We have upgraded to a base bifunctor, but because this is direcursion
, we aren't upgrading to use Left
and Right
. Instead, we still copy (\ i -> f (i : p))
into both ibimap
slots, but rather than adding Left
and Right
, we just replace the first slot's :
with ,
to get (\ i -> f (i , p))
, while the second slot gets to stay \ i -> h (i : p)
(well, we turned the i
to a j
but thats purely nomenclative).
Placing the FirstIndex
to the left in (FirstIndex f, [SecondIndex f])
plays well with the understanding that the path is reversed, as by reading it from right to left, the FirstIndex
forms the tip. /path/to/file.json
becomes (file.json, ["to/", "path/", "/"])
(if file-names were considered as FirstIndex
to directory-names as SecondIndex
).
Note that for recursive functors such that FirstIndex ~ ()
, then Index f ~ forall a . Path (f a)
.
What can we do with pdihylo
? Well, earlier we defined data FileTree = File | Directory (Map Name FileTree)
as an example for phylo
, but we could actually free up the filetype and turn FileTree
into a recursive functor.
data FileTree a = File a | Directory (Map Name FileTree)
data FileTreeF a r = ...
One can imagine not just drilling down through the filetree, through the content of the file as well, in a single operation, such as FileTree Json
with a combined filetree-json-path being made available to every json fragment - or breaking up a filetree-of-json into (filetree-or-json)-fragments, a deepening of the shard shattering that we performed with phylo
because everything that we mentioned for phylo
carries over to pdihylo
.
It is possible to build a di-
variant of the earlier shatterList
and shatterMap
functions, one which continues with shattering the functor content - making it possible to recursively shatter nested recursive functors and gather up all of the pieces. This requires a bit more finesse, so this definition is left for another time. Feel free to try and derive it yourself, however.
Applicative dipathed dimorphisms
Unlike Recursive
and Birecursive
, with di-
we have access to non-trivial applicative variants. These of course are created by using ibitraverse
like -M
variants, but with .
instead of <=<
, and using fmap
at the front.
Note that as with dihylo
and friends, the applicative nature only occurs in the mapping transformation function, and not in the alg
or coalg
function - the Applicative
nature is ignorant of the recursive nature just as much as Functor
is. Compare (forall a . Recursive (f a), Functor f) => Difunctor f
which understand that it is recursive, and can be broken up into (di)base functors, and so a recursive applicative might be broken up into (di)base applicatives. Concepts like diapplicative
or even dimonad
(recursive applicatives and monads) are definitely worth looking further into, to explore how the applicative or monadic nature as a whole is constructed from the fragments.
pdicataA
pdicataA
:: (Applicative f, DipathedDirecursive t, BiindexedBitraversable (Dibase t))
=> (Diindex t -> a -> f b)
-> (Dipath t -> Dibase t b c -> c)
-> Dipath t -> t a -> f c
pdicataA f alg = pdihyloA f alg (const diproject)
pdianaA
pdianaA
:: (Applicative f, DipathedCodirecursive t, BiindexedBitraversable (Dibase t))
=> (Diindex t -> a -> f b)
-> (Dipath t -> c -> Dibase t a c)
-> Dipath t -> c -> f (t b)
pdianaA f coalg = pdihyloA f (const diembed) coalg
pdihyloA
pdihyloA
:: (Applicative f, BiindexedBitraversable t)
=> ((FirstIndex t, [SecondIndex t]) -> a -> f b)
-> ([SecondIndex t] -> t b d -> d)
-> ([SecondIndex t] -> c -> t a c)
-> [SecondIndex t] -> c -> f d
pdihyloA f alg coalg = h where
h p = fmap (alg p) . ibitraverse (\ i -> f (i,p)) (\ j -> h (j : p)) . coalg p
A minor correction about -A
variants
It is worth mentioning, I was in error earlier when I stated that there was no applicative variants for mono- and bi- recursion. This of course is not strictly true, and it turns out that an applicative form is possible if we apply the fmap
trick that we used in dihyloA
, which is to promote fmap
to traverse
, while also wrapping alg
with fmap
before composition.
hyloA :: (Traversable t, Applicative f) => (t b -> b) -> (a -> t a) -> a -> f b
hyloA alg coalg = h where h = fmap alg . traverse h . coalg
-- hylo for comparison: alg . fmap h . coalg
It is pretty trivial though - f
doesn't get used anywhere except the very end, unlike dihyloA
. I would expect cataA
* and anaA
to be just as trivial. The di-
variants are non-trivial even without indexing, because the applicative f
does get used in the mapping transformation, which hyloA
lacks.
However, the addition of indexing to the mix may make it such that phyloA
and similar become non-trivial given that there is now a path with a chance to affect things. It may be worth revisiting in the future.
NOTE: * Not the same as
cataA
inrecursion-schemes
, which is not of the form(Base t a -> f a) -> t -> f a)
but rather(Base t (f a) -> f a) -> t -> f a
.
Monadic dipathed dimorphisms
If you get dipathed dimorphisms, and monadic morphisms, you'll get these.
pdicataM
pdicataM
:: (Monad m, DipathedDirecursive f, BiindexedBitraversable (Dibase f))
=> (Diindex f -> a -> m b)
-> (Dipath f -> Dibase f b d -> m d)
-> Dipath f -> f a -> m d
pdicataM f alg = pdihyloM f alg (const diprojectM)
pdianaM
pdianaM
:: (Monad m, DipathedCodirecursive f, BiindexedBitraversable (Dibase f))
=> (Diindex f -> a -> m b)
-> (Dipath f -> c -> m (Dibase f a c))
-> Dipath f -> c -> m (f b)
pdianaM f coalg = pdihyloM f (const diembedM) coalg
pdihyloM
pdihyloM
:: (Monad m, BiindexedBitraversable f)
=> ((FirstIndex f, [SecondIndex f]) -> a -> m b)
-> ([SecondIndex f] -> f b d -> m d)
-> ([SecondIndex f] -> c -> m (f a c))
-> [SecondIndex f] -> c -> m d
pdihyloM f alg coalg = h where
h p = alg p <=< ibitraverse (\ i -> f (i,p)) (\ j -> h (j : p)) <=< coalg p
An elegant weapon for a more civilized age.
Special dipathed dimorphisms
With recursion and birecursion, we found that some of our 'special' morphisms were trivial - variants of iso
ended up not using the path at all.
This is not true for the special dipathed morphisms - none of them are trivial due to the use of the index / path in the f
mapping transformation argument, and so the path argument may always affect the result. This makes them all worth defining - and we will discover something very interesting.
pdiiso
If you recall from the initial birecursion article, diiso
lifts iso :: s -> t
to a functor and adds a mapping argument (a -> b) -> s a -> t b
. Extending it to be pathed is by now routine.
pdiiso
:: (Diiso s t, BiindexedBifunctor (Dibase s))
=> (Diindex s -> a -> b)
-> Dipath s -> s a -> t b
pdiiso f = pdihylo f (const diembed) (const diproject)
pditrans
pditrans
:: (DipathedDirecursive s, Codirecursive t)
=> (Diindex s -> a -> b)
-> (forall c . Dipath s -> Dibase s b c -> Dibase t b c)
-> Dipath s -> s a -> t b
pditrans f g = pdihylo f (\ p -> diembed . g p) (const diproject)
pdiisoA
pdiisoA
:: (Applicative f, Diiso s t, BiindexedBitraversable (Dibase s))
=> (Diindex s -> a -> f b)
-> Dipath s -> s a -> f (t b)
pdiisoA f = pdihyloA f (const diembed) (const diproject)
pditransA
pditransA
:: (Applicative f, DipathedDirecursive s, Codirecursive t, BiindexedBitraversable (Dibase s))
=> (Diindex s -> a -> f b)
-> (forall c . Dipath s -> Dibase s b c -> Dibase t b c)
-> Dipath s -> s a -> f (t b)
pditransA f g = pdihyloA f (\ p -> diembed . g p) (const diproject)
pdiisoM
pdiisoM
:: (Monad m, Diiso s t, BiindexedBitraversable (Dibase s))
=> (Diindex s -> a -> m b)
-> Dipath s -> s a -> m (t b)
pdiisoM f = pdihyloM f (const diembedM) (const diprojectM)
pditransM
pditransM
:: (Monad m, DipathedDirecursive s, Codirecursive t, BiindexedBitraversable (Dibase s))
=> (Diindex s -> a -> m b)
-> (forall c . Dipath s -> Dibase s b c -> m (Dibase t b c))
-> Dipath s -> s a -> m (t b)
pditransM f g = pdihyloM f (\ p -> diembedM <=< g p) (const diprojectM)
Conclusion
As hard as it is to write about, it is even more difficult to end the writing - but end we must. We have covered enough ground for the day, we went deep into the labyrinth, and now it is time to come back out having discovered some wild things (shatterList
and shatterMap
especially). As we leave the labyrinth, winding up our golden thread, we arrive back where we started - the origin of our journey. It is time to make camp, and rest by the fire.
We started with asking a few questions, and it is fitting that we will end with answering one accordingly.
The profunctoral perspective
Whatever happened to the profunctoral perspective? I did say that I would get around to it, but we will need to go back to a few functions and give them a closer examination first.
I would now like to remind you of something that we discovered about diiso
in the prior article - by restricting s ~ t
, we found out that fmap = diiso
. Therefore, we ask, what do we get if we restrict pdiiso
the same way (and crush down types for readability since Diindex ~ Index
and Dipath ~ Path
)?
pdiiso_fmap
:: (DipathedDirecursive f, DipathedCodirecursive f)
=> (Index f -> a -> b)
-> Path f -> f a -> f b
pdiiso_fmap f = pdihylo f (const diembed) (const diproject)
If you were to say imap
, you'd be wrong. Although , the resulting function has type (Index f -> a -> b) -> Path f -> f a -> f b
- it has that additional Path f ->
argument compared to imap
, and so imap f = pdiiso f []
.
On the other hand, if you were to say pmap
(as a sort of pathed map) you'd be wrong too, because the mapping transformation function f
takes an Index f
and not a Path f
- it is only if FirstIndex (Dibase f) ~ ()
(meaning the final constructor has a singular functor content slot) that Index f ~ ((), Path f)
such that we get Index f ~ Path f
by dropping the irrelevant ()
.
If it is neither imap
nor pmap
, it is instead both - it is an indexed, pathed map. We've got to remember that it isn't just fmap = diiso
, it is recursive fmap = diiso
, so it expresses both the functor nature and the recursive nature.
More importantly, it performed some type-changing operation on the path, which sounds like something we've been looking for. At the very end, it goes from Path
to Index
- but we did that. We did that when we implemented pdihylo
. Has our profunctoral indices been hiding right under our noses the entire time?
Lets take a look! phylo
concatenates indices Index f
into a path [Index f]
, changing the type. pbihylo
concatenates two types of indices FirstIndex f
and SecondIndex f
into a path of both [Either (SecondIndex f) (FirstIndex f)]
- again changing the type. Finally, pdihylo
concatenates one type of indices SecondIndex f
into a path [SecondIndex f]
(changing type once), before capping it with another index FirstIndex f
to turn it into an index path (FirstIndex f, [SecondIndex f])
(changing type a second time).
How did we do that?
phylo alg coalg = h where
h p = alg p . imap (\ i -> h (i : p)) . coalg p
pbihylo alg coalg = h where
h p = alg p . ibimap (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) . coalg p
pdihylo f alg coalg = h where
h p = alg p . ibimap (\ i -> f (i , p)) (\ j -> h (j : p)) . coalg p
Can we abstract this? All we're doing is imap
/ ibimap
with almost the exact same logic - in fact, we can zoom in on the centers of these functions, and pluck out their arguments to see how they are almost identical.
(\ i -> h (i : p)) -- phylo
(\ i -> h (Left i : p)) -- bihylo first
(\ j -> h (Right j : p)) -- bihylo second
(\ i -> f (i , p)) -- dihylo first
(\ j -> h (j : p)) -- dihylo second
Really, these are all variations of the same form - we're performing a cons-like operation with the index and path, and we can extract the ,
and :
arguments to get the generalized function:
(\ cons i -> h (i `cons` p))
With this, we can make a slightly different version of imap
and ibimap
, which we will call imapWith
/ ibimapWith
:
imapWith
:: (IndexedFunctor f)
=> (Index f -> path -> indexpath) -- Cons
-> (indexpath -> a -> b)
-> path -- Nil
-> f a -> f b
imapWith cons f p = imap (\ i -> f (cons i p))
ibimapWith
:: (BiindexedBifunctor f)
=> (FirstIndex f -> path -> findexpath) -- Left Cons
-> (SecondIndex f -> path -> sindexpath) -- Right Cons
-> (findexpath -> a -> b)
-> (sindexpath -> c -> d)
-> path -- Nil
-> f a c -> f b d
ibimapWith icons jcons f g p = ibimap (\ i -> f (icons i p)) (\ j -> g (jcons j p))
Note that imapWith
/ ibimapWith
now take an additional cons
argument(s), but also a path
argument - like pdiiso_fmap
earlier - except now the types are free. In fact, we have gone from implicitly working with lists (through use of :
) to accepting arguments in the form of List
's constructors, Nil
and Cons
. By exposing the cons-like arguments, we allow for the user to determine the actual type of the indices and paths freely! The profunctoral perspective, everybody!
Now, we can redefine phylo
/ pbihylo
/ pdihylo
in terms of these new functions - and the definitions are much cleaner!
phylo alg coalg = h where
h p = alg p . imapWith (:) h p . coalg p
pbihylo alg coalg = h where
h p = alg p . ibimapWith ((:) . Left) ((:) . Right) h p . coalg p
pdihylo f alg coalg = h where
h p = alg p . ibimapWith (,) (:) h p . coalg p
It seems that imapWith
could be the more appropriate definition of an indexed functor, if we want to consider the profunctoral perspective. Even more, although we have defined imapWith
using imap
, it can go in the other direction trivially using imap f = imapWith const f undefined
. However, imap
makes for a concise base case, and we will leave it as-is.
The end of the beginning
Our campfire is now embers, and the stars are overhead. Today, we have accomplished a lot, even though the significance of much of it has yet to be felt. Now, it is time to sleep on what we have learned - to digest it and let it become part of our learned self.
For anyone who has made it this far, congratulations - this marks the end of the beginning. All of this has been preparation, a foundation for connecting the many topics that we will be tackling in the days to come.
I hope you'll join me next time we explore the recursiverse, when we discuss sources and sourced morphisms, and the implications of imapWith
.
Where to get the code
You can find version 0.0.2
of the birecursion-schemes package here, which corresponds to this article.