Haskell: Indexed Recursion Schemes

Finding your way back after a recursive descent into madness

Indexing schemes

Today, we are going on a deep dive to show how recursion schemes may be extended by the addition of indexing. Last time, we defined over 30 related functions, and this time we're making indexed variants of all of them, so at least 30 more, plus any supporting functions we might need. It could be a descent into a labyrinth of madness - but don't worry, we will be smart as Theseus, with our own version of Ariadne's golden thread. No matter how deep we go, indices will guide us on our way back.

In a previous article, we developed version 0.0.1 of a birecursion-schemes package, focused on 5 core recursive functions: cata, ana, hylo, iso, and trans, plus their -M monadic variants. For each, we then derived their birecursive bi- and direcursive di- variants. This library has been updated to 0.0.2 following along with today's article.

As we go exploring, we are going to do our best to provide examples to illustrate things, so you should get comfortable and have a snack handy - it's going to be a long one.

As always, we start by asking questions.

What's in an index?

That which we call a rose, by any other index would smell as sweet.

So what exactly is an index? As always, etymology is our first resort. The word 'index' (plural 'indices') has its origins in latin, and its meaning remains largely unchanged: literally 'forefinger' as in your index or pointer finger, but also abstractly as 'sign' or 'pointer' or 'indicator'. It is derived from the verb 'indico': prefix 'in-', meaning 'towards', followed by 'dico', meaning 'to point out, make known, set apart'. Note the relationship between 'indices' and 'indico' and 'indicate'

Taken all together, an 'index' is a thing that points you in the right direction to where something can be made known: it guides you in a traversal of knowledge (that's quite the clue as to what we're going to be getting into). This is largely the same as its modern usage, even for niche-specific variants, a bit of linguistic preservation. We have been using indices for millenia to look up more specific data, from catalogues in roman libraries, to the the modern dewey decimal system, and now we are going to use indices to find our way within data structures.

Now, we could just say that an index tells us how to access a element of that data structure, and be done with it - but that isn't the whole story. You see, indexing can be a muddy affair if we are not precise with our terminology - sometimes you have an index like with Vector, other times a key as in Map, still other times a path like Json.

These terms all have colloquial use, but we're going to have to precisely define them - taking into account things like that an index may point to a child of the same type in a recursive structure, or it may point to content of a different type in a container - and that recursive containers have both children and content? It all seems to depend on how and what you are using indexing for - a change of perspective results in a change of emphasis, after all. Given that we're going to be mixing indexing with recursion, it's going to to pay off to be precise, otherwise we're going to get lost in the labyrinth.

Another concern is that sometimes the index is obviously related to the shape of the data structure - a Vector is indexed by integers that tell you exactly where each element is - but other times, the index is not obviously related to the shape of the data structure at all, and instead it functions more as an arbitrary label - a Map k is seemingly indexed by whatever you want. We have, seemingly in contention, two different perspectives of what an index is, and our explanation has to handle both.

This last point is interesting enough that it marks our starting point, and by clarifying that last point, we may begin our descent. Our first step is to split the concept of an index into two different groups - structural indices, and profunctoral keys.

Structural indices

Sometimes an index tells you exactly where to go.

In this case, an index is in some sense 'a reduction of the rest of the data structure' - sufficient instructions as to navigate quickly and efficiently to the desired value. The book is the fifth from the right, on the top shelf of the third row. We will call this the 'structural' perspective of indices.

It is not an alien concept, with Connor McBride having written the rather relevant The Derivative of a Regular Type is its Type of One-Hole Contexts. This term 'one-hole context' is an elegant way to refer to 'the rest of the data structure', and so we will adopt the terminology for that reason.

A structural index could be considered the minimal / forgetful 'one-hole context' , with all extraneous data not leading to the hole removed / replaced by () - aka Free StructureF () where everything not on 'the path' is implicitly (). The resulting structure is a spine or path that goes all the way from the hole back up to the top of the original data structure, and contains just enough information to navigate back down to it. I haven't made such lovely graphs as the one-hole-context paper yet, so you'll have to imagine tracing a line from the hole back to the top of the data structure, and clipping off everything else.

The most clear example is the simple List indexed by Nat.

data List elem = Nil  | Cons elem (List elem)
data Nat       = Zero | Succ      Nat

-- The index of 'c' is SSZ = 2
abc = Cons 'a'  (Cons 'b'  (Cons 'c' Nil))
--    Cons ()   (Cons ()   Nil)
--    Succ      (Succ      Zero)

We can clearly see that a Nat is a List that has forgotten every element before the one that we are indexing, but also every child list after. This explains why Nat is isomorphic to [()], and why Succ (Succ Zero) == [Unit,Unit]. Note that we should be able to define such an index for just about any sensible sum of products type.

However, we caution the reader to observe carefully that the derivation is not always so simple - for instance, Zero does not correspond to Nil but rather to the singular elem position within Cons. Some data structures have multiple functor content points. Take, for instance, a Bilist which contains two elements per Bicons.

Note that rather than explicitly fill with (), we'll just ignore whatever's there with a blank _ since it's a little easier on the eyes.

data Bit = Zit | Bit -- NOTE: Bit = Biunit aka Bool

data Binat = Bizero Bit | Bisucc Binat

data Bilist a = Binil | Bicons a a (Bilist a)

-- The index of 'c' is SZ(0), and the index of 'd' is SZ(1).
abcdef = Bicons 'a' 'b' (Bicons 'c' 'd' (Bicons 'e' 'f' Binil))
--       Bicons _   _   (Bicons 'c' _   _)      -- 'c'
--       Bisucc         (Bizero Zit)            -- SZ(0)
--       Bicons _   _   (Bicons _ 'd'   _)      -- 'd'
--       Bisucc         (Bizero   Bit)          -- SZ(1)

Another pitfall is multiple points of recursion - this also affects how we derive our structural index:

data Dir = Left | Right -- NOTE: Dir == Bit

data Tree a = Tip | Node a (Tree a) (Tree a)

data Branch = Leaf | Limb Dir Branch -- NOTE: is just [Dir] or `([Dir], Unit) in disguise

-- The index of 'c' is `[Right]` or `Limb Right Leaf`
treeabc = Node 'a'   (Node 'b' Tip Tip) (Node 'c' Tip Tip)
--        Node _     _                  (Node 'c' _   _)
--        Limb Right                    Leaf

In fact we can see both interacting with a tree that contains both two content points and two recursion points.

data Bitree a = Bitip | Binode a a (Bitree a) (Bitree a)

data Bibranch = Bileaf Bit | Bilimb Dir Bibranch

-- The index of 'c' is `([Left], Zit)` or `Bilimb Left (Bileaf Zit)`
bitreeabcdef = Binode 'a' 'b' (Binode 'c' 'd' Bitip Bitip) (Binode 'e' 'f' Bitip Bitip)
--             Binode _   _   (Binode 'c' _   _     _    ) _
--             Bilimb Left    (Bileaf Zit)

Okay, something is funny - we started with the statement that an index is based on the structure of the original, and yet the index of a Bitree is list-like! In fact, they're all list-like!

It should be obvious now that indices are related to the form data Way path end = End end | Path path (Way path end) - and that the index of List should actually be Way Unit Unit because it only has one recursion point (hence the first Unit) and also only one element slot (hence the second). A Bilist has index Way Unit Bit, a Tree has index Way Dir Unit, and a Bitree has index Way Dir Bit. The Bitree case is especially illuminating because it shows that indexes always have zero or more "branches", followed by exactly one "leaf" - this exactly matches our examples. We can even use Way Void end to index data structures that explicitly aren't recursive!

What gives!? Well, we've traded off "replacing irrelevant data points with blanks" for "counting the number of positions filled with blanks before our data" It just so happens that with List, the two concepts coincide - because it has both one content slot, and one recursion point.

Way is still a reduction of the one-hole context - the rest of the data structure. path is a reduction of the recursive branches taken, and end is a reduction of the final branch containing the indexed value. This means that for our purposes, Way is just a different way of representing Free StructureF () (again, with the constraint that everything off-path is replaced by () or _) - one is efficient in the metric of 'shaped like the original data structure' and the other is efficient in the metric of 'has a compact representation'.

This 'filling blanks' vs 'counting blanks' doesn't matter, because we can always go back and forth. We can always count filled blanks, and having a count always tells us how many blanks to fill, and everything after the indexed element is either truncated or ignored.

NOTE: When it comes to an index for a single selected element, we can either fill blanks, or count blanks and it doesn't matter - but there are cases where it may start to matter for efficiency 'how' we reduce the rest of the data structure to an index. When it comes to distributed data and distributed indices, blanks are filled with proofs instead of (), and the difference between index and data becomes more like 'how much do you know'. There is a lot to think about.

For the majority of the article, we will focus on structural indices, as they form the foundation of today's work, and everything we just discussed will come in handy. But first, let us discuss for a moment the other perspective.

Profunctoral keys

Sometimes an index is a label that helps you search, because anything that you can turn into an label, is also label, and thus an index.

Rather than telling you exactly where something is, this type of index helps narrow it down, often in steps. The book is in the fiction section, between "Green" and "Grishom". Note that while structural indices are concrete, profunctoral keys are abstracted. This allows for indexing collections that involve factors like balancing - a profunctoral key doesn't tell you exactly where the book is, because that information is specific to a given library layout. If the library were to reorganize itself, a structural index would have to change - but a profunctoral key would not have to.

This makes profunctoral keys great for organized piles where you don't always know where things are, but you do know where things aren't - and when you know that 'all other values are somewhere else', you then know that 'if there is any value here, it is the one that I seek'.

We call it profunctoral in reference to:

class Profunctor (f :: * -> * -> *) where
    promap :: (a -> b) -> (c -> d) -> p b c -> p a d -- aka Profunctor.dimap
    premap :: (a -> b) -> p b c -> p a c             -- aka Profunctor.lmap
    postmap :: (b -> c) -> p a b -> p a c            -- aka Profunctor.rmap

Specifically, we can use premap / lmap to change the type of a key in data structures like Map k, eg if you have a f :: k2 -> k you can turn the m :: Map k into a m2 :: Map k2, because you can on-the-fly turn all k2s into ks immediately before using them using m2 = premap f m.

Profunctoral keys often work by offloading some of the organizational logic to a constraint, such Ord, which is why Map requires such a constraint - it is masking that, inside, it turns k into something else to navigate (like an Int). We can even go as far as to use just an Eq constraint - if Ord k => Map k a is an organized pile, then Eq k => AssocList k a is an unorganized pile, and yet you can still, albeit with a loss of efficiency. If a structural index gets its organization factor from the data structure, a profunctoral index gets its organization from the key.

An appropriate example of a profunctoral key might be IntMap, where the true index isn't Int but rather (Min Int, Max Int), because we start at the top with an index of (-Infinity, Infinity) and narrow it down via (min,max) as we descend recursively - our initial "key" gives us one of our bounds, and which branch we take tells us the other. We reduce (min,max) until we have min == max because that is where our value is located. It is only when we consider the entire recursive functor head structure of the IntMap that the index sums up the steps that we took to become an Int.

Profunctoral keys afford a great deal of flexibility and convenience, because they allow you to ignore the precise structure required to reach an element. This in turn mean that the keys can have other desirable properties such as being able to survive edits of a data structure due to the insertion or deletion of other elements, or balancing of the data structure - the same sort of indirection that makes pointers useful in C.

The abstraction of profunctoral keys is not free, however - it must necessarily build off of the structural indices, despite the two seemingly contradictory perspectives. Do not worry - at the end of this article, we will explain this contradiction and reveal how the profunctoral keys have been hidden inside the entire time.

Existing packages

With all of that in mind, we should take a quick look at the existing ecosystem - it will help us decide on how to go about implementing this sort of logic. At this point, we have enough information to more clearly search for existing libraries, modules, and classes - and we find ourselves a scattered but mostly-consistent ecosystem. This could all use a good deal of unification, but for now let us see what we have discovered.

In no particular order, we have:

containers & vector

Several commonly-used modules (such as Map and Vector) have functions specific to their data structures.

Vector.imap :: (Int -> a -> b) -> Vector a -> Vector b 
Map.mapWithKey :: (k -> a -> b) -> Map k a -> Map k b 

If we generalize, we get imap :: (i -> a -> b) -> f a -> f b, which is a function type that we can use in a class definition.

lens

The lens package provides a class for indexed functors, using FunctionalDependencies:

-- lens: Control.Lens.Combinators
class Functor f => FunctorWithIndex i (f :: Type -> Type) | f -> i where
    imap :: (i -> a -> b) -> f a -> f b 

indexed-traversable

The indexed-traversable package has a practically identical definition - unsurprising as it was extracted from lens in the first place.

-- indexed-traversable: Data.Functor.WithIndex
class Functor f => FunctorWithIndex i f | f -> i where
    imap :: (i -> a -> b) -> f a -> f b

Notably, it also contains FoldableWithIndex and TraversableWithIndex - we'll keep these in mind.

keys

The keys package is a package that uses TypeFamilies instead of FunctionalDependencies. However, it is very minimal, and also deprecated.

-- keys: Data.Key
type family Key (f :: * -> *)
class Functor f => Keyed f where
    mapWithKey :: (Key f -> a -> b) -> f a -> f b 

mono-traversable-keys

An honorable mention is mono-traversable-keys, which is keys for monomorphic functors. It is worth noting for inclusion, but for now we're not going to be exploring monomorphic functors, even though they do relate to recursives through a recursive structure being a monomorphic functor where its elements are the same type as itself.

-- mono-traversable-keys: Data.MonoTraversable.Keys
type family MonoKey key 
class MonoFunctor mono => MonoKeyed mono where
    omapWithKey :: (MonoKey mono -> Element mono -> Element mono) -> mono -> mono

base

Another honorable mention is Data.Ix in base. It isn't exactly what we want but we can sense a relation because the class is used for indexing contiguous arrays while still allowing the index type to be abstracted to anything orderable + rangeable.

-- base: Data.Ix
class Ord a => Ix a where
    range :: (a, a) -> [a]
    index :: (a, a) -> a -> Int
    inRange :: (a, a) -> a -> Bool

Remember IntMap secretly operating over (min,max) rather than Int? That notion is expressed here in class form.

adjunctions

Finally, we have a really interesting class found in adjunctions - we really won't get anywhere near this for a while, but nonetheless:

-- adjunctions: Data.Functor.Rep
class Distributive f => Representable f where
    type Rep f :: *
    tabulate :: (Rep f -> a) -> f a
    index :: f a -> Rep f -> a 

We'll find Representable to be the subject of future discussion, but I have not explored the relation deep enough to say anything of depth.

Kleisli arrows of outrageous fortune

Beyond that, there is the paper Kleisli arrows of outrageous fortune by Connor McBride (again) involving Atkey-style indexed / parameterized functor, which will be relevant if we wish to generalize from type-specific indices and paths, to a parametric / type argument index. However, in keeping with 'start simple', we'll get to that when we get to it.

recursion-schemes-ix

Finally there is already technically a library for indexed recursion schemes but it uses singletons and is a bit too abstracted for my taste - remember, we don't want to get lost in the madness, and I'm not even sure that it does everything I need. I'm looking for a solution that provides clarity and more compatibility with the way that existing indexed functor packages work, in order to aid in our exploration of this space.

Indexed functors and foldables

With examples of the existing ecosystem fresh in our minds, we are now going to define our own classes for indexing, based on what we just covered and also the opinions developed during use.

Indexed and indexed

We start with the Index type, and its accompanying class, Indexed which provides the function indexed :: f a -> f (Index f, a), which tuples up content with its index.

type family Index (f :: * -> *)

class Indexed f where
    indexed :: f a -> f (Index f, a)
    default indexed :: (IndexedFunctor f) => f a -> f (Index f, a)
    indexed = imap (,)

All that indexed does is annotate an element with its index - this is rarely the function that we want to use, but it is good to have for the sake of a proper hierarchy.

Note that there is a default implementation for indexed, which relies on our next class, IndexedFunctor. We have opted however to keep the two classes separate, and use Indexed as a superclass for IndexedFunctor.

IndexedFunctor and imap

Our next class, IndexedFunctor, combines both Indexed and Functor to provide the function imap :: (Index f -> a -> b) -> f a -> f b. The difference between imap and fmap is that we now get access to the index of a value while we are mapping.

class (Indexed f, Functor f) => IndexedFunctor f where
    imap :: (Index f -> a -> b) -> f a -> f b
    imap f = fmap (uncurry f) . indexed

Note that imap also has a default implementation - and the default implementations of indexed and imap rely on each other. This means that we must actually implement at least one of them, preferably imap, for reasons that will be explained later.

IndexedFoldable and ifoldMap

Now, you may be asking, why have we declared Indexed and IndexedFunctor separately? It is because the next class, IndexedFoldable, does not rely on Functor. Instead, it combines both Indexed and Foldable to provide the function ifoldMap :: (Monoid m) => (Index f -> a -> m) -> f a -> m . Like imap, with ifoldMap we now get access to the index of a value while we are folding.

class (Indexed f, Foldable f) => IndexedFoldable f where
    ifoldMap :: (Monoid m) => (Index f -> a -> m) -> f a -> m 
    ifoldMap f = foldMap (uncurry f) . indexed

I do not think it is possible to define a default implementation of indexed in terms of ifoldMap (as we did with imap), even if we wanted - we would need fromList or some other way of constructing the specific type, which is not captured by Foldable.

In the name of preserving our immediate sanity, we have opted to leave defining ifoldr and other indexed foldable functions for later, as they are beyond the necessary scope of this already-overburdened article.

IndexedTraversable and itraverse

Finally, to complete the trio, we of course have IndexedTraversable. Just like Functor and Foldable, if something is both an IndexedFunctor and an IndexedFoldable, it should probably also be an IndexedTraversable (and vice versa), even though it only actually requires Indexed and Traversable constraints.

class (Indexed t, Traversable t) => IndexedTraversable t where
    itraverse :: Applicative f => (Index t -> a -> f b) -> t a -> f (t b) 
    itraverse f = traverse (uncurry f) . indexed

NOTE: The class constraints here should possibly be (IndexedFunctor t, IndexedFoldable t, Traversable t) instead - we're still figuring out exactly how weak / strong we want these connections to be.

Ditto on leaving imapM and friends undefined for now.

Notes on default implementations

Note that we can choose to implement indexed, and get imap, ifoldMap, and itraverse for free; however, it is strongly preferred that indexed use the default implemention from imap, and that we implement imap, ifoldMap, and itraverse, in order to only perform one map / traversal of the structure and fuse away the default indexed's inner fmap. We need this fusing not just for efficiency, but also out of concern for the need to avoid fetching the entire data structure to index-map it all at once - this will become more important later.

Time for a break

Go take a break, stretch your legs or something. You've earned it, having survived the theoretical and philosophical discussion. All of the stuff above, is more or less known, but we're about to cross into the unknown, so you'll want your energy.

Indexed recursives

With indexed functors out of the way, it seems now we should try and take the next logical step, and define... indexed recursives? Sure, lets take a stab at that and see what we get. For perspective, remember that the default implementation of imap was fmap (uncurry f) . indexed, so we can work off of that.

For our first attempt, we might be tempted to try and define indexed versions of hylo and friends:

icata :: (Recursive a, Indexed (Base a)) => (Base a (Index (Base a), b) -> b) -> a -> b
icata alg = cata (alg . indexed)

ihylo :: (Indexed f, Functor f) => (f (Index f, b) -> b) -> (a -> f a) -> a -> b
ihylo alg = hylo (alg . indexed)

However, this definition uses indexed rather than imap, which is preferred due to our pattern of extending hylo by adding new capabilities to fmap. We can fuse the indexed into the hylo by implementing it using imap instead of fmap, at the cost of the constraint now requiring IndexedFunctor:

icata :: (Recursive a, IndexedFunctor (Base a)) => (Base a (Index (Base a), b) -> b) -> a -> b
icata alg = ihylo alg project

ihylo :: IndexedFunctor f => (f (Index f, b) -> b) -> (a -> f a) -> a -> b
ihylo alg coalg = h where h = alg . imap (curry (second h)) . coalg

So we have a few functions that seem sensible - but if we take a closer look, and one finds that this lacks our desired symmetry.

Note that the type definition requires a tuple (Index f, b) rather than a curried function such as Index f -> f b -> b - and that the index is entirely absent on the ana / coalg side. Indeed, the notion of an index only matters on the cata / alg side - we could pass an index to the child but it does not matter because it must immediately discard it. Indeed, it seems there is no sensible iana that doesn't just drop them immediately, like this:

iana :: (Corecursive t, Indexed (Base t)) => (a -> Base t (Index (Base t), a)) -> a -> t
iana = hylo embed . fmap (fmap snd)

Something else, with a type definition like (Index (Base t) -> a -> Base t a)) -> Index (Base t) -> a -> t or (Index (Base t) -> a -> Base t (Index (Base t), a)) -> Index (Base t) -> a -> t hardly seems better. Why does it appear that having the index only matters when ascending back up, only after - but not while - we have descended recursively? If we look more closely at those horrid attempts to create a non-trivial iana, the answer is already given - ... -> Index (Base t) -> a -> t.

We would have to provide an index for the top-level - but there is none. An index is about a child's position within a parent, but a top level can have no index, being without a parent, and as we unfold, we must treat each child as a top-level - and thus the entire descent is without indexing. What we have here is not so much a golden thread telling us how to get back, but rather one telling us what the next steps forward could be - and it is only once we have traversed them that they collectively become a part of the way back.

That isn't to say that the icata and ihylo functions aren't useful, merely that they are in fact degenerate or trivial versions of the functions of what we seek, and that we have to keep working to derive what we want.

In particular, icata and ihylo can be used for simple catamorphic hashing of JSON based on its structure, and not just by hashing a JSON-encoded String, as you so often see in naive the crypto space. This can be used to de-duplicate, parallelize, diff, and patch a stream of on-demand JSON fragments, being the most simple variant of a series of functions which we will get into when this article series finishes with the foundations and begins focusing on the practical applications.

Nevertheless, it seems we have only half the story - we are missing something.

Indexed base functors

If we take the time to remember the steps that we took to promote functors to recursives, it gives us a clue as to what we are missing. Index is clearly a functoral concept, but that means it only operates over a single level of a recursive at a time. hylo differentiates itself from fmap by operating over multiple levels, so its indices must also do the same. Our indexed recursives really need to have recursive indices, made out of the indices of the base functors leading back to the top level - a path of sorts.

Pathed recursives

This really should have been obvious from the get-go, what with all of the Index (Base t) going on, but now we can get on with what we were really trying to define in the first place - pathed recursives.

In fact, lets define a few quick alias:

type Path (r :: *) = [Index (Base r)]

type PathedRecursive a = (Recursive a, IndexedFunctor (Base a))
type PathedCorecursive a = (Corecursive a, IndexedFunctor (Base a))

Note that while a functor has an Index, a recursive has a Path, made up of a sequence of base indices. An index and a path are separate but related concepts, because a path directs you along the internal structure of a recursive, while an index points all the way to a functor content slot. It helps to think of a path as potentially ending 'inside' of the data structure, while an index goes 'all the way through', ending 'at the edge or boundary' between the functor and its content. If the functor is itself recursive, then by necessity its indices will contain a path as part of its structure, in some sort of index-path - but that's getting ahead of ourselves.

We have also created the PathedRecursive and PathedCorecursive types by upgrading from a Functor (Base a) constraint (implicit in the Recursive constraint) to an IndexedFunctor (Base a) constraint.

We're defining these as type aliases for now, rather than as type families / classes, but we can always go back and change that - and we might actually need to later when we get into sourced paths, once we solidify things. For the moment, this suffices, so let us proceed to trying to define pathed hylo and friends again, with what we have just newly learned.

Pathed morphisms

Whereas before, we have i- functions to denote indexed-, here we will have p- functions, to denote pathed-, and we will start with pcata since we know it will be meaningful. But how do we promote icata to pcata if we don't even know what the type of pcata is yet?

pcata and pana

Well, what are the changes in our understanding? Well, for one, we want to keep track of the path of indices that we have encountered before and pass that in at every level, rather than the indices of children which we haven't encountered yet. And secondly, it should be compatible with an upgrade from fmap to imap - so vaguely it needs to be like h = alg . imap h . coalg, except we need to pass in a path parameter somewhere.

In other words, for pcata we're actually looking for an algorithm with the form path -> f b -> b, instead of f (index, a) -> a. Having the index inside of the functor (as in f (i, a) -> a) is entirely unnecessary, as we can combine the index into the path immediately before passing that along with the child to the next recursion - meaning that pcata itself also needs an initial path -> argument after the alg argument. Additionally, by symmetry we now have a sensible co-algorithm form for pana - path -> a -> f a - and with this form, the path affects the unfolding in a meaningful way!

Another thing that must be true of our desired set of functions is that, if we ignore the index / path information entirely, it should be the same as plain hylo and friend - that is, hylo alg coalg = phylo (const alg) (const coalg) undefined. This actually tells us enough to implement both pcata and pana in terms of phylo, even though we haven't defined phylo yet!

Thus, we can define pcata and pana as having the same type as cata and ana, but simply having added Path a -> to a few places, and also using const to let project and embed ignore this new parameter when appropriate.

-- cata ::     Recursive t => (          Base t a -> a) ->           t -> a
pcata :: PathedRecursive t => (Path t -> Base t a -> a) -> Path t -> t -> a
pcata alg  = phylo alg (const project)

-- ana ::     Corecursive t => (          a -> Base t a) ->           a -> t
pana :: PathedCorecursive t => (Path t -> a -> Base t a) -> Path t -> a -> t
pana coalg  = phylo (const embed) coalg

phylo

With that in place, we are free to define the type of phylo. We know it must be something similar to pana and pcata, but since we are back to the level of indexed functors we have slightly different constraints - IndexedFunctor f rather than PathedRecursive a or PathedCorecursive a - and so we need use [Index f] rather than Path (since Path is defined as Path r = [Index (Base r)]). Otherwise, we can clearly see our desired function now taking shape:

-- hylo ::      Functor f => (             f b -> b) -> (             a -> f a)              -> a -> b
phylo :: IndexedFunctor f => ([Index f] -> f b -> b) -> ([Index f] -> a -> f a) -> [Index f] -> a -> b
phylo = _

However, the question is now implementing phylo. Compared to hylo, phylo and its alg and coalg each all require an additional parameter. Rather than h = alg . fmap h . coalg, we need something like h p = alg p . imap h p . coalg p, except that clearly doesn't work because imap doesn't take a path argument, and we want to join it with the index anyway which this doesn't do.

We need to join the path p with the index i inside of the imap's argument, like imap (h p) instead of imap h p. A little bit of playing around to join index onto a path, and we get something sensible.

phylo :: IndexedFunctor f => ([Index f] -> f b -> b) -> ([Index f] -> a -> f a) -> [Index f] -> a -> b
phylo   alg coalg = h where h p = alg p . imap (\ i -> h (i:p)) . coalg p
-- hylo alg coalg = h where h   = alg   . fmap         h        . alg

This is actually what we've been after - we have successfully defined a pathed hylomorphism - something that, while we are traversing it, knows where we came from. This is pretty sweet! It is as powerful as hylo, but we also have a nice golden thread to help find our way back out of the labyrinth.

Doing something like using Cofree to annotate a data structure with its path is now trivial!

annotatePaths :: Json -> Cofree JsonF JsonPath
annotatePaths = pcata (:<) []

NOTE: It is worth mentioning that paths are formed in reverse, with the most recent index on top and root at te end - eg, /path/to/file.md yields ["file.md", "to", "path", "/"].

In general, these functions do everything that hylo does, with the addition of its path being available to it - so these are morphisms that know where they are!. This makes it possible to untie a structure, breaking it into shards and fragments that we can collect in a list or dictionary.

type Frag r = Base r (Path r)
type Shard r = (Path r, Frag r) -- Or CofreeF (Base r) (Path r) (Path r)

shatterList :: (Recursive r, IndexedFunctor (Base r), Foldable (Base r)) => r -> [Shard r]
shatterList r = snd $ pcata shatter [] r where
    -- shatter :: Path r -> Base r (Path r, [Shard r]) -> (Path r, [Shard r])
    shatter p fs = (p, (p, fmap fst fs) : concatMap snd fs)

shatterMap :: (Recursive r, IndexedFunctor (Base r), Foldable (Base r)) => r -> Map (Path r) (Frag r)
shatterMap = Map.fromList . shatterList

Reconstituting the original object is trivial, and we can imagine this being useful for turning recursive objects into streams of fragments, allowing you to avoid sending megabytes of of JSON when a few kilobytes would do.

Note that pathed morphisms are also very useful when you want to differentiate identical values that differ only by their position within a data structure. Take for instance the following JSON:

{
    "account" : {
        "balance": 500
    },
    "address" : {
        "streetno" : 500
    }
}

Note that the values of account.balance and address.streetno are both 500 - but they are in a sense different 500's. You might use types to differentiate a bank balance from a street number, but then what if you have two bank accounts? Distinguishing by type is no longer enough. You see, the location of data can be just as important as the type of the data (in fact these notions can be related more strongly if we wish).

Lets say we were cryptographically hashing this data. With simple catamorphic hashing, we might accidentally reveal that the bank account and street number are equivalent, and this is of course very bad. This would happen because we would start at the bottom, and we would see 500 as a base case on two separate occasions, hash it individually twice, and get the same hash value for each both times. This is because when hashing from the bottom up, we of course don't include any of the 'up', and so as far as those two 'different' 500 values go, they are exactly the same. hash(500) is the same everywhere.

Note that this property isn't bad by itself - it turns out that this property enables deduplication, at the cost of revealing similarities between data because thats what deduplication is. So this property can be valuable, especially for public data.

However, with pathed catamorphic hashing, we have a chance to include the path in the hashing. Instead, our base cases for hashing would be, not hash(500) and hash(500), but rather hash(account.balance.500) and hash(address.streetno.500). The inclusion of the path differentiates the two, and makes the resulting hash of each value unique to not just its structural value, but also its location within a larger structure. With this, revealing that account.balance = 500 does not reveal that address.streetno = 500.

So, sometimes location matters, and when it does, 'morphisms that know where they are' are especially useful.

Monadic pathed morphisms

We may now go about defining monadic variants, if we like. Anything that a pathed morphism can do, we can now do monadically. They are of course especially useful if we need to perform IO while traversing our recursive structure. Otherwise, everything we just said for pathed morphisms, just slap a monad on it.

pcataM

pcataM
    :: (PathedRecursive a, IndexedTraversable (Base a), Monad m)
    => (Path a -> Base a b -> m b)
    -> Path a -> a -> m b
pcataM alg  = phyloM alg (const projectM)

We could use pcataM here to simply print every path of a JSON object, from the bottom up while folding:

printJsonPaths :: JsonPath -> Json -> IO ()
printJsonPaths = pcataM (const . print)

panaM

panaM
    :: (PathedCorecursive b, IndexedTraversable (Base b), Monad m)
    => (Path b -> a -> m (Base b a))
    -> Path b -> a -> m b
panaM coalg  = phyloM (const embedM) coalg

We could use panaM to recursively fetch fragments of a JSON object in a way that combines the path with a digest.

fetchJson :: (JsonPath -> Digest -> IO (JsonF Digest)) -> JsonPath -> Digest -> IO Json
fetchJson = panaM

phyloM

phyloM
    :: (Monad m, IndexedTraversable t)
    => ([Index t] -> t b -> m b)
    -> ([Index t] -> a -> m (t a))
    -> [Index t] -> a -> m b
phyloM alg coalg = h where h p = alg p <=< itraverse (\ i -> h (i:p)) <=< coalg p

phyloM is quite exciting because can use the path in the alg and coalg to perform IO while both unfolding (reading the next chunk of data from a file or network) and folding (writing updated chunks out). Given the amount of IO involving relative paths, this function can come in quite handy if you build yourself a FileTree / FileTreeF data structure.

data FileTree = File ByteString | Directory (Map FileName FileTree)
data FileTreeF a = ...

cloneFilesToDisk
    :: (FilePath -> FileTreeF () -> IO ())              -- Write to disk
    -> (FilePath -> Digest -> IO (FileTreeF Digest))    -- Fetch file tree fragment
    -> FilePath -> Digest -> IO ()
cloneFilesToDisk = phyloM

Special pathed morphisms

I suppose we should define p- equivalents for iso and trans, and their monadic -M variants too.

Unsurprisingly, both piso and pisoM are trivial, as ignoring the path argument just makes them equivalent to iso and isoM. However, ptrans and ptransM are not trivial - because the path argument makes it to the natural transformation argument, it is able to have an effect on the resulting structure.

piso

piso :: (Iso s t, IndexedFunctor (Base s)) => Path s -> s -> t
piso   = phylo (const embed) (const project)
-- iso = hylo  (      embed( (      project)

This function is trivial, and equivalent to const iso.

ptrans

ptrans
    :: (PathedRecursive s, Corecursive t)
    => (Path s -> forall a. Base s a -> Base t a)
    -> Path s -> s -> t
-- trans f = hylo  (       embed . f  ) (      project)
ptrans   f = phylo (\ p -> embed . f p) (const project)

This function is non-trivial, because the path may affect the transformation.

pisoM

pisoM :: (Monad m, Iso s t, IndexedTraversable (Base s)) => Path s -> s -> m t
pisoM = phyloM (const embedM) (const projectM)

This function is trivial, and equivalent to const isoM.

ptransM

ptransM
    :: (Monad m, PathedRecursive s, Corecursive t, IndexedTraversable (Base s))
    => (Path s -> forall a. Base s a -> m (Base t a))
    -> Path s -> s -> m t
ptransM f = phyloM (\ p -> embedM <=< f p) (const projectM)

This function is non-trivial, because the path may affect the transformation.

Biindexed bifunctors and bifoldables

Okay, so that wasn't too bad. But this is birecursion-schemes, and so you knew it was coming - biindexed bifunctors. Luckily, this is going to be very easy, because we did all of the hard work to get pathed recursion schemes, we've just got to combine the methods of lifting from recursion to birecursion, with the method of lifting to indexed / pathed recursion. That means we've got two indices now. Oh no! Anyway...

First, we define our index type families.

type family FirstIndex (f :: * -> * -> *) :: *
type family SecondIndex (f :: * -> * -> *) :: *
type Biindex f = Either (FirstIndex f) (SecondIndex f)

Note that forall a . Index (f a) == SecondIndex f since forall a . Functor (f a) => Bifunctor f (assuming the bifunctor-superclass proposal, which we haven't quite yet applied)

What's this Biindex type? Well, since birecursion means that we recurse twice, and so when we are recursing, the resulting path may contain a mix of FirstIndex and SecondIndex depending on whether the recursion went through the First slot or the Second slot at a given level.

Trust me, it'll make more sense as we go on. On with the show!

Biindexed and biindexed

If indexed is just imap (,), then biindexed is just ibimap (,) (,).

class Biindexed f where
    biindexed :: f a b -> f (FirstIndex f, a) (SecondIndex f, b)
    default biindexed :: (BiindexedBifunctor f) => f a b -> f (FirstIndex f, a) (SecondIndex f, b)
    biindexed = ibimap (,) (,)

Again, we have a default implementation in terms of ibimap, but the same thoughts as before regarding fusing still apply.

BiindexedBifunctor and ibimap

If imap f = fmap (uncurry f) . indexed, then ibimap f g = bimap (uncurry f) (uncurry g) . biindexed

class (Biindexed f, Bifunctor f) => BiindexedBifunctor f where
    ibimap :: (FirstIndex f -> a -> b) -> (SecondIndex f -> c -> d) -> f a c -> f b d
    ibimap f g = bimap (uncurry f) (uncurry g) . biindexed

BiindexedBifoldable and ibifoldMap

Now we really are starting to harvest the result of our efforts earlier. We upgrade foldMap (uncurry f) . indexed to bifoldMap (uncurry f) (uncurry g) . biindexed - because we took the time to define IndexedFoldable, we can define BiindexedBifoldable just as easily!

class (Biindexed f, Bifoldable f) => BiindexedBifoldable f where
    ibifoldMap :: (Monoid m) => (FirstIndex f -> a -> m) -> (SecondIndex f -> b -> m) -> f a b -> m
    ibifoldMap f g = bifoldMap (uncurry f) (uncurry g) . biindexed

Our types are beginning to grow unseasonably large, however.

BiindexedBitraversable and ibitraverse

BiindexedBitraversable is again just so much more purely mechanical lifting.

class (Biindexed t, Bitraversable t) => BiindexedBitraversable t where
    ibitraverse
        :: Applicative f
        => (FirstIndex t -> a -> f c)
        -> (SecondIndex t -> b -> f d)
        -> t a b -> f (t c d)
    ibitraverse f g = bitraverse (uncurry f) (uncurry g) . biindexed

As far as I can tell, while indexed-traversable as a package exists, biindexed-bitraversable does not, and we are firmly in novel territory here. However, these are necesary for... are you ready for it? Bipathed birecursion!

Biindexed birecursives

This time, we're going to skip the i- variants entirely, and get straight to the p- pathed variants.

Biindexed bibase bifunctors

Since before we had indexed base functors, now we have biindexed bibase bifunctors. Oh dear, what terribly babbling terminology.

Bipathed birecursives

Just as before, we define our types, this time lifted from both recursive to birecursive and indexed to biindexed.

type Bipath (r :: *) = [Biindex (Bibase r)]

--   PathedRecursive     t = (  Recursive t, IndexedFunctor     (Base t))
type BipathedBirecursive t = (Birecursive t, BiindexedBifunctor (Bibase t))
type BipathedCobirecursive t = (Cobirecursive t, BiindexedBifunctor (Bibase t))

Our golden thread still shows us our way back, however - and it tells us which doors we passed through on the way - the first recursion or the second.

Again, these are just type aliases for now.

Bipathed bimorphisms

NOTE: I am unsure of whether our use of the term 'bimorphism' meets the definition a 'bimorphism' as "A morphism which is both a monomorphism and an epimorphism.". It may clash, or it may turn out to be equivalent, if we squint at it hard enough. I haven't investigated yet.

We're going to get through these bimorphisms real quick, because effectively all that they do is take imap / itraverse and lift to ibimap / ibitraverse, Path to Bipath, etc. Lots of things just get bi- shoved in front. Anything that isnt straightforward will get called out.

pbicata

Just pcata with bi- plastered everywhere.

pbicata                                 --  pcata
    :: (BipathedBirecursive t)          --      :: PathedRecursive t
    => (Bipath t -> Bibase t a a -> a)  --      => (Path t -> Base t a -> a)
    -> Bipath t -> t -> a               --      -> Path t -> t -> a
pbicata alg = pbihylo alg (const biproject)
-- pcata alg = phylo  alg (const   project)

Note that Bibase t a a takes an extra a argument compared to Base t a, because it has a base bifunctor.

pbiana

Just pana with bi- plastered everywhere.

pbiana                                  --  pana
    :: (BipathedCobirecursive t)        --      :: PathedCorecursive t
    => (Bipath t -> a -> Bibase t a a)  --      => (Path t -> a -> Base t a)
    -> Bipath t -> a -> t               --      -> Path t -> a -> t
pbiana coalg = pbihylo (const biembed) coalg
-- pana coalg = phylo  (const   embed) coalg

pbihylo

Remember that Biindex f = Either (FirstIndex f) (SecondIndex f)? This is why!

pbihylo                             --  phylo
    :: (BiindexedBifunctor f)       --      :: IndexedFunctor f 
    => ([Biindex f] -> f b b -> b)  --      => ([Index f] -> f b -> b)
    -> ([Biindex f] -> a -> f a a)  --      -> ([Index f] -> a -> f a)
    -> [Biindex f] -> a -> b        --      -> [Index f] -> a -> b
pbihylo alg coalg = h where
    h p = alg p . ibimap (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) . coalg p
-- phylo
--      = alg p . imap                           (\ i -> h (      i : p))       . coalg p

With fmap to bimap, we just need to pop h into both slots, but with imap to ibimap, we have to deal with that FirstIndex is not necessarily the same as SecondIndex. As a result, instead of just copying (\ i -> h (i : p)) into both recursion slots, we have to differentiate by using Left and Right, as Bipath uses Biindex which contains an Either. Otherwise, we're just doing the same path-combining as with phylo, using Left i : p and Right j : p instead of just i : p.

Can you imagine this without the type-aliases? And yet, we can still clearly see the original hylo hidden inside.

Monadic bipathed bimorphisms

These just lift what just did, to also be monadic. So, ibimap becomes ibitraverse, . becomes <=<, and we add a few -Ms in places. Otherwise, noting of note, just rote.

pbicataM

pbicataM
    :: (Monad m, Birecursive t, BiindexedBifunctor (Bibase t), BiindexedBitraversable (Bibase t))
    => (Bipath t -> Bibase t a a -> m a)
    -> Bipath t -> t -> m a
pbicataM alg = pbihyloM alg (const biprojectM)

pbianaM

pbianaM
    :: (Monad m, Cobirecursive t, BiindexedBifunctor (Bibase t), BiindexedBitraversable (Bibase t))
    => (Bipath t -> a -> m (Bibase t a a))
    -> Bipath t -> a -> m t
pbianaM coalg = pbihyloM (const biembedM) coalg

pbihyloM

pbihyloM
    :: (Monad m, BiindexedBifunctor f, BiindexedBitraversable f)
    => ([Biindex f] -> f b b -> m b)
    -> ([Biindex f] -> a -> m (f a a))
    -> [Biindex f] -> a -> m b
pbihyloM alg coalg = h where
    h p = alg p <=< ibitraverse (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) <=< coalg p

Special bipathed bimorphisms

For completeness, of course.

pbiiso

Trivially const biiso.

pbiiso :: (BipathedBirecursive s, Biiso s t) => Bipath s -> s -> t
pbiiso = pbihylo (const biembed) (const biproject)

pbitrans

Not trivial.

pbitrans
    :: (BipathedBirecursive s, Cobirecursive t)
    => (Bipath s -> forall a b. Bibase s a b -> Bibase t a b) -> Bipath s -> s -> t
pbitrans f = pbihylo (\ p -> biembed . f p) (const biproject)

pbiisoM

Trivially const biisoM.

pbiisoM
    :: (Monad m, BipathedBirecursive s, Biiso s t, BiindexedBitraversable (Bibase s))
    => Bipath s -> s -> m t
pbiisoM = pbihyloM (const biembedM) (const biprojectM)

pbitransM

Not trivial.

pbitransM
    :: (Monad m, BipathedBirecursive s, Cobirecursive t, BiindexedBitraversable (Bibase s))
    => (Bipath s -> forall a b. Bibase s a b -> m (Bibase t a b))
    -> Bipath s -> s -> m t
pbitransM f = pbihyloM (\ p -> biembedM <=< f p) (const biprojectM)

Take another break.

You've just finished getting through not just pathed recursion, but bipathed birecursion! The end of this is in sight, after this, we just have to define what indexing and pathing mean for recursive functors.

Diindexed difunctors and difoldables, diindexed dibase bifunctors

In comparison to Indexed and Pathed, I have not yet created a Diindexed class, nor are there DiindexedDifunctor or DiindexedDifoldable or DiindexedDitraversable classes. You see, a Difunctor is just a Functor and a Recursive, and so it already has an Index from Functor and a Path from Recursive. Its base functor is a base bifunctor, and so Biindexed, BiindexedBifunctor, BiindexedBifoldable, and BiindexedBitraversable are already sufficient.

Dipathed direcursives

At last we arrive here - the summit of an exhausting journey, but also a fundamental building block, the first step in a much larger journey that I wish to take you on. Be proud of reaching this point with me, as it prepares you for things yet to come.

If we have indexed functors and biindexed bifunctors, and pathed recursives and bipathed birecursives, what do we get for difunctors aka recursive functors? Well, they have both an index, being a functor, as well as a path, being a recursive at the same time. In fact, remember that their dibase bifunctor is a base bifunctor that is also a functor for the fixed / recursive functor's content, instead of being a bibase bifunctor that recurses in both slots.

From this we know that a recursive functor's index must be made up of its base bifunctor's (bi)indices. Whereas for bipathed birecursion, our golden thread back could weave back and forth through both base bifunctor slots, now it threads through one base slot repeatedly - the path made up of one slot's indices, pointing repeatedly at more recursive structure - before finally weaving across through the other base slot once to finally point at a functor content element - the tip of the path, a house number on a street address, the last leg of a journey.

NOTE: Why would we bother with this concept? Well, it is in order to chain recursion - our goal is to stack / transform recursive functors, and that means we must do the same with their indices. If we have a compound type f (g (h a)), then we have a compound path (Index h, (Index g, (Index f, ()))), and if they are recursive, the paths must link up too, giving us ( (FirstIndex h, [SecondIndex h]), ((FirstIndex g, [SecondIndex g]), ((FirstIndex f, [SecondIndex f]), ()))). It is easier and clearer however to define it as Way (SecondIndex h) (FirstIndex h, Way (SecondIndex g) ...) and embed inner paths at the end of outer paths.

Let us define some helpful type aliases.

type Diindex (r :: * -> *) = (FirstIndex (Dibase r), [SecondIndex (Dibase r)]) -- NOTE: Should be ~ Index r
type Dipath (r :: * -> *) = [SecondIndex (Dibase r)]                           -- NOTE: Should be ~ Path (r a)

-- NOTE: These constraints often end up adding little, due to incomplete hierarchy
--  Eg, `Direcursive t, BiindexedBitraversable (Dibase t)` is sufficient as compared
--  to `PathedDirecursive t, Bitraversable (Dibase t)` which isn't.
type DipathedDirecursive t = (Direcursive t, BiindexedBifunctor (Dibase t))
type DipathedCodirecursive t = (Codirecursive t, BiindexedBifunctor (Dibase t))

Note that forall a . Path (t a) ~ Dipath t since forall a . Index (t a) ~ SecondIndex t, and that Diindex t ~ Index t too. Also note that (FirstIndex f, [SecondIndex f]) is just Way (SecondIndex f) (FirstIndex f) in reverse - it makes sense because Diindex is just Index, and Index is Way as we discovered early on. At the time we haven't completely solidified this hierarchy, but when we do, these equivalences should be more explicit, and should be enforced similarly to the bifunctor-superclass proposal - it is related after all.

So Dipath is just Path in bifunctor clothing, and Diindex is just Index because a recursive functor is already a functor - but what exactly is Diindex? Whereas Index is ignorant of what lay beneath, Diindex is aware of the internal structure of a recursive functor - it knows that it must take a Path made up of SecondIndex, and cap it off at the end with a FirstIndex, in order to make the Index.

For many recursive functors, there exists only one logical content slot per base constructor - in that case the FirstIndex is equivalent to (), and it can be ignored entirely. For these functors, Dipath f ~ Diindex f, and Index f ~ Path (f a).

data Identity a = Identity a
data List a = Nil | Cons a (List a)
data Tree a = Tip | Tree a (List a) (List a)
data BinTree  a = Leaf a | Node (BinTree a) (BinTree a)

However, for others with multiple content slots per case, FirstIndex actually requires some information to be stored.

--                     v FirstIndex  v SecondIndex
data BilistIx = Bizero Bool | Bisucc () BilistIx    -- Or: Way () Bool
data Bilist a = Binil | Bicons a a (Bilist a)

Bilist requires that SecondIndex BilistF ~ Unit because there is only one possible successor, and it requires that FirstIndex BilistF ~ Bool because each successor skips two content slots, and so the Diindex of a Bilist effectively counts by units of twos until it reaches the end, then uses the Bool to pick one of the two slots.

It is important to understand how sometimes FirstIndex (often) or SecondIndex (rarely) can be implicitly ignored sometimes - it will help you make sense of why sometimes a path is an index, and sometimes it is not.

Dipathed dimorphisms

NOTE: Here we have made up the term 'dimorphism' wholesale, as a cursory search of "category theory" with "dimorphism" yielded nothing of value.

Again, we're going to skip the i- variants entirely, and get straight to it.

pdicata

This is better explained through pdihylo, but we'll define it first.

pdicata
    :: (DipathedDirecursive f)
    => (Diindex f -> a -> b)
    -> (Dipath f -> Dibase f b d -> d)
    -> Dipath f -> f a -> d
pdicata f alg = pdihylo f alg (const diproject)
-- pcata  alg = phylo     alg (const   project)

Also, note that enforcing Functor, Recursive => Direcursive would allow for enforcing (Index f ~ Diindex f, forall a . Path (f a) ~ Dipath f) implicitly, allowing us to define pdicata using Index and Path. See here with those constraints made explicit:

pdicata
    :: (DipathedDirecursive f, Index f ~ Diindex f, Path (f a) ~ Dipath f)
    => (Index f -> a -> b)
    -> (Path (f a) -> Dibase f b d -> d)
    -> Path (f a) -> f a -> d
pdicata f alg = pdihylo f alg (const diproject)
-- pcata  alg = phylo     alg (const   project)

This could be the preferable way of defining this, once we tighten the hierarchy.

pdiana

Ditto everything said for pdicata.

pdiana
    :: (DipathedCodirecursive f)
    => (Diindex f -> a -> b)
    -> (Dipath f -> c -> Dibase f a c)
    -> Dipath f -> c -> f b
pdiana f coalg = pdihylo f (const diembed) coalg
-- pana  coalg = phylo     (const   embed) coalg

pdihylo

If dihylo is hylo plus fmap, then pdihylo is phylo plus imap. This function gives you the power to do one thing going down the branches, a second thing to map the leaves, and a third thing coming back up - all with an index or path to tell you where you are.

pdihylo 
    :: BiindexedBifunctor f
    => ((FirstIndex f, [SecondIndex f]) -> a -> b)
    -> ([SecondIndex f] -> f b d -> d)
    -> ([SecondIndex f] -> c -> f a c)
    -> [SecondIndex f] -> c -> d
pdihylo f alg coalg = h where
    h p = alg p . ibimap (\ i -> f (     i , p)) (\ j -> h (      j : p)) . coalg p
-- phylo
--      = alg p . imap                           (\ i -> h (      i : p)) . coalg p
-- pbihylo
--      = alg p . ibimap (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) . coalg p

Note the correlations and differences with phylo and pbihylo. We have upgraded to a base bifunctor, but because this is direcursion, we aren't upgrading to use Left and Right. Instead, we still copy (\ i -> f (i : p)) into both ibimap slots, but rather than adding Left and Right, we just replace the first slot's : with , to get (\ i -> f (i , p)) , while the second slot gets to stay \ i -> h (i : p) (well, we turned the i to a j but thats purely nomenclative).

Placing the FirstIndex to the left in (FirstIndex f, [SecondIndex f]) plays well with the understanding that the path is reversed, as by reading it from right to left, the FirstIndex forms the tip. /path/to/file.json becomes (file.json, ["to/", "path/", "/"]) (if file-names were considered as FirstIndex to directory-names as SecondIndex).

Note that for recursive functors such that FirstIndex ~ (), then Index f ~ forall a . Path (f a).

What can we do with pdihylo? Well, earlier we defined data FileTree = File | Directory (Map Name FileTree) as an example for phylo, but we could actually free up the filetype and turn FileTree into a recursive functor.

data FileTree a = File a | Directory (Map Name FileTree)
data FileTreeF a r = ...

One can imagine not just drilling down through the filetree, through the content of the file as well, in a single operation, such as FileTree Json with a combined filetree-json-path being made available to every json fragment - or breaking up a filetree-of-json into (filetree-or-json)-fragments, a deepening of the shard shattering that we performed with phylo because everything that we mentioned for phylo carries over to pdihylo.

It is possible to build a di- variant of the earlier shatterList and shatterMap functions, one which continues with shattering the functor content - making it possible to recursively shatter nested recursive functors and gather up all of the pieces. This requires a bit more finesse, so this definition is left for another time. Feel free to try and derive it yourself, however.

Applicative dipathed dimorphisms

Unlike Recursive and Birecursive, with di- we have access to non-trivial applicative variants. These of course are created by using ibitraverse like -M variants, but with . instead of <=<, and using fmap at the front.

Note that as with dihylo and friends, the applicative nature only occurs in the mapping transformation function, and not in the alg or coalg function - the Applicative nature is ignorant of the recursive nature just as much as Functor is. Compare (forall a . Recursive (f a), Functor f) => Difunctor f which understand that it is recursive, and can be broken up into (di)base functors, and so a recursive applicative might be broken up into (di)base applicatives. Concepts like diapplicative or even dimonad (recursive applicatives and monads) are definitely worth looking further into, to explore how the applicative or monadic nature as a whole is constructed from the fragments.

pdicataA

pdicataA
    :: (Applicative f, DipathedDirecursive t, BiindexedBitraversable (Dibase t))
    => (Diindex t -> a -> f b)
    -> (Dipath t -> Dibase t b c -> c)
    -> Dipath t -> t a -> f c
pdicataA f alg = pdihyloA f alg (const diproject)

pdianaA

pdianaA
    :: (Applicative f, DipathedCodirecursive t, BiindexedBitraversable (Dibase t))
    => (Diindex t -> a -> f b)
    -> (Dipath t -> c -> Dibase t a c)
    -> Dipath t -> c -> f (t b)
pdianaA f coalg = pdihyloA f (const diembed) coalg

pdihyloA

pdihyloA
    :: (Applicative f, BiindexedBitraversable t)
    => ((FirstIndex t, [SecondIndex t]) -> a -> f b)
    -> ([SecondIndex t] -> t b d -> d)
    -> ([SecondIndex t] -> c -> t a c)
    -> [SecondIndex t] -> c -> f d
pdihyloA f alg coalg = h where
    h p = fmap (alg p) . ibitraverse (\ i -> f (i,p)) (\ j -> h (j : p)) . coalg p

A minor correction about -A variants

It is worth mentioning, I was in error earlier when I stated that there was no applicative variants for mono- and bi- recursion. This of course is not strictly true, and it turns out that an applicative form is possible if we apply the fmap trick that we used in dihyloA, which is to promote fmap to traverse, while also wrapping alg with fmap before composition.

hyloA :: (Traversable t, Applicative f) => (t b -> b) -> (a -> t a) -> a -> f b
hyloA alg coalg = h where h = fmap alg . traverse h . coalg
-- hylo for comparison:            alg . fmap     h . coalg

It is pretty trivial though - f doesn't get used anywhere except the very end, unlike dihyloA. I would expect cataA* and anaA to be just as trivial. The di- variants are non-trivial even without indexing, because the applicative f does get used in the mapping transformation, which hyloA lacks.

However, the addition of indexing to the mix may make it such that phyloA and similar become non-trivial given that there is now a path with a chance to affect things. It may be worth revisiting in the future.

NOTE: * Not the same as cataA in recursion-schemes, which is not of the form (Base t a -> f a) -> t -> f a) but rather (Base t (f a) -> f a) -> t -> f a .

Monadic dipathed dimorphisms

If you get dipathed dimorphisms, and monadic morphisms, you'll get these.

pdicataM

pdicataM
    :: (Monad m, DipathedDirecursive f, BiindexedBitraversable (Dibase f))
    => (Diindex f -> a -> m b)
    -> (Dipath f -> Dibase f b d -> m d)
    -> Dipath f -> f a -> m d
pdicataM f alg  = pdihyloM f alg (const diprojectM)

pdianaM

pdianaM 
    :: (Monad m, DipathedCodirecursive f, BiindexedBitraversable (Dibase f))
    => (Diindex f -> a -> m b)
    -> (Dipath f -> c -> m (Dibase f a c))
    -> Dipath f -> c -> m (f b)
pdianaM f coalg  = pdihyloM f (const diembedM) coalg

pdihyloM

pdihyloM
    :: (Monad m, BiindexedBitraversable f)
    => ((FirstIndex f, [SecondIndex f]) -> a -> m b)
    -> ([SecondIndex f] -> f b d -> m d)
    -> ([SecondIndex f] -> c -> m (f a c))
    -> [SecondIndex f] -> c -> m d
pdihyloM f alg coalg = h where
    h p = alg p <=< ibitraverse (\ i -> f (i,p)) (\ j -> h (j : p)) <=< coalg p

An elegant weapon for a more civilized age.

Special dipathed dimorphisms

With recursion and birecursion, we found that some of our 'special' morphisms were trivial - variants of iso ended up not using the path at all.

This is not true for the special dipathed morphisms - none of them are trivial due to the use of the index / path in the f mapping transformation argument, and so the path argument may always affect the result. This makes them all worth defining - and we will discover something very interesting.

pdiiso

If you recall from the initial birecursion article, diiso lifts iso :: s -> t to a functor and adds a mapping argument (a -> b) -> s a -> t b. Extending it to be pathed is by now routine.

pdiiso
    :: (Diiso s t, BiindexedBifunctor (Dibase s))
    => (Diindex s -> a -> b)
    -> Dipath s -> s a -> t b
pdiiso f = pdihylo f (const diembed) (const diproject)

pditrans

pditrans
    :: (DipathedDirecursive s, Codirecursive t)
    => (Diindex s -> a -> b)
    -> (forall c . Dipath s -> Dibase s b c -> Dibase t b c)
    -> Dipath s -> s a -> t b
pditrans f g = pdihylo f (\ p -> diembed . g p) (const diproject)

pdiisoA

pdiisoA
    :: (Applicative f, Diiso s t, BiindexedBitraversable (Dibase s))
    => (Diindex s -> a -> f b)
    -> Dipath s -> s a -> f (t b)
pdiisoA f = pdihyloA f (const diembed) (const diproject)

pditransA

pditransA
    :: (Applicative f, DipathedDirecursive s, Codirecursive t, BiindexedBitraversable (Dibase s))
    => (Diindex s -> a -> f b)
    -> (forall c . Dipath s -> Dibase s b c -> Dibase t b c)
    -> Dipath s -> s a -> f (t b)
pditransA f g = pdihyloA f (\ p -> diembed . g p) (const diproject)

pdiisoM

pdiisoM
    :: (Monad m, Diiso s t, BiindexedBitraversable (Dibase s))
    => (Diindex s -> a -> m b)
    -> Dipath s -> s a -> m (t b)
pdiisoM f = pdihyloM f (const diembedM) (const diprojectM)

pditransM

pditransM
    :: (Monad m, DipathedDirecursive s, Codirecursive t, BiindexedBitraversable (Dibase s))
    => (Diindex s -> a -> m b)
    -> (forall c . Dipath s -> Dibase s b c -> m (Dibase t b c))
    -> Dipath s -> s a -> m (t b)
pditransM f g = pdihyloM f (\ p -> diembedM <=< g p) (const diprojectM)

Conclusion

As hard as it is to write about, it is even more difficult to end the writing - but end we must. We have covered enough ground for the day, we went deep into the labyrinth, and now it is time to come back out having discovered some wild things (shatterList and shatterMap especially). As we leave the labyrinth, winding up our golden thread, we arrive back where we started - the origin of our journey. It is time to make camp, and rest by the fire.

We started with asking a few questions, and it is fitting that we will end with answering one accordingly.

The profunctoral perspective

Whatever happened to the profunctoral perspective? I did say that I would get around to it, but we will need to go back to a few functions and give them a closer examination first.

I would now like to remind you of something that we discovered about diiso in the prior article - by restricting s ~ t, we found out that fmap = diiso. Therefore, we ask, what do we get if we restrict pdiiso the same way (and crush down types for readability since Diindex ~ Index and Dipath ~ Path)?

pdiiso_fmap
    :: (DipathedDirecursive f, DipathedCodirecursive f)
    => (Index f -> a -> b)
    -> Path f -> f a -> f b
pdiiso_fmap f = pdihylo f (const diembed) (const diproject)

If you were to say imap, you'd be wrong. Although , the resulting function has type (Index f -> a -> b) -> Path f -> f a -> f b - it has that additional Path f -> argument compared to imap, and so imap f = pdiiso f [].

On the other hand, if you were to say pmap (as a sort of pathed map) you'd be wrong too, because the mapping transformation function f takes an Index f and not a Path f - it is only if FirstIndex (Dibase f) ~ () (meaning the final constructor has a singular functor content slot) that Index f ~ ((), Path f) such that we get Index f ~ Path f by dropping the irrelevant ().

If it is neither imap nor pmap, it is instead both - it is an indexed, pathed map. We've got to remember that it isn't just fmap = diiso, it is recursive fmap = diiso, so it expresses both the functor nature and the recursive nature.

More importantly, it performed some type-changing operation on the path, which sounds like something we've been looking for. At the very end, it goes from Path to Index - but we did that. We did that when we implemented pdihylo. Has our profunctoral indices been hiding right under our noses the entire time?

Lets take a look! phylo concatenates indices Index f into a path [Index f], changing the type. pbihylo concatenates two types of indices FirstIndex f and SecondIndex f into a path of both [Either (SecondIndex f) (FirstIndex f)] - again changing the type. Finally, pdihylo concatenates one type of indices SecondIndex f into a path [SecondIndex f] (changing type once), before capping it with another index FirstIndex f to turn it into an index path (FirstIndex f, [SecondIndex f]) (changing type a second time).

How did we do that?

phylo alg coalg = h where
    h p = alg p . imap (\ i -> h (i : p)) . coalg p
    
pbihylo alg coalg = h where
    h p = alg p . ibimap (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) . coalg p
    
pdihylo f alg coalg = h where
    h p = alg p . ibimap (\ i -> f (i , p)) (\ j -> h (j : p)) . coalg p

Can we abstract this? All we're doing is imap / ibimap with almost the exact same logic - in fact, we can zoom in on the centers of these functions, and pluck out their arguments to see how they are almost identical.

(\ i -> h (i       : p))    -- phylo
(\ i -> h (Left i  : p))    -- bihylo first
(\ j -> h (Right j : p))    -- bihylo second
(\ i -> f (i       , p))    -- dihylo first 
(\ j -> h (j       : p))    -- dihylo second

Really, these are all variations of the same form - we're performing a cons-like operation with the index and path, and we can extract the , and : arguments to get the generalized function:

(\ cons i -> h (i `cons` p))

With this, we can make a slightly different version of imap and ibimap, which we will call imapWith / ibimapWith:

imapWith
    :: (IndexedFunctor f)
    => (Index f -> path -> indexpath)           -- Cons
    -> (indexpath -> a -> b)
    -> path                                     -- Nil
    -> f a -> f b
imapWith cons f p = imap (\ i -> f (cons i p))

ibimapWith
    :: (BiindexedBifunctor f)
    => (FirstIndex f -> path -> findexpath)     -- Left Cons
    -> (SecondIndex f -> path -> sindexpath)    -- Right Cons
    -> (findexpath -> a -> b)
    -> (sindexpath -> c -> d)
    -> path                                     -- Nil
    -> f a c -> f b d
ibimapWith icons jcons f g p = ibimap (\ i -> f (icons i p)) (\ j -> g (jcons j p))

Note that imapWith / ibimapWith now take an additional cons argument(s), but also a path argument - like pdiiso_fmap earlier - except now the types are free. In fact, we have gone from implicitly working with lists (through use of :) to accepting arguments in the form of List's constructors, Nil and Cons. By exposing the cons-like arguments, we allow for the user to determine the actual type of the indices and paths freely! The profunctoral perspective, everybody!

Now, we can redefine phylo / pbihylo / pdihylo in terms of these new functions - and the definitions are much cleaner!

phylo alg coalg = h where
    h p = alg p . imapWith (:) h p . coalg p
pbihylo alg coalg = h where
    h p = alg p . ibimapWith ((:) . Left) ((:) . Right) h p . coalg p
pdihylo f alg coalg = h where
    h p = alg p . ibimapWith (,) (:) h p . coalg p

It seems that imapWith could be the more appropriate definition of an indexed functor, if we want to consider the profunctoral perspective. Even more, although we have defined imapWith using imap, it can go in the other direction trivially using imap f = imapWith const f undefined. However, imap makes for a concise base case, and we will leave it as-is.

The end of the beginning

Our campfire is now embers, and the stars are overhead. Today, we have accomplished a lot, even though the significance of much of it has yet to be felt. Now, it is time to sleep on what we have learned - to digest it and let it become part of our learned self.

For anyone who has made it this far, congratulations - this marks the end of the beginning. All of this has been preparation, a foundation for connecting the many topics that we will be tackling in the days to come.

I hope you'll join me next time we explore the recursiverse, when we discuss sources and sourced morphisms, and the implications of imapWith.

Where to get the code

You can find version 0.0.2 of the birecursion-schemes package here, which corresponds to this article.