Haskell: Sourced Recursion Schemes

Up through the recursiverse

This is the third article in a series about recursion, focused on developing the birecursion-schemes library, although it is starting to grow beyond that scope.

NOTE: The birecursion-schemes library has been updated to 0.0.3 following along with today's article. This update constitutes a breaking change, but since this library is still experimental / not yet uploaded to hackage, we're just going bump the minor version. Once it is past the experimental phase, we will release it as 1.0.0 on hackage.

Welcome again, new and returning readers.

Morning dawns, the campfire smoulders, and a new day begins. The smoke from our campfire drifts, ascending lazily into the sky, embers sleeping as we in turn awaken after a long rest.

The previous article was long enough to be awkward as a snake swallowing a watermelon. All things take practice, and I am still developing my writing style - but I promise this one is shorter, and more to the point - a bit of mental digestion, to gather energy now, before we begin to take action later.

In order to expedite our journey, we will focus more closely on the core. This means that we will not be laboriously defining every function variant, but rather treat hylo as our spine. Unless there is something particular I wish to note, I will leave cata / ana / iso / trans and -A / -M variants for the reader to derive - if you've been following along, you should have no trouble doing so.

Recap

First, a quick recap over the last few articles in the series.

Birecursion Schemes

In the first article, we examined recursion, with the goal of 'recursing harder' - that is, finding the next way that things may recurse differently, instead of recursing in the same way.

At a glance, it is about these functions:

fmap   :: (a -> b)                                 -> f a -> f b
hylo   ::             (f   b -> b) -> (a -> f   a) ->   a ->   b
bihylo ::             (f b b -> b) -> (a -> f a a) ->   a ->   b
dihylo :: (a -> b) -> (f b d -> d) -> (c -> f a c) ->   c ->   d

We started with the standard implementation of recursion using functors. By repeatedly calling fmap from within itself, we can unfold and descend through a recursive structure, level by level, and then ascend and fold up the result.

class Functor f where
    fmap :: (a -> b) -> f a -> f b

hylo :: (Functor f) => (f b -> b) -> (a -> f a) -> a -> b
hylo alg coalg = h where h = alg . fmap h . coalg
--             = alg . fmap (alg . fmap (...) . coalg) . coalg

We then extended the concept, using hylo as a template, by lifting from functors to bifunctors. This resulted in two different new forms of recursion - bi- and di- recursion, because with bifunctors we can recurse in either only one slot, or both.

class (forall a . Functor (f a)) => Bifunctor f where
    bimap :: (a -> b) -> (c -> d) -> f a c -> f b d 

-- hylo ::                 (f   b -> b) -> (a -> f   a) -> a -> b
bihylo :: (Bifunctor f) => (f b b -> b) -> (a -> f a a) -> a -> b
bihylo alg coalg = h where h = alg . bimap h h . coalg
--                      hylo = alg . fmap    h . coalg

-- hylo ::                             (f   b -> b) -> (a -> f   a) -> a -> b
dihylo :: (Bifunctor f) => (a -> b) -> (f b d -> d) -> (c -> f a c) -> c -> d
dihylo f alg coalg = h where h = alg . bimap f h . coalg
--                        hylo = alg . fmap    h . coalg

The first, bi- recursion, recurses in both slots of the bifunctor, emphasizing recursion in a side by side manner. The second, di- recursion, only recurses in one slot, leaving the other slot free and making it a recursive functor, emphasizing nested recursive types.

In particular, we may illuminate the defining use case of the di- morphisms - recursive functors:

cata    :: (Recursive   r) =>             (Base   r   b -> b) -> r   -> b
dicata  :: (Direcursive f) => (a -> b) -> (Dibase f b c -> c) -> f a -> c

ana    :: (Corecursive   r) =>             (c -> Base   r   c) -> c -> r
diana  :: (Codirecursive f) => (a -> b) -> (c -> Dibase f a c) -> c -> f b

Note that the argument of dicata is a functor f a, and the result of diana is a functor f b. This highlights how that, in dihylo, the argument and result may both be a functor, but the type definition of dihylo does not show it because it is also possible that the neither the argument nor result be a functor.

We can however constrain dihylo slightly to show the specialized versions.

-- The original
dihylo :: (Bifunctor t)
    => (a -> b) -> (t b d     -> d)   -> (c   -> t a c)     -> c   -> d
    
-- Argument `c` becomes `f a`
dihylo :: (Functor f,            Bifunctor t)
    => (a -> b) -> (t b c     -> c)   -> (f a -> t a (f a)) -> f a -> c
  
-- Result `d` becomes `g b`  
dihylo :: (           Functor g, Bifunctor t)
    => (a -> b) -> (t b (g b) -> g b) -> (c   -> t a c)     -> c   -> g b
    
-- Combination of both
dihylo :: (Functor f, Functor g, Bifunctor t)
    => (a -> b) -> (t b (g b) -> g b) -> (f a -> t a (f a)) -> f a -> g b
    
-- NOTE: The `Functor f` and `Functor g` restrictions aren't /strictly/ necessary.

Although the original type definition sufficiently covers these them, these specialized versions emphasize what dicata and diana emphasize compared to cata and ana - that the argument or result could now a functor, and these di- functions are equipped to handle that.

Indexed and Pathed Recursion Schemes

In the second article, we again extended the concept of recursion, this time with indexing, in order to keep track of where we are and giving us a way to retrace our steps - morphisms that know where they are.

At a glance, it is about these functions:

imap    :: (Index f   -> a -> b)                                                                     -> f a -> f b
phylo   ::                          (Path f   -> f   b -> b) -> (Path f   -> a -> f   a) -> Path f   -> a ->   b
pbihylo ::                          (Bipath f -> f b b -> b) -> (Bipath f -> a -> f a a) -> Bipath f -> a ->   b
pdihylo :: (Diindex f -> a -> b) -> (Dipath f -> f b d -> d) -> (Dipath f -> c -> f a c) -> Dipath f -> c ->   d

We started with the addition of Index f -> to fmap, and since an indexed base functor begets a pathed recursive, this resulted in the addition of Path f née [Index f] -> to hylo.

type family Index (f :: * -> *) :: *
type family Path (f :: * -> *) :: [Index f]
-- NOTE: These type definitions may have changed since the last article.
--  The nomenclature is still being worked.

class Indexed f where
    indexed :: f a -> f (Index f, a)

class (Indexed f, Functor f) => IndexedFunctor f where
    imap :: (Index f -> a -> b) -> f a -> f b

phylo                            -- hylo
    :: IndexedFunctor f          --  :: Functor f
    => (Path f -> f b -> b)      --  => (f b -> b)
    -> (Path f -> a -> f a)      --  -> (a -> f a)
    -> Path f -> a -> b          --  -> a -> b
phylo alg coalg = h where h p = alg p . imap (\ i -> h (i : p)) . coalg p
---                      hylo = alg   . fmap         h          . coalg

We also applied the concept to bi- and di- recursion. This gave us slight variations on the type of path - with bi- recursion using Either to mix both indices into a Biindex and Bipath, while di- recursion uses , to cap a Dipath of multiple indices of one type using a single value of the other to create a Diindex.

type family FirstIndex (f :: * -> * -> *) :: *
type family SecondIndex (f :: * -> * -> *) :: *

class Biindexed f where
    biindexed :: f a b -> f (FirstIndex f, a) (SecondIndex f, b)

class (Biindexed f, Bifunctor f) => BiindexedBifunctor f where
    ibimap :: (FirstIndex f -> a -> b) -> (SecondIndex f -> c -> d) -> f a c -> f b d

type Biindex (f :: * -> * -> *) = Either (FirstIndex f) (SecondIndex f)
type Bipath (f :: * -> * -> *) = [Biindex f]
-- NOTE: These type definitions may have changed since the last article.
--  The nomenclature is still being worked.

pbihylo
    :: (BiindexedBifunctor f)
    => (Bipath f -> f b b -> b)
    -> (Bipath f -> a -> f a a)
    -> Bipath f -> a -> b
pbihylo alg coalg = h where
    h p = alg p . ibimap (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) . coalg p

-- NOTE: These type definitions may have changed since the last article.
--  The nomenclature is still being worked.
type Diindex (f :: * -> * -> *) = (FirstIndex f, [SecondIndex f]) -- ~ (FirstIndex f, Dipath f) 
type Dipath (f :: * -> * -> *) = [SecondIndex f]

pdihylo 
    :: BiindexedBifunctor f
    => (Diindex f -> a -> b)
    -> (Dipath f -> f b d -> d)
    -> (Dipath f -> c -> f a c)
    -> Dipath f -> c -> d
pdihylo f alg coalg = h where
    h p = alg p . ibimap (\ i -> f (i , p)) (\ j -> h (j : p)) . coalg p

Note that pdi- functions emphasize the relationship between Index and Diindex, as well as Path and Dipath - however, the structural / concrete path types make these functions less flexible than is ideal, and so we need a definition that is more type-flexible if we are to continue - this being how we ended it last.

NOTE: There's quite a bit more to it than what I've outlined here, so it may be worth reading the article if you are just joining us, or haven't already.

Continuing onward with type-flexible profunctoral indices and polymorphic paths

We left off last article with the observation that locking ourselves into p- variants with specific types was rather awkward. In retrospect, having p- variants with strict types should be considered to be far too simplistic, a result of our focus being mostly about structural indices. Instead, we should find a way to define p- variants with polymorphic paths - and take the profunctoral perspective!

imapWith / ibimapWith

And so we took our first stab at it, by defining imapWith, a variant of imap.

imapWith
    :: (IndexedFunctor f)
    => (Index f -> path -> indexpath)           -- Cons
    -> (indexpath -> a -> b)
    -> path                                     -- Nil
    -> f a -> f b
imapWith cons f p = imap (\ i -> f (cons i p))

ibimapWith
    :: (BiindexedBifunctor f)
    => (FirstIndex f -> path -> findexpath)     -- Left Cons
    -> (SecondIndex f -> path -> sindexpath)    -- Right Cons
    -> (findexpath -> a -> b)
    -> (sindexpath -> c -> d)
    -> path -> f a c -> f b d
ibimapWith icons jcons f g p = ibimap (\ i -> f (icons i p)) (\ j -> g (jcons j p))

So how and why exactly did we define imapWith?

We defined imapWith by replacing the implicit use of the List data type with List-like constructor arguments - a bit like slapping on some Scott or Church encoding onto it as the solution to expressing profunctoral indices. A little pure lambda calculus never hurt anyone.

If we look closely, its definition imapWith cons f p = imap (\ i -> f (cons i p)) is almost exactly the same as the core of phylo: imap (\ i -> h (i:p)). In fact, imapWith generalizes this pattern to allow for any cons -like function, which we can use to define phylo (and pbihylo and pdihylo) more easily and clearly.

phylo alg coalg = h where
    h p = alg p . imapWith (:) h p . coalg p

pbihylo alg coalg = h where
    h p = alg p . ibimapWith ((:) . Left) ((:) . Right) h p . coalg p

pdihylo f alg coalg = h where
    h p = alg p . ibimapWith (,) (:) h p . coalg p

It was a pattern that we were using repeatedly, so why not make life easier?

pmapWith

However, this isn't the only reason that we defined this function. If we squint by constraining the types ever so slightly, the list-like nature of a pathed map is blatantly obvious:

-- A pointless specialization? But good for illustration!
pmapWith
    :: (IndexedFunctor f)
    => (Index f -> path -> path)    -- Cons
    -> (path -> a -> b)
    -> path                         -- Nil
    -> f a -> f b
pmapWith = imapWith

-- We'll throw in pbimapWith for free
pbimapWith
    :: (BiindexedBifunctor f)
    => (FirstIndex f -> path -> fpath)   -- Left Cons
    -> (SecondIndex f -> path -> spath)  -- Right Cons
    -> (fpath -> a -> b)
    -> (spath -> c -> d)
    -> path                             -- Nil
    -> f a c -> f b d
pbimapWith = ibimapWith

Note the List constructors. In imapWith, we allowed the result type of cons to vary, because it costs us nothing to relax from pmapWith and (Index f -> path -> path) / (path -> a -> b) to (Index f -> path -> indexpath) and (indexpath -> a -> b) in imapWith.

pmap

In fact, we can define pmap, which is the concretely-typed, single-step / non-recursive equivalent of phylo, by restricting the path to the List type like we did with phylo:

pmap
    :: (IndexedFunctor f)
    => ([Index f] -> a -> b)
    -> [Index f]
    -> f a -> f b
pmap = imapWith (:)

pbimap
    :: (BiindexedBifunctor f)
    => ([Biindex f] -> a -> b)
    -> ([Biindex f] -> c -> d)
    -> [Biindex f] -> f a c -> f b d
pbimap = ibimapWith ((:) . Left) ((:) . Right)

We even could have used it to define phylo, if we wished:

phylo alg coalg = h where h p = alg p . pmap h p . coalg p

phyloWith / pbihyloWith / pdihyloWith

Since we've just defined a p- variant of imap, it is a fitting form of symmetry that we can also derive a polymorphic-path -With variant of phylo for the whole suite of p- morphisms, which allow us to customize the path type for the morphism.

We can define a function phyloWith based off of imapWith, to keep the cons-like constructor as an argument. Then, phylo = phyloWith (:).

phyloWith
    :: IndexedFunctor f
    => (Index f -> path -> path)    -- Cons
    -> (path -> f b -> b)
    -> (path -> a -> f a)
    -> path                         -- Nil
    -> a -> b
phyloWith cons alg coalg = h where h p = alg p . imapWith cons h p . coalg p

Note that the cons argument can no longer vary its result type as imapWith did - this is because hylo is plural - it may use the singular fmap repeatedly, and so the type must stay the same. The repetition locks us into the path type for the span of the hylomorphism - and it is the same for pbihyloWith, as both of its cons-like arguments get locked in:

pbihyloWith
    :: (BiindexedBifunctor f)
    => (FirstIndex f -> path -> path)     -- Left cons
    -> (SecondIndex f -> path -> path)    -- Right cons
    -> (path -> f b b -> b)
    -> (path -> a -> f a a)
    -> path -> a -> b
pbihyloWith icons jcons alg coalg = h where h p = alg p . ibimapWith icons jcons h h p . coalg p

However, it is not the same with pdihyloWith - one of the cons-like arguments is freed, as it is capped with an index pointing to the functor content.

pdihyloWith
    :: BiindexedBifunctor f
    => (FirstIndex f -> path -> index)  -- Cap
    -> (SecondIndex f -> path -> path)  -- Cons
    -> (index -> a -> b)
    -> (path -> f b d -> d)
    -> (path -> c -> f a c)
    -> path                             -- Nil
    -> c -> d
pdihyloWith cap cons f alg coalg = h where h p = alg p . ibimapWith cap cons f h p . coalg p

These forms are more illustrative of their nature than the original p- functions that we derived last article phylo, pbihylo, and pdihylo. Whereas the originals have concrete path types - in part due to how we were focusing on the structural perspective of indices - these new functions allow us to properly express the profunctoral perspective, and allow our effective index to be whatever type we want by supplying the appropriate constructors.

This is the sought-after type-flexibility that allows for us to express both the structural and profunctoral perspectives of indexing - and in fact, because of their stronger expressiveness, we are going to rename them all and give the original names to these new functions.

pmapWith is renamed to pmap
pbimapWith is renamed to pbimap
phyloWith is renamed to phylo
pbihyloWith is renamed to pbihylo
pdihyloWith is renamed to pdihylo

Of course, since the original functions have had their names stolen, we should do something with them - we can either discard them as unnecessary, or keep them and rename them to reflect their more concretely-typed nature, it doesn't really matter which. We'll rename / redefine them as such:

pmapList = pmap (:)                                 -- Originally pmap
phyloList = phylo (:)                               -- Originally phylo
pbihyloList = pbihylo ((:) . Left) ((:) . Right)    -- Originally pbihylo
pdihyloList = pdihylo (,) (:)                       -- Originally pdihylo

So, from now on, remember that the new p- functions now may take a few additional arguments, compared to the original definitions last article - but we got rid of that awkward -With terminology - which brings us to our next question.

What about imapWith and ibimapWith?

They aren't p- functions, so they weren't involved in our little renaming party just now - and now their -With suffix sticks out like a sore thumb. Is there anything we can do about it?

Sources

We defined imapWith and ibimapWith initially based on the concept of continuing a path by adding a single index to it. However, this perspective is subtly wrong. We used a path, because that is what we had at the time - 'a path back up'. But as it turns out, we mistook the path for the need that it fulfilled - we are not looking for a 'path', we are looking for 'up'.

We have been talking from the path perspective - that is, building up a path from indices. However, we should not assume that we are always consing the index onto a path - we should generalize, to understand that we are consing the index onto whatever is 'up'. It might be the first link in the chain. It might be a path - a series of indices linked together. It might even be anchored to something - that is usually what chains are for.

We need a change in perspective, if we are to continue our ouroborosean journey.

smap / sbimap

If a source is a generalization of index and path, to be 'up', then the proper perspective for imapWith is not (Index f -> path -> indexpath), but rather (Index f -> source -> index) - where the source might be a path, might not.

We can reframe the types to be clear about this:

smap
    :: (IndexedFunctor f)
    => (Index f -> source -> index)     -- Cons
    -> (index -> a -> b)
    -> source                           -- Nil
    -> f a -> f b
smap cons f src = imap (\ i -> f (cons i src))
-- NOTE: imap = smap const

This is just imapWith - literally, we've only changed type variable names, and not any logic - but it has the proper perspective, as to better guide our intent. In fact, we will ditch the imapWith nomenclature entirely, in favor of renaming it to smap, since they are entirely the same function.

We can of course perform the same perspective reframing with ibimapWith, giving us sbimap:

sbimap
    :: (BiindexedBifunctor f)
    => (FirstIndex f -> source -> findex)     -- Left Cons
    -> (SecondIndex f -> source -> sindex)    -- Right Cons
    -> (findex -> a -> b)
    -> (sindex -> c -> d)
    -> source -> f a c -> f b d
sbimap icons jcons f g p = ibimap (\ i -> f (icons i p)) (\ j -> g (jcons j p))
-- NOTE: ibimap = sbimap const const

So, from now on, remember that smap = imapWith and sbimap = ibimapWith.

shylo / sbihylo / sdihylo

Now that we have added smap and sbimap to the renaming part, officially welcoming them as a variant of imap and fmap, our next step is obvious - we wish to try and derive shylo. However, it is not as simple as sticking smap inside of hylo, because we have already done that to get phylo and friends.

Remember - we just renamed imapWith to smap, and phyloWith to phylo, and phylo was already defined using imapWith in the first place. And the same for all of the p- functions! We want to do something with smap to make some function shylo, but we ended up with phylo! What gives?

If we look closely, the source type hasn't disappeared, it's been folded up into path! So what is going on?

You may recall earlier that we defined pmapWith = imapWith (now pmap = smap), a seemingly useless specialization. All it does is constrain smap's cons-like argument Index f -> source -> index to be an actual cons Index f -> path -> path - but also, it has forced the initial source argument to become path too.

When we use smap for a single level, the source type is unperturbed. However, when we use it recursively by plugging it into hylo, it forces source and path to be the same type, since they are both 'up' from the index. So, by using smap in hylo it is automatically dropped to be as strict as pmap!

In fact, all of the p- functions can be defined using pmap or pbimap. Well, almost all of them. Notably, pdihylo still requires sbimap rather than pbimap - and we must ask, why?

We defined pdihylo as:

pdihylo
    :: BiindexedBifunctor f
    => (FirstIndex f -> path -> index)  -- Cap
    -> (SecondIndex f -> path -> path)  -- Cons
    -> (index -> a -> b)
    -> (path -> f b d -> d)
    -> (path -> c -> f a c)
    -> path                             -- Nil
    -> c -> d
pdihylo cap cons f alg coalg = h where
    h p = alg p . sbimap cap cons f h p . coalg p

Note the cap argument has type FirstIndex f -> path -> index and f has type index -> a -> b.

We needed to do this to jump from talking about a recursive path, to talking about a (recursive) functor's index, as we transitioned from recursing through the functor, to mapping over the functor content.

But we could have defined pdihylo as:

pdihylo
    :: BiindexedBifunctor f
    => (FirstIndex f -> path -> path)  -- Cap
    -> (SecondIndex f -> path -> path)  -- Cons
    -> (path -> a -> b)
    -> (path -> f b d -> d)
    -> (path -> c -> f a c)
    -> path                             -- Nil
    -> c -> d
pdihylo cap cons f alg coalg = h where
    h p = alg p . pbimap cap cons f h p . coalg p

In contrast, this version has cap argument of type FirstIndex f -> path -> path. It forces the recursive functor to use the same type for both recursive path and functor index, allowing us to define it purely in terms of pbimap instead of sbimap - but also making it deficient, so we'll be keeping the other definition.

For pdihylo, we used smap to preserve the index type at one end - and with this in mind, we know that we need to do something to preserve the initial source type at the other end. We need to be explicit about using the source to start the path, just as we were explicit about capping / ending it with an index.

To get smap, we added a cons-like argument to imap so we could 'intercept' the index, but when the resulting smap gets used recursively / plurally by shylo, it composes smap's source type away into the path type. shylo needs something else because we can't just reuse smap's cons - it is already in use.

To restore our source-iness, we need to 'intercept' the original source and turn it into a path, explicitly - hence, we can solve this by adding a new (source -> path) argument to phylo to get shylo.

NOTE: There is no point in adding a (source -> path) argument to smap itself - it would only ever be immediately composed with the Index f -> path -> index argument, meaning that smap doesn't ever really see the path type, and we can just perform that composition outside giving us the smap that we actually defined.

shylo
    :: (IndexedFunctor f)
    => (source -> path)             -- End
    -> (Index f -> path -> path)    -- Way
    -> (path -> f b -> b)
    -> (path -> a -> f a)
    -> source                       -- End's argument
    -> a -> b
shylo end way alg coalg = h . end where
    h p = alg p . smap way h p . coalg p
-- NOTE: phylo = shylo id

We can do the same for pbihylo to get sbihylo:

sbihylo
    :: BiindexedBifunctor f
    => (source -> path)                 -- End
    -> (FirstIndex f -> path -> path)   -- Left cons
    -> (SecondIndex f -> path -> path)  -- Right cons
    -> (path -> f b b -> b)
    -> (path -> a -> f a a)
    -> source                           -- End's argument
    -> a -> b
sbihylo end icons jcons alg coalg = h . end where
    h p = alg p . sbimap icons jcons h h p . coalg p

And again to pdihylo to yield sdihylo.

sdihylo
    :: BiindexedBifunctor f
    => (source -> path)                 -- End
    -> (FirstIndex f -> path -> index)  -- Cap
    -> (SecondIndex f -> path -> path)  -- Cons
    -> (index -> a -> b)
    -> (path -> f b d -> d)
    -> (path -> c -> f a c)
    -> source                           -- End's argument
    -> c -> d
sdihylo end cap cons f alg coalg = h . end
    where h p = alg p . sbimap cap cons f h p . coalg p

With sdihylo, we can see the entire journey take place, from source above to path in between to index below.

Now, these aren't perhaps the most ergonomic functions, their arrangement of arguments having mostly been the result of sticking things on the front of hylo and friends without much accounting for natural flow. Perhaps we will revisit the issue of ergonomics in the future.

Snake paths

So, how do we go about using these s- variants? What do we do to fill these new arguments?

With the phylo, it was easy - we used (:) and so the path was just [index]. Things got a little trickier when we got to pdihylo - in addition to using (:) on one side to get [sindex] as the recursive path, we had to add (,) on the other to get (findex,[sindex]) as the functor index.

With sources, we need something on the other end. We could just use (,) again, giving us something like ([index],src) for phylo and (findex,[sindex],src) for pdihylo. A little bit of work later, and viola!

tuplehylo
    :: (IndexedFunctor f)
    => (([Index f], source) -> f b -> b)
    -> (([Index f], source) -> a -> f a)
    -> source
    -> a
    -> b
tuplehylo = shylo ([],) (first . (:))

tripledihylo
    :: (BiindexedBifunctor f)
    => ((FirstIndex f, [SecondIndex f], source) -> a -> b)
    -> (([SecondIndex f], source) -> f b d -> d)
    -> (([SecondIndex f], source) -> c -> f a c)
    -> source
    -> c
    -> d
tripledihylo = sdihylo ([],) (\ i (p,s) -> (i,p,s)) (first . (:))

These tuple forms aren't bad per se, they do give us fast access to the heterogenous first and last elements. However, I want to take a more recursive approach, because I'd like to give base fuctors to indices of recursive structures later (much later, as in not-appearing-in-this-article later).

First, we need some list-like data types.


-- Cap

-- NOTE: Cap is (,) + List
data Cap head body = Cap head [body]
-- NOTE: Cap is NonEmpty except head is a heterogenous element
type NonEmpty a = Cap a a

-- NOTE: We stole `:|` from NonEmpty for Snake
infixr 5 :<
pattern (:<) :: head -> [body] -> Cap head body
pattern head :< body = Cap head body

capExample :: Cap String Int
capExample = "head" :< 1 : 2 : 3 : []

-- Way

data Way body tail = Way body (Way body tail) | End tail
-- NOTE: Way is List except Nil contains a heterogenous element
type List a = Way a ()

infixr 5 :-
pattern (:-) :: body -> Way body tail -> Way body tail
pattern body :- way = Way body way

infixr 5 :>
pattern (:>) :: body -> tail -> Way body tail
pattern body :> tail = Way body (End tail)

wayExample :: Way Int Char
wayExample = 1 :- 2 :- 3 :> 'z'

-- Snake

data Snake head body tail = Snake head (Way body tail)
-- NOTE: Snake is (,) + Way instead of (,) + List like Cap
--  so technically Cap h b ~ Snake h b ()
-- We could redefine them (List, Cap, Way) in terms of Snake,
--  but that would require explicit invariants that are handled
--  implicitly by being 'different data structures' right now.

infixr 5 :|
pattern (:|) :: head -> Way body tail -> Snake head body tail
pattern head :| body = Snake head body

snakeExample :: Snake String Int ()
snakeExample = "head" :| 1 :- 2 :- 3 :> ()

We can use these as concrete path types with our various p- and s- morphisms, like we did specifically with phylo and List / (:).

capdihylo
    :: BiindexedBifunctor f
    => (Cap (FirstIndex f) (SecondIndex f) -> a -> b)
    -> ([SecondIndex f] -> f b d -> d)
    -> ([SecondIndex f] -> c -> f a c)
    -> [SecondIndex f] -> c -> d
capdihylo = pdihylo Cap (:)

wayhylo
    :: (IndexedFunctor f)
    => (Way (Index f) source -> f b -> b)
    -> (Way (Index f) source -> a -> f a)
    -> source
    -> a
    -> b
wayhylo = shylo End Way

snakedihylo
    :: BiindexedBifunctor f
    => (Snake (FirstIndex f) (SecondIndex f) source -> a -> b)
    -> (Way (SecondIndex f) source -> f b d -> d)
    -> (Way (SecondIndex f) source -> c -> f a c)
    -> source -> c -> d
snakedihylo = sdihylo End Snake Way

Note that source is still polymorphic for way- and snake- - so it can be nested. This allows me to concatenate heterogenous recursive structures, with one snake biting the tail of / being swallowed by another to become it's source a la this monstrosity:

doublesnakedihylo
    :: (BiindexedBifunctor f, BiindexedBifunctor g)
    => (Snake (FirstIndex g) (SecondIndex g) (Snake (FirstIndex f) (SecondIndex f) source) -> x -> y)
    -> (Way (SecondIndex g) (Snake (FirstIndex f) (SecondIndex f) source) -> g y b -> b)
    -> (Way (SecondIndex g) (Snake (FirstIndex f) (SecondIndex f) source) -> a -> g x a)
    -> (Way (SecondIndex f) source -> f b d -> d)
    -> (Way (SecondIndex f) source -> c -> f a c)
    -> source
    -> c
    -> d
doublesnakedihylo g galg gcoalg falg fcoalg = snakedihylo (snakedihylo g galg gcoalg) falg fcoalg

Take a good long look at this to understand what it is doing - and take all the time that you need. We've got the source, the outer dihylo with the first path type, the inner dihylo with the second path type, using the first path type as its own source, and finally, at the core, the inner dihylo's indexed mapping function, which stretches all the way back to the source.

The monster factor comes from it being able to be pedantic about all of those things, which makes the type definition rather large - but it could also be seen as a recursive generalization of the concept of a URL, minus any minutia about specific encoding. It helps to clean up the types a little bit with some handy type aliases:

type BodyPath f = Way (SecondIndex f)
type SnakePath f = Snake (FirstIndex f) (SecondIndex f)

doublesnakedihylo
    :: (BiindexedBifunctor f, BiindexedBifunctor g)
    => (SnakePath g (SnakePath f source) -> x -> y)
    -> (BodyPath g (SnakePath f source) -> g y b -> b)
    -> (BodyPath g (SnakePath f source) -> a -> g x a)
    -> (BodyPath f source -> f b d -> d)
    -> (BodyPath f source -> c -> f a c)
    -> source
    -> c
    -> d

You can do things like path down a Network (FileSystem (Json)), by using an IP address or domain name as the source, then traverse with directory entries to make the file path to a JSON file, which we burrow down into in a second traversal over the JSON contents, all to point at a specific fragment / sub-JSON object in a specific file on a specific computer in the network.

We can even nest it at the end, sort of like encoding one URL into the query fragment of another URL - except there's no character escaping necessary. If we could translate the following URL file://127.0.0.1/path/to/json/file?path.to.json.object into a very long Snake, we might get something like:


type Scheme   = String
type IPSubnet   = Int
type FilePath = String
type JsonPath = String

-- NOTE: We've left the head as () for now, but it could be made to contain
--  anything, such as the object retrieved by the path
urlSnake :: Snake () JsonPath (Way FilePath (Way IPSubnet Scheme))
urlSnake
    = ()
    :| "object" :- "json" :- "to" :- "path"
    :> "file" :- "json" :- "to" :- "path"
    :> 1 :- 0 :- 0 :- 127 
    :> "file"
    -- NOTE: Remember, it reads from right to left / bottom-up because it is a stack.

One thing to note is that the head of the snake is (), because Json is not a functor. It does however have leaf nodes containing single terminal values. We can imagine:

data Json' a
    = Object (Map String (Json' a))
    | Array (Vector (Json' a))
    | Prim a
type Json = Json' (String :+: Scientific :+: Bool :+: ())

Then the () as head makes sense* - often it is implicitly there, and is usually hidden away in singular functors like List because it provides no additional information - the only content child a of Cons a r is a. This would be different for plural functors like Join (,) a (which is really (a,a), and would have an index of Bool aka Bit instead of () aka Unit), but thankfully they are less common.

NOTE: * Really strictly speaking, it should actually be Proxy String :+: Proxy Scientific :+: Proxy Bool :+: Proxy () or something, but once we strip the types it all collapses down to () and can be poofed away. With the polymorphic indexing, we are free to abstract away or ignore irrelevant details. If we had kept the concretely-type indexing and pathing, we would not be able to do so nearly as easily.

Regardless, we could have defined urlSnake as Way JsonPath (Way FilePath (Way IPSubnet Scheme)) or Snake [JsonPath] FilePath (Way IPSubnet Scheme)), but it doesn't really matter, because we can pick and choose as we need. Now that we have embraced polymorphic indexing and pathing, we have lots of options! You can do whatever you want!

NOTE: If we look closely at what we have done, we can just about see that we have performed our own unfolding of the path, along side the main morphism. Ideally, if the path is itself recursive (and [index] is recursive), we should be able to take advantage of that to write a variant that works like shylo except also exposing the path's base functor by taking a coarg-like i -> p -> Base p p argument instead of a cons-like argument. However, to do that, we need to express IndexedRecursiveFunctor such that IndexF f ~ Base (Index f) - we've been getting away type and constraint aliases for quite some time, and it may finally be worth defining a new typeclass.

Conclusion

We will end this article with a few questions / observations.

Sources are a one-hole context / reduction of the rest of the universe

So, what is a source anyway? What is 'up'?

When we use paths, we are taking a piece of the above, and carrying it down into the below, so that we may use it to find our way back up to where we started. A source just takes that concept further, beyond the local scope of a path.

As you climb up a ladder, (a step) what was once above you, getting closer, and then all of a sudden it is below you, receding - and the same process occurs in reverse as you climb back down. Yet every step of the ladder is still just part of the ladder - above and below are defined by where you are on the ladder, not by the ladder itself.

In a way, sources are co / dual to indexing - they are going the other way, up, instead of down.

We know what 'below' is. hylo has a homogenous path; with dihylo has a homogenous path turning heterogenous at the end (in order to point at the functor content). With sources, we are contemplating that path changing from the other direction - a flip in perspective.

Above could be more of the same - a homogenous source, the perspective of being a middle link in a homogenous path continuing upwards. Or, above could be something different - a heterogenous source, the perspective of being the final link, anchored to a different, greater structure.

This leaves us with a critical question - if sources are dual to indices, and indices are related to the notion of a one-hole context, what does that mean for how sources related to one-hole-contexts? That is, if an index is a reduction of the rest of the data structure, what is a source a reduction of?

Earlier, we described indices and paths as a minimal one-hole context describing 'the rest of the data structure' well enough to navigate through it. If that holds, then a source is the minimal one-hole context for the entire data structure, describing 'what is above the data structure' - we have flipped our perspective, and instead of being outside the hole, we are inside, and the only thing that we know about the outside, is what we have been told / brought along with us.

A source is a reduction of the rest of the universe, the world stripped of all but the relevant elements - a sort of minimal one-hole-supercontext, if you will, reflecting the way that indices and paths do the same for the local data structure.

It is worth mentioning that the Haskell RealWorld trick aligns neatly with our concept of Sourced, making RealWorld the inaccessible origin of the snake, hidden behind functions guarding interaction with the real world. It should not be unexpected that some of the seemingly arbitrary choices of Haskell can be understood through this perspective.

Why is this?

NOTE: Construing RealWorld as a Source will yield significant benefits in the future when it comes to representing otherwise-pure segments of code interspersed with the occasional need to 'update' the real world. For instance, what if you wanted to 'replay' the real world? Or swap it for an 'alternate universe'? Try to imagine if, rather than a RealWorld with no constructors and only used at the type-level, we instead had a git-like Repository construct containing a representation of a subset of the real world, updating only when something has changed and a relevant 'real world' commit has been made. Then, for compatibility, the existing RealWorld could be treated as a Repository committing infinitesimally often in between each evaluation step, being continuous instead of discrete, and having no way of either branching nor rolling back. It is a shame that git is only designed to work on directories and text files, because I'd love to be able to apply that level of control over my application or program state. We'll come back to this in the future.

Sources are an abstraction of both indexing and the reader monad.

It should be observed that Sourced - however we define it when we properly do - is an abstraction over both Indexed and Reader, as well as taking function parameters in general.

Basically, Reader r aand r -> a are isomorphic, meaning these two are effectively the same:

readerHylo1 :: (f b -> Reader r b) -> (a -> Reader r (f a)) -> a -> Reader r b
readerHylo2 :: (f b -> r -> b) -> (a -> r -> f a) -> a -> r -> b

We can also define this, which is the exact same thing minus some flips.

readerHylo3 :: (IndexedFunctor f) => (src -> f b -> b) -> (src -> a -> f a) -> src -> a -> b
readerHylo3 = shylo id (const id) -- Or: phylo (const id) since phylo = shylo id

The implications of readerHylo3 are that sources abstract both Reader and Indexed / Pathed - you get one or the other, depending on what you leave out. If Sourced is Reader, with Indexed, then Reader is Sourced, without Indexed, and Indexed is Sourced without Reader (or at the very least, Sourced without local).

This means that, in the last article (and this one as well), we could have written our p- and s- functions using a combination of Indexed and Reader, using local to append each index to the path in turn. Were the last few articles just a huge waste of our time?

Certainly not! This shouldn't be seen as a bad thing - rather, this should be seen as a natural and necessary consequence of how Reader and Index combine to express "adding each step we take into 'what is up'" . Sourced is a convenient way of acknowledging using something like Reader and local to build polymorphic / openly typed paths, without invoking monads directly - meaning we can use it in a purely functional context.

Far from being pointless, this result - the relationship between Sourced, Indexed, and Reader - is going to be important in coming articles.

Unanswered questions

Finally, over the last few articles, there have been a few unanticipated questions - some of my own, as well as some from readers. I would like to address them and keep track of them, even if I cannot answer them right now.

We started this series branching off of recursion, with a specific journey in mind. There is a jungle of unexplored territory here, a wealth of things to be found - the curse of incompleteness being that the jungle of knowledge grows faster than we can explore it, and not wanting to go off-course, we must press on with our charted course.

Two roads diverged in a yellow wood, And sorry I could not travel both
The Road Not Taken, Robert Frost

From one traveler to another, having already chosen my path, I invite you to take your own. Open roads (questions) include:

Power-equivalence of mono-, bi-, and di- recursion
Expressing mutual recursion through bi- or di- recursion
A formal definition in reference to initial algebras
Relationship to zippers / jokers / clowns
A deeper look at the one-hole context paper
Atkey-style indexed functors
Extending the whole zoo of morphisms (para,apo, et al)
A closer look at the index / source duality with respect to the cata / ana duality
Expression of recursion in conjunction with the FunctorOf proposal

Up next: Cryptographic recursion schemes

In our next article, we're in for a real treat! We are going to plow head-first into the practical utility of sourced recursion schemes, in order to explore the concept of cryptographic recursion schemes. We're going to build a few cryptographic data structures, and apply them to JSON in order to create our own distributed JSON format.

You see, sources are an excellent way to represent things like signing and encryption keys, salts, or choice of algorithm. Cryptographic values blur the line between data and type*, because a signing or encryption key defines the set of valid cryptographic elements that it applies to - making it possible to use type safety to enforce ownership, such as making it impossible for me to even try to use my key to open your box.

NOTE: * This actually occurs due to a very special reason, which we will get to when we talk about sources at the type-level.

That's all!

I hope you join me next time, traveler, and for now, I wish you a safe journey!