Haskell: Sourced Recursion Schemes
Up through the recursiverse
This is the third article in a series about recursion, focused on developing the birecursion-schemes library, although it is starting to grow beyond that scope.
NOTE: The
birecursion-schemes
library has been updated to0.0.3
following along with today's article. This update constitutes a breaking change, but since this library is still experimental / not yet uploaded to hackage, we're just going bump the minor version. Once it is past the experimental phase, we will release it as1.0.0
on hackage.
Welcome again, new and returning readers.
Morning dawns, the campfire smoulders, and a new day begins. The smoke from our campfire drifts, ascending lazily into the sky, embers sleeping as we in turn awaken after a long rest.
The previous article was long enough to be awkward as a snake swallowing a watermelon. All things take practice, and I am still developing my writing style - but I promise this one is shorter, and more to the point - a bit of mental digestion, to gather energy now, before we begin to take action later.
In order to expedite our journey, we will focus more closely on the core. This means that we will not be laboriously defining every function variant, but rather treat hylo
as our spine. Unless there is something particular I wish to note, I will leave cata
/ ana
/ iso
/ trans
and -A
/ -M
variants for the reader to derive - if you've been following along, you should have no trouble doing so.
Recap
First, a quick recap over the last few articles in the series.
Birecursion Schemes
In the first article, we examined recursion, with the goal of 'recursing harder' - that is, finding the next way that things may recurse differently, instead of recursing in the same way.
At a glance, it is about these functions:
fmap :: (a -> b) -> f a -> f b
hylo :: (f b -> b) -> (a -> f a) -> a -> b
bihylo :: (f b b -> b) -> (a -> f a a) -> a -> b
dihylo :: (a -> b) -> (f b d -> d) -> (c -> f a c) -> c -> d
We started with the standard implementation of recursion using functors. By repeatedly calling fmap
from within itself, we can unfold and descend through a recursive structure, level by level, and then ascend and fold up the result.
class Functor f where
fmap :: (a -> b) -> f a -> f b
hylo :: (Functor f) => (f b -> b) -> (a -> f a) -> a -> b
hylo alg coalg = h where h = alg . fmap h . coalg
-- = alg . fmap (alg . fmap (...) . coalg) . coalg
We then extended the concept, using hylo
as a template, by lifting from functors to bifunctors. This resulted in two different new forms of recursion - bi-
and di-
recursion, because with bifunctors we can recurse in either only one slot, or both.
class (forall a . Functor (f a)) => Bifunctor f where
bimap :: (a -> b) -> (c -> d) -> f a c -> f b d
-- hylo :: (f b -> b) -> (a -> f a) -> a -> b
bihylo :: (Bifunctor f) => (f b b -> b) -> (a -> f a a) -> a -> b
bihylo alg coalg = h where h = alg . bimap h h . coalg
-- hylo = alg . fmap h . coalg
-- hylo :: (f b -> b) -> (a -> f a) -> a -> b
dihylo :: (Bifunctor f) => (a -> b) -> (f b d -> d) -> (c -> f a c) -> c -> d
dihylo f alg coalg = h where h = alg . bimap f h . coalg
-- hylo = alg . fmap h . coalg
The first, bi-
recursion, recurses in both slots of the bifunctor, emphasizing recursion in a side by side manner. The second, di-
recursion, only recurses in one slot, leaving the other slot free and making it a recursive functor, emphasizing nested recursive types.
In particular, we may illuminate the defining use case of the di-
morphisms - recursive functors:
cata :: (Recursive r) => (Base r b -> b) -> r -> b
dicata :: (Direcursive f) => (a -> b) -> (Dibase f b c -> c) -> f a -> c
ana :: (Corecursive r) => (c -> Base r c) -> c -> r
diana :: (Codirecursive f) => (a -> b) -> (c -> Dibase f a c) -> c -> f b
Note that the argument of dicata
is a functor f a
, and the result of diana
is a functor f b
. This highlights how that, in dihylo
, the argument and result may both be a functor, but the type definition of dihylo
does not show it because it is also possible that the neither the argument nor result be a functor.
We can however constrain dihylo
slightly to show the specialized versions.
-- The original
dihylo :: (Bifunctor t)
=> (a -> b) -> (t b d -> d) -> (c -> t a c) -> c -> d
-- Argument `c` becomes `f a`
dihylo :: (Functor f, Bifunctor t)
=> (a -> b) -> (t b c -> c) -> (f a -> t a (f a)) -> f a -> c
-- Result `d` becomes `g b`
dihylo :: ( Functor g, Bifunctor t)
=> (a -> b) -> (t b (g b) -> g b) -> (c -> t a c) -> c -> g b
-- Combination of both
dihylo :: (Functor f, Functor g, Bifunctor t)
=> (a -> b) -> (t b (g b) -> g b) -> (f a -> t a (f a)) -> f a -> g b
-- NOTE: The `Functor f` and `Functor g` restrictions aren't /strictly/ necessary.
Although the original type definition sufficiently covers these them, these specialized versions emphasize what dicata
and diana
emphasize compared to cata
and ana
- that the argument or result could now a functor, and these di-
functions are equipped to handle that.
Indexed and Pathed Recursion Schemes
In the second article, we again extended the concept of recursion, this time with indexing, in order to keep track of where we are and giving us a way to retrace our steps - morphisms that know where they are.
At a glance, it is about these functions:
imap :: (Index f -> a -> b) -> f a -> f b
phylo :: (Path f -> f b -> b) -> (Path f -> a -> f a) -> Path f -> a -> b
pbihylo :: (Bipath f -> f b b -> b) -> (Bipath f -> a -> f a a) -> Bipath f -> a -> b
pdihylo :: (Diindex f -> a -> b) -> (Dipath f -> f b d -> d) -> (Dipath f -> c -> f a c) -> Dipath f -> c -> d
We started with the addition of Index f ->
to fmap
, and since an indexed base functor begets a pathed recursive, this resulted in the addition of Path f
née [Index f] ->
to hylo
.
type family Index (f :: * -> *) :: *
type family Path (f :: * -> *) :: [Index f]
-- NOTE: These type definitions may have changed since the last article.
-- The nomenclature is still being worked.
class Indexed f where
indexed :: f a -> f (Index f, a)
class (Indexed f, Functor f) => IndexedFunctor f where
imap :: (Index f -> a -> b) -> f a -> f b
phylo -- hylo
:: IndexedFunctor f -- :: Functor f
=> (Path f -> f b -> b) -- => (f b -> b)
-> (Path f -> a -> f a) -- -> (a -> f a)
-> Path f -> a -> b -- -> a -> b
phylo alg coalg = h where h p = alg p . imap (\ i -> h (i : p)) . coalg p
--- hylo = alg . fmap h . coalg
We also applied the concept to bi-
and di-
recursion. This gave us slight variations on the type of path - with bi-
recursion using Either
to mix both indices into a Biindex
and Bipath
, while di-
recursion uses ,
to cap a Dipath
of multiple indices of one type using a single value of the other to create a Diindex
.
type family FirstIndex (f :: * -> * -> *) :: *
type family SecondIndex (f :: * -> * -> *) :: *
class Biindexed f where
biindexed :: f a b -> f (FirstIndex f, a) (SecondIndex f, b)
class (Biindexed f, Bifunctor f) => BiindexedBifunctor f where
ibimap :: (FirstIndex f -> a -> b) -> (SecondIndex f -> c -> d) -> f a c -> f b d
type Biindex (f :: * -> * -> *) = Either (FirstIndex f) (SecondIndex f)
type Bipath (f :: * -> * -> *) = [Biindex f]
-- NOTE: These type definitions may have changed since the last article.
-- The nomenclature is still being worked.
pbihylo
:: (BiindexedBifunctor f)
=> (Bipath f -> f b b -> b)
-> (Bipath f -> a -> f a a)
-> Bipath f -> a -> b
pbihylo alg coalg = h where
h p = alg p . ibimap (\ i -> h (Left i : p)) (\ j -> h (Right j : p)) . coalg p
-- NOTE: These type definitions may have changed since the last article.
-- The nomenclature is still being worked.
type Diindex (f :: * -> * -> *) = (FirstIndex f, [SecondIndex f]) -- ~ (FirstIndex f, Dipath f)
type Dipath (f :: * -> * -> *) = [SecondIndex f]
pdihylo
:: BiindexedBifunctor f
=> (Diindex f -> a -> b)
-> (Dipath f -> f b d -> d)
-> (Dipath f -> c -> f a c)
-> Dipath f -> c -> d
pdihylo f alg coalg = h where
h p = alg p . ibimap (\ i -> f (i , p)) (\ j -> h (j : p)) . coalg p
Note that pdi-
functions emphasize the relationship between Index
and Diindex
, as well as Path
and Dipath
- however, the structural / concrete path types make these functions less flexible than is ideal, and so we need a definition that is more type-flexible if we are to continue - this being how we ended it last.
NOTE: There's quite a bit more to it than what I've outlined here, so it may be worth reading the article if you are just joining us, or haven't already.
Continuing onward with type-flexible profunctoral indices and polymorphic paths
We left off last article with the observation that locking ourselves into p-
variants with specific types was rather awkward. In retrospect, having p-
variants with strict types should be considered to be far too simplistic, a result of our focus being mostly about structural indices. Instead, we should find a way to define p-
variants with polymorphic paths - and take the profunctoral perspective!
imapWith / ibimapWith
And so we took our first stab at it, by defining imapWith
, a variant of imap
.
imapWith
:: (IndexedFunctor f)
=> (Index f -> path -> indexpath) -- Cons
-> (indexpath -> a -> b)
-> path -- Nil
-> f a -> f b
imapWith cons f p = imap (\ i -> f (cons i p))
ibimapWith
:: (BiindexedBifunctor f)
=> (FirstIndex f -> path -> findexpath) -- Left Cons
-> (SecondIndex f -> path -> sindexpath) -- Right Cons
-> (findexpath -> a -> b)
-> (sindexpath -> c -> d)
-> path -> f a c -> f b d
ibimapWith icons jcons f g p = ibimap (\ i -> f (icons i p)) (\ j -> g (jcons j p))
So how and why exactly did we define imapWith
?
We defined imapWith
by replacing the implicit use of the List
data type with List
-like constructor arguments - a bit like slapping on some Scott or Church encoding onto it as the solution to expressing profunctoral indices. A little pure lambda calculus never hurt anyone.
If we look closely, its definition imapWith cons f p = imap (\ i -> f (cons i p))
is almost exactly the same as the core of phylo: imap (\ i -> h (i:p))
. In fact, imapWith
generalizes this pattern to allow for any cons
-like function, which we can use to define phylo
(and pbihylo
and pdihylo
) more easily and clearly.
phylo alg coalg = h where
h p = alg p . imapWith (:) h p . coalg p
pbihylo alg coalg = h where
h p = alg p . ibimapWith ((:) . Left) ((:) . Right) h p . coalg p
pdihylo f alg coalg = h where
h p = alg p . ibimapWith (,) (:) h p . coalg p
It was a pattern that we were using repeatedly, so why not make life easier?
pmapWith
However, this isn't the only reason that we defined this function. If we squint by constraining the types ever so slightly, the list-like nature of a pathed map is blatantly obvious:
-- A pointless specialization? But good for illustration!
pmapWith
:: (IndexedFunctor f)
=> (Index f -> path -> path) -- Cons
-> (path -> a -> b)
-> path -- Nil
-> f a -> f b
pmapWith = imapWith
-- We'll throw in pbimapWith for free
pbimapWith
:: (BiindexedBifunctor f)
=> (FirstIndex f -> path -> fpath) -- Left Cons
-> (SecondIndex f -> path -> spath) -- Right Cons
-> (fpath -> a -> b)
-> (spath -> c -> d)
-> path -- Nil
-> f a c -> f b d
pbimapWith = ibimapWith
Note the List
constructors. In imapWith
, we allowed the result type of cons
to vary, because it costs us nothing to relax from pmapWith
and (Index f -> path -> path)
/ (path -> a -> b)
to (Index f -> path -> indexpath)
and (indexpath -> a -> b)
in imapWith
.
pmap
In fact, we can define pmap
, which is the concretely-typed, single-step / non-recursive equivalent of phylo
, by restricting the path to the List
type like we did with phylo
:
pmap
:: (IndexedFunctor f)
=> ([Index f] -> a -> b)
-> [Index f]
-> f a -> f b
pmap = imapWith (:)
pbimap
:: (BiindexedBifunctor f)
=> ([Biindex f] -> a -> b)
-> ([Biindex f] -> c -> d)
-> [Biindex f] -> f a c -> f b d
pbimap = ibimapWith ((:) . Left) ((:) . Right)
We even could have used it to define phylo
, if we wished:
phylo alg coalg = h where h p = alg p . pmap h p . coalg p
phyloWith / pbihyloWith / pdihyloWith
Since we've just defined a p-
variant of imap
, it is a fitting form of symmetry that we can also derive a polymorphic-path -With
variant of phylo
for the whole suite of p-
morphisms, which allow us to customize the path type for the morphism.
We can define a function phyloWith
based off of imapWith
, to keep the cons
-like constructor as an argument.
Then, phylo = phyloWith (:)
.
phyloWith
:: IndexedFunctor f
=> (Index f -> path -> path) -- Cons
-> (path -> f b -> b)
-> (path -> a -> f a)
-> path -- Nil
-> a -> b
phyloWith cons alg coalg = h where h p = alg p . imapWith cons h p . coalg p
Note that the cons
argument can no longer vary its result type as imapWith
did - this is because hylo
is plural - it may use the singular fmap
repeatedly, and so the type must stay the same. The repetition locks us into the path
type for the span of the hylomorphism - and it is the same for pbihyloWith
, as both of its cons-like arguments get locked in:
pbihyloWith
:: (BiindexedBifunctor f)
=> (FirstIndex f -> path -> path) -- Left cons
-> (SecondIndex f -> path -> path) -- Right cons
-> (path -> f b b -> b)
-> (path -> a -> f a a)
-> path -> a -> b
pbihyloWith icons jcons alg coalg = h where h p = alg p . ibimapWith icons jcons h h p . coalg p
However, it is not the same with pdihyloWith
- one of the cons-like arguments is freed, as it is capped with an index pointing to the functor content.
pdihyloWith
:: BiindexedBifunctor f
=> (FirstIndex f -> path -> index) -- Cap
-> (SecondIndex f -> path -> path) -- Cons
-> (index -> a -> b)
-> (path -> f b d -> d)
-> (path -> c -> f a c)
-> path -- Nil
-> c -> d
pdihyloWith cap cons f alg coalg = h where h p = alg p . ibimapWith cap cons f h p . coalg p
These forms are more illustrative of their nature than the original p-
functions that we derived last article phylo
, pbihylo
, and pdihylo
. Whereas the originals have concrete path types - in part due to how we were focusing on the structural perspective of indices - these new functions allow us to properly express the profunctoral perspective, and allow our effective index to be whatever type we want by supplying the appropriate constructors.
This is the sought-after type-flexibility that allows for us to express both the structural and profunctoral perspectives of indexing - and in fact, because of their stronger expressiveness, we are going to rename them all and give the original names to these new functions.
pmapWith
is renamed topmap
pbimapWith
is renamed topbimap
phyloWith
is renamed tophylo
pbihyloWith
is renamed topbihylo
pdihyloWith
is renamed topdihylo
Of course, since the original functions have had their names stolen, we should do something with them - we can either discard them as unnecessary, or keep them and rename them to reflect their more concretely-typed nature, it doesn't really matter which. We'll rename / redefine them as such:
pmapList = pmap (:) -- Originally pmap
phyloList = phylo (:) -- Originally phylo
pbihyloList = pbihylo ((:) . Left) ((:) . Right) -- Originally pbihylo
pdihyloList = pdihylo (,) (:) -- Originally pdihylo
So, from now on, remember that the new p-
functions now may take a few additional arguments, compared to the original definitions last article - but we got rid of that awkward -With
terminology - which brings us to our next question.
What about imapWith
and ibimapWith
?
They aren't p-
functions, so they weren't involved in our little renaming party just now - and now their -With
suffix sticks out like a sore thumb. Is there anything we can do about it?
Sources
We defined imapWith
and ibimapWith
initially based on the concept of continuing a path by adding a single index to it. However, this perspective is subtly wrong. We used a path, because that is what we had at the time - 'a path back up'. But as it turns out, we mistook the path for the need that it fulfilled - we are not looking for a 'path', we are looking for 'up'.
We have been talking from the path perspective - that is, building up a path from indices. However, we should not assume that we are always consing the index onto a path - we should generalize, to understand that we are consing the index onto whatever is 'up'. It might be the first link in the chain. It might be a path - a series of indices linked together. It might even be anchored to something - that is usually what chains are for.
We need a change in perspective, if we are to continue our ouroborosean journey.
smap / sbimap
If a source is a generalization of index and path, to be 'up', then the proper perspective for imapWith
is not (Index f -> path -> indexpath)
, but rather (Index f -> source -> index)
- where the source might be a path, might not.
We can reframe the types to be clear about this:
smap
:: (IndexedFunctor f)
=> (Index f -> source -> index) -- Cons
-> (index -> a -> b)
-> source -- Nil
-> f a -> f b
smap cons f src = imap (\ i -> f (cons i src))
-- NOTE: imap = smap const
This is just imapWith
- literally, we've only changed type variable names, and not any logic - but it has the proper perspective, as to better guide our intent. In fact, we will ditch the imapWith
nomenclature entirely, in favor of renaming it to smap
, since they are entirely the same function.
We can of course perform the same perspective reframing with ibimapWith
, giving us sbimap
:
sbimap
:: (BiindexedBifunctor f)
=> (FirstIndex f -> source -> findex) -- Left Cons
-> (SecondIndex f -> source -> sindex) -- Right Cons
-> (findex -> a -> b)
-> (sindex -> c -> d)
-> source -> f a c -> f b d
sbimap icons jcons f g p = ibimap (\ i -> f (icons i p)) (\ j -> g (jcons j p))
-- NOTE: ibimap = sbimap const const
So, from now on, remember that smap = imapWith
and sbimap = ibimapWith
.
shylo / sbihylo / sdihylo
Now that we have added smap
and sbimap
to the renaming part, officially welcoming them as a variant of imap
and fmap
, our next step is obvious - we wish to try and derive shylo
. However, it is not as simple as sticking smap
inside of hylo
, because we have already done that to get phylo
and friends.
Remember - we just renamed imapWith
to smap
, and phyloWith
to phylo
, and phylo
was already defined using imapWith
in the first place. And the same for all of the p-
functions! We want to do something with smap
to make some function shylo
, but we ended up with phylo
! What gives?
If we look closely, the source
type hasn't disappeared, it's been folded up into path
! So what is going on?
You may recall earlier that we defined pmapWith = imapWith
(now pmap = smap
), a seemingly useless specialization. All it does is constrain smap
's cons-like argument Index f -> source -> index
to be an actual cons Index f -> path -> path
- but also, it has forced the initial source
argument to become path
too.
When we use smap
for a single level, the source
type is unperturbed. However, when we use it recursively by plugging it into hylo
, it forces source
and path
to be the same type, since they are both 'up' from the index. So, by using smap
in hylo
it is automatically dropped to be as strict as pmap
!
In fact, all of the p-
functions can be defined using pmap
or pbimap
. Well, almost all of them. Notably, pdihylo
still requires sbimap
rather than pbimap
- and we must ask, why?
We defined pdihylo
as:
pdihylo
:: BiindexedBifunctor f
=> (FirstIndex f -> path -> index) -- Cap
-> (SecondIndex f -> path -> path) -- Cons
-> (index -> a -> b)
-> (path -> f b d -> d)
-> (path -> c -> f a c)
-> path -- Nil
-> c -> d
pdihylo cap cons f alg coalg = h where
h p = alg p . sbimap cap cons f h p . coalg p
Note the cap
argument has type FirstIndex f -> path -> index
and f
has type index -> a -> b
.
We needed to do this to jump from talking about a recursive path, to talking about a (recursive) functor's index, as we transitioned from recursing through the functor, to mapping over the functor content.
But we could have defined pdihylo
as:
pdihylo
:: BiindexedBifunctor f
=> (FirstIndex f -> path -> path) -- Cap
-> (SecondIndex f -> path -> path) -- Cons
-> (path -> a -> b)
-> (path -> f b d -> d)
-> (path -> c -> f a c)
-> path -- Nil
-> c -> d
pdihylo cap cons f alg coalg = h where
h p = alg p . pbimap cap cons f h p . coalg p
In contrast, this version has cap
argument of type FirstIndex f -> path -> path
. It forces the recursive functor to use the same type for both recursive path and functor index, allowing us to define it purely in terms of pbimap
instead of sbimap
- but also making it deficient, so we'll be keeping the other definition.
For pdihylo
, we used smap
to preserve the index
type at one end - and with this in mind, we know that we need to do something to preserve the initial source
type at the other end. We need to be explicit about using the source
to start the path
, just as we were explicit about capping / ending it with an index
.
To get smap
, we added a cons-like argument to imap
so we could 'intercept' the index
, but when the resulting smap
gets used recursively / plurally by shylo
, it composes smap
's source
type away into the path
type. shylo
needs something else because we can't just reuse smap
's cons - it is already in use.
To restore our source-iness, we need to 'intercept' the original source and turn it into a path, explicitly - hence, we can solve this by adding a new (source -> path)
argument to phylo
to get shylo
.
NOTE: There is no point in adding a
(source -> path)
argument tosmap
itself - it would only ever be immediately composed with theIndex f -> path -> index
argument, meaning thatsmap
doesn't ever really see thepath
type, and we can just perform that composition outside giving us thesmap
that we actually defined.
shylo
:: (IndexedFunctor f)
=> (source -> path) -- End
-> (Index f -> path -> path) -- Way
-> (path -> f b -> b)
-> (path -> a -> f a)
-> source -- End's argument
-> a -> b
shylo end way alg coalg = h . end where
h p = alg p . smap way h p . coalg p
-- NOTE: phylo = shylo id
We can do the same for pbihylo
to get sbihylo
:
sbihylo
:: BiindexedBifunctor f
=> (source -> path) -- End
-> (FirstIndex f -> path -> path) -- Left cons
-> (SecondIndex f -> path -> path) -- Right cons
-> (path -> f b b -> b)
-> (path -> a -> f a a)
-> source -- End's argument
-> a -> b
sbihylo end icons jcons alg coalg = h . end where
h p = alg p . sbimap icons jcons h h p . coalg p
And again to pdihylo
to yield sdihylo
.
sdihylo
:: BiindexedBifunctor f
=> (source -> path) -- End
-> (FirstIndex f -> path -> index) -- Cap
-> (SecondIndex f -> path -> path) -- Cons
-> (index -> a -> b)
-> (path -> f b d -> d)
-> (path -> c -> f a c)
-> source -- End's argument
-> c -> d
sdihylo end cap cons f alg coalg = h . end
where h p = alg p . sbimap cap cons f h p . coalg p
With sdihylo
, we can see the entire journey take place, from source
above to path
in between to index
below.
Now, these aren't perhaps the most ergonomic functions, their arrangement of arguments having mostly been the result of sticking things on the front of hylo
and friends without much accounting for natural flow. Perhaps we will revisit the issue of ergonomics in the future.
Snake paths
So, how do we go about using these s-
variants? What do we do to fill these new arguments?
With the phylo
, it was easy - we used (:)
and so the path
was just [index]
. Things got a little trickier when we got to pdihylo
- in addition to using (:)
on one side to get [sindex]
as the recursive path
, we had to add (,)
on the other to get (findex,[sindex])
as the functor index
.
With sources, we need something on the other end. We could just use (,)
again, giving us something like ([index],src)
for phylo
and (findex,[sindex],src)
for pdihylo
. A little bit of work later, and viola!
tuplehylo
:: (IndexedFunctor f)
=> (([Index f], source) -> f b -> b)
-> (([Index f], source) -> a -> f a)
-> source
-> a
-> b
tuplehylo = shylo ([],) (first . (:))
tripledihylo
:: (BiindexedBifunctor f)
=> ((FirstIndex f, [SecondIndex f], source) -> a -> b)
-> (([SecondIndex f], source) -> f b d -> d)
-> (([SecondIndex f], source) -> c -> f a c)
-> source
-> c
-> d
tripledihylo = sdihylo ([],) (\ i (p,s) -> (i,p,s)) (first . (:))
These tuple forms aren't bad per se, they do give us fast access to the heterogenous first and last elements. However, I want to take a more recursive approach, because I'd like to give base fuctors to indices of recursive structures later (much later, as in not-appearing-in-this-article later).
First, we need some list-like data types.
-- Cap
-- NOTE: Cap is (,) + List
data Cap head body = Cap head [body]
-- NOTE: Cap is NonEmpty except head is a heterogenous element
type NonEmpty a = Cap a a
-- NOTE: We stole `:|` from NonEmpty for Snake
infixr 5 :<
pattern (:<) :: head -> [body] -> Cap head body
pattern head :< body = Cap head body
capExample :: Cap String Int
capExample = "head" :< 1 : 2 : 3 : []
-- Way
data Way body tail = Way body (Way body tail) | End tail
-- NOTE: Way is List except Nil contains a heterogenous element
type List a = Way a ()
infixr 5 :-
pattern (:-) :: body -> Way body tail -> Way body tail
pattern body :- way = Way body way
infixr 5 :>
pattern (:>) :: body -> tail -> Way body tail
pattern body :> tail = Way body (End tail)
wayExample :: Way Int Char
wayExample = 1 :- 2 :- 3 :> 'z'
-- Snake
data Snake head body tail = Snake head (Way body tail)
-- NOTE: Snake is (,) + Way instead of (,) + List like Cap
-- so technically Cap h b ~ Snake h b ()
-- We could redefine them (List, Cap, Way) in terms of Snake,
-- but that would require explicit invariants that are handled
-- implicitly by being 'different data structures' right now.
infixr 5 :|
pattern (:|) :: head -> Way body tail -> Snake head body tail
pattern head :| body = Snake head body
snakeExample :: Snake String Int ()
snakeExample = "head" :| 1 :- 2 :- 3 :> ()
We can use these as concrete path types with our various p-
and s-
morphisms, like we did specifically with phylo
and List / (:)
.
capdihylo
:: BiindexedBifunctor f
=> (Cap (FirstIndex f) (SecondIndex f) -> a -> b)
-> ([SecondIndex f] -> f b d -> d)
-> ([SecondIndex f] -> c -> f a c)
-> [SecondIndex f] -> c -> d
capdihylo = pdihylo Cap (:)
wayhylo
:: (IndexedFunctor f)
=> (Way (Index f) source -> f b -> b)
-> (Way (Index f) source -> a -> f a)
-> source
-> a
-> b
wayhylo = shylo End Way
snakedihylo
:: BiindexedBifunctor f
=> (Snake (FirstIndex f) (SecondIndex f) source -> a -> b)
-> (Way (SecondIndex f) source -> f b d -> d)
-> (Way (SecondIndex f) source -> c -> f a c)
-> source -> c -> d
snakedihylo = sdihylo End Snake Way
Note that source
is still polymorphic for way-
and snake-
- so it can be nested. This allows me to concatenate heterogenous recursive structures, with one snake biting the tail of / being swallowed by another to become it's source a la this monstrosity:
doublesnakedihylo
:: (BiindexedBifunctor f, BiindexedBifunctor g)
=> (Snake (FirstIndex g) (SecondIndex g) (Snake (FirstIndex f) (SecondIndex f) source) -> x -> y)
-> (Way (SecondIndex g) (Snake (FirstIndex f) (SecondIndex f) source) -> g y b -> b)
-> (Way (SecondIndex g) (Snake (FirstIndex f) (SecondIndex f) source) -> a -> g x a)
-> (Way (SecondIndex f) source -> f b d -> d)
-> (Way (SecondIndex f) source -> c -> f a c)
-> source
-> c
-> d
doublesnakedihylo g galg gcoalg falg fcoalg = snakedihylo (snakedihylo g galg gcoalg) falg fcoalg
Take a good long look at this to understand what it is doing - and take all the time that you need. We've got the source, the outer dihylo with the first path type, the inner dihylo with the second path type, using the first path type as its own source, and finally, at the core, the inner dihylo's indexed mapping function, which stretches all the way back to the source.
The monster factor comes from it being able to be pedantic about all of those things, which makes the type definition rather large - but it could also be seen as a recursive generalization of the concept of a URL, minus any minutia about specific encoding. It helps to clean up the types a little bit with some handy type aliases:
type BodyPath f = Way (SecondIndex f)
type SnakePath f = Snake (FirstIndex f) (SecondIndex f)
doublesnakedihylo
:: (BiindexedBifunctor f, BiindexedBifunctor g)
=> (SnakePath g (SnakePath f source) -> x -> y)
-> (BodyPath g (SnakePath f source) -> g y b -> b)
-> (BodyPath g (SnakePath f source) -> a -> g x a)
-> (BodyPath f source -> f b d -> d)
-> (BodyPath f source -> c -> f a c)
-> source
-> c
-> d
You can do things like path down a Network (FileSystem (Json))
, by using an IP address or domain name as the source, then traverse with directory entries to make the file path to a JSON file, which we burrow down into in a second traversal over the JSON contents, all to point at a specific fragment / sub-JSON object in a specific file on a specific computer in the network.
We can even nest it at the end, sort of like encoding one URL into the query fragment of another URL - except there's no character escaping necessary. If we could translate the following URL file://127.0.0.1/path/to/json/file?path.to.json.object
into a very long Snake
, we might get something like:
type Scheme = String
type IPSubnet = Int
type FilePath = String
type JsonPath = String
-- NOTE: We've left the head as () for now, but it could be made to contain
-- anything, such as the object retrieved by the path
urlSnake :: Snake () JsonPath (Way FilePath (Way IPSubnet Scheme))
urlSnake
= ()
:| "object" :- "json" :- "to" :- "path"
:> "file" :- "json" :- "to" :- "path"
:> 1 :- 0 :- 0 :- 127
:> "file"
-- NOTE: Remember, it reads from right to left / bottom-up because it is a stack.
One thing to note is that the head of the snake is ()
, because Json is not a functor. It does however have leaf nodes containing single terminal values. We can imagine:
data Json' a
= Object (Map String (Json' a))
| Array (Vector (Json' a))
| Prim a
type Json = Json' (String :+: Scientific :+: Bool :+: ())
Then the ()
as head makes sense* - often it is implicitly there, and is usually hidden away in singular functors like List
because it provides no additional information - the only content child a
of Cons a r
is a
. This would be different for plural functors like Join (,) a
(which is really (a,a)
, and would have an index of Bool
aka Bit
instead of ()
aka Unit
), but thankfully they are less common.
NOTE: * Really strictly speaking, it should actually be
Proxy String :+: Proxy Scientific :+: Proxy Bool :+: Proxy ()
or something, but once we strip the types it all collapses down to()
and can be poofed away. With the polymorphic indexing, we are free to abstract away or ignore irrelevant details. If we had kept the concretely-type indexing and pathing, we would not be able to do so nearly as easily.
Regardless, we could have defined urlSnake
as Way JsonPath (Way FilePath (Way IPSubnet Scheme))
or Snake [JsonPath] FilePath (Way IPSubnet Scheme))
, but it doesn't really matter, because we can pick and choose as we need. Now that we have embraced polymorphic indexing and pathing, we have lots of options! You can do whatever you want!
NOTE: If we look closely at what we have done, we can just about see that we have performed our own unfolding of the path, along side the main morphism. Ideally, if the
path
is itself recursive (and[index]
is recursive), we should be able to take advantage of that to write a variant that works likeshylo
except also exposing the path's base functor by taking a coarg-likei -> p -> Base p p
argument instead of a cons-like argument. However, to do that, we need to expressIndexedRecursiveFunctor
such thatIndexF f ~ Base (Index f)
- we've been getting away type and constraint aliases for quite some time, and it may finally be worth defining a new typeclass.
Conclusion
We will end this article with a few questions / observations.
Sources are a one-hole context / reduction of the rest of the universe
So, what is a source anyway? What is 'up'?
When we use paths, we are taking a piece of the above, and carrying it down into the below, so that we may use it to find our way back up to where we started. A source just takes that concept further, beyond the local scope of a path.
As you climb up a ladder, (a step) what was once above you, getting closer, and then all of a sudden it is below you, receding - and the same process occurs in reverse as you climb back down. Yet every step of the ladder is still just part of the ladder - above and below are defined by where you are on the ladder, not by the ladder itself.
In a way, sources are co / dual to indexing - they are going the other way, up, instead of down.
We know what 'below' is. hylo
has a homogenous path; with dihylo
has a homogenous path turning heterogenous at the end (in order to point at the functor content). With sources, we are contemplating that path changing from the other direction - a flip in perspective.
Above could be more of the same - a homogenous source, the perspective of being a middle link in a homogenous path continuing upwards. Or, above could be something different - a heterogenous source, the perspective of being the final link, anchored to a different, greater structure.
This leaves us with a critical question - if sources are dual to indices, and indices are related to the notion of a one-hole context, what does that mean for how sources related to one-hole-contexts? That is, if an index
is a reduction of the rest of the data structure, what is a source
a reduction of?
Earlier, we described indices and paths as a minimal one-hole context describing 'the rest of the data structure' well enough to navigate through it. If that holds, then a source is the minimal one-hole context for the entire data structure, describing 'what is above the data structure' - we have flipped our perspective, and instead of being outside the hole, we are inside, and the only thing that we know about the outside, is what we have been told / brought along with us.
A source is a reduction of the rest of the universe, the world stripped of all but the relevant elements - a sort of minimal one-hole-supercontext, if you will, reflecting the way that indices and paths do the same for the local data structure.
It is worth mentioning that the Haskell RealWorld
trick aligns neatly with our concept of Sourced
, making RealWorld
the inaccessible origin of the snake, hidden behind functions guarding interaction with the real world. It should not be unexpected that some of the seemingly arbitrary choices of Haskell can be understood through this perspective.
Why is this?
NOTE: Construing
RealWorld
as aSource
will yield significant benefits in the future when it comes to representing otherwise-pure segments of code interspersed with the occasional need to 'update' the real world. For instance, what if you wanted to 'replay' the real world? Or swap it for an 'alternate universe'? Try to imagine if, rather than aRealWorld
with no constructors and only used at the type-level, we instead had agit
-likeRepository
construct containing a representation of a subset of the real world, updating only when something has changed and a relevant 'real world' commit has been made. Then, for compatibility, the existingRealWorld
could be treated as aRepository
committing infinitesimally often in between each evaluation step, being continuous instead of discrete, and having no way of either branching nor rolling back. It is a shame thatgit
is only designed to work on directories and text files, because I'd love to be able to apply that level of control over my application or program state. We'll come back to this in the future.
Sources are an abstraction of both indexing and the reader monad.
It should be observed that Sourced
- however we define it when we properly do - is an abstraction over both Indexed
and Reader
, as well as taking function parameters in general.
Basically, Reader r a
and r -> a
are isomorphic, meaning these two are effectively the same:
readerHylo1 :: (f b -> Reader r b) -> (a -> Reader r (f a)) -> a -> Reader r b
readerHylo2 :: (f b -> r -> b) -> (a -> r -> f a) -> a -> r -> b
We can also define this, which is the exact same thing minus some flips.
readerHylo3 :: (IndexedFunctor f) => (src -> f b -> b) -> (src -> a -> f a) -> src -> a -> b
readerHylo3 = shylo id (const id) -- Or: phylo (const id) since phylo = shylo id
The implications of readerHylo3
are that sources abstract both Reader and Indexed / Pathed - you get one or the other, depending on what you leave out. If Sourced is Reader, with Indexed, then Reader is Sourced, without Indexed, and Indexed is Sourced without Reader (or at the very least, Sourced without local
).
This means that, in the last article (and this one as well), we could have written our p-
and s-
functions using a combination of Indexed
and Reader
, using local
to append each index to the path in turn. Were the last few articles just a huge waste of our time?
Certainly not! This shouldn't be seen as a bad thing - rather, this should be seen as a natural and necessary consequence of how Reader
and Index
combine to express "adding each step we take into 'what is up'" . Sourced
is a convenient way of acknowledging using something like Reader
and local
to build polymorphic / openly typed paths, without invoking monads directly - meaning we can use it in a purely functional context.
Far from being pointless, this result - the relationship between Sourced, Indexed, and Reader - is going to be important in coming articles.
Unanswered questions
Finally, over the last few articles, there have been a few unanticipated questions - some of my own, as well as some from readers. I would like to address them and keep track of them, even if I cannot answer them right now.
We started this series branching off of recursion, with a specific journey in mind. There is a jungle of unexplored territory here, a wealth of things to be found - the curse of incompleteness being that the jungle of knowledge grows faster than we can explore it, and not wanting to go off-course, we must press on with our charted course.
Two roads diverged in a yellow wood, And sorry I could not travel both
From one traveler to another, having already chosen my path, I invite you to take your own. Open roads (questions) include:
Power-equivalence of mono-, bi-, and di- recursion
Expressing mutual recursion through bi- or di- recursion
A formal definition in reference to initial algebras
Relationship to zippers / jokers / clowns
A deeper look at the one-hole context paper
Atkey-style indexed functors
Extending the whole zoo of morphisms (
para
,apo
, et al)A closer look at the index / source duality with respect to the cata / ana duality
Expression of recursion in conjunction with the FunctorOf proposal
Up next: Cryptographic recursion schemes
In our next article, we're in for a real treat! We are going to plow head-first into the practical utility of sourced recursion schemes, in order to explore the concept of cryptographic recursion schemes. We're going to build a few cryptographic data structures, and apply them to JSON in order to create our own distributed JSON format.
You see, sources are an excellent way to represent things like signing and encryption keys, salts, or choice of algorithm. Cryptographic values blur the line between data and type*, because a signing or encryption key defines the set of valid cryptographic elements that it applies to - making it possible to use type safety to enforce ownership, such as making it impossible for me to even try to use my key to open your box.
NOTE: * This actually occurs due to a very special reason, which we will get to when we talk about sources at the type-level.
That's all!
I hope you join me next time, traveler, and for now, I wish you a safe journey!