Something is wrong in computing.

An essay about the unstable state of computing, and how we might avoid the dangers lying therein.


The bad

Something is wrong in computing, and we need to do something about it.

Computing used to be math, but these days it doesn't feel like it. Now, computing is a scary, complicated thing, to the point that I've heard ill-informed colleagues claim that programming is not mathematical and that software is unprovable, and so there's no point even to trying. This is a concerning outlook, because it is prevalent, wrong, and self-fulfilling, and yet viewpoints like that can be found at all levels of the industry, being heard from people charged with the very safety of the data of millions of users. When the very builders of a system do not trust it, something is wrong.

What is wrong in computing?

Instead of being a reliable, trustworthy tool, it is instead filled with hacks, vulnerabilities, inefficiencies, and dependencies. Viruses, trojans, backdoors, worms, malware, hacks, social engineering, scraping, phishing, ransomware. If we look back, we can see that it has been happening for a while, and that the problem of security has been steadily increasing over time.

Timeline of notable hacks

We'll start at the 2010's with the event that arguably made hacking real to the world, fifteen years after the release of the film 'Hackers' in 1995, which also left one hell of an an imprint on the public consciousness.

This list is by no means complete - it goes on forever. These are just the ones that we know about, that float to the top of even the most cursory of searches. I can assure you that there are many more hacks off this list than on it, and there are a great many that we do not know about, or did not document properly enough.

These are just the hacks. The list of vulnerabilities goes on and on as well.

We could also list social engineering techniques (Facebook and Google tricked out of $100 million by an extended phishing campaign), or petty developer spats (Coder deletes 17 lines of code, breaks the internet), or just plain incompetence (Knight Capital crashes stock market, suffers $440 million in losses).

Every mistake is a vulnerability.

It isn't just security. It's also scaling.

Even in the purest of environments, we still have trouble scaling. Turing completeness works on the assumption of infinite computing power, but we are stuck with the real world, and the universe at large. And so, all useful computers are not just concerned with firstly the task at hand, but also secondarily the acquisition and allocation of resources with which to perform that task - a necessary side effect of being a finite computer with finite access to a finite set of resources.

We have solved the first problem - how to do any and all math with a single, infinite computer. We have not solved the second problem, which is how to do it with an appropriately small finite network of finite computers. This task is much, much harder, which is rather interesting because one of the side effects of Turing completeness is that it should, in principle, not matter at all whether you are running your computation over a network or on a single device. It is merely a matter of speed. Which begs 2 questions - how much slower, and why don't we have provable distributed computation even if it would be slower?

The real world problem means we must think about the efficient allocation of resources in order to make a problem tractable. In the real finite world, fetching from the network is oodles of magnitude slower tha reading from disk, or accessing RAM, or cache, or registers. But reading from disk is much slower than reading from RAM, and so on. Put on a logarithmic perspective, and network is just a bit further away than the disk. So it seems that speed is not the issue, as we have plenty of it. Latency doesn't cause us any logical trouble (specific latency-sensitive tasks aside, and that's logistical), so why isn't the internet just one giant network hard drive?

Latency time, bandwidth, and storage space are not the only resources that we have to deal with. A resource that we are sorely lacking is that of the understanding of the reliability of computation itself. We must compute how the computer is to compute its allocation of resources for computing, and to do that you must understand all of the layers, and right now there are a great number of layers. Oh, weee're... *inhales* running an application on a site in a browser that's a program in the software runtime of an operating system riding a hardware BIOS! Each layer has its own software stack, resulting in a great heap of software stacks sitting in the way, crushing computing beneath its vast weight and stagnation. It's turing machines, all the way down! (And that, my readers, is the problem).

So can we do anything about it?

The good

Something is wrong in computing, and we can do something about it.

This heap of layers is the strata of our digital civilization, and it is stopping us from fixing everything because it is everything that is wrong. It is why we must endlessly reimplement the same sorting algorithms, over and over, everywhere, in every language, rather than simply stating that it will be sorted by whatever algorithm is appropriate. Instead of dealing with general computation, you must memorize specific stacks, with their specific quirks, which are often unreliable and unmaintained and vulnerable.

Once you have a vulnerability, an impurity, you have broken the chain of mathematical proof. If an algorithm takes input that must satisfy some constraints, and you give it data outside of those constraints, the behavior is undefined and unacceptable - and our digital civilization rests on an unstable foundation because at large it does nothing to prevent this chain from being broken, being more concerned with 'does it do the thing' than 'does it always do the thing correctly?'. Such a system relies on trust to operate, but simultaneously cannot be trusted because it relies on trust instead of truth.

So our number one goal in computing is to ensure that the computation itself is valid. It is far easier to do this when we start with a valid foundation, and ensure that everything we build rests upon it. You do not need to trust math, when you can do it yourself, and see the truth for yourself. That is what we need in order to solve both problems - that of security, and distributibility, because solving one is solving the other.

We are seeing the rise of functional programming and strongly-typed languages into the public consciousness as a response - they are attempting to reclaim this mathematical purity of proof that we lost to decades of inappropriate use of object-oriented programming. This can be seen by the intrusion of functional aspects into otherwise previously object-oriented languages, mostly slowly stolen from Haskell over the last decade.

This is of course a good thing! Functors are the hot new thing, what with the popularity of map / reduce techniques, which any good programmer knows is just foldable functors. I hope they steal monads next, because most scripting languages could be a monad.

The ugly

Something is wrong in computing, and there's a lot to do about it.

A well-typed program always halts, and we can take advantage of this to write software that has provable behavior in the very rigorous sense that a well-defined program is a mathematical proof. This is what we must do with every program, for every computer. It is just another jump in computing, like the jump to multicore, or to 64bit, or to the internet.

Unfortunately, it means we need to re-do everything, just about. Most software is "badly written", even if it is quite useful. And fortunately this time around, we don't have to fumble about for decades figuring out what functions software users need, we just need to re-implement them. We know people want chat and forums and wikis and games, we just have to describe them a little better. With enough work, we can recompile everything, and someday hopefully soon we will look at all the bits and bytes and files and folders the same way we view punch cards and rotary phones. It's all just math anyways.

It is far easier to build a compiler for assembly in a higher-level language, than it is to build a compiler for a high-level language in assembly. A singular computation must be distributed manually across a network, but a distributed computation gracefully compresses down to running on a network of one.

This is why we're building a functional operating system. Better late than never, eh? And every little bit makes it better.

- Leo D., August, 2022