Friday 12 October 2012

The trouble with abstraction

Abstraction is good


Abstraction hides all the complicated details of a process so that the user focuses on the grand scheme instead of dealing with the nitty gritty.  For example, to drive an automatic car you don't have to understand how an engine works, you need to understand how to use the pedals, the steering wheel, and how to read the gauges.  If your car is manual you have other nonsense to worry about, like the clutch, and knowing how the transmission works.  That's because an automatic transmission is one level of abstraction higher than a manual transmission, and as a result if your transmission is automatic than the details of how it works are hidden from you.  Furthermore, both of these abstract away the inner-workings of the engine, so the driver can focus on the road and his destination rather than how to turn the wheels of the car.  Eventually driving will be abstracted even more and cars will drive themselves, so the only thing you'll have to think about is your destination.

Another example of abstraction is a restaurant.  A client in a restaurant is given a menu of options, tells somebody which ones he wants, and gets to eat them shortly after.  Restaurants abstract away cooking and creating recipes, so you only have to worry about how to get the food from your plate into your mouth.

Instructions in a higher level programming languages abstract away Assembly language, which in turn abstracts away machine code.  If you're writing for a VM then there's at least another step in there too.  This is a good thing because code looks like this:


instead of this: 


There are lots of benefits to using higher level languages, including readability, productivity, clarity, etc., that you can read about all over the place, but I'm writing an article about the problem with abstraction.

Where abstraction fails


As it turns out, programming is not the same as driving a car.  Cars are tools with one goal in mind, whereas most programming languages are general purpose tools that are used for building more tools.  The trouble is that it's easier to design a tool that does one thing really well, than it is to design a tool that can do anything, including build more tools.  As a result, high level languages will get the job done, but they may not do it the way you had hoped.  This isn't actually a problem for most applications, but can a pain when writing video games and other performance critical applications.  All of a sudden, the nitty gritty details that have been abstracted away become very important.  Another problem arises when the system breaks.  For example if your car's breaks don't work, you can't fix them unless you understand the underlying system.  Programmers who don't understand how a compiler works will be confused by unfamiliar compiler errors, so they seek somebody who understands the underlying system.

What to do about it?


There are multiple solutions.  You can build your own system of abstraction, so that you can control and understand every aspect of it.  This is equivalent to building your own car engine, or writing your own game engine in a language that gives you the necessary level of control, such as C or C++.  This is both an enriching and time-consuming experience, and when you finish you'll feel like a real champ.  However this is also a very difficult path, and without the right guidance you may never actually finish, in which case you'll feel like a trash can, question whether you just aren't as good as those who succeeded, go through an existential crisis and move into the woods to find yourself.  This is the bottom-up approach to understanding a system of abstraction.

You could also learn the system from the top-down.  In other words, start using an existing system of abstraction, and as you become more comfortable with it you learn ways to coax it into doing what you want.  This has been a common approach for a long time.  For example, when C++ first trickled it's way into game development, developers were concerned with making object-oriented design in C++ as performant as C, which is done by avoiding constructing and destroying too many objects.  The problem is that depending on your code, the compiler may create lots of temporary objects without you realizing.  That means that you have to understand the underlying system of abstraction in order to coax the compiler into not doing that, which I think is a great example of a top-down approach to learning a system of abstraction.  By the way, this problem still exists and in garbage collected languages this can have a huge effect on performance.  Similarly to the bottom-up approach, the more you understand the underlying system the more productive you'll be, but luckily the path to righteousness won't be quite as difficult.  Furthermore, the knowledge you gain from learning this system can be used later if you choose to build your own from the bottom-up.

You could do nothing.  That is, if you're lucky you'll find the right tools for the job and you won't have a problem, like a steering wheel which is perfectly suited to steer a car.

Why am I writing this?


I think it's important for creators to understand their tools, rather than view them as a mysterious black box.  Doing so will make you more effective when using them, and avoid helplessness and frustration when something goes wrong.  

In my experience no system is perfect, and this post represents my experience in dealing with a world of broken systems.