The comfortable delusion of infinitely increasing computer speed is shattering - time to learn some real programming!
We are often faced with the idea that hand writing assembler will give the fastest code. Next up the food chain is very low level C and then C++ followed by, maybe, Java and tailed by something akin to Python. Does this all make sense? Is this the real situation or maybe correlation does not imply causality? In this post I would like to suggest it does not. I posit that knowledge of the underlying mechanisms employed by a program is a much bigger factor than the choice of language. If we were to read a file one line at a time from disk (a real spinning disk) then the performance of the system would be entirely dominated by the buffering strategy and not by the code reading the lines. Which language (Java, C and such) is use would make no noticeable difference at all. In such a case, hand writing assembler would be a very bad design choice as the chances of implementing a sub-optimal buffering strategy would be high; maintaining that choice correctly over many software and hardware updates would be near impossible.
So what am I saying? I guess I am saying that understanding the way the computer's I/O system actually works is more important than the code we write. Understanding the underlying 'exactly what is happening' is more important than the details of the implementation language. The difficulty we face as developers is that software practice has moved towards abstraction. Mathematics gets in the way of computer programmers in so many ways. Some are foolish enough to see computer programming as a branch of mathematics. This makes about as much sense as saying architecture is a branch of physics. Similarly, educators, scared of frightening off young people from programming, hide the electronic nature of a computing machine behind interpreted, easy to understand programming techniques. This may well be a valid approach except the step to developer maturation - knowing what the computer is actually doing never happens.
Sadly, the notion that simply writing code in a low level language like C will make it run fast is also misguided. Indeed, writing 'next to the machine' with assembler or C is no guarantee of performance. humans lack the ability to process large amounts of related data which is required to perform some of the best optimisations for which modrern compilers are getting so profissiant. We might be brilliant at writing 20 instructions which are tailored for absolute maximum performance and not notice they these instructions can be elided in the majority of code paths due to branch speculation optimisation.
Nevertheless, we cannot rely on compilers to always arrive at an optimal solution and we definitely cannot expect them to work around poorly written code. What is required is a cooperation between compiler and developer. To achieve that we developers need to start to learn more about the dirty mechanics of code execution; we need to stop thinking in highly abstract ways and start to disassemble generated code. We need to look inside library objects like share_ptr and find out the mechanical details of what they do, not just skim over the high level theory.
Similarly, there is nothing inherently bad with using interpreters; as a high abstraction for pushing large computational tasks around they can be very useful. The moment we find ourselves performing that computation in an interpreted (or in truth - dynamic) language is the moment we have gone wrong.
Where does this leave abstraction? Do we have to jettison Haskell, Java and C++ in favour of C and Fortran? No, we can still work at abstraction where it helps - but we must not hide behind it.
- We need to accept that hot code needs to be fully characterised on the target hardware. Mathematical abstraction will not help.
- The free lunch of ever cheaper and faster hardware is over - computer performance has stagnated and costs are rising.
- Tooling must evolve beyond simple 'this bit of code is taking the most time' - we need to know WHY.
- We as developers need to stop hiding from reality - computers are machines, computation is engineering not mathematics or science.
1,2 and 4 are starting to sink in to the commercial worl at least. I know of a few places where 3 is in its infancy. The future will be a more hardware centric, intellectually challenging place: In my view - all the better for it.