Daily Surf: Understanding Engineers: Feasibility!

In an internal blogpost inside Atlassian, I described a certain problem as being ‘very hard’, to explain why our efforts were better spent elsewhere. Later, as I was walking across the bridge into work I had a moment to look at that statement through the eyes of a non-engineer. Atlassian prides itself on hiring really smart people. What use are they if they can't solve hard problems?

To that end, here's a quick lexicon of what computer programmers generally mean when they're talking about how hard some problem is, starting with the most extreme:

Impossible

The man most commonly regarded as the 'father' of computer science is the English mathematician, Alan Turing. Turing did a lot of work in World War II helping the Allies break the German military ciphers. To reward him, the British Government convicted him of gross indecency after the war (he was homosexual), took away his security clearance and put him on hormone therapy. His death not long after is generally accepted as suicide.

Anyway. Turing's most famous contribution to computer science is the Church-Turing Thesis. This describes a theoretical device called a Universal Turing Machine that is both capable of solving any computational problem that could be represented as an algorithm, and of emulating any other device that solves computational problems.

Anything that can be (deterministically) mechanically computed can be computed by a Turing machine. Anything that performs deterministic mechanical computations is really just a Turing machine. Engineers speak of computer hardware or programming languages that have this full range of computation as being "Turing complete".

In this framework, the word 'impossible' has a definite meaning. A problem is impossible if its solution can not be computed by a Turing machine.

Admittedly, this isn't a very useful distinction. On one hand, Turing machines are a mathematical theory. They're infinitely fast and have unlimited storage, and you have an unbounded amount of time to write your program for it. As such, many things that are theoretically possible are practically impossible (more on that later).

On the other hand (and this trips up engineers all the time), it doesn't ascribe any value to mostly solving a problem. No Turing machine can tell you if you're going to like a book or not, but Amazon makes a lot of money out of coming out with a close-enough guess.

Trivial

At the other end of the scale, we have problems that are trivial. This definition can be expressed in far fewer words:

I know how to solve this problem.

To a programmer, a problem is trivial if there is a clear solution, and the only thing that needs to be done is to implement it.

The only caveat is that triviality refers to how hard the problem is to solve, not how hard it is to implement the solution. So there is no necessary relation between a task being trivial, and how long it takes. To the programmer, once the plans for the bridge have been drawn up, the materials chosen properly and the model tested for how it would survive wind, traffic and earthquakes, actually building the bridge is trivial.

Unfeasible

(Grammatical correctness would suggest ‘not feasible’ as the proper alternative, but I suspect ‘unfeasible’ has enough current usage amongst programmers to ignore correctness)

A problem is unfeasible if enough of the solution is known to determine that you don't have the resources to solve it. You might not have the free developer resources, or the available expertise, or it might just need more hardware than you can ever afford.

The field of cryptography offers us a perfect example of a problem that is both trivial and unfeasible. The algorithms that encrypt and decrypt data are well-known and published. Assuming the algorithm works ‘as advertised’, to decrypt some data you just need to write a program that implements that algorithm, and throw enough computing hardware at the problem to run that algorithm with every possible key.

Cryptography works by choosing keys in a way that ensures there's not enough computing hardware available, practically speaking, to run that trivial program to completion. Decrypting such data is unfeasible.

Non-Trivial

An ex-Google non-engineer described 'non-trivial' thus in the Xooglers blog:

"It means impossible. Since no engineer is going to admit something is impossible,they use this word instead. When an engineer says something is "non-trivial," it's the equivalent of an airline pilot calmly telling you that you might encounter "just a bit of turbulence" as he flies you into a cat 5 hurricane."

This quote shows both the difference between the engineer and non-engineer's conception of non-triviality, and their different definitions of impossible (Flying through a hurricane isn't necessarily impossible, you just really wouldn't want to do it).

In clear opposition to the meaning of trivial, non-trivial means:

I do not know how to solve this problem completely.

Non-trivial contains dangerous unknowns. Some part of it is not yet understood, or lies outside the range of things the programmer has done before, or can quickly imagine a workable solution to. The more experienced the programmer who tells you a problem is non-trivial, the more concerned you should be.

Given time for research and experimentation, a non-trivial problem can be made trivial. Or it could be determined to be hard.

Hard, and Very Hard.

Hard problems are a class of non-trivial problem1 where some of the unknowns, some of the problems the engineer would have to find a way to overcome, are known to be difficult to find solutions to. These are problems the engineer has tried to solve in the past, or has seen the resources that other developers have had to throw at similar problems to solve them.
'Hard' can also be used to describe non-trivial problems with a big potential for 'unknown unknowns' - things the developer not only doesn't know the solution to, but doesn't even know what kind of problems he's likely to face.

Scaling out a web application is hard. There are any number of thorny problems to do with performance, distribution, data access, caching and so on that take quite a bit of effort to solve, and are often different enough from application to application for there to be no good, general solution. There are also inevitable thorny problems that you haven't even anticipated, lurking in the wings.

Very Hard is the extreme of hard problems. You'll often see both words capitalised for emphasis, even in the middle of a sentence. Indexing the entire World Wide Web and providing relevant search results in millisecond response times is a Very Hard problem. Breaking commercial-grade encryption within practical hardware and time limitations is a Very Hard problem. Peace in the Middle East is a Very Hard problem.

'Very Hard' is usually reserved for the class of problem that if you solved it, you could change the world. Or at least build a successful business on top of your solution.
1 Gary Capell pointed out to me in email that, in classic engineering understatement, hard, or even Very Hard problems can sometimes be referred to as ‘distinctly non-trivial’.

Daily Surf

Tuesday, 17 July 2007

Understanding Engineers: Feasibility!

No comments:

Blog Archive