« May 2012 | Main | November 2012 »

October 11, 2012

What's the best answer?

In a recent dialog on a web forum, someone asked how to find the "best" way to construct an SQL query. They had posed a problem and received many solutions, each different in syntax, in approach, and in detail. Reasonably, he asked "how do I find the 'best' solution?"

That's a question we should always be asking when we craft a solution to any problem. In IT, and in SQL particularly, I use these criteria to evaluate an answer:

(1) it must return the correct answer on the target platform
(2) it should use standard syntax and construct
(3) it should be easy to understand by someone who didn't write it
(4) it should be as fast as necessary... it needn't be as fast as possible
(5) it should play well with others in a multi-user environment
(6) it must be ready-to-run on time and under budget

The "best" answer is the one which blends all of these. Notice that only two of these are "must", the others are "should".

Let's take these one by one:

(1) it must return the correct answer on the target platform

You would think this goes without saying, but an astonishing number of solutions don't solve the problem, or solve a different problem. Gratefully, that doesn't happen frequently, yet it does happen often enough that you are well-advised to verify the result before you worry about whether the solution is a 'quality' solution or not.

It's important also to make sure the solution works on the target platform. Testing in a lab or a test-bed or in a database 'just like' the real one is not assurance that the solution works on the target. Why not? Usually because of the scale of the problem -- a table of 10's of millions of rows partitioned over multiple servers can turn your simply elegant solution in an out-of-control monster. If scale is not the problem, data often is -- your test-bed data is carefully crafted and follows all of the rules, but the real-life platform may have outliers, offenders and other difficulties.

(2) it should use standard syntax and construct

This is where the first 'religious' argument begins. On one side are those who argue that you should take best advantage of the platform-specific features afforded by the vendor. By the same token, they might argue that the vendor's ANSI SQL syntax is just a wrapper over its native implementation, and that the wrapper itself is a performance penalty.

The counter-argument is, of course, that the adoption of standard syntax and construct enables the query to work with any tool, or any tool user -- including any SQL programmer. That provides a long-term value and benefit, as tools and staff are replaced over time.

Ultimately, this question resolves to whether you are taking a short-term or long-term view, whether the solution is a single-use item (to be replaced when needed) or it is the foundation for re-use (to be adopted and extended when needed).

Having been an eyewitness to decades of churn in both technology and staff, I am always an advocate for standards, reuse and long-term viability.

(3) it should be easy to understand by someone who didn't write it

By the same token -- taking the long-term view -- the syntax that is used, and the solution strategy that is adopted, need to be intelligible to anyone who might chance to come across it. In the immediate case, you will be obligated to demonstrate to someone -- the user, your team partner, a peer review -- that your solution works on paper, that it is a sensible approach, that it is safe and will cause no harmful side-effects. In the longer case, you may be called upon to debug it six months from now, or someone else may have to do so, when it suddenly stops working, when a user wants a small change to it, or when the vendor upgrades their platform. If the solution is not easily understood, then a minor maintenance task may turn into hours or days of experimentation.

Being easy to understand might mean adding appropriate documentation to the programmer's manual, or embedding inline comments at appropriate places in the code. My preference, however, is to use common and recognizable idioms, to avoid "magic tricks", to use verbose language and names, to organize the code visually, and to use standard syntax (see previous item).

(4) it should be as fast as necessary... it needn't be as fast as possible

This is religious argument, the sequel -- how fast is fast enough?

On the forum previously mentioned, the first answer predictably said: "In the world of databases, usually, the fastest solution is the best answer." Is it? My answer is 'No' (obviously -- or else there would only be one criterion here, not six).

Ask anyone who has tried to set a speed record -- on the Bonneville Salt Flats, for example -- and they will tell you that it takes some very heroic, and very expensive, effort to be " the fastest". Even there, the goal is never to be "the fastest" -- instead, these speed-seekers only try to be faster than the last guy -- to be as fast as necessary to take the first position.

Why not go for the fastest possible? Because it takes too long, costs too much, and carries too much risk. Instead, as fast as necessary is a reasonable -- though difficult -- approach.

In IT, there is an unhealthy, even wasteful, focus on speed. Granted, in some cases, speed is critical -- but these are the rare exception, not the rule. The rule is that your solution is fixing a real-life problem that has real goals and timelines and costs, and that probably has little to do with technology. It probably has a lot to do with the business' domain -- banking, health care, government, education, manufacturing, retail or some other concern.

When your attention is directed to getting the solution that is "the fastest", you are likely distracted from the business that you are there to serve. That business will tell you that a solution you think is "too slow" is "fast enough" to solve their problem. When they tell you that, stop working on the solution.

(5) it should play well with others in a multi-user environment

It's very common to hear of a solution to a poorly-behaved database query that begins with "Your database table is wrong. You need to create a new index/column/table. Then you can simply...".

And it is equally common for the poor programmer to reply: "I can't change the database" -- the DBA or the vendor or the legacy system has already decided how the database will be structured. Or, more likely, implementing a database change is a months-long effort involving planning by four or five organizations, refactoring many tools and programs, and planning a data migration. By contrast, the user needs an solution by Monday morning.

Playing well with others requires that you take notice that your little problem is a small part of a much larger, more complex system. Therefore, your solution must fit within its allotted space -- memory-wise, space-wise, processor-wise and network-wise -- within that system complex.

(6) it must be ready-to-run on time and under budget

For those of us who don't have the luxury of working with an unlimited budget, without a clock or calendar, who are not doing problem-solving just for the pure joy of it -- this last criterion might be the first, or only, criterion on the list.

Every problem that is presented to you comes with the question: "When can you get this fixed?" and "How much will it cost?". We are fond of saying that anything is possible, it just takes time and money. This is where time and money have their way with your beautiful, elegant, all-encompassing, everlasting solution.

Someone is waiting -- impatiently -- for your solution. And someone else is waiting for that person to finish what he was asked for so they can get on with their job. You, the solution provider, the expert, are at the end of a long chain of people waiting to do their jobs.

I have never seen a solution that couldn't use just a few more tweaks. That wouldn't be just a little better if we could make one more change.

But inevitably, the time and the money run out. Your solution must be ready and deliverable before that happens.

So -- how do you know which solution is the best solution? Make sure of these two things -- that the solution works and that you've spent no more time and money than you can afford. If you've met those criteria, then you can use the others to evaluate how much better one solution is than another. If you've got reasonable-to-high scores in all criteria, you can be comfortable knowing you've got as good an answer as you might expect.... until the next time.

[ Yahoo! ] options