Wednesday, January 16, 2013

On Conversion, Data Types, and Flawed Code

I came across the following expression recently:

DisplaySequence = Convert(int, IsNull(A.DisplaySequence, 10000))
I was immediately struck by an inconsistency: the IsNull function is being passed an int as its second parameter. That doesn't make sense. If the Convert is needed in the first place, then the DisplaySequence column is not int already, so either the 10000 is going to be converted to some non-integer data type, or the DisplaySequence column will be converted to an int. Neither of those make any sense. If Convert is needed at all, it should come before the IsNull, so that it is the correct data type to not possibly perform strange conversions to one or another of the IsNull operands' disparate data types.

Also, if this is working in production code and hasn't been causing errors or improper ordering, then all the values in the DisplaySequence column, when not NULL, are already integers or an equivalent data type (as far as ordering is concerned)! And if any could be non-integers, then some way to extract the numbers from them or convert them to NULL would be needed. So all in all, this expression inherently makes no sense.

But let's address the question--could this column contain non-integers? A quick sp_help TableName reveals that the column is... drumroll please... int. So the Convert is definitely unnecessary, because the column is already int!

Why does it matter to make your expressions make sense? It was working, wasn't it, without any errors in production code? Who cares?

Well, it has been said that any programmer can write code that a computer can understand. But it takes a good programmer to write code that a human can understand. The very fact that I had to go and check the data type of the DisplaySequence column is proof that there is a cost to this kind of thing. There is no way for anyone to know that an inherently flawed expression is immune to error, and this prompts research on the spot to see if so. Inherently flawed code expressions should be corrected, immediately. My work to correct this today is paying down the technical debt that was taken out when it was first written--and now the next visitor to this code won't have to waste time.

Here is the correct expression:

DisplaySequence = IsNull(A.DisplaySequence, 10000)

Now there is no inherent contradiction and we can implicitly trust that DisplaySequence is of type int. Even if wrong on this point, we've removed the eyesore that would trip up the next visitor to the code.

Monday, January 7, 2013

The Excluded Possibility

Consider this code:

   Value =
         WHEN Value IS NULL AND Text IS NOT NULL THEN Text
         WHEN Text IS NULL AND Value IS NOT NULL THEN Value
         ELSE NULL

Does a problem leap out at you? It should. What if Value and Text are both not NULL? Then the whole expression is NULL. There's no WHEN Text IS NOT NULL AND Value IS NOT NULL THEN ... case.

Maybe the query can never return a non-NULL value for both of the columns at the same time. But in that case, why bother checking for the NULLity of the other column in each condition? It doesn't make sense.

This can be rewritten much more sensibly:

   Value = IsNull(Value, Text)

It's that simple. If the two both can't be non-NULL at the same time, it will return the same value as the whole prior expression. If they can be both non-NULL at the same time, well, we have a problem... but no bigger a problem than the previous one. We would simply show one of the columns instead of a NULL. Perhaps here is the real solution:

   Value = IsNull(Value, '') + IsNull(Text, '')

Something along these lines (perhaps with a separator in the case they both are non-NULL) is the only thing that makes sense in terms of explicitly checking the NULLity of both columns.