Friday, April 4, 2008

Database Time Storage and Display Problems

After seeing a post about UTC time in a forum, I did some more searching and reading on the web, and found an article, UTC datetime values in SQL Server 2000.

Earlier today I had the same basic idea as the one presented in that blog, though a little more simply:

SELECT UTCDateColumn + Getdate() - Getutcdate()

In the comments on the blog entry is a function someone tried to build that would automatically adjust for daylight savings time as well. This immediately struck me as problematic, and this was borne out by further comments such as that the rules are different for Arizona. And to not be overly self-centric, what about the majority of the people in the world who do not observe DST at all? Follow the link in this paragraph to see a map of the world usage of DST.

There were additional problems. One commenter said, "Say it's currently summer and I'm in daylight savings time, my difference will include DST. Now I try to convert a date from 6 months ago, I'll be an hour out."

So here are my rambling thoughts on the subject for your amusement (at me or with me, your choice).

  1. So you want to know the time. The time where and in what time reference? Is it at the client? Is it at the server? Is it at the place the time event corresponds to like the cash machine where a withdrawal was made?

  2. If you have web clients around the world, using the database to calculate "The Time" is pointless.

  3. Even if your client location(s) only span one time zone (and can you guarantee this forever) are you sure that client will always be in the same time zone as the server?

  4. The various methods to automatically calculate the UTC differential on the server are clever but with DST are exposed for being completely inadequate, because all one is answering is, what is the current UTC differential, including DST, at the server's location? One cannot reasonably expect that to apply to all clients and for all date ranges in the database.
In my mind, the correct answer is to let the user choose his UTC differential in the client, or let the client autodiscover the differential (through a server getutcdate(), or connecting to a time server, or reading the offset out of the registry, for example).

Handling older dates that were stored during a different phase of DST is an interesting problem. Are we answering the question "what was the time then in terms of our time now" or "what was the time then in terms of the time then?" And the time now/then where?

Before you answer that seemingly easy question overquickly, think about the 1 hour period just after DST adjustment makes the time jump back and the same hour is repeated. Imagine you're reviewing security logs for a crime event. Your system reports:

  • 2:05 am - last person to exit the facility was researcher John Smith

  • 2:20 am - critical research samples verified to be secure

  • 2:35 am - later investigation pinpoints this as the time critical research samples were known to be missing
If the system is adjusting times for "what was the time then" then you can't know whether John Smith left the building before the samples were verified secure, or if he left after they were stolen, because there were in fact two segments of time that night covering 2 am - 2:59:59 am, and he could have cleverly left the building during the second hour in order to fool the computer systems.

Anyway, if you set a concern like this aside and still wish to adjust for the time things were when recorded, I can't see how this can be accomplished except on the client because only the client can know the actual location of the client and his desired DST rules. Unless you keep a comprehensive and carefully-updated catalog of the DST rules everywhere your clients can possibly be, this sounds like a challenge.

Maybe the answer is to store both dates in the database: the client time AND UTC time, or UTC time and client offset at that time. This data doesn't even have to be in the individual rows: it could be captured or updated in a UTC-offset history table for each client location every time a client connects. It has its own problems like: does the client know his location, UTC offset, and DST rules & current phase correctly? If so, can the client always be trusted to present this information accurately or could the user purposefully taint the data?

There is even more complication:

What if you are in some place that doesn't observe DST and you want to compare transactions from months ago that were recorded for two different places: your current location and a location that is in a different time zone and does observe DST.

What consistent method of adjustment do you plan to use to display all the times in the same list? Clearly they have to be adjusted the same for any sort of sensical ordering or comparison, and it makes sense to adjust them for the observer so he can view them in his own time. Yet now, adjusting the historical date of the time recorded for the observing-DST location has to be done in terms of the viewing location's DST rules, which can also lead to incorrect assumptions.

So yet again it seems that time data needs to either be stored in UTC and viewed in UTC with an adjustment for the CURRENT offset only and the user knowing this information and making allowances for it, or time data needs to be stored in UTC and ALSO the client's time and the client's location, in some format or other, so that true reconstruction of "the time" can be answered from every perspective needed: any location, observing any DST rules, during any DST phase. At historical review time, there is a huge list of possible dates to display for a single recorded UTC time:

  • The UTC time of the event (the one place where all the madness is overcome!)

  • The time of the event at the event location, using the DST offset of the event location then

  • The time of the event at the event location, using the DST offest of the event location now

  • The time of the event at the user's location, using the DST offset of the user location then

  • The time of the event at the user's location, using the DST offset of the user location now

  • The time of the event at the event location, using the DST offset of the user location then

  • The time of the event at the event location, using the DST offset of the user location now
Yikes!

In closing, the getdate() - getutcdate() idea might be useful in conjunction with the client's stated known offset to determine if the client is observing DST *now*, but by itself can never solve the problem of fully-sensible historical time display in any time zone observing any DST rules (that themselves change over time).

No comments: