All about locale sensitive time processing in Java
This article was inspired by some issues I have found recently in some Java code that previously appeared to be solid and reliable. After talking to some friends I have realized that there is a big misunderstanding (or sometimes at least misinterpretation) of the basics of time processing - in the computers in general, not only in Java. And I have to admit that it took for me some time to build the complete picture of what happens in this code for myself. In this article I will try to briefly summarize my understanding of the problem.
Starting from the basics, there is always a source for the time information. In our case it is the clock of the computer. Here I would like to mention (although it has nothing to do with Java) that usually there are two sources of time. One is the RTC, real-time clock. RTCs are powered by the batteries so they keep working even when the machine is off. They are relatively accurate. When you set your RTC date/time you just provide date and time you do not specify the time zone. Another source is the timer implemented by your operating system. Usually it gets initialized using RTC when the OS starts. It is not as accurate as RTC, primarily because of the lost interrupts (due to I/O, for example). This is why the people frequently use the time synchronization service (NTP) in order to keep it accurate enough. Here it is important to mention that in the OS there is concept of the time zone. Usually it is handled by the libraries, not by the OS kernel itself (thus allowing to use different time zones at the same time if needed). It is important to understand that the OS does not have the time zone, it just has the timer that is counting the number of seconds since 1st of January 1970 UTC. And, depending on the time zone, this number of seconds can represent different "local time".
Before we go too far we need to understand what is the time zone. Time zone defines the offset of the local time from the UTC time (UTC is the international acronym for Coordinated Universal Time). The offset is not necessarily an integer number of hours, there are time zones with the offset of 3.5 hours, for example. In some countries there are additional rules defined for the time zones in order to support the Daylight Saving Time (DST).
Software timer in the OS always counts the number of seconds (actually, it has higher precision) since 01/01/1970. And this timer always advances, it never goes back because of the time zone changes or the DST. In order to better understand the effect of the time zone, consider the following example (you can use the command like perl -e ‘use POSIX qw(strftime); print strftime("%a %b %e %H:%M:%S %Y %Z(%z)", localtime($ARGV))."\n"’ – <time> to validate it):
- timestamp=1194155999 -> Sun Nov 4 01:59:59 2007 EDT(-0400)
- timestamp=1194156000 -> Sun Nov 4 01:00:00 2007 EST(-0500)
As you can see, the software clock goes forward by one second but the local time goes back by one hour minus one second. And, if you omit the specification of the time zone, these results do not really make sense!
Now we move from the operation system up to Java. In fact, it is not that different. First of all, the current time can be retrieved using
currentTimeMillis() method of
java.lang.System class. This method returns the “number of milliseconds between the current time and midnight, January 1, 1970 UTC”. This time is coming from the software clock of the operating system. It always increases, never goes back (unless the current time gets changed externally).
First important Java object that deals with the time is
java.util.Date. This class does not deal with any localization or internationalization issues by itself. Its primary purpose is to reflect coordinated universal time (UTC) - which is also Greenwich mean time (GMT). GMT is the "civic" name for the scientific name "UTC" This class also takes care of the leap second issue but this is a UTC time issue by itself, so we will not discuss it here (if you want to learn more about the leap second - look at the References section at the end of the article). Do not be confused with various methods of the Date class such as toString(), toGMTString() and toLocaleString() - the Date class by itself does not do anything special with the time zones and it is really the container for the UTC time, nothing more, nothing less.
Now we move to the locale-specific time processing. First of all, the current time zone is set using the static
setTimeZone() method of
java.util.TimeZone class. The application can do it by calling this method or it can be set using
user.timezone property of the JVM.
First class that deals with locale-sensitive date presentation is
java.util.Calendar. Time zone can be either set explicitly or the default one will be used. You can set the current UTC time (in milliseconds) or the local time (each component separately, i.e. year, day, month etc). Calendar allows you to manipulate the dates in locale-sensitive way, for example you can add a number of weeks to the current date etc.
In most of the applications the developer has to parse the string containing a local date and eventually convert the dates to the local strings.
java.text.DateFormat and its subclass
SimpleDateFormat are used for converting the dates to the local strings and for parsing the strings containing the dates. Just like with the Calendar, you need to either set the target time zone explicitly or use the default one. The following code fragment shows how these classes can be used:
SimpleDateFormat parser = new SimpleDateFormat( "MMM dd, yyyy 'at' HH:mm:ss z"); Date d = parser.parse("Oct 25, 2007 at 18:35:07 EDT"); System.out.println("Timestamp: " + d.getTime()); // prints "1193351707000" SimpleDateFormat formatter = new SimpleDateFormat( "yyyy/MM/dd HH:mm:ss z'('Z')'"); TimeZone est = TimeZone.getTimeZone("America/Montreal"); formatter.setTimeZone(est); System.out.println("Reformatted date: " + formatter.format(d)); // prints "2007/10/25 18:35:07 EDT(-0400)" formatter.setTimeZone(TimeZone.getTimeZone("UTC")); System.out.println("UTC date: " + formatter.format(d)); // prints "2007/10/25 22:35:07 UTC(+0000)"
As you can see, this code first parses a string containing local date/time (formatted for EDT time zone). Then it prints the number of milliseconds passed between the epoch (01/01/1970 00:00:00 UTC) and this moment. Then it reformats this time using the time zone for Montreal (same EST/EDT time zone) and prints the same date as the original one. And finally it reformats the same time for GMT time zone.
It is always important to remember that only the UTC time is the absolute one and should be considered as "normalized" time. Any time expressed as local time absolutely must be processed while taking the time zone offset into account using the Calendar class.
blog comments powered by Disqus