TIP #173: Internationalisation and Refactoring of the 'clock' Command
From: Kevin Kenny (kennykb_at_acm.org)
Date: 03/15/04
- Next message: Darren New: "Re: Access Control Lists(ACL)"
- Previous message: Darren New: "Re: windwows and mousepointer"
- Next in thread: David S. Cargo: "Re: TIP #173: Internationalisation and Refactoring of the 'clock' Command"
- Reply: David S. Cargo: "Re: TIP #173: Internationalisation and Refactoring of the 'clock' Command"
- Reply: Arjen Markus: "Re: TIP #173: Internationalisation and Refactoring of the 'clock' Command"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 15 Mar 2004 17:44:02 +0000 (UTC)
TIP #173: INTERNATIONALISATION AND REFACTORING OF THE 'CLOCK' COMMAND
=======================================================================
Version: $Revision: 1.4 $
Author: Kevin Kenny <kennykb_at_acm.org>
State: Draft
Type: Project
Tcl-Version: 8.5
Vote: Pending
Created: Thursday, 11 March 2004
URL: http://purl.org/tcl/tip/173.html
WebEdit: http://purl.org/tcl/tip/edit/173
Discussions-To: news:comp.lang.tcl
Post-History:
-------------------------------------------------------------------------
ABSTRACT
==========
The [clock] command provides Tcl's fundamental facilities for computing
with dates and times. It has served Tcl faithfully since 7.6, but the
computing world has advanced significantly in the decade that it has
been in service. This TIP proposes a (nearly entirely compatible)
reimplementation of [clock] that will allow for fewer ambiguities on
input, improved localisation, more portability, and less exposure of
platform-dependent bugs. A significantly greater fraction of [clock]
shall be implemented in Tcl than it is today, and the code shall be
refactored to use the ensemble mechanism introducted for Tcl 8.5 (see
[TIP #112]).
RATIONALE
===========
There is an embarrassing number of open bugs and feature requests
against the [clock] command. As the maintainer of [clock], the author
of this TIP has also received a number of informal feature requests
that are not logged at SourceForge. Unfortunately, many of the
requested fixes and enhancements cannot be effectively addressed with
the current architecture of [clock].
1. Several users have requested additional input formats to [clock
scan], notably the full range of ISO8601 time formats (including
formats based on week number and day-of-week); year and
day-of-year; Apache "web log" dates and times; numeric dates
placing the month before the day; and localised names of months
and days of the week. Unfortunately, these formats simply cannot
be added in the current architecture of [clock scan]; in fact,
there are several outstanding bugs in [clock scan] (for example,
the parsing of numeric time zones east of Greenwich) that cannot
be fixed without breaking something else.
The fundamental issue is that [clock scan] is asked to process
input with too many ambiguities. An input token such as *2000*,
for example, may be interpreted as a year, a time of day, or a
number ("now + 2000 seconds"). *1000* may (perhaps) not be a
year, but could be a time of day, a number, or a time zone.
Localisation would only make this problem worse. Without
additional guidance, there is, even in theory, no way to
determine whether *03-11-2004* represents the third of November
or the eleventh of March.
To solve this problem, a radical redesign of [clock scan] is
required; the programmer *must* be allowed to specify an expected
input format (or set of expected formats).
A side effect of such a redesign would be improved ease of
maintenance. The current [clock scan] is a YACC-derived parser;
the build process, however, runs a script on the output of YACC
to modify its memory management and alter its external symbol
names to make it compatible with Tcl's conventions. This script
is fragile; at present, it is known to work only with the version
of YACC distributed with Solaris.
There are a number of other issues with [clock scan] that could
be addressed at the same time with such a redesign. For instance,
there is a known problem at present that an input string that
specifies time and time zone but not date can return a time that
is one day too early or late; this problem arises because the
existing parser presumes the current *local* date when parsing
such a string, rather than the current date in the given time
zone. The problem is difficult to address because of the
left-to-right nature of the LALR(1) parser.
2. A few enhancements have been requested to [clock format]; most
notably, proper localization on all platforms. In addition, the
documentation of [clock format] is at best approximate, because
it depends on the *strftime* function in the Standard C Library.
This function differs among platforms, because the C standard,
the Posix standard, and the Single Unix Specification have gone
through evolution over time, and few platforms support all the
features of the current generation of any of them.
In addition, the Year 2038 bug looms large on the horizon. On
most 32-bit platforms, *time_t* (used in the C library funtions)
is a 32-bit count of seconds from 1 January 1970; dates beyond
2038 cannot be represented in this format.
The dependence on a complex library function such as *strftime*
introduces obscure platform-dependent bugs. Several open bugs in
[clock format], for instance, fail only on HP-UX, or only on
Windows.
Date formats have been requested (specifically, the Japanese
civil calendar) that are beyond the capabilities of the Standard
C Library functions.
[clock format] does not honor user preferences for date/time
format on Windows.
All of these concerns seem to indicate that our current
dependency upon vendor-supplied date and time manipulation
routines is ill advised. A single implementation that we control
will make the behavior consistent among platforms, allow the
localisation to follow Tcl's conventions, and let us lead rather
than follow the vendor in fixing bugs.
3. Server applications frequently require support of multiple
locales and multiple time zones within a single process, because
they need to parse input and format output according to the
client's environment. The current [clock] facilities either do
not support localization at all, or else support a change to
locale only by changing environment variables. This technique,
once again, exposes bugs in the vendor libraries. It also
introduces difficulties with thread safety; Tcl does not have a
single mechanism whereby the *TZ* and *LC_TIME* environment
variables are protected.
4. The only mechanism for performing calculations like "one month
after the current date" is [clock scan]. While this works well in
practice, using a parser to perform arithmetic seems somewhat
perverse.
SPECIFICATION
===============
The [clock] command shall be reimplemented as an ensemble [TIP #112],
with most of the subcommands implemented in Tcl. A minimal set of the
existing C code shall be refactored and placed inside a *::tcl::clock*
namespace. The existing subcommands *seconds* and *clicks* shall be
exposed. The existing *scan* shall be hidden inside the namespace.
[clock scan] and [clock] format shall be reimplemented in Tcl. In
addition, new [clock add] and [clock subtract] commands shall be added.
The syntax and semantics of the [clock clicks] and [clock seconds]
commands will remain unchanged.
SPECIFICATION: CLOCK SCAN
===========================
The [clock scan] command shall have the syntax:
clock scan $string ?-base baseTime? \
?-format format? \
?-gmt boolean? \
?-locale name? \
?-timezone timeZone?
It accepts a character string representing a date and time and returns
the time that the string represents, expressed as a count of seconds
from the Posix epoch (1 January 1970, 0000 UTC).
If a *-format* option is not supplied, the scan is a *free format*
scan. The existing YACC parser for *clock scan* will be used to
interpret the input string. *This form of the command is explicitly
deprecated* because of the inherent ambiguities in interpreting the
input string. If the *-format* option is supplied, the *-locale* and
*-timezone* options will be forbidden, since the legacy code does not
support multiple locales or time zones.
If the *-format* options is supplied, it is interpreted as a
specification for the expected input form. If the given string matches
the input form, it is converted to a count of seconds and returned;
otherwise, an error is thrown. See *FORMATS* below for a discussion of
the available format groups and their interpretation.
Extraction of the date from the input string is guided by what fields
are present in the format. The order of preference, from highest to
lowest, is:
{seconds from epoch}, {starDate}:
Date fields that specify both date and time take highest
precedence. If format groups for these fields appear multiple
times, the rightmost takes precedence.
{Julian Day Number}:
The Julian Day Number uniquely specifies a calendar date.
{century, year, month, day of month},
{century, year, day of year},
{century, year, week of year, day of year},
{locale era, locale year, month, day of month}:
Formats with complete year are preferred to formats with a
two-digit year. For a two digit year, the date range is
constrained to lie between 1938 and 2037.
{year, month, day of month},
{year, day of year},
{year, week of year, day of week},
{year of locale era, month, day of month}:
Formats that specify the year are preferred to those that do not.
{month, day of month}, {day of year}, {week of year, day of week}:
Formats that specify a day within the year are preferred to those
that specify merely the day of week or day of month. Formats that
do not specify the year are presumed to designate the base year.
{day of month}, {day of week}:
If none of the above rules apply, a day of the month or day of
the week standing alone is interpreted as belonging to the base
month or week.
None of the above:
If no combination of fields that specifies a date is found, the
base date is used.
In all of the foregoing discussion, the 'base date', 'base month',
'base week', and 'base year' refer to the day, month, week or year
designated by the *-base* parameter, which is a count of seconds from
the Posix epoch. If no *-base* parameter is supplied, the current date
is used as the base date. The year, month, week and day are obtained by
interpreting the base date in the time zone specified by the date/time
string. If the given format does not include a time zone, then the base
time is interpreted in the default time zone; see *TIME ZONES* below
for the way that the default time zone is determined.
The time of day returned by [clock scan] is determined by the presence
of fields in the format string, in the following order of preference.
{seconds from epoch, StarDate}:
If either of these fields is present, it uniquely determines date
and time.
{am/pm indicator, hour am/pm, minute, second}, {hour, minute, second}:
Time with seconds is preferred to time without seconds.
{am/pm indicator, hour am/pm, minute}, {hour, minute}:
Time can be interpreted without the seconds.
{am/pm indicator, hour am/pm}, {hour}:
Time can be expressed as an hour alone, *e.g.*,
clock scan "6 pm" -format "%I %p"
None of the above:
If none of the above indicators is present, midnight in the given
time zone is used.
In all of the foregoing discussion, the 'base date', 'base month',
'base week', and 'base year' refer to the day, month, week or year
designated by the *-base* parameter, which is a count of seconds from
the Posix epoch. If no *-base* parameter is supplied, the current date
is used as the base date. The year, month, week and day are obtained by
interpreting the base date in the time zone specified by the date/time
string. If the given format does not include a time zone, then the base
time is interpreted in the default time zone; see *TIME ZONES* below
for the way that the default time zone is determined, and the
interpretation of the *-timezone* and * -gmt* options.
The locale is used to determine the spelling of native language words
such as the names of months, names of weekdays, am/pm indicators, and
locale eras. It is also used in the interpretation of the format
groups, '%X', '%x', and '%c'. In addition, the locale determines the
date at which the calendar in use changes from the Julian calendar to
the Gregorian. If no *-locale* parameter is supplied, the default is to
use the root locale. See *LOCALISATION* below for more information.
SPECIFICATION: CLOCK FORMAT
=============================
The [clock format] command shall have the syntax:
clock format $string ?-format format? \
?-gmt boolean? \
?-locale name? \
?-timezone timeZone?
It accepts a time, expressed in seconds from the Posix epoch of 1
January 1970, 00:00 UTC, and formats it according to the given format
string. See *FORMATS* below for a discussion of the available format
codes. If no format string is supplied, a default format, {%a %b %d
%H:%M:%S %Z %Y} is used.
The *-timezone*, *-gmt*, and *-locale* options are interpreted as for
[clock scan]. See *TIME ZONES* and *LOCALISATION* below for how these
options work.
SPECIFICATION: CLOCK ADD, CLOCK SUBTRACT
==========================================
These two commands perform arithmetic on dates and times. The syntax
is:
clock (add|subtract) time ?count unit?... \
?-gmt boolean? ?-timezone timeZone? ?-locale name? ?-
It accepts a time, expressed in seconds from the Posix epoch of 1
January 1970, 00:00 UTC, and adds or subtracts units of time from it
according to the alternating *count* and *unit* parameters. Each
*count* must be a wide integer; each *unit* is one of the following:
years year months month
weeks week days day
hours hour minutes minute seconds second
The command works by converting the given time to a calendar day and
time of day in the given locale and time zone. To that day and time of
day, it adds or subtracts the given offsets *in sequence*. It
reconverts the resulting time to a count of seconds, again using the
given locale and time zone, and returns that count of seconds.
There are subtle differences in many cases between adding seemingly
similar offsets. For instance, on the day before Daylight Saving Time
goes into effect, adding 24 hours will give "the time 24 hours from the
base time, irrespective of any clock change", while adding 1 day will
give "the time it will be at the same time of day on the following
day." Similarly, adding 1 month on 30 January will give either 28 or 29
February. There are equally strange effects when performing date/time
arithmetic across the change between the Julian and Gregorian
calendars.
The *-timezone*, *-gmt*, and *-locale* options are used to control the
interpretation of the count of seconds as a calendar day and time.
Refer to *TIME ZONES* and *LOCALIZATION* below for a fuller discussion.
SPECIFICATION: FORMATS
========================
The [clock scan] and [clock format] commands will be implemented in
Tcl, without depending on the local *strftime* and *strptime*
functions. For this reason, format groups will function identically on
all platforms. The format groups will be interpreted as follows.
%a: On output, receives the abbreviation for the day of the week in
the given locale. On input, matches the name of the day of the
week (in the given locale) in either abbreviated or full form,
and may be used to determine the calendar date.
%A: On output, receives the full name of the day of the week in the
given locale. On input, treated identically with %a.
%b: On output, receives the abbreviation for the name of the month in
the given locale. On input, matches the name of the month (in the
given locale) in either abbreviated or full form, and may be used
to determine the calendar date.
%B: On output, receives the full name of the month in the given
locale. On input, treated identically with %b.
%C: On output, receives the number of the century, in Indo-Arabic
numerals. On input, matches one or two digits, and accepts the
number of the century in Indo-Arabic numerals. May be used to
determine the calendar date.
%c: On output, produces a correct locale-dependent representation of
date and time of day. On input, matches whatever format *%c*
produces in the given locale, and may be used to determine
calendar date and time.
%d: On output, produces the number of the day of the month, in
Indo-Arabic numerals, with a leading zero. On input, matches one
or two digits, accepts the day of the month, and may be used to
determine calendar date.
%D: Synonymous with %m/%d/%Y. Should be used only in US locales.
%e: On output, produces the number of the day of the month, in
Indo-Arabic numerals, with no leading zero. On input, treated
identically with %d.
%Ec: On output, produces a locale-dependent representation of date and
time of day in the locale's alternative calendar. On input,
matches whatever %Ec produces, and may be used to determine
calendar date and time.
%EC: On output, produces the name of the current era in the locale's
alternative calendar. On input, accepts the name of the era in
the locale's alternative calendar, and may be used to determine
calendar date.
%Ex: On output, produces the calendar date in a locale-dependent
representation using the locale's alternative calendar and
alternative numerals. On input, accepts whatever %Ex produces and
may be used to determine calendar date.
%EX: On output, produces the time of day in the locale's alternative
representation. On input, accepts whatever %EX produces and may
be used to determine time of day.
%Ey: On output, produces the number of the current year relative to
the locale's current era *%EC*, expressed in the locale's
alternative numerals. On input, accepts the number of the year
relative to the current era in the locale's alternative numerics,
and may be used to determine calendar date.
%EY: On output, produces an unambiguous representation of the current
year in the locale's alternative calendar and alternative
numerals. This group is often synonymous with %EC%Ey. On input,
accepts whatever %EY produces and may be used to determine
calendar date.
%g: On output, produces the two-digit year number suitable for use
with the ISO8601 week number. On input, accepts a two-digit year
number, and may be used to determine calendar date if the %V
format group is also present.
%G: On output, produces the four-digit year number suitable for use
with the ISO8601 week number. On input, accepts a four-digit year
number, and may be used to determine calendar date if the %V
format group is also present.
%h: Synonymous with %b.
%H: On output, produces the two-digit hour of the day on a 24-hour
clock (00-24). On input, matches two digits, and may be used to
determine time of day.
%I: On output, produces the two-digit hour of the day on a 12-hour
clock (12-11). On input, matches two digits, and may be used to
determine time of day.
%j: On output, produces the three-digit number of the day of the
year. On input, matches three digits, and may be used to
determine the day of the year.
%J: On output, produces the number of the Julian Day Number beginning
at noon of the given date. The Julian Day Number is a
representation popular with astronomers; it is a count of days in
which Day 1 is 1 January, 4713 B.C.E., on the proleptic Julian
calendar; in this system, 1 January 2000 is Julian Day 2451545.
On input, matches any string of digits and interprets it as a
Julian Day; may be used to determine calendar date.
%k: On output, produces the number of the hour on a 24-hour clock
(0-24) without a leading zero. On input, matches one or two
digits and may be used to determine time of day.
%l: On output, produces the number of the hour on a 12-hour clock
(12-11) without a leading zero. On input, matches one or two
digits and may be used to determine time of day.
%m: On output, produces the number of the month (01-12), with exactly
two digits (using a leading zero if necessary). On input, matches
exactly two digits and may be used to determine calendar date.
%M: On output, produces the number of the minute of the hour (00-59)
with exactly two digits (using a leading zero if necessary). On
input, matches exactly two digits and may be used to determine
time of day.
%N: On output, produces the number of the month, with no leading
zero. On input, matches one or two digits, and may be used to
determine time of day.
%Od, %Oe, %OH, %OI, %Ok, %Ol, %Om, %OM, %OS, %Ou, %ow, %Oy:
All of these format groups are synonymous with their counterparts
without the 'O', except that the string is produced and parsed in
the locale-dependent alternative numerals.
%p: On output, produces the indicator for 'a.m.', or 'p.m.'
appropriate for the given locale, converted to upper case. On
input, accepts whatever %p produces (in upper or lower case) and
may be used to determine time of day.
%P: On output, produces the indicator for 'a.m.', or 'p.m.'
appropriate for the given locale. On input, accepts whatever %p
produces (in upper or lower case) and may be used to determine
time of day.
%Q: On output, produces a StarDate. On input, accepts a StarDate and
may be used to determine calendar date and time of day.
%r: On output, produces a locale-dependent time of day representation
on a 12-hour clock. On input, accepts whatever %r produces and
may be used to determine time of day.
%R: On output, produces a locale-dependent time of day representation
on a 24-hour clock. On input, accepts whatever %R produces and
may be used to determine time of day.
%s: On output, produces a string of digits representing the count of
seconds since 1 January 1970, 00:00 UTC. On input, accepts a
string of digits and accepts it as such a count; may be used to
determine date and time of day.
%S: On output, produces a two-digit number of the second of the
minute (00-59). On input, accepts two digits. May be used to
determine time of day.
%t: On output, produces a TAB character. On input, matches a TAB
character.
%T: Synonymous with %H:%M:%S.
%u: On output, produces the number of the day of the week
(1-Monday,7-Sunday). On input, accepts a single digit. May be
used to determine calendar day.
%U: On output, produces the ordinal number of the week of the year
(00-53). The first Sunday of the year is the first day of week
01. On input accepts two digits *which are otherwise ignored.*
This format group is never used in determining an input date.
%V: On output, produces the number of the ISO8601 week as a two digit
number (01-53). Week 01 is the week containing January 4; or the
first week of the year containing at least 4 days; or the week
containing the first Thursday of the year (the three statements
are equivalent). Each week begins on a Monday. On input, accepts
the ISO8601 week number, and may be used to determine the
calendar day.
%w: On output, produces a week number (00-53) within the year; week
01 begins on the first Monday of the year. On input, accepts two
digits, *which are otherwise ignored.* This format group is never
used in determining an input date.
%x: On output, produces the date in a locale-dependent
representation. On input, accepts whatever %x produces and may be
used to determine calendar date.
%X: On output, produces the time of day in a locale-dependent
representation. On input, accepts whatever %X produces and may be
used to determine time of day.
%y: On output, produces the two-digit year of the century. On input,
accepts two digits, and may be used to determine calendar date.
Note that %y does not yield a year appropriate for use with the
ISO8601 week number %V; programs should use %g for that purpose.
%Y: On output, produces the four-digit calendar year. On input,
accepts four digits and may be used to determine calendar date.
Note that %Y does not yield a year appropriate for use with the
ISO8601 week number %V; programs should use %G for that purpose.
%z: On output, produces the current time zone, expressed in hours and
minutes east (+hhmm) or west (-hhmm) of Greenwich. On input,
accepts a time zone specifier (see *TIME ZONES* below) that will
be used to determine the time zone.
%Z: On output, produces the current time zone's name, possibly
translated to the given locale. On input, accepts a time zone
specifier (see *TIME ZONES* below) that will be used to determine
the time zone. *This option should, in general, be used on input
only when parsing RFC822 dates.* Other uses are fraught with
ambiguity; for instance, the string *BST* may represent *British
Summer Time* or *Brazilian Standard Time*. It is recommended that
date/time strings for use by computers use numeric time zones
instead.
%%: On output, produces a literal '%' charater. On input, matches a
literal '%' character.
%+: Synonymous with "%a %b %e %H:%M:%S %Z %Y".
SPECIFICATION: TIME ZONES
===========================
There are several ways that a time zone may be specified for use with
[clock scan], [clock format], [clock add] and [clock subtract]. In
order of preference:
* The time zone may appear in the input string matched by a %z or
%Z format group in [clock scan]. These format groups match time
zones in the forms +hhmm, +hhmmss, -hhmm, -hhmmss, and
alphanumeric strings. The numeric representations are self
explanatory; an alphanumeric string must be the one of:
gmt ut utc bst wet wat at
nft nst ndt ast adt est edt
cst cdt mst mdt pst pdt yst
ydt hst hdt cat ahst nt idlw
cet cest met mewt mest swt sst
eet eest bt it zp4 zp5 ist
zp6 wast wadt jt cct jst cast
cadt east eadt gst nzt nzst nzdt
idle
or a single letter other than J. Generally speaking, numeric time
zones should be preferred for communication among computers; the
alphanumeric time zones are provided primarily for the parsing of
legacy RFC822 time stamps.
* The time zone may appear in the *-timezone* argument to the
[clock] command.
* The time zone may appear in the environment variable, *TCL_TZ*.
* The time zone may appear in the environment variable, *TZ*.
* Failing all of these, on Windows systems, the time zone will be
obtained from the Registry.
* As a last resort, the time zone is set to ':localtime'.
Once the time zone is obtained by one of these means, it is interpreted
as follows:
":localtime":
This specifier requests that the C library functions
*localtime()* and *mktime()* be used whenever converting times
between local and Greenwich. It is generally used as a last
resort if the time zone can be determined in no other way.
"+hhmm", "+hhmmss", "-hhmm", "-hhmmss":
These specifiers give the time zone explicitly in terms of hours,
minutes and seconds east (+) or west (-) of Greenwich.
":filename":
The given file name is interpreted as a path name relative to
[info library]/tzdata, and the specified file is loaded as a Tcl
script. The script is expected to set the *:filename* element in
the *tzdata* array to a list of transitions. Each transition is a
four-element list comprising:
* the time at which the transition takes place, expressed in
seconds from the Posix Epoch (1 January 1970, 00:00 UTC)
* the offset (in seconds east of Greenwich) to apply.
* an indicator (0=Standard Time, 1=Daylight Saving Time)
* the name to use when displaying the given time zone in the
root locale.
The first transition is expected to take place at time
-9223372036854775808, the smallest value of a wide integer.
Any string recognizable as a Posix time zone specifier:
A time zone may be specified in Posix syntax (see
[<URL:http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html>]),
for example *EST5EDT* or
*EST+05:00EDT+04:00,M4.1.0/01:00,M10.5.0/02:00*.
Any other string is processed by prefixing a colon and attempting to
load the given file, as shown above.
SPECIFICATION: LOCALISATION
=============================
The [clock] command is localised by a set of message catalogs located
in [file join [info library] clock msgs] and loaded into the namespace,
::tcl::clock. The possible strings to be translated include:
AM: The string that identifies *ante meridiem* times when expressing
a time of day in the given locale. This string has the value,
{am} in the root locale.
BCE: The string that identifies dates before the Common Era in the
given locale. This string has the value, {B.C.E.} in the root
locale. Those localising this string should be aware that,
depending on local culture, a name such as "B.C." (before Christ)
may be offensive.
CE: The string that identifies dates of the Common Era in the given
locale. This string has the value, {C.E.} in the root locale.
Those localising this string should be aware that, depending on
local culture, a name such as "A.D." (Latin, *anno Domini*, "in
the year of Our Lord") may be offensive.
DATE_FORMAT:
The format specifier for calendar dates in the given locale. In
the root locale, %m/%d/%Y is used for compatibility with earlier
versions of the [clock] command, even though %Y-%m-%d would
probably be preferable.
DATE_TIME_FORMAT:
The format specifier for combined date and time in the given
locale. In the root locale, {%a %b %e %H:%M:%S %Y} is used for
compatibility with earlier versions of the [clock] command, even
though %Y-%m-%dT%H:%M:%S would be preferable.
DAYS_OF_WEEK_ABBREV:
Abbreviations of the days of the week in the given locale. In the
root locale, this string has the value, {Sun Mon Tue Wed Thu Fri
Sat}
DAYS_OF_WEEK_FULL:
Full names of the days of the week in the given locale. In the
root locale, this string has the value, {Sunday Monday Tuesday
Wednesday Thursday Friday Saturday}
GREGORIAN_CHANGE_DATE:
The date on which the change from the Julian to the Gregorian
calendar takes place, expressed as a Julian Day Number. In the
root locale, this string has the value, {2299161}, corresponding
to 15 October 1582 New Style. In the 'en' locale, this value is
{2361222}, 14 September 1752 New Style.
LOCALE_DATE_FORMAT:
The format to use when formatting dates in the locale's
alternative calendar. In the root locale, LOCALE_DATE_FORMAT is
the same as DATE_FORMAT.
LOCALE_ERAS:
In a locale where a calendar with multiple eras is in use, gives
a list of triples. The first element of each triple is the time
(in seconds from the Posix epoch of 1 January 1970, 00:00 UTC) at
which the era begins; the second is the name of the era, and the
third is a constant offset to be subtracted from the Gregorian
year to give the year of the era.
LOCALE_NUMERALS:
In a locale where alternative numerals may be used, gives a list
containing the numerals that represent the numbers from zero to
ninety-nine. Note that these numerals are the ones typically used
on calendars, not the ones that represent currencies or
quantities. For instance, in a Han locale, the number twenty-one
is represented by \u5eff\u4e00, not by \u4e8c\u5341\u4e00.
LOCALE_TIME_FORMAT:
The time format to use when formatting a time of day using a
locale's alternative numerals. In the root locale, this string is
the same as TIME_FORMAT.
LOCALE_YEAR_FORMAT:
The time format to use when formatting a year in the locale's
alternative calendar. In the root locale, this string is %Y.
MONTHS_ABBREV:
Abbreviated names of the months in the given locale. In the root
locale, consists of three-letter abbreviations for the English
months: Jan-Dec.
MONTHS_FULL:
Full names of the months in the given locale. In the root locale,
consists of the names of the English months in order from
'January' to 'December'.
PM: The string that identifies *post meridiem* times when expressing
a time of day in the given locale. This string has the value,
{pm} in the root locale.
TIME_FORMAT:
String that specifies the default time format in the given
locale. In the root locale, this string is {%H:%M:%S}
TIME_FORMAT_12:
String that formats time on a 12-hour clock in the given locale.
In the root locale, this string is {%I:%M:%S %p}.
TIME_FORMAT_24:
String that formats time on a 24-hour clock in the given locale.
In the root locale, this string is {%H:%M}.
*Example.* The following file is "ja.msg", which localises the [clock]
command to a Japanese locale.
namespace eval ::tcl::clock {
::msgcat::mcset ja DAYS_OF_WEEK_ABBREV [list \
"\u65e5"\
"\u6708"\
"\u706b"\
"\u6c34"\
"\u6728"\
"\u91d1"\
"\u571f"]
::msgcat::mcset ja DAYS_OF_WEEK_FULL [list \
"\u65e5\u66dc\u65e5"\
"\u6708\u66dc\u65e5"\
"\u706b\u66dc\u65e5"\
"\u6c34\u66dc\u65e5"\
"\u6728\u66dc\u65e5"\
"\u91d1\u66dc\u65e5"\
"\u571f\u66dc\u65e5"]
::msgcat::mcset ja MONTHS_ABBREV [list \
"1"\
"2"\
"3"\
"4"\
"5"\
"6"\
"7"\
"8"\
"9"\
"10"\
"11"\
"12"\
""]
::msgcat::mcset ja MONTHS_FULL [list \
"1\u6708"\
"2\u6708"\
"3\u6708"\
"4\u6708"\
"5\u6708"\
"6\u6708"\
"7\u6708"\
"8\u6708"\
"9\u6708"\
"10\u6708"\
"11\u6708"\
"12\u6708"\
""]
::msgcat::mcset ja BCE "\u7d00\u5143\u524d"
::msgcat::mcset ja CE "\u897f\u66a6"
::msgcat::mcset ja AM "\u5348\u524d"
::msgcat::mcset ja PM "\u5348\u5f8c"
::msgcat::mcset ja DATE_FORMAT "%Y/%m/%d"
::msgcat::mcset ja TIME_FORMAT "%k:%M:%S"
::msgcat::mcset ja DATE_TIME_FORMAT "%Y/%m/%d %k:%M:%S %z"
::msgcat::mcset ja LOCALE_NUMERALS "\u3007 \u4e00 \u4e8c \u4e09 \u56db
\u4e94 \u516d \u4e03 \u516b \u4e5d \u5341 \u5341\u4e00 \u5341\u4e8c
\u5341\u4e09 \u5341\u56db \u5341\u4e94 \u5341\u516d \u5341\u4e03
\u5341\u516b \u5341\u4e5d \u4e8c\u5341 \u5eff\u4e00 \u5eff\u4e8c
\u5eff\u4e09 \u5eff\u56db \u5eff\u4e94 \u5eff\u516d \u5eff\u4e03
\u5eff\u516b \u5eff\u4e5d \u4e09\u5341 \u5345\u4e00 \u5345\u4e8c
\u5345\u4e09 \u5345\u56db \u5345\u4e94 \u5345\u516d \u5345\u4e03
\u5345\u516b \u5345\u4e5d \u56db\u5341 \u56db\u5341\u4e00
\u56db\u5341\u4e8c \u56db\u5341\u4e09 \u56db\u5341\u56db
\u56db\u5341\u4e94 \u56db\u5341\u516d \u56db\u5341\u4e03
\u56db\u5341\u516b \u56db\u5341\u4e5d \u4e94\u5341
\u4e94\u5341\u4e00
\u4e94\u5341\u4e8c \u4e94\u5341\u4e09 \u4e94\u5341\u56db
\u4e94\u5341\u4e94 \u4e94\u5341\u516d \u4e94\u5341\u4e03
\u4e94\u5341\u516b \u4e94\u5341\u4e5d \u516d\u5341
\u516d\u5341\u4e00 \u516d\u5341\u4e8c \u516d\u5341\u4e09
\u516d\u5341\u56db \u516d\u5341\u4e94 \u516d\u5341\u516d
\u516d\u5341\u4e03 \u516d\u5341\u516b \u516d\u5341\u4e5d
\u4e03\u5341
\u4e03\u5341\u4e00 \u4e03\u5341\u4e8c \u4e03\u5341\u4e09
\u4e03\u5341\u56db \u4e03\u5341\u4e94 \u4e03\u5341\u516d
\u4e03\u5341\u4e03 \u4e03\u5341\u516b \u4e03\u5341\u4e5d
\u516b\u5341
\u516b\u5341\u4e00 \u516b\u5341\u4e8c \u516b\u5341\u4e09
\u516b\u5341\u56db \u516b\u5341\u4e94 \u516b\u5341\u516d
\u516b\u5341\u4e03 \u516b\u5341\u516b \u516b\u5341\u4e5d
\u4e5d\u5341
\u4e5d\u5341\u4e00 \u4e5d\u5341\u4e8c \u4e5d\u5341\u4e09
\u4e5d\u5341\u56db \u4e5d\u5341\u4e94 \u4e5d\u5341\u516d
\u4e5d\u5341\u4e03 \u4e5d\u5341\u516b \u4e5d\u5341\u4e5d"
::msgcat::mcset ja LOCALE_DATE_FORMAT "%EY\u5e74%B%Od\u65e5"
::msgcat::mcset ja LOCALE_TIME_FORMAT "%OH\u6642%OM\u5206%OS\u79d2"
::msgcat::mcset ja LOCALE_DATE_TIME_FORMAT \
"%A %EY\u5e74%B%Od\u65e5%OH\u6642%OM\u5206%OS\u79d2 %z"
::msgcat::mcset ja LOCALE_ERAS "
{-9223372036854775808 \u897f\u66a6 0}
{-3060979200 \u660e\u6cbb 1867}
{-1812153600 \u5927\u6b63 1911}
{-1357603200 \u662d\u548c 1925}
{568512000 \u5e73\u6210 1987}"
}
In addition to the standard locales, two special locales may appear on
the *-locale* parameter; *current*, which designates the result of
evaluating [mclocale], and *system*, which designates the current
"system" locale, which is determined by (in order of preference):
* the date/time format settings on the Windows control panel
* the environment variable LC_TIME
* the current locale from [mclocale].
BUILD SYSTEM
==============
Several tools are provided for the use of maintainers:
loadICU.tcl:
Given a distribution of IBM's *icu4c*
[<URL:http://www-124.ibm.com/developerworks/oss/icu/project/index.html>],
this program analyzes the source code of the message catalogs and
extracts appropriate Tcl-based messages for the date and time
formats in the supported locales.
loadtzif.tcl:
Given a time zone information file used by the Olson version of
'tzset' (for a description, see the latest 'tzcode' file in
[<URL:ftp://elsie.nci.nih.gov/pub/>]), creates the corresponding
Tcl 'tzdata' file.
tclZIC.tcl: Given the source code for the Olson time zone descriptions
(obtainable as the latest 'tzdata' file in
[<URL:ftp://elsie.nci.nih.gov/pub/>]), creates the full set of
Tcl 'tzdata' files.
Since these tools depend on third party source, they will not be
included in the usual build steps; instead, maintainers will be
expected to run them whenever changing files on which they depend. It
will be a good practice to update the ICU and Olson files just before
cutting a release.
REFERENCE IMPLEMENTATION
==========================
The proper location of the reference implementation is open to
discussion. The author of this TIP has been managing it as a
TEA-compatible extension that can eventually be integrated into the
Core sources. One possibility for finding a home for it would be to
create a 'newclock' module under the Tcl project at SourceForge.
Another would be perhaps to open a new project, or to create a
repository elsewhere. Suggestions are welcomed.
BUGS
======
The reference implementation does not attempt any calendars not based
on the hybrid Julian/Gregorian calendar. This implementation is
adequate for the Western countries and for the Japanese civil calendar,
but does not address the Hijri, Hebraic, Thai, Chinese or Korean
calendars. (No Tcl user has requested these, to the best of the
knowledge of the author of this TIP.)
The Gregorian change date is not supplied in most locales.
Localisation in most locales was done by an American who is probably
excessively ignorant in such matters.
COPYRIGHT
===========
Copyright 2004, by Kevin B. Kenny. Redistribution permitted under the
terms of the Open Publication License
[<URL:http://www.opencontent.org/openpub/>].
-------------------------------------------------------------------------
TIP AutoGenerator - written by Donal K. Fellows
[[Send Tcl/Tk announcements to tcl-announce@mitchell.org
Announcements archived at http://groups.yahoo.com/group/tcl_announce/
Send administrivia to tcl-announce-request@mitchell.org
Tcl/Tk at http://tcl.tk/ ]]
- Next message: Darren New: "Re: Access Control Lists(ACL)"
- Previous message: Darren New: "Re: windwows and mousepointer"
- Next in thread: David S. Cargo: "Re: TIP #173: Internationalisation and Refactoring of the 'clock' Command"
- Reply: David S. Cargo: "Re: TIP #173: Internationalisation and Refactoring of the 'clock' Command"
- Reply: Arjen Markus: "Re: TIP #173: Internationalisation and Refactoring of the 'clock' Command"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|