Proposal: String::Format::General
- From: "harryfmudd [AT] comcast [DOT] net" <"harryfmudd [AT] comcast [DOT] net">
- Date: Sat, 02 Dec 2006 14:56:36 -0500
All,
There being no follow-up on "Formatting -- recommended module" in three weeks, I have proceeded with my own module. The POD is appended.
In short, it's kind of a meta-formatter. It provides format string parsing and output assembly, you provide the code that implements the individual conversion characters. Format syntax is kind of a cross between sprintf and strftime, but how close it is to each of these depends on the semantics implemented by the user.
Note that the following is pre-alpha documentation; the interface to the output conversion code has changed since yesterday, and may change again. I'm thinking about passing the format-derived parameters in a hash reference instead of as 8 arguments.
Tom Wyant
NAME
String::Format::General - Configurable sprintf-style formatter
SYNOPSIS
use String::Format::General;
$fmt = String::Format::General->new (
format => '%N = %-rN',
newline => "\n",
);
$fmt->define (conversion =>
L => sub { # Literal
my ($fmt, $dta, $inx, $flg, $xfg, $wid, $dp, $str, $trn, $cc) = @_;
sprintf "%$flg*s", $wid, $dta;
}, 10, 'Text');
$fmt->define (transform =>
r => sub { # Reverse
my ($fmt, $dta, $inx, $flg, $xfg, $wid, $dp, $str, $trn, $cc) = @_;
scalar reverse $dta;
}, 'Reversed');
print $fmt->head ();
print $fmt->format ('Desperado');
which would produce the following output:
= Reversed
Text = Text
Desperado = odarepseD
The idea is that you can pass arbitrary data in, and have arbitrary
displays made out of them. Some limited heading capability is provided.
DESCRIPTION
This module provides a formatter class which allows its user to
implement arbitrary sprintf-like or strftime-like formats. The formatter
class provides format parsing, and marshaling of output for both data
and headings. The user of the class provides code to implement the
individual output conversion characters.
A format is basically a string into which data are to be inserted. The
insertion is controlled by conversion descriptors, which begin with a
'%' (or whatever the value of the 'leader' attribute is), end with the
conversion character, and contain a number of optional fields which may
affect the conversion. These fields are:
parameter index - a number followed by a literal dollar sign, which by
convention is used to select the datum to be converted when formatting a
list reference. This will be defaulted 0, 1, 2, and so on left to right
in the format. That is to say, if %d formats an integer in decimal and
format string '%2$03d %03d %03d' is used on [1, 2, 3], the result will
be '003 001 002'.
flags - the standard sprintf flags are allowed, and you can define your
own. User-defined and standard flags can be intermingled in the format
specification, but are passed separately to the conversion routine.
width - a number which by convention represents the width of the field.
The sprintf '*' is not allowed.
places - a dot ('.') followed by a number. By convention this represents
the number of decimal places if applicable. The sprintf '*' is not
allowed.
string - an arbitrary string, quoted by curly brackets or whatever the
contents of the 'quote' attribute are. This string may not contain
whatever character closes the quote, but may be used in any way the
conversion desires, or ignored.
transform - characters specifying a transformation of the data before
its output conversion. Transformations are user-defined.
For example, one could define conversion 'f' for floating-point output,
and invoke this with format '%5.2f'. Or one could define conversion 'T'
to output the time of day according to some default format overridden by
the contents of the string portion of the specification, and use format
'%{%I:%M:%S %p}T' to output time with a meridian indicator.
In addition to formatting functionality, a number of output modes are
provided. Discussion of these is given under the mode attribute.
Methods
This class implements the following public methods:
$fmt = String::Format::General->new ();
This method instantiates a new String::Format::General object. Any
arguments are passed to $fmt->set ();
Note that this can be called as a normal method, i.e.
my $fmt2 = $fmt->new ();
This makes $fmt2 a clone of $fmt. Any arguments are passed to
$fmt2->set ().
$fmt->clear_cache ();
This method clears data cached by the formatter. This would be done
to force the data to be recomputed. Cached data include the regular
expression used to parse the format string, and the computed widths
for the columns being formatted. This is called by the define()
method, and by the mutator for the 'format', 'leader', and 'quote'
attributes. It should be called by the mutator of any user-added
attribute that affects format parsing or format() or head() output
in any way.
This method may not be called as a static method.
$fmt1 = $fmt->clone ();
This method duplicates the given object. It is more efficient to
clone an existing object than to manufacture one from scratch
(assuming you have a suitable existing object on hand), because
there is no need to validate define() calls.
This method may not be called as a static method.
$fmt->define ($what => $name, ...);
This method defines things. The $what argument says what is being
defined, and must be one of 'attribute', 'conversion', 'flag',
'mode', or 'transform'. The $name argument must conform to the
naming conventions for the thing being defined, and the subsequent
arguments depend on the what is being defined.
If $what eq 'attribute', a new attribute with the given name is
created. The next argument is either false or a reference to the
mutator for the object. The mutator should expect the following
arguments:
- a reference to the object being mutatated;
- the name of the attribute being mutated;
- the new value of the attribute.
The mutator must return the value to be placed in the attribute. If
the value is invalid it should raise an exception. In the case of a
false value being passed for the mutator, the mutator is sub
{$_[2]}.
If $what eq 'conversion', a new conversion is created. The name must
be a single character and not be a digit, ' ', '-', '+', '#', '.',
'$', or the quote character. The subsequent arguments are:
- a reference to code to implement the conversion;
- a heading or code to compute the heading;
- a literal default width or code to compute the width.
If $what eq 'flag', a new flag is created. The name must conform to
the same restrictions as a 'conversion'. Any extra arguments are
ignored.
If $what eq 'mode', a new output mode is created. The subsequent
arguments are:
- a reference to code to implement format() output;
- a reference to code to implement head() output;
- an optional reference to code to see if the mode
may be used, and if so to load any modules needed.
The format() output code should expect as input the formatter object
and a list of formatted data to be inserted into the output line and
returned. The head() output code should expect as input the
formatter object and a list of headers, and return a list of header
lines. The check code should expect as input the formatter object,
and raise an exception if the mode may not be used.
If $what eq 'transform', a new transform is created. The name must
conform to the same restrictions as a 'conversion. The subsequent
arguments are:
- a reference to code to implement the transform;
- a heading or code to implement the heading.
Once an attribute, conversion, flag, mode, or transform is defined
it may not be deleted. It may be redefined by the class (i.e.
namespace) that made the original definition, or by a subclass of
that class, using the define() method; but it may not be made into
something different. That is, if a class has defined attribute 'x',
a subclass may not make 'x' into a conversion, flag, or transform,
though it can change its mutator.
Except as noted above, when code is called for it should expect to
be called with the following arguments:
$fmt : The relevant String::Format::General object;
$dta : The data to be formatted (undef if none);
$inx : The parameter index to be converted (from zero);
$flg : The sprintf-style flags ('' if none);
$xfg : The extended flags ('' if none);
$wid : The field width ('' if none);
$dp : The decimal places (undef if none);
$str : The arbitrary string (undef if none);
$trn : The transform characters ('' if none);
$cc : The conversion character.
What the code returns depends on its purpose.
The conversion code is expected to return the formatted data.
Heading code is expected to return the relevant heading, as either a
string or a list of strings. The $dta argument will be whatever
argument was passed to the head() call.
Transform code is expected to return the transformed data.
Width code is expected to return the width, which will be passed to
heading, transform, and conversion code in the $wid argument. The
$dta argument will be undef, and the $wid argument may be empty.
This method may not be called as a static method.
The return value is the object itself.
$what = $fmt->defined_as ($name);
This method returns the thing the $name is defined as, as one of the
strings 'attribute', 'conversion', 'flag', 'mode', or 'transform'.
If the $name is not defined, undef is returned.
This method may be called as a static method.
$who = $fmt->defined_by ($name);
This method returns the name of the class (i.e. namespace) that
defined the given thing. If it is not defined, undef is returned.
This method may be called as a static method.
$text = $fmt->format ($data);
This method formats the given $data into text in the current output
mode, by applying the defined conversions, flags, and transforms to
the 'format' attribute.
This may not be called as a static method.
$value = $fmt->get ($name);
This method retrieves the value of the given attribute. It may not
be called as a static method.
$hash_reference = $fmt->get_cache ($name);
This method returns the contents of the named cache. The return will
be a hash reference, and this cache will be created if necessary.
The cache is a hash attached to the object. User-defined code may
use the cache, but any use of the cache should be prepared to
compute whatever is needed in case the cache has been cleared.
Cache names beginning with an underscore are reserved to this class.
It is recommended that the user create a single appropriately-named
cache, and store all user-defined data under this name. The
recommended idiom for accessing data in this cache is something like
my $my_cache = $fmt->get_cache ('my_cache');
my $cached_data = $my_cache->{cached_data} ||= do {
# Code to compute the cached data. Because $my_cache
# is in the cache, the computed {cached_data} will
# be also.
};
This method may not be called as a static method.
@lines = $fmt->head ($dta);
This method generates all headers in the current output mode.
The $dta argument is optional. If it is provided it will be passed
to the individual transforms' and conversions' header code as the
second argument; otherwise undef will be passed.
The heading for each conversion is formed by the headings of the
transforms (if any) and conversion character for each conversion,
joined with ' '.
If the reverse_headings attribute is true (in the Perl sense) the
individual headings will be reversed before being joined, in a crude
attempt to give a more natural word order in (e.g.) the Romance
languages.
If the make_text attribute is true (in the Perl sense) the
reverse_headings attribute is ignored. Instead, the field heading,
once built, is submitted to maketext, and the result is used as the
column heading. If the maketext call fails, the individual pieces of
the heading are submitted to maketext, and the result concatenated.
The individual translated headings are reversed before concatenation
if maketext ('_reverse_headings') is true.
This method may not be called as a static method.
$fmt = $fmt->set ($name, $value ...)
This method changes the values of the given attributes. More than
one attribute/value pair may be given, but the named attributes must
exist.
This method may not be called as a static method.
%used = $fmt->used ();
This method returns a hash enumerating the conversion characters
used by the current format. The hash keys are the conversion
characters, and the values are the number of times each character is
used. Conversion characters not used will not appear in the hash.
This method may not be called as a static method.
Attributes
The attributes are described below, listed by name, followed by the data
type in parentheses. The data types are:
boolean - Any value is valid, and it will be interpreted according to
Perl's notion of truth and falsehood (undef, 0, and '' are false,
everything else is true);
character - The attribute must be a single character, and certain
characters may be invalid;
handle - see the description of the attribute;
string - Any characters are valid.
In addition, the parentheses may contain the word 'cached' if the class
caches results computed from this attribute. Changing such attributes
will cause the cache to be flushed, and the computations will be redone
the next time they are needed.
This class implements the following public attributes:
format (string, cached)
This attribute contains the format string to be used by the
format(), head(), and used() methods.
There is no default.
leader (string, cached)
This attribute specifies the character that introduces a conversion.
The default is '%'.
make_text (string or handle)
This attribute specifies either the name of a subclass of
Locale::MakeText or a language handle generated by calling the
class' get_handle method. If a subclass name is specified,
get_handle will be called on that class, and the make_text attribute
set to the result of the get_handle method.
As a consequence of this, if you wish to change languages in
mid-stream, you need to reset the make_text attribute. Something
like
$fmt->set (make_text => ref $fmt->get ('make_text'))
should do the job, though depending on how the conversions are
implemented you may also need to do things like call
POSIX::setlocale.
The head() method makes use of this attribute to translate field
headings, but it is also available for other purposes.
If this attribute is set (i.e. true in the Perl sense) the
reverse_headings attribute is ignored.
The default is undef.
mode (string)
This attribute controls the output produced by the format() and
head() methods. Initially, valid values are:
csv - Causes the output to be comma-separated values. The head()
method returns only one line, with all headings for a column being
joined by single spaces. Contents of the 'format' attribute other
than conversion specifications are ignored, and leading and trailing
spaces will be stripped before output. The Text::CSV module is
required, and will be loaded if needed.
html_table - Causes the output to be html table rows. The head()
method encloses the headings in <th> tags, and the whole row in <tr>
tags. Only one table row is generated by the head() method, with all
headings for a column being joined by single spaces. The format()
method encloses the data in <td> tags, and the whole row in <tr>
tags. Contents of the 'format' attribute other than conversion
specifications are ignored, and leading and trailing spaces will be
stripped before output. The HTML::Entities module is required, and
will be loaded if needed.
text - Causes the output to be text formatted per the contents of
the 'format' attribute. The head() method may return multiple lines
of text if necessary. No additional modules are needed.
tsv - Similar to 'csv', but the output is tab-separated and strings
are not quoted. No additional modules are needed.
The user may of course define additional output modes using the
define() method.
The default is 'text'.
newline (string)
This attribute specifies a string to place at the end of each line
of text generated by the format() and head() methods in 'text',
'html_head', and 'tsv' modes. In 'csv' mode the return is provided
by Text::CSV. The use of this in a user-defined mode is determined
by the definition of the mode.
The default is ''.
quote (character, cached)
This attribute specifies the character to delimit the 'string'
portion of the conversion. The same character will be used to
delimit the end unless the character is a bracket of some sort, in
which case the opposite bracket terminates the string. Yes, this
means that if you begin the string with a right-hand bracket you
must end it with the corresponding left-hand bracket.
Digits and the characters ' ', '+', '-', '#', '$', and '.' are
invalid, as are characters defined as having other functions (e.g.
conversions). An attempt to set the attribute to one of these will
croak.
The default is '{', which means '}' terminates the string.
reverse_headings (boolean)
This attribute specifies the order in which individual transform and
conversion headings are concatenated to make a field heading. If
true (in the Perl sense) they are reversed.
This attribute is ignored if the make_text attribute was specified.
The default is undef (i.e. false).
trim (boolean)
This attribute specifies the behavior of the format() method when a
datum does not fit in its field. If true, the datum will be
truncated on the right. If false, the datum will not be truncated.
Headings are always truncated.
The default is undef (i.e. false).
SUBCLASSING
Unless you intend to use your formats in more than one script you do not
need to subclass. Instead, instantiate an object, define() what you need
on it, and then clone() it if you need more than one (say, to avoid the
overhead involved in changing some of the attributes).
Should you want to subclass, something like this is probably the
simplest approach:
sub new {
my $class = shift;
my $self = $class->SUPER::new ();
$self->define (...);
# As many as desired.
@_ and $self->set (@_);
$self;
}
Since cloning is more efficient than defining attributes (no checking
for conflicts) you may wish to do something like this:
my $prototype = __PACKAGE__->SUPER::new ();
_setup ($prototype);
sub new {
my $class = shift;
my $self = ref $class ? $class->clone () :
$class eq __PACKAGE__ ? $prototype->clone () :
_setup ($class->SUPER::new ());
@_ and $self->set (@_);
}
sub _setup {
my $self = shift;
$self->define (...);
# As many as desired.
$self;
}
You may find it useful to override the set() method in a couple cases:
you may wish to restrict the values of the base attributes, or you may
wish to alter some attributes based on the values of others. In both
cases this can be done with the following pseudo-Perl:
sub set {
my $self = shift;
while (@_) {
my $name = shift;
my $value = shift;
if ($name eq 'something') {
# throw an exception if you do not like the value
}
$self->SUPER::set ($name, $value);
if ($name eq 'something') {
# do other processing now that the value has been set
}
}
$self; # since set() is documented as returning this
}
BUGS
Bugs may be reported to <https://rt.cpan.org/> or by mail to the author
(wyant at cpan dot org).
SEE ALSO
<http://search.cpan.org/> finds 267 modules with 'format' in their names
as of November 2 2006. Most of these seem to be special-purpose
formatters, parts of template systems, or concerned with blocks of text.
For implementing a general sprintf-like formatter, only the following
seemed more-or-less relevant to me.
<http://search.cpan.org/dist/String-Format> by Darren Chamberlain uses
the format characters to select like-keyed data from a hash.
<http://search.cpan.org/dist/String-FormatX> by Lance Cleveland is
object-oriented, and associates values with COBOL-like templates.
<http://search.cpan.org/dist/Data-Display> by Geo Tiger is concerned
with the marshaling of tabular data.
<http://search.cpan.org/dist/Data-MaskPrint> by Ilya Verlinsky uses
COBOL-like templates to format tabular data.
There are even more templating systems, the most comprehensive (and
largest) being <http://search.cpan.org/dist/Template-Toolkit> by Andy
Wardley.
COPYRIGHT
Copyright 2006 by Thomas R. Wyant, III (wyant at cpan dot org). All
rights reserved.
This module is free software; you can use it, redistribute it and/or
modify it under the same terms as Perl itself.
.
- Prev by Date: Proposal: String::Format::General
- Next by Date: Email address syntax check?
- Previous by thread: Proposal: String::Format::General
- Next by thread: Trouble installing libapreq2-2.08
- Index(es):
Relevant Pages
|
|