TIP #189: Tcl Modules
From: Andreas Kupries (akupries_at_shaw.ca)
Date: 04/27/04
- Next message: Andreas Kupries: "TIP #190: Implementation Choices for Tcl Modules"
- Previous message: Will Duquette: "ANN: Snit V0.93"
- Next in thread: Scott Gargash: "Re: TIP #189: Tcl Modules"
- Reply: Scott Gargash: "Re: TIP #189: Tcl Modules"
- Reply: Arjen Markus: "Re: TIP #189: Tcl Modules"
- Reply: Bob Techentin: "Re: TIP #189: Tcl Modules"
- Reply: Peter De Rijk: "Re: TIP #189: Tcl Modules"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 27 Apr 2004 21:19:59 +0000 (UTC)
TIP #189: TCL MODULES
=======================
Version: $Revision: 1.2 $
Author: Andreas Kupries <akupries_at_shaw.ca>
Jean-Claude Wippler <jcw_at_equi4.com>
Jeff Hobbs <jeffh_at_activestate.com>
Don Porter <dgp_at_users.sourceforge.net>
State: Draft
Type: Project
Tcl-Version: 8.5
Vote: Pending
Created: Wednesday, 24 March 2004
URL: http://purl.org/tcl/tip/189.html
WebEdit: http://purl.org/tcl/tip/edit/189
Post-History:
-------------------------------------------------------------------------
ABSTRACT
==========
This document describes a new mechanism for the handling of packages by
the Tcl Core which differs from the existing system in important
details and makes different trade-offs with regard to flexibility of
package declarations and to access to the filesystem. This mechanism is
called "Tcl Modules".
BACKGROUND AND MOTIVATION
===========================
The current mechanism for locating and loading packages employed by the
Tcl core is very flexible, but suffers from a number of drawbacks as
well. These are at least partially the result of the flexibility, and
thus not easily solved without giving up something.
One problem with the current mechanism is that it extensively searches
the filesystem for packages, and that it has to actually read a file
(/pkgIndex.tcl/) to get the full information for a prospective package.
All of these operations which take time. The fact that "index scripts"
are able to extend the list of paths searched tends to heighten this
cost as it force rescans of the filesystem. Installations the where
directories in the /auto_path/ are large or mounted from remote hosts
are hit especially hard by this (Network delays). All of this together
causes a slow startup of tclsh and Tcl-based applications.
"*Tcl Modules*" on the other hand is designed with less flexibility in
mind and to allow implementations to glean as much information as
possible without having to perform lots of accesses to the filesystem.
Additional benefits of the proposed design are a simplified deployment
of packages, akin to the way starkits made application deployment
simple, and from that an easier implementation and management of
repositories.
It does not come without penalties however.
* The simplified design has no "index scripts". While this does
away with extending the list of paths to search it also does away
with the ability of packages to check preconditions, like the
version of the currently executing Tcl interpreter. Dependencies
of packages in module form on particular versions of Tcl have to
be managed differently and outside of them.
* "Tcl Modules" is defined to be an extension of the existing
package mechanism and does /not/ replace it. This means that any
failure to find a package as module /has to/ cause a fall back to
the regular package mechanism. It also sets a limit on how much
of our goals we can reach: searching for packages which are not
installed will stay relatively slow, and dominated by the
filesystem scan of the regular search. This implies that "Tcl
Modules" will be best suited in installations where the number of
regular packages is low, and contained in a small part of the
overall filesystem.
On the gripping hand the only regular packages required will be
packages supporting the virtual filesystems employed by modules
(more on that later), so a transformation of a installation based
on a set of regular packages to the form above is quite feasible.
SPECIFICATION
===============
INTRODUCTION
--------------
Modules are regular Tcl Packages, in a different guise. To ease
explanations first a summary of the existing mechanism:
* Packages are identified through "pkgIndex.tcl" files and the
"index script" they contain. These files are read and define the
"provide script", which tells Tcl how to actually load the
package. In other words, whether to use the "source" or "load"
command, which file to specify as an argument to that command,
etc. However as "pkgIndex.tcl" contains a regular tcl script it
can do more than that and actually influence the environment,
i.e. the package search itself, in several ways:
* It may choose to not register the package if conditions
for the package are not met, like being run by a too old
version of Tcl.
* It may extend the list of paths used to search for
packages. This implies that a package is able to modify
the behaviour of the package search (usually extend the
search) even before it is loaded, and even if it will not
be loaded at all.
The above is very flexible, but comes at a price. The filesystem is not
only searched, but files have to be read as well to build up the
in-memory index of packages. And this is iterated as well if index
files change/extend the list of paths to search.
Tcl Modules simplifies the above considerably, by cutting down on the
number of indirections involved. It only searches for module files, and
records their location, but does not read them. The search is only
performed when required, on a limited part of the filesystem. This
makes locating and importing packages in module form easier and faster.
The price is that packages in module form cannot prevent registration
in an interpreter not of their choice, nor can they influence the
package search itself before they are actually used.
The remainder of this document will cover the following topics
* What constitutes a Tcl Module ?
* How are they found ?
* How are they indexed, i.e. entered into the package database ?
MODULE DEFINITION
-------------------
A Tcl Module is a Tcl Package contained in a /single/ file. This file
has to be *source*able. In other words, a Tcl Module is always imported
via:
source module_file
The "load" command is not directly used. This restriction is not an
actual limitation, as we may believe. Ever since 8.4 the Tcl *source*
command reads only until the first ^Z character. This allows us to
combine an arbitrary Tcl script with arbitrary binary data into one
file, where the script processes the attached data in any it chooses to
fully import and activate the package. Please read [TIP #190]
"Implementation Choices for Tcl Modules" for more explanations of the
various choices which are possible.
The name of a module file has to match the regular expression
([[:alpha:]][:[:alnum:]]*)-([[:digit:]].*)\.tm
The first capturing parentheses provides the name of the package, the
second clause its version. In addition to matching the pattern the
extracted version number must not raise an error when used in the
command
package vcompare $version 0
This additional check has several benefits. The reg-exp pattern is a
bit simpler, and the full version check is based on the official
definition of version numbers used by the Tcl core itself.
FINDING MODULES
-----------------
Remember the check for a valid module in last section, and notice that
any filename matching this name pattern is going to be treated by the
TM system as if it's a Tcl module, whether it really is or not. This
means it's a bad idea for any non-Tcl module files that might match
that pattern to end up in a directory where TM will be scanning. This
suggests that the directory tree for storing Tcl modules ought to be
something separate from other parts of the filesystem. This further
implies that a new search path over just these separate storage areas
would be better than Yet Another use of /$::auto_path/.
Therefore: Modules are searched for in all directories listed in the
result of the command "::tcl::tm::path list" (See also section 'API to
"Tcl Modules"'). This is called the "Module path". Neither
"/auto_path/" nor "/tcl_pkgPath/" are used.
All directories on the module path have to obey one restriction:
* For any two directories neither is an ancestor directory of the
other.
This is required to avoid ambiguities in package naming. If for example
the two directories
foo/
foo/cool
were on the path a package named 'cool::ice' could be found via the
names 'cool::ice' or 'ice', the latter potentially obscuring a package
named 'ice', unqualified.
Before the search is started the name of the requested package is
translated into a partial path, using the following algorithm:
* All occurrences of '::' in the package name are replaced by the
appropriate directory separator character for the platform we are
on. For Unix for example this is '/'.
Example:
* The requested package is /encoding::base64/. The generated
partial path is
encoding/base64
After this translation the package is looked for in all module paths,
by combining them one-by-one, first to last with the partial path to
form a complete search pattern. The exact pattern and mechanism is left
unspecified, giving the implementation freedom of choice what glob
searches to perform, how much of them, and when.
Independent of that the implemented algorithm has to reject all files
where the filename does not match the regular expression given in the
previous section. For the remaining files "provide scripts" are
generated and added to the *package ifneeded* database.
The algorithm has to fall back to the previous unknown handler when
none of the found module files satisfied the request. If the request
was satisfied no fall-back is required.
PROVIDE AND INDEX SCRIPTS
---------------------------
Packages in module form have no control over the "index" and "provide
script"s entered into the package database for them. For a module file
/MF/ the "index script" is
package ifneeded PNAME PVERSION [list source MF]
and the "provide script" embedded in the above is
source MF
Both package name *PNAME* and package version *PVERSION* are extracted
from the filename *MF* according to the definition below:
MF = /module_path/PNAME'-PVERSION.tm
Where *PNAME' *is the partial path of the module as defined in section
'Finding Modules' before, and translated into *PNAME* by changing all
directory separators to '::', and *module_path* the path from the list
of paths to search we found the module file under.
/Note/ that we are here creating a connection between package names and
paths. Tcl is case-sensitive when it comes to comparing package names,
but there are filesystems which are not, like NTFS. Luckily these
filesystems do store the case of the name, despite not using the
information when comparing.
Given the above we allow the names for packages in Tcl modules to have
mixed-case, but also require that there are no collisions when
comparing names in a case-insensitive manner. In other words, if a
package 'Foo' is deployed in the form of a Tcl Module, packages like
'foo', 'fOo', etc. are not allowed anymore.
Regular packages have no problem with the names of their files as their
entry point is has a standard name ("/pkgIndex.tcl/") and its contents
can be adjusted according to the filesystem they are stored in.
API TO "TCL MODULES"
----------------------
"Tcl Modules" is implemented in Tcl, as a new handler command for
*package unknown*. This command calls the previously installed handler
when its own search fails, thereby ensuring proper fall-back to the
regular package search.
All code and data structures implementing "Tcl Modules" reside in the
namespace "/::tcl::tm/".
A namespace variable holds the list of paths to search for modules, but
is not officially exported. All access to this variable is done through
the following public commands:
* *::tcl::tm::path add* /PATH/
The path is added at the head to the list of module paths.
The command enforces the restriction that no path may be an
ancestor directory of any other path on the list. If the new path
violates this restriction an error will be raised.
If the path is already present as is no error will be raised and
no action will be taken.
Paths are searched in the order of their appearance in the list.
As they are added to the front of the list they are searched in
reverse order of addition. In other words, the paths added last
are looked at first.
* *::tcl::tm::path remove* /PATH/
Removes the path from the list of module paths. The command is
silently ignored if the path is not on the list.
* *::tcl::tm::path list*
Returns a list containing all registered module paths, in the
order that they are searched for modules.
We do /not/ provide APIs for rescanning directories, clearing internal
state and such. The official interface to this functionality is
"package forget" and special interfaces are neither required nor
desirable.
DISCUSSION
============
RESTRICTION TO "SOURCE"
-------------------------
This has already been discussed in the specification above.
For more discussion I again refer to [TIP #190] "Implementation Choices
for Tcl Modules" which explains the various implementation choices in
much more detail.
PRECONDITIONS
---------------
It has already been mentioned in section 'Background and Motivation'
that preconditions in "index scripts" are lost, one of the penalties of
the simplified scheme specified here.
Their existence was most important to installations with multiple
versions of Tcl coexisting with each other as they could share the
directory hierarchy containing packages between the various Tcl cores.
This is not possible anymore, at least not in a simple manner.
For the majority of installations however, i.e. those without only one
version of Tcl installed, or controlled environments like the inside of
starkits and starpacks, this loss is irrelevant and of no consequence.
For more discussion please see [TIP #191] "Managing Tcl Package and
Modules in a Multi-Version Environment" which explains the various
choices a sysadmin has in much more detail.
PACKAGE METADATA
------------------
An area possibly made harder by Tcl Modules is the storage and query of
package metadata. [TIP #59] was one way of handling such information,
by storing them in the binary library of packages which have such.
Another approach was to store them in the package index script, using a
hypothetical *package about* command.
The latter approach has the definite advantage that it was possible to
query the database of metadata for a particular package without having
to actually load said package, as a load may fail if the Tcl shell used
to query the database does not fulfil the preconditions for that
package.
Both approaches listed above assume that it makes sense to query the
database of metadata for all installed packages from a plain Tcl shell.
In other words, to use the standard Tcl shell also as the tool to
directly manage an installation.
It is possible to extend the proposal made in this document to handle
metadata as well. We already reserved the namespace *::tcl::tm* for use
by us, so it is no big problem to extend the public API with commands
to locate all installed packages, their metadata, and to perform
queries based on this. This will require an additional specification
how metadata is stored in/by Tcl Modules, and it will have to be
understood that these extended management operations can take
considerably more time than a *package require*, as they will have to
scan all defined search paths and all their sub directories for Tcl
Modules, and have to extract the metadata itself as well.
DEPLOYMENT
------------
The fact that a Tcl Module consists only of a single file makes its
deployment quite easy. We only have to ensure correct placement in one
of the searched directories when installing it locally, but nothing
more.
Regarding the usage of Tcl Modules in a wrapped application please see
[TIP #190] "Implementation Choices for Tcl Modules". This is highly
dependent on the implementation chosen for a specific Tcl Module and
thus not discussed here, but in the referred document.
PACKAGE REPOSITORIES
----------------------
At a very basic level, the physical storage, any directory tree
containing properly placed files for a number of modules can serve as a
package repository for the modules in it. In other words, from that
point of view an installation is virtually indistinguishable from a
repository, and their creation and maintenance is very easy
Note however that the higher levels of a repository, like indexing
package metadata in general, or dependence tracking in particular,
licensing, documentation, etc. are not addressed here and by this.
This requires standards for package metadata, format and content, this
document will not deal with.
DEFAULTS
----------
The default list of paths on the module path is computed by a tclsh as
follows, where /X/ is the major version of the Tcl interpreter and /y/
is less than or equal to the minor version of the Tcl interpreter.
* System specific paths
* *file normalize* [*info library*]/../tcl/X///X/./y/
In other words, the interpreter will look into a directory
specified by its major version and whose minor versions
are less than or equal to the minor version of the
interpreter.
Example: For Tcl 8.4 the paths searched are
* [*info library*]/../tcl8/8.4
* [*info library*]/../tcl8/8.3
* [*info library*]/../tcl8/8.2
* [*info library*]/../tcl8/8.1
* [*info library*]/../tcl8/8.0
This definition assumes that a package defined for Tcl
/X.y/ can also be used by all interpreters which have the
same major number /X/ and a minor number greater than /y/.
* *file normalize* /EXEC//tcl/X///X/./y/
Where /EXEC/ is [*file normalize* [*info
nameofexecutable*]/../lib] or [*file normalize*
[*::tcl::pkgconfig get* libdir,runtime]]
This sets of paths is handled equivalently to the set
coming before, except that it is anchored in
/EXEC_PREFIX/. For a build with /PREFIX/ = /EXEC_PREFIX/
the two sets are identical.
* Site specific paths.
* *file normalize* [*info library*]/../tcl/X//site-tcl
* User specific paths.
* *$::env*(TCL/X/./y/_TM_PATH)
A list of paths, separated by either *:* (Unix) or *;*
(Windows). This is user and site specific as this
environment variable can be not only by the users profile,
but by system configuration scripts as well.
These paths are seen and therefore shared by all Tcl
shells in the *$::env*(PATH) of the user.
Note that /X/ and /y/ follow the general rules set out
above. In other words, Tcl 8.4 for example will look at
these 5 environment variables
* *$::env*(TCL8.4_TM_PATH)
* *$::env*(TCL8.3_TM_PATH)
* *$::env*(TCL8.2_TM_PATH)
* *$::env*(TCL8.1_TM_PATH)
* *$::env*(TCL8.0_TM_PATH)
/All/ the default paths are added to the module path, even those paths
which do not exist. Non-existent paths are filtered out during actual
searches. This enables a user to create one of the paths searched when
needed and all running applications will automatically pick up any
modules placed in them.
The paths are added in the order as they are listed above, and for
lists of paths defined by an environment variable in the order they are
found in the variable.
INSTALLATION
--------------
The installation of a Tcl module for a particular interpreter is
basically done like this:
#! /path/to/chosen/tclsh
# First argument is the name of the module.
# Second argument is the base filename
set mpaths [::tcl::tm::path list]
... remove all paths the user has no write permissions for.
... throw an error if there are no paths left.
... provide the user with some UI if more than one path is left
... so that she can select the path to use.
set selmpath [ui_select $mpaths]
file copy [lindex $argv 1] \
[file join $selmpath \
[file dirname [string map {:: /} \
[lindex $argv 0]]]]
GLOSSARY
==========
The following terms and definitions are used throughout the document
* /index script/
A script used to index a package, or not. Usually contained in a
file named "/pkgIndex.tcl/". Can check preconditions for a
package and contains package specific code for setting up the
package specific /provide script/.
* /provide script/
This is a package specific script and tells Tcl exactly how to
import it. In the existing package system it is generated and
registered by the /index script/. Tcl Modules on the other hand
generates it based on information gleaned from filenames.
REFERENCE IMPLEMENTATION
==========================
A reference implementation is available in Patch 942881
[<URL:http://sf.net/tracker/?func=detail&aid=942881&group_id=10894&atid=310894>]
QUESTIONS
===========
COMMENTS
==========
[ Add comments on the document here ]
COPYRIGHT
===========
This document has been placed in the public domain.
-------------------------------------------------------------------------
TIP AutoGenerator - written by Donal K. Fellows
[[Send Tcl/Tk announcements to tcl-announce@mitchell.org
Announcements archived at http://groups.yahoo.com/group/tcl_announce/
Send administrivia to tcl-announce-request@mitchell.org
Tcl/Tk at http://tcl.tk/ ]]
- Next message: Andreas Kupries: "TIP #190: Implementation Choices for Tcl Modules"
- Previous message: Will Duquette: "ANN: Snit V0.93"
- Next in thread: Scott Gargash: "Re: TIP #189: Tcl Modules"
- Reply: Scott Gargash: "Re: TIP #189: Tcl Modules"
- Reply: Arjen Markus: "Re: TIP #189: Tcl Modules"
- Reply: Bob Techentin: "Re: TIP #189: Tcl Modules"
- Reply: Peter De Rijk: "Re: TIP #189: Tcl Modules"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|