Re: associative array, container, mapping, terminology



phil-news-nospam@xxxxxxxx writes:

On 15 Feb 2009 19:31:09 -0800 Tim Rentsch <txr@xxxxxxxxxxxxxxxxxxx> wrote:

| The specific words used vary from language to language, but the
| conceptual difference is language independent.

I'm hoping to find wording as close to language independence as possible.
But if that is not possible, I guess I will fall back to C terms.


|> I don't know what you mean by values (of the same type) and pointers.
|> If what I get when doing a lookup in a association/containter/dictionary
|> is a pointer, I see that as a kind of type, but also specific to what it
|> points to. E.g. a pointer to one kind of struct and a pointer to a
|> different kind of struct I consider to really be two different types.
|> In C, such pointers would be defined as such (struct a *x; struct b *y;).
|
| The struct used to hold the pointer to the "value" portion of the
| association has a fixed type (in the C sense of the word type).
| At run-time, a value pointer would point to different kinds of objects,
| but which kind isn't known at compile time, only at run-time.
| The "kind" of an object is referred to its /class/, or sometimes
| as its /run-time type/, but which term is used isn't so important
| as the property that, whatever we call this aspect, it's known
| only at run-time, not at compile time.

With homogenous keys and heterogenous data, my structs have the key appended
to the struct used as the node in the tree used to implement the mapping.
The data type is in there. Certain data types are in there and other data
types are pointers.

I'm looking at having a mapping that is homogenous keys and homogenous data,
with the both key and data in the node. This will depend on fixed size data.
When the data is always a fixed size struct, this is practical. I eliminates
some steps in working with the data.


| Using just slightly different terminology, your storage function will
| store different kinds of object values, and it remembers what kind of
| object value it stored. What kind of value is stored for a particular
| key (or entry?) is known only at run-time, not at compile-time. The
| differentation that occurs is a run-time differentiation, not a
| compile-time one.

Right. But I am including compile-time differentiation for new features.
In this, a programmer would provide definitions of structs to be used as
the key and data (one for key, another one for data). This would create
a custom API as named by the programmer, based on a common API template.


|> This is a run time distinction of type in the sense that any entry could have
|> any type decided at run time. There is a compile time aspect to it because
|> there are different functions to deal with different types. It's not a case
|> of all types being stored or fetched as data are using the same function across
|> all these possible types. The functions are thus declared with specific types
|> in the argument list, or as return values, or as pointers in the argument list
|> where the values are to be stored or fetched.
|
| So there are compile-time type-specific functions to store different
| kinds of objects (so the functions have static typing), but the
| data structure itself allows storing different kinds of objects,
| with what kind of object will be stored being a run-time parameter,
| not a compile-time constraint -- right?

Right. There are different insert functions for each of the different key
types that could be inserted. There are different search functions for each
of the key types that could be searched for. There are different data store
functions for each of the different data types to be stored. And so on for
data to be fetched. One must use the correct functions or it is an error.
What key type a mapping has, and what data type the data associated with a
key has, is decided at run time.

I'm wanting to make a tool to let programmers make these decisions at compile
time as an alternative.


|> But the point here is one instance of a mapping can have only the same type of
|> key, but can have different types of data. One ENTRY in a mapping can only
|> have one type of value.
|
| I take this to mean an individual entry can remember only one kind of object
| value at any one time. At some later point in time, that entry might
| remember an object value of a different kind, yes?

Yes.

However, it is possible with my design to have multiple entries that have the
same key value.

And, of course, the entry could also be deleted.


|> I'm studing expanding this library so that it will not only support a wider
|> range of types, focusing on programmer defined structs, but also have the
|> basic variations of allowing mappings to either have a homogenous key type
|> or have a heterogenous key type (allowing more than one key type in the same
|> mapping). I also want to add homogenous data types (the implementation can
|> be slightly optimized for this case). This means I can have 4 different basic
|> varations of mappings, and within thise that have homogenous key and/or data,
|> there would be sub-variations based on what specific key and/or data type is
|> chosen.
|
| Suppose we decided to represent keys thusly:
|
| struct abstract_key_s;
| typedef struct abstract_key_s *Key;
| struct abstract_key_s {
| unsigned long key_kind; /* for key object's internal use */
| unsigned long (*hash)( Key ); /* hash value function for this key */
| int (*compare( Key, Key )); /* determine ordering of two keys */
| };
|
| This representation allows the association data structure code
| to use a single type for keys, but accommodates keys of many
| different kinds ("heterogeneous" keys). However, at this level,
| these abstract keys accommodate all the flexibility we need,
| and the abstract keys have just one type (in the sense that
| the calling code for the association code never needs to do
| any type conversion, just call functions). What's important
| about this is that, although it is heterogeneous at a lower
| level, at the level of the association data structure itself,
| keys are homogeneous. If you're intending to talk about
| the data structure code, the higher level perspective (that
| is, homogeneous) is important; whereas, if you're intending
| to talk about what sorts of things are held, the lower level
| perspective (that is, heterogeneous) is important. And if
| it's important to talk about both levels, then keys will be
| homogeneous in one part of the discussion and heterogeneous
| in another. So an essential question is, at what level
| will the presentation occur, because different levels
| result in different descriptions.

This is a good point to consider. What I have now applies to the data value.
The data value types can vary (within a finite set that is implemented).
But the caller has to use the correct data access functions for the data
type involved. So in that sense the heterogeneous aspect of data is exposed
to the programmer.

And I would rather leave it that way. I want to have the application compile
time type checking to take place, or at least be available to the programmer.


|> What I am basically looking for HERE in this newsgroup is terminology.
|> Are there different terms I can use for a mapping or association or
|> container or dictionary that has:
|>
|> 1. Homogenous keys and homogenous data
|> 2. Homogenous keys and heterogenous data (what I have now)
|> 3. Heterogenous keys and homogenous data
|> 4. Heterogenous keys and heterogenous data
|>
|> ... without having to spell out these specific combinations.
|
| Yes, I understood that the question you were asking is a question
| about terminology. Before trying to give an answer I wanted to make
| sure it was understood what question is being asked. Values that
| are homogeneous at one level can be heterogeneous at another level
| (or the other way around); so in asking the question, it's important
| not only to describe the situation, but also to say at what level(s)
| you want to talk about the values (and/or keys) involved.

It is the case that my real (original) question is not one about what to do
or how to do it. It is about the terminology, so I can figure out how to
best give a name to the variations on what I am doing.

I already have a "type 2 library" (from the given list). I want to also make
one with type 1 and also one with type 4. I don't know if type 3 has much
value. I want to give these library variations similar but slightly different
names. The basic method names would be similar (insert, delete, find, store,
fetch). My existing library is at http://libh.slashusr.org/

Okay, that's all pretty much along the lines of what I expected.

The one strong recommendation I would make is that whatever
terminology you use, it should clarify the type/class distinction.
For example, if you expressed this in C++ terms, the instance variable
for values would probably be of fixed /type/, but would hold (in
the heterogeneous case) instances of different /classes/. Also that
would make clear that (or whether) there is only one kind of Dictionary
defined, but different instances of Dictionary have homogeneous keys
(all of the same class) or heterogeneous keys (potentially of different
classes), or values, or both. I think everyone will understand the
terms homogeneous and heterogeneous in such a description, but the
question is, exactly what is it that's the same or different?
Make sure the adjectives modify the right nouns, and these adjectives
will do just fine. Make sense?
.



Relevant Pages

  • [PATCH 13/24] CRED: Separate per-task-group keyrings from signal_struct [ver #7]
    ... from the cred struct rather than the signal_struct. ... #ifdef CONFIG_SECURITY ... * copy the keys in a thread group for fork without CLONE_THREAD ...
    (Linux-Kernel)
  • Re: associative array, container, mapping, terminology
    ... | struct key_and_value { ... I wrote a mapping library in C. ... of storing a value into an entry. ... Suppose we decided to represent keys thusly: ...
    (comp.programming)
  • VC6 -> VC2005 conversion problem
    ... struct SHomog ... static void CrossProduct(const SPoint& p, const SPoint& q, SHomog& r) ... returns number of keys ... enabled, disable normalization ...
    (microsoft.public.vc.mfc)
  • Re: USB keyboard problem
    ... The keyboard_t struct kbd is passed to the ukbd_interrupt ... ukbd driver but within the mechanism that fills in ks_ndata. ... Once this has been done, a majority of the keys (a-z, 1-0 ... have found that when you hit any of the modifier keys, ...
    (freebsd-hackers)
  • Re: associative array, container, mapping, terminology
    ... |> is a pointer, I see that as a kind of type, but also specific to what it ... |> different kind of struct I consider to really be two different types. ... | but which kind isn't known at compile time, ... I'm looking at having a mapping that is homogenous keys and homogenous data, ...
    (comp.programming)