Re: adjustable array vs hash-table



On Jan 30, 2:09 pm, Slobodan Blazeski <slobodan.blaze...@xxxxxxxxx>
wrote:
I could use a hash-table or an adjustable array. Considering the keys
of the hash-table are equal to indices in the array 0.1.2.3.4...
functionality is the same. So which one is more efficient at
a. Expanding when new elements are added when container can't hold
them
b. Accessing elements via index / key.

---------------------------------------------
Allow me to rant.

First of all, in high-level langs there should be no such thingy/data-
type as hash-table. Since, as you observed, functionally they are just
array/list/sequence/aggregate (or whichma*** computer “scientists”
wants to call them).

As you can see, in Mathematica, there is no hash table. There is just
list that is used by what other lang might call vector, sequence,
array, n-tuple, map, set, key'd list, hash, dictionary, cons.

Further, in another recent high-level lang, it has combined array and
hash into one entity, and as the only entity of the sequence kinda
thing. That lang is PHP. For example, in PHP, you create such a linear
sequence like this:

arrays("a","b","c")

where the lang actually created a hash, with keys being 0,1,2. The
above is syntactically equivalent to

arrays(0=>"a",1=>"b",2=>"c")

But what if you want a hash with different keys? just code it so:

arrays("some"=>"a","key2"=>"b","anotherkey"=>"c")

The merging of hash and array into one single language entity has some
peculiar effects. For example, if you have list, and you remove a item
in the mid, you actually end up with a hash with keys that jumps a
integer. So, you no longer go thru the list by incrementing a index
such as i++. So, you effective need to “reindex” it if you intent to
use a for loop for it. (in PHP, there are other lang things that
address this of course. But if you really just want to re-index it,
then just copy it to create another array.)

May this merging at the lang level be good or bad, the important thing
to remember is that all these vector, sequence, array, n-tuple, map,
set, key'd list, hash, dictionary... are mathematically just the same
in the context of programing lang. Their difference is merely how
programers want to use them, and their time-complexity properties on
how they are used. The latter, is due to hard-ware implementation/
pragmatics/speed issues.

In a lang that just have list, you can implement any of these “behind-
the-scenes” stuff on top. For example:

• vector ⇒ never add or delete a element.

• sequence ⇒ just create a list

• array ⇒ depending on what lang you using, this is the same as
sequence, list, or vector.

• n-tuple ⇒ like vector, just create your list with n element and
don't insert or delete elements.

• map ⇒ depending on what lang, this may be just a list, or hash
table.

• hashtable ⇒ a list with element being lists, each contains 2
elements.

• key'd list ⇒ hashtable.

• dictionary ⇒ hashtable.

• set ⇒ a list. Don't ever add to it a item that's already there.

• cons ⇒ a list of 2 elements, nest them into a binary tree, with the
last element of the last node being a special entity (i.e. nil).

-----------------------------

As time matches on, computing hardware resource grows exponentially,
all langs are getting rid of all these special behind-the-scenes
“types” that are necessacitated by hardware/pragmatics/speed issues.

For example, in C, you have none of these. You implement them
yourself. This is about 40 years ago. In Lisp, you have just
essentially just cons and hashtable, with few language facilities on
top that make these easier to use (yet one big stupidity lisp made is
to let the cons surface to the top, and with things like caard,
cddaar). In java, you have almost all of these built-in, together with
the concept of _Java_'s Interfaces, that helps programer see them as
one single entity. In perl, python, you essentially just have a list
and hash table (hashtable is called dictionary in Python). In PHP, you
just have list, but which functions both as sequence and keyd list (it
is called array). In Javascript, only essentially just have the list
(may be off a bit... too lazy to lookup).

The important thing to note here is that, in designing a high-level
comp lang, it is important to seprate, what is significant in the
context of a human-machine protocol for specifying computatiton that
are mathematically meaningful, and separate out issues that are
related to hardware/pragmatics/speed.

In Mathematica, the lang is so high-level, but the price is that the
programs are in general slow, and the lang, as of now, cannot be used
for tasks that needs to access hardware directly. (such as writing a
compiler, OS, device driver)

Similar thing can be said for any other high-level lang, esp PHP,
Javascript. (and to a lesser degree Perl, Python) (that's why these
high-level lang are sometimes called scripting langs)

In my opinion, a solution for a general lang that's high level, is to
have declarations entitites in the lang specifically for the hardware/
pragmatic/speed issues.

This idea isn't new. I think Haskell, Dylan, both have some such
philosophy. (though i don't know these lang so can't be sure exactly
what)

Recently i note in that in Paul Graham's new lang arc, pre-released
today , the cons business lives! What a fucking moronicity.

-------------------------
See also:

• Jargons And High Level Languages
http://xahlee.org/jargons_high_level_lang.html

• Lisp's List Problem
http://xahlee.org/emacs/lisp_list_problem.html

• What is Expressiveness in a Computer Language
http://xahlee.org/perl-python/what_is_expresiveness.html

• The Hack of bitmask used as Boolean Parameters
http://xahlee.org/UnixResource_dir/writ/bitmask.html

Xah
xah@xxxxxxxxxx
http://xahlee.org/



.


Quantcast