jshashtable 2.1
Contents
Introduction
jshashtable is a JavaScript implementation of a hash table. It associates objects ("keys") with other objects ("values"). Each key is associated with precisely one value. "Objects" here is used loosely to mean any JavaScript object or value.
Version 2.0 of jshashtable features significant performance
improvements over version 1.0 (typically 200% faster for put
and get
operations) and adds a few new methods: clone()
, putAll()
,
entries()
and each()
. Version 2.1 is minor upgrade that allows use of ActiveX
objects as keys in IE.
Set-up
-
Download the code
Download jshashtable. You can download a compressed or uncompressed version of
jshashtable.js
which are functionally identical or a zip containing this document and both JavaScript files. -
Include jshashtable in your page
Include
jshashtable.js
in a script tag in your page. This file creates one object in the global scope, which isHashtable
. -
Create your hash table
Create your hash table, as in the example below. Any non-null, non-undefined JavaScript value can be used as a key or a value.
<script type="text/javascript" src="jshashtable.js"></script> <script type="text/javascript"> var typesHash = new Hashtable(); typesHash.put("A string", "string"); typesHash.put(1, "number"); var o = {}; typesHash.put(o, "object"); alert( typesHash.get(o) ); // "object" </script>
Usage
The code is contained in one JavaScript file, which creates a single constructor function called
Hashtable
in the global scope.
Doesn't JavaScript already do this?
No. Although a JavaScript object can be used as a hash, there are several limitations that make using a JavaScript object unsuitable for a generic hash. One limitation is that only strings and numbers tend to make useful keys.
For example, the following is simple and works:
var key = "A key"; var o = {}; o[key] = 1; alert( o[key] ); // Alerts 1
However, it is often desirable to use other kinds of objects as keys. JavaScript doesn't complain if you try to use values other than strings and numbers inside square brackets; indeed, it superficially looks like this kind of association works for any object:
var key = {}; var o = {}; o[key] = "First"; alert( o[key] ); // Alerts "First"
It doesn't take much effort to discover the flaw in this approach:
var key1 = {}; var key2 = {}; var o = {}; o[key1] = "First"; o[key2] = "Second"; alert( o[key1] ); // Alerts "Second", not "First"
The reason for this is that in JavaScript, all object property names are
strings, and any non-string value placed inside a square bracket property accessor is
converted into a string. In the example above, key1
and key2
are both
converted to "[object Object]", hence the second assignment simply replaces the original value
of the "[object Object]" property.
With jshashtable, any JavaScript value apart from null
and undefined
can be used as a key:
var key1 = {}; var key2 = {}; var h = new Hashtable(); h.put(key1, "First"); h.put(key2, "Second"); alert( h.get(key1) ); // Alerts "First" alert( h.get(key2) ); // Alerts "Second"
The above example shows use of put()
to add a key/value pair to
the hash table and get()
to retrieve a value. If a value already
exists in the hash table for the key used, it is replaced with the new value.
Another limitation of using a JavaScript object as a hash is the awkwardness of enumerating its
members. Some libraries add properties to Object.prototype
and these properties
are then enumerated for every object (unless the environment is ECMAScript 5 compliant and the
property has been set to be non-enumerable). This means property enumeration looks like this:
var key = "A key"; var o = {}; o[key] = 1; for (var i in o) { if (o.hasOwnProperty(i)) { alert(i + "=>" + o[i]); } }
Equality
This is all very well but there is a drawback. Imagine you have a hash table of colour values at
various different positions on the screen. Each position on the screen is represented by a
Point
object, which has properties x
and y
, representing
the x and y coordinates of the point on the screen.
function Point(x, y) { this.x = x; this.y = y; } var coloursForPoints = new Hashtable(); function getColourAt(x, y) { var point = new Point(x, y); return coloursForPoints.get(point); } coloursForPoints.put( new Point(1, 2), "green" ); alert( getColourAt(1, 2) ); // Alerts null
Why do we get null
? Because the Point
object that gets created in
getColourAt
is not the self same object as the original Point
that got
used as key in the coloursForPoints
hash table. This is clearly not the ideal
behaviour - any two Point
objects with the same x and y values are to all intents
and purposes the same thing. What we need is a way of defining when two objects are equal. By
default, jshashtable uses the strict equality (===
)
operator in JavaScript. However, if one of the objects being compared has an
equals()
method, it will use that instead. We can implement an
equals()
method in the above example to get the behaviour we want:
function Point(x, y) { this.x = x; this.y = y; } Point.prototype.equals = function(obj) { return (obj instanceof Point) && (obj.x === this.x) && (obj.y === this.y); }; var coloursForPoints = new Hashtable(); function getColourAt(x, y) { var point = new Point(x, y); return coloursForPoints.get(point); } coloursForPoints.put( new Point(1, 2), "green" ); alert( getColourAt(1, 2) ); // Alerts "green"
Hash codes
This works but is still not quite ideal. Internally, jshashtable stores
key/values pairs in arrays, called buckets. When put()
is called, jshashtable converts the key into a hash code, and
stores the key/value pair in the bucket for that particular hash code. A hash code in
jshashtable is a string so that the buckets themselves can be
associated with hash codes using an object and JavaScript's built-in string property names. When
get()
is called, jshashtable finds
the correct bucket for the key it's looking for and then searches the contents of that bucket
for that key. Since the process of locating the correct bucket is massively faster than
searching through a bucket's contents, it is most efficient to have as many buckets as possible
containing the least possible number of items (ideally one). My tests have shown that for a hash
table with 1000 elements, it is around 70 times faster to replace or retrieve a
value if each key has a unique hash code than if each key had the same hash code (tested using
objects with a very simple hashCode()
method). For 10000 elements, it's closer to
300 times faster. So, generating meaningful hash codes for keys makes the hash table much more
efficient.
jshashtable generates a hash code for an object by checking to see if
it has a hashCode()
method and using that if it exists. Otherwise it calls
toString()
on the object, like JavaScript's built-in square bracket property
accessor does. In the above example, Point
has no hashCode()
method,
so every point placed in coloursForPoints
goes into the "[object Object]" bucket,
which is very inefficient, particularly with a large number of points. What would be better
would be to implement a hashCode()
method on Point
that returns a
different value for every distinct point but the same value for equal points (this is very
important: two objects that are considered equal according to their equals()
method
must return the same hash code). Something like the following would be good:
Point.prototype.hashCode = function(obj) { return "Point:" + this.x + "," + this.y; };
So for our example point, this returns "Point:1,2", for any Point
object with x
and y coordinates of 1 and 2 respectively. Every point will therefore go in its own bucket and
the hash table is efficient.
The "Point" at the start of the hash code is there to distinguish points from any other similar object that may have x and y coordinates and potentially have the same hash code. Note that even if a non-Point object did return a hash code of "Point:1,2", the hash table would still work fine, since the non-Point object would fail the equality test when searching through the "Point:1,2" bucket.
If you don't want to violate the keys of your hash table by giving them extra methods like
equals()
and hashCode()
, you can instead pass in functions into the
Hashtable
constructor to generate hash codes and test key equality. The hash code
generator function is passed an object and should return a hash code for that object, while the
equality function is passed two objects and should return a boolean value representing whether
the two objects are equal. This also has the advantage of improving performance, since the hash
table does not have to check for the existence of equals()
and
hashCode()
methods on every key object.
For a hash table where you knew in advance that all the keys would be Point
objects, the previous example could be rewritten to be more efficient as follows:
function Point(x, y) { this.x = x; this.y = y; } function hashPoint(p) { return "Point:" + p.x + "," + p.y; } function pointsEqual(p1, p2) { return p1.x === p2.x && p1.y === p2.y; } var coloursForPoints = new Hashtable(hashPoint, pointsEqual); function getColourAt(x, y) { var point = new Point(x, y); return coloursForPoints.get(point); } coloursForPoints.put( new Point(1, 2), "green" ); alert( getColourAt(1, 2) ); // Alerts green
Public API
Constructors
-
Hashtable()
Creates a new, empty hash table.
-
Hashtable(Function hashingFunction, Function equalityFunction)
Creates a new, empty hash table with the supplied hashing function and equality function.
Parameters:-
hashingFunction
A function that provides hash codes for keys placed in the hash table. It is passed the object to be hashed as its only parameter. If not provided, the hash table checks whether the object has ahashCode()
method, and if not, callstoString()
on the object. -
equalityFunction
A function that checks for equality between two keys with the same hash code. Two keys that are considered equal will map to the same value in the hash table. This function is passed the two objects to be compared as its parameters. If not provided, the hash table checks whether either object being compared has anequals()
method, and if not, compares the objects using the===
operator.
-
Methods
-
mixed put(mixed key, mixed value)
Updated in version 2.0: now returns previous value associated with the key
Sets the value associated with the key supplied. If the hash table already contains the key then the old value is overwritten and the old value is returned, otherwise
null
is returned. -
void putAll(Hashtable hashtable[, Function conflictCallback])
New in version 2.0
Adds all entries from the supplied hash table to this hash table. For any key in the supplied hash table for which an entry already exists in this hash table, the optional callback function
conflictCallback
is called to resolve the conflict. This function should accept three parameters:key
: the key for the conflicting entry;thisValue
: the current value for this key in the current hash table;value
: the value for this key in the hash table supplied.
The value returned by the callback function will be used as the value for the new entry in the current hash table. If no callback function is supplied, the existing value in the current hash table will be overwritten by the value in the hash table supplied.
-
mixed get(mixed key)
Returns the value associated with the key supplied, or
null
if no value is found for that key. -
Boolean containsKey(mixed key)
Returns whether the hash table contains the specified key.
-
Boolean containsValue(mixed value)
Returns whether the hash table contains the specified value.
-
void clear()
Removes all entries from the hash table.
-
Boolean isEmpty()
Returns true if the hash table contains no key/value pairs.
-
Array keys()
Returns an array containing all the keys contained in the hash table.
-
Array values()
Returns an array containing all the values contained in the hash table.
-
Array entries()
New in version 2.0
Returns an array containing all the entries contained in the hash table. Each entry is a two element array containing the key and value respectively for that entry.
-
mixed remove(mixed key)
Updated in version 2.0: now returns the value associated with the removed key
Removes the key and its corresponding value from the hash table.
-
Number size()
Returns the number of key/value pairs contained in the hash table.
-
Hashtable clone()
New in version 2.0
Creates and returns a shallow copy of the hash table. If hashing and equality functions were provided to the hash table when it was constructed, they are passed into the new hash table.
-
void each(Function callback)
New in version 2.0
Iterates over all the entries in the hash table, calling the
callback
function for each entry. This function is passed two parameters:key
: the key for the current entry;value
: the value for the current entry.