jshashtable 2.1

Introduction

jshashtable is a JavaScript implementation of a hash table. It associates objects ("keys") with other objects ("values"). Each key is associated with precisely one value. "Objects" here is used loosely to mean any JavaScript object or value.

Version 2.0 of jshashtable features significant performance improvements over version 1.0 (typically 200% faster for put and get operations) and adds a few new methods: clone(), putAll(), entries() and each(). Version 2.1 is minor upgrade that allows use of ActiveX objects as keys in IE.

Top

Set-up

  1. Download the code

    Download jshashtable. You can download a compressed or uncompressed version of jshashtable.js which are functionally identical or a zip containing this document and both JavaScript files.

  2. Include jshashtable in your page

    Include jshashtable.js in a script tag in your page. This file creates one object in the global scope, which is Hashtable.

  3. Create your hash table

    Create your hash table, as in the example below. Any non-null, non-undefined JavaScript value can be used as a key or a value.

    <script type="text/javascript" src="jshashtable.js"></script>
    <script type="text/javascript">
        var typesHash = new Hashtable();
    
        typesHash.put("A string", "string");
        typesHash.put(1, "number");
    
        var o = {};
        typesHash.put(o, "object");
    
        alert( typesHash.get(o) ); // "object"
    </script>
    

Top

Usage

The code is contained in one JavaScript file, which creates a single constructor function called Hashtable in the global scope.

Doesn't JavaScript already do this?

No. Although a JavaScript object can be used as a hash, there are several limitations that make using a JavaScript object unsuitable for a generic hash. One limitation is that only strings and numbers tend to make useful keys.

For example, the following is simple and works:

var key = "A key";
var o = {};
o[key] = 1;
alert( o[key] ); // Alerts 1

However, it is often desirable to use other kinds of objects as keys. JavaScript doesn't complain if you try to use values other than strings and numbers inside square brackets; indeed, it superficially looks like this kind of association works for any object:

var key = {};
var o = {};
o[key] = "First";
alert( o[key] ); // Alerts "First"

It doesn't take much effort to discover the flaw in this approach:

var key1 = {};
var key2 = {};
var o = {};
o[key1] = "First";
o[key2] = "Second";
alert( o[key1] ); // Alerts "Second", not "First"

The reason for this is that in JavaScript, all object property names are strings, and any non-string value placed inside a square bracket property accessor is converted into a string. In the example above, key1 and key2 are both converted to "[object Object]", hence the second assignment simply replaces the original value of the "[object Object]" property.

With jshashtable, any JavaScript value apart from null and undefined can be used as a key:

var key1 = {};
var key2 = {};
var h = new Hashtable();
h.put(key1, "First");
h.put(key2, "Second");
alert( h.get(key1) ); // Alerts "First"
alert( h.get(key2) ); // Alerts "Second"

The above example shows use of put() to add a key/value pair to the hash table and get() to retrieve a value. If a value already exists in the hash table for the key used, it is replaced with the new value.

Another limitation of using a JavaScript object as a hash is the awkwardness of enumerating its members. Some libraries add properties to Object.prototype and these properties are then enumerated for every object (unless the environment is ECMAScript 5 compliant and the property has been set to be non-enumerable). This means property enumeration looks like this:

var key = "A key";
var o = {};
o[key] = 1;

for (var i in o) {
    if (o.hasOwnProperty(i)) {
        alert(i + "=>" + o[i]);
    }
}

Equality

This is all very well but there is a drawback. Imagine you have a hash table of colour values at various different positions on the screen. Each position on the screen is represented by a Point object, which has properties x and y, representing the x and y coordinates of the point on the screen.

function Point(x, y) {
    this.x = x;
    this.y = y;
}

var coloursForPoints = new Hashtable();

function getColourAt(x, y) {
    var point = new Point(x, y);
    return coloursForPoints.get(point);
}

coloursForPoints.put( new Point(1, 2), "green" );

alert( getColourAt(1, 2) ); // Alerts null

Why do we get null? Because the Point object that gets created in getColourAt is not the self same object as the original Point that got used as key in the coloursForPoints hash table. This is clearly not the ideal behaviour - any two Point objects with the same x and y values are to all intents and purposes the same thing. What we need is a way of defining when two objects are equal. By default, jshashtable uses the strict equality (===) operator in JavaScript. However, if one of the objects being compared has an equals() method, it will use that instead. We can implement an equals() method in the above example to get the behaviour we want:

function Point(x, y) {
    this.x = x;
    this.y = y;
}

Point.prototype.equals = function(obj) {
    return (obj instanceof Point) &&
        (obj.x === this.x) &&
        (obj.y === this.y);
};

var coloursForPoints = new Hashtable();

function getColourAt(x, y) {
    var point = new Point(x, y);
    return coloursForPoints.get(point);
}

coloursForPoints.put( new Point(1, 2), "green" );

alert( getColourAt(1, 2) ); // Alerts "green"

Hash codes

This works but is still not quite ideal. Internally, jshashtable stores key/values pairs in arrays, called buckets. When put() is called, jshashtable converts the key into a hash code, and stores the key/value pair in the bucket for that particular hash code. A hash code in jshashtable is a string so that the buckets themselves can be associated with hash codes using an object and JavaScript's built-in string property names. When get() is called, jshashtable finds the correct bucket for the key it's looking for and then searches the contents of that bucket for that key. Since the process of locating the correct bucket is massively faster than searching through a bucket's contents, it is most efficient to have as many buckets as possible containing the least possible number of items (ideally one). My tests have shown that for a hash table with 1000 elements, it is around 70 times faster to replace or retrieve a value if each key has a unique hash code than if each key had the same hash code (tested using objects with a very simple hashCode() method). For 10000 elements, it's closer to 300 times faster. So, generating meaningful hash codes for keys makes the hash table much more efficient.

jshashtable generates a hash code for an object by checking to see if it has a hashCode() method and using that if it exists. Otherwise it calls toString() on the object, like JavaScript's built-in square bracket property accessor does. In the above example, Point has no hashCode() method, so every point placed in coloursForPoints goes into the "[object Object]" bucket, which is very inefficient, particularly with a large number of points. What would be better would be to implement a hashCode() method on Point that returns a different value for every distinct point but the same value for equal points (this is very important: two objects that are considered equal according to their equals() method must return the same hash code). Something like the following would be good:

Point.prototype.hashCode = function(obj) {
    return "Point:" + this.x + "," + this.y;
};

So for our example point, this returns "Point:1,2", for any Point object with x and y coordinates of 1 and 2 respectively. Every point will therefore go in its own bucket and the hash table is efficient.

The "Point" at the start of the hash code is there to distinguish points from any other similar object that may have x and y coordinates and potentially have the same hash code. Note that even if a non-Point object did return a hash code of "Point:1,2", the hash table would still work fine, since the non-Point object would fail the equality test when searching through the "Point:1,2" bucket.

If you don't want to violate the keys of your hash table by giving them extra methods like equals() and hashCode(), you can instead pass in functions into the Hashtable constructor to generate hash codes and test key equality. The hash code generator function is passed an object and should return a hash code for that object, while the equality function is passed two objects and should return a boolean value representing whether the two objects are equal. This also has the advantage of improving performance, since the hash table does not have to check for the existence of equals() and hashCode() methods on every key object.

For a hash table where you knew in advance that all the keys would be Point objects, the previous example could be rewritten to be more efficient as follows:

function Point(x, y) {
    this.x = x;
    this.y = y;
}

function hashPoint(p) {
    return "Point:" + p.x + "," + p.y;
}

function pointsEqual(p1, p2) {
    return p1.x === p2.x && p1.y === p2.y;
}

var coloursForPoints = new Hashtable(hashPoint, pointsEqual);

function getColourAt(x, y) {
    var point = new Point(x, y);
    return coloursForPoints.get(point);
}

coloursForPoints.put( new Point(1, 2), "green" );

alert( getColourAt(1, 2) ); // Alerts green

Top

Public API

Constructors

  • Hashtable()

    Creates a new, empty hash table.

  • Hashtable(Function hashingFunction, Function equalityFunction)

    Creates a new, empty hash table with the supplied hashing function and equality function.

    Parameters:
    • hashingFunction
      A function that provides hash codes for keys placed in the hash table. It is passed the object to be hashed as its only parameter. If not provided, the hash table checks whether the object has a hashCode() method, and if not, calls toString() on the object.
    • equalityFunction
      A function that checks for equality between two keys with the same hash code. Two keys that are considered equal will map to the same value in the hash table. This function is passed the two objects to be compared as its parameters. If not provided, the hash table checks whether either object being compared has an equals() method, and if not, compares the objects using the === operator.

Methods

  • mixed put(mixed key, mixed value)

    Updated in version 2.0: now returns previous value associated with the key

    Sets the value associated with the key supplied. If the hash table already contains the key then the old value is overwritten and the old value is returned, otherwise null is returned.

  • void putAll(Hashtable hashtable[, Function conflictCallback])

    New in version 2.0

    Adds all entries from the supplied hash table to this hash table. For any key in the supplied hash table for which an entry already exists in this hash table, the optional callback function conflictCallback is called to resolve the conflict. This function should accept three parameters:

    • key: the key for the conflicting entry;
    • thisValue: the current value for this key in the current hash table;
    • value: the value for this key in the hash table supplied.

    The value returned by the callback function will be used as the value for the new entry in the current hash table. If no callback function is supplied, the existing value in the current hash table will be overwritten by the value in the hash table supplied.

  • mixed get(mixed key)

    Returns the value associated with the key supplied, or null if no value is found for that key.

  • Boolean containsKey(mixed key)

    Returns whether the hash table contains the specified key.

  • Boolean containsValue(mixed value)

    Returns whether the hash table contains the specified value.

  • void clear()

    Removes all entries from the hash table.

  • Boolean isEmpty()

    Returns true if the hash table contains no key/value pairs.

  • Array keys()

    Returns an array containing all the keys contained in the hash table.

  • Array values()

    Returns an array containing all the values contained in the hash table.

  • Array entries()

    New in version 2.0

    Returns an array containing all the entries contained in the hash table. Each entry is a two element array containing the key and value respectively for that entry.

  • mixed remove(mixed key)

    Updated in version 2.0: now returns the value associated with the removed key

    Removes the key and its corresponding value from the hash table.

  • Number size()

    Returns the number of key/value pairs contained in the hash table.

  • Hashtable clone()

    New in version 2.0

    Creates and returns a shallow copy of the hash table. If hashing and equality functions were provided to the hash table when it was constructed, they are passed into the new hash table.

  • void each(Function callback)

    New in version 2.0

    Iterates over all the entries in the hash table, calling the callback function for each entry. This function is passed two parameters:

    • key: the key for the current entry;
    • value: the value for the current entry.

Top