REN

Ph.D. in Computer Science at Rutgers University

Hash Table

Hash table is a data structure uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.

A hash table could be divided into the following parts: key/value, hash function, buckets. The key is an input, through hash function, yielding an value as the index of buckets. Now, look at the picture below:

   

The average time complexity for hash table operation is O(1). To deal with bucket index confliction yielded by hash function, there're several methods. In this article, I'll use the linking data structure to implement the buckets.


Data Structure and Operation of Hash Table

The data structure of hash table is below:

typedef int type;

typedef struct HashEntry {
	char *key;
	type value;
	struct HashEntry *next;
} HashEntry;

typedef struct HashTable {
	int size;
	HashEntry **entryList;
} HashTable;

To deal with bucket index confliction yielded by hash function, each hash table entry is not storing value. Instead it's a pointer pointing to a linked list. If two keys falls into the same hash table entry, no problem, just insert them together in the linked list.

   

Hash Function

Hash function is the bridge between the key/value pairs. Hash function should be finished in O(1) times. And a good hash function should guarantee the confliction that two keys have the same value as few as possible. Typically, the longer the length of hash table bucket is, the less probability of confliction will happen. Thus, the easiest way is to use module operation with a rather long hash table bucket. Here, I'll introduce a well-known hash function:

int hash(char *key) {
    const unsigned char *name = (const unsigned char *)key;  
    unsigned long h = 0, g;  
    int i;  
	int len = strlen(key);  

    for (i = 0; i < len; i++) {  
        h = (h << 4) + (unsigned long)(name[i]); 
        if ((g = (h & 0xF0000000UL))!=0)       
            h ^= (g >> 24);  
        h &= ~g;  
    }
    return (int)h;  	
}

Initialize

Initialize hash table, set length and preserve dynamic memory for bucket.

void initHashTable(HashTable *t, int size) {
	int i;
	t->size = size;
	t->entryList = (HashEntry**)malloc(size*sizeof(HashEntry*));
	for (i = 0; i < size; i++) {
		t->entryList[i] = NULL;
	}
}

Look up

HashEntry* lookup(HashTable t, char *key) {
	int index = hash(key);
	HashEntry *entry = t.entryList[index];
	while (NULL != entry) {
		if (!strcmp(t.entryList[index]->key, key))
			return entry;
		entry = entry->next;	
	}
	return NULL;
}

Insert

int insertValue(HashTable *t, char *key, type value) {
	int index = hash(key);
	HashEntry *entry = t->entryList[index];
	while (NULL != entry) {
		if (NULL == entry->next) {
			HashEntry* newEntry = (HashEntry*)malloc(sizeof(HashEntry));
			strcpy(newEntry->key, key);
			newEntry->value = value;
			newEntry->next = NULL;
			return 0;
		}
		entry = entry->next;	
	}
	return -1;
}

Delete

int deleteValue(HashTable *t, char *key, type value) {
	int index = hash(key);
	HashEntry *p = t->entryList[index];
	HashEntry *entry = NULL;
	if (NULL != p && !strcmp(p->key, key)) {
		t->entryList[index] = p->next;
		free(p);
		return 0;
	}		
	while (NULL != p) {
		entry = p->next;
		if (NULL != entry && !strcmp(entry->key, key)) {
			p->next = entry->next;
			free(entry);
			return 0;
		}
		p = p->next;	
	}
	return -1;
}

Destroy

Free the dynamic memory for the hash table.

void destoryHashTable(HashTable *t) {
	HashEntry *p, *q;
	for (int i = 0; i < t->size; i++) {
		p = t->entryList[i];
		while (NULL != p) {
			q = p;
			p = p->next;
			free(q);
		}
	}
}