Using A Dictionary In C
1. Introduction
If you are reading this article I suppose you use C. In C, we do not have dictionary or map or other key-value pair collection by default. This forces us to design and to use array as a fake dictionary. But there are also situations that array is just not enough, or using a dictionary will be more clear in logic.
In this article we will discuss how to use uthash. It is almost the most widely used tool designed for key-value collection in C.
2. Install
According to it’s documentation(here), uthash is not a library. It is just a header file: uthash.h. All we need to do is copy the file into our project, and:
#include “uthash.h”
3. Declare
To declare a key-value pair collection, or namely a hash table, we will use struct in C.
typedef struct user_t { int id; int cookie; UT_hash_handle hh } user_t; int main(void) { user_t* users = NULL; return 0; }
I use “typedef”, this will “register” our struct name as a type. You don’t have to use it for our hash table, and your code will looks like below. This is another topic so I will stop here.
struct user_t { int id; int cookie; UT_hash_handle hh }; int main(void) { struct user_t* users = NULL; return 0; }
Another important thing is to assign our collection as NULL.
4. Add Item
To add an item into the collection we will first create an individual key-value pair item, then we can use HASH_ADD() function.
For example, we will add a “user” who has id 233 and cookie 244 into our “users” collection.
typedef struct user_t { int id; int cookie; UT_hash_handle hh } user_t; int main(void) { // the collection user_t* users = NULL; // the new item user_t* user = NULL; user = malloc(sizeof(user_t)); user->id = 233; user->cookie = 244; HASH_ADD(hh, users, id, sizeof(int), user); return 0; }
Seemingly in HASH_ADD() function we use an undeclare variable “id”. But uthash is designed with C macros so it is all right and it is the rule we need to follow.
There are also another function named HASH_ADD_INT() we can use. But it is specially designed for integer key. I think HASH_ADD() is more general in different key type situation.
// the collection user_t* users = NULL; // the new item user_t* user = NULL; user = malloc(sizeof(user_t)); user->id = 233; user->cookie = 244; HASH_ADD_INT(users, id, user);
5. Find Item
To find an item, we will declare a new variable finder, then we can use function HASH_FIND().
typedef struct user_t { int id; int cookie; UT_hash_handle hh } user_t; int main(void) { user_t* users = NULL; user_t* user = NULL; user = malloc(sizeof(user_t)); user->id = 233; user->cookie = 244; HASH_ADD(hh, users, id, sizeof(int), user); user_t* finder = NULL; int finding = 233; // important HASH_FIND(hh, users, &finding, sizeof(int), finder); if(finder != NULL) { printf("Found!\n"); printf("id is %d, cookie is %d\n", finder->id, finder->cookie); } return 0; } // output // Found! // id is 233, cookie is 244
One important thing is HASH_FIND() function doesn’t accept the real key value as a parameter. But it only accepts a variable’s address.
So we extra declare a “finding” variable and assign it as the key we are looking for.
6. Delete Item
With the knowledge of above find item, delete is easy. We just delete the “finder” variable with HASH_DEL().
typedef struct user_t { int id; int cookie; UT_hash_handle hh } user_t; int main(void) { user_t* users = NULL; user_t* user = NULL; user = malloc(sizeof(user_t)); user->id = 233; user->cookie = 244; HASH_ADD(hh, users, id, sizeof(int), user); user_t* finder = NULL; int finding = 233; HASH_FIND(hh, users, &finding, sizeof(int), finder); if(finder != NULL) { printf("Found!\n"); printf("id is %d, cookie is %d\n", finder->id, finder->cookie); } if(finder != NULL) { printf("Deleteing id %d...\n", finder->id); HASH_DEL(users, finder); printf("Deleted!\n"); } return 0; } // output: // Found! // id is 233, cookie is 244 // Deleteing id 233... // Deleted!
7. Print
To print out the whole hash collection, we can use an “iterator” like Cpp, with the help of uthash.
typedef struct user_t { int id; int cookie; UT_hash_handle hh } user_t; int main(void) { // initialize the collection user_t* users = NULL; // create the elements for(int i = 0; i < 10; i++) { user_t* user = NULL; user = malloc(sizeof(user_t)); user->id = i; user->cookie = i * i; HASH_ADD(hh, users, id, sizeof(int), user); } // print out the collection for(user_t* user = users; user != NULL; user = user->hh.next) { printf("id %d, cookie %d\n", user->id, user->cookie); } // find and delete id 7 printf("find and delete id 7\n"); user_t* finder = NULL; int finding = 7; HASH_FIND(hh, users, &finding, sizeof(int), finder); if(finder != NULL) { HASH_DEL(users, finder); } // print out the collection again for(user_t* user = users; user != NULL; user = user->hh.next) { printf("id %d, cookie %d\n", user->id, user->cookie); } return 0; } // output id 0, cookie 0 id 1, cookie 1 id 2, cookie 4 id 3, cookie 9 id 4, cookie 16 id 5, cookie 25 id 6, cookie 36 id 7, cookie 49 id 8, cookie 64 id 9, cookie 81 find and delete id 7 id 0, cookie 0 id 1, cookie 1 id 2, cookie 4 id 3, cookie 9 id 4, cookie 16 id 5, cookie 25 id 6, cookie 36 id 8, cookie 64 id 9, cookie 81
If we don’t want to print out the whole collection but just want to know how many items are there, we can use HASH_COUNT().
int count; count = HASH_COUNT(users);
8. Other Examples
Example 1 int - int pair
#include "uthash.h" #include <stdlib.h> #include <stdio.h> typedef struct user_t { int id; int cookie; UT_hash_handle hh } user_t; int main(void) { // initialize the collection user_t* users = NULL; // create the elements for(int i = 0; i < 10; i++) { user_t* user = NULL; user = malloc(sizeof(user_t)); user->id = i; user->cookie = i * i; HASH_ADD_INT(users, id, user); } // print out the collection for(user_t* user = users; user != NULL; user = user->hh.next) { printf("id %d, cookie %d\n", user->id, user->cookie); } // look up an element user_t* finder = NULL; int find_id = 9; /* you cannot directly write what you want to look up in the below function */ HASH_FIND_INT(users, &find_id, finder); if(finder != NULL) { printf("Found\n"); printf("id %d, cookie %d\n", finder->id, finder->cookie); } // delete an element if(finder != NULL) { printf("Deleting user with id %d\n", finder->id); HASH_DEL(users, finder); printf("Deleted\n"); } // count the elements int user_num = 0; user_num = HASH_COUNT(users); printf("User number is now left %d\n", user_num); return 0; } // output id 0, cookie 0 id 1, cookie 1 id 2, cookie 4 id 3, cookie 9 id 4, cookie 16 id 5, cookie 25 id 6, cookie 36 id 7, cookie 49 id 8, cookie 64 id 9, cookie 81 Found id 9, cookie 81 Deleting user with id 9 Deleted User number is now left 9
Example 2: string – int pair
#include “uthash.h” #include <stdlib.h> #include <stdio.h> typedef struct user_t { char name[10]; int cookie; UT_hash_handle hh } user_t; int main(void) { // initialize the collection user_t* users = NULL; // create the elements char* names[] = {"joe", "bob", "betty"}; for(int i = 0; i < 3; i++) { user_t* user = NULL; user = malloc(sizeof(user_t)); strcpy(user->name, names[i]); user->cookie = i * i; HASH_ADD_STR(users, name, user); } // print out the collection for(user_t* user = users; user != NULL; user = user->hh.next) { printf("name %s, cookie %d\n", user->name, user->cookie); } // look up the elements user_t* finder = NULL; char* find_name = "betty"; HASH_FIND_STR(users, find_name, finder); // you are directly write a cstring in the function */ if(finder != NULL) { printf("Found\n"); printf("name %s, cookie %d\n", finder->name, finder->cookie); } // delete an element if(finder != NULL) { printf("Deleting user with name %s\n", finder->name); HASH_DEL(users, finder); printf("Deleted\n"); } // count the elements int user_num = 0; user_num = HASH_COUNT(users); printf("User number is now left %d\n", user_num); return 0; } //output name joe, cookie 0 name bob, cookie 1 name betty, cookie 4 Found name betty, cookie 4 Deleting user with name betty Deleted User number is now left 2
Example 3: Surprisingly, the key is not required as unique and it could be a problem to care...
#include “uthash.h” #include <stdlib.h> #include <stdio.h> typedef struct user_t { char name[10]; int cookie; UT_hash_handle hh } user_t; int main(void) { // initialize the collection user_t* users = NULL; // create the elements char* names[] = {"joe", "bob", "betty"}; for(int i = 0; i < 3; i++) { user_t* user = NULL; user = malloc(sizeof(user_t)); strcpy(user->name, names[i]); user->cookie = i * i; HASH_ADD_STR(users, name, user); } // however if we do this, it will... user_t* user = NULL; user = malloc(sizeof(user_t)); strcpy(user->name, "betty"); user->cookie = 999; HASH_ADD_STR(users, name, user); // print out the collection for(user_t* user = users; user != NULL; user = user->hh.next) { printf("name %s, cookie %d\n", user->name, user->cookie); } // look up the elements user_t* finder = NULL; char* find_name = "betty"; HASH_FIND_STR(users, find_name, finder); // you are directly write a cstring in the function */ if(finder != NULL) { printf("Found\n"); printf("name %s, cookie %d\n", finder->name, finder->cookie); } // delete an element if(finder != NULL) { printf("Deleting user with name %s\n", finder->name); HASH_DEL(users, finder); printf("Deleted\n"); } // count the elements int user_num = 0; user_num = HASH_COUNT(users); printf("User number is now left %d\n", user_num); // print out the collection again for(user_t* user = users; user != NULL; user = user->hh.next) { printf("name %s, cookie %d\n", user->name, user->cookie); } return 0; } // output name joe, cookie 0 name bob, cookie 1 name betty, cookie 4 name betty, cookie 999 Found name betty, cookie 999 Deleting user with name betty Deleted User number is now left 3 name joe, cookie 0 name bob, cookie 1 name betty, cookie 4