In this project you will code slightly different but practical implementations of hashing and heaps:
Cuckoo Hashing and sorting using Binary Heaps. Both implementations must be coded in the same project according to the specifications below. Read the handout thoroughly since details for each implementation are given. As on previous projects, it must be coded in C++.
Cuckoo Hashing is a variation of the regular hashing concept. It was presented by R. Pagh and F. Rodler on the Proceedings of European Symposium on Algorithms in 2001 (original paper here). We are interested in this hashing mechanism since it is straightforward to implement and it has worst case constant lookup time. The name is received from the European cuckoo who throws out the eggs from other bird's nests in order to make room for its own eggs!
This method uses two hash tables T1 and T2 each of length r, and two hash functions h1(x) and h2(x). Every key x is stored in cell h1(x) of T1 or h2(x) of T2 but never in both. The lookup function is defined as follows:
bool functionlookup(x)
return T1[h1(x)] == x or T2[h2(x)] == x
end
For insertion, as described in the original paper, it turns out that the "cuckoo approach" (kicking other keys away until every key has its own nest) works very well. Specifically, if x is to be inserted we first see if cell h1 (x) of T1 is occupied. If not occupied, we set T1[h1(x)] = x and we are done. Otherwise, we set y = T1 [h1 (x)] and then set T1 [h1 (x)] = x anyway, thus making y "nestless". The key y is then inserted in T2. If its nest in T2 is occupied, we proceed in a similar way and so forth iteratively. Check the example in Figure 1a. It may happen that this process loops (see Figure 1b). Therefore the number of iterations is bounded by a MaxLoop value. If this number of iterations is reached, everything is rehashed (which is described later) and we try to accommodate the nestless key. see image.
The insert procedure is defined as follows (notation x y means the values of variables x and y are swapped):
procedureinsert(x)
iflookup(x) then return
loopMaxLoop times
if T1[h1(x)] = empty then{T1[h1(x)] = x; return}
x <-> T1[h1(x)]
if T2[h2(x)] = empty then{T2[h2(x)] = x; return}
x <-> T2[h2(x)]
end loop
rehash(); insert(x)
end
Removing a key is of course simple to perform. If lookup(x) is true then remove x from T1 [h1(x)] or T2[h2(x)]. Remember an element in T1 cannot be in T2 and vice versa. For rehashing, we will perform two operations: 1) double the size of both tables, and 2) add the elements of T1 and T2 into the doubledsize new tables. Keep in mind that you need to add such values in the same way as inserting any x values (i.e., use the insert operation).
In http://www.lkozma.net/cuckoo_hashing_visualization, you can visualize how Cuckoo Hashing works by inserting the values of your choosing. Here is a suggestion: delete TEST1 and TEST2 and insert numbers in increasing order, one by one. Observe how the keys are swapped between tables. Keep doing this until you fall in a loop. You will implement the Cuckoo Hashing mechanism as described before. The hashing functions to be implemented are:
h1(x) = x mod r and h2(x) x mod (r 1)
The input for the program is:
program < inputfile.txt > outputfile.txt
The format of the inputfile.txthas the following structure:
1
r m
a b
a b
a b
a b
...
Where 1 indicates that the Cuckoo Hashing part will be executed, ris a positive integer that indicates the length of the tables and mis a positive integer that indicates the maximum number of loops. For each tuple (a b), ais the instruction to perform and bis the number for the required instruction. The possible values for a are "i" for insert, "r" for remove and "l" for lookup. The values of b are positive integer numbers.
The program has to write the following outputs:
The following is an example of the output given a set of tuples with instructions and values: see image.
In this section you need to implement a Binary Heap. Then, using this heap we are going to implement a sorting algorithm. First you are going to implement a binary heap. One should be able to use it as either a maxheap or a minheap it should be configurable during heap initialization. Once the heap is initialized, it continues to behave the same way until its deletion. Your implementation should use a binary tree based approach, not an array based approach. Refer to your lecture slides to understand how heaps can be implemented. The elements in the heap should follow this
structure:
struct element {
int key;
int value;
};
When you execute the insert() or extract() functions on the heap, the comparison should be on 'key', not 'value'. You can also use a class to define the above structure, or use your own names.
Your heap should implement the following functionalities:
1. Peek:basically findmax or findmin. If it is configured as a maxheap, find the element with maximum 'key'. [If it is configured as a minheap, find the element with minimum 'key'.]
2. Insert:add a new element to the heap.
3. Extract/delete:return the element with minimum 'key' if the heap is a minheap [ or maximum 'key' if the heap is a maxheap] after removing it from the heap.
4. Create:create an empty heap.
5. Heapify:create the heap out of a given array of elements. Assume here that an empty heap is already initialized (as maxheap or minheap).
6. Size:return the number of items in the heap.
For the sorting implementation, given a list of < key, value > pairs you are going to sort the list based on the keys. This should be straightforward. You just need to call heapify() and then extract one by one the elements on the heap. The sort order will be specified (ascending or descending key values), and depending on that you should use maxheap or minheap.
The input for the program is:
program < inputfile.txt > outputfile.txt
The first line of the inputfile.txt could be the numbers 2 or 3 declaring which part of the project is being tested (2 for the heap part and 3 for sorting using heaps).
For heap part the following lines of the input file are as follows:
For the sorting part the following lines of the input file are as follows:
Sample test cases: see image.