Requirement
For this request, you will compare the efficiency of searching a sorted linked list and a hash table.
You should use a variation of your code for Mini-request 3 for this.
Modifications from Mini-request 3
Instead of getting the words to load into the hash table from the user, you must load at least 2000 words from a file called words.txt as input (using file I/O). It is OK if you share the words with classmates for this. Make sure that your words are sorted in random order (more about this later). The words.txt file must be in the same directory as your source code.
It’s OK to use more than 2000 words but don’t have more than 10000.
The hash table’s buckets must point to sorted linked lists. When you load a word into the hash table, also load the words into one very long sorted linked list. You must call the same function for inserting into the sorted linked list as for inserting into the hash table bucket’s linked list. Do not duplicate code for linked list insertion.
Make the hash table 127 buckets in size.
You must create a function called searchLinkedList that searches a sorted linked list for a word and returns a pointer to the node containing the word (if found) or NULL.
This function must return NULL immediately after you pass the point in the linked list where the word would have been found if it was in the linked list.
The function takes three parameters:
char searchWord (or you can use a string object): word to search for (entered by the user)
struct linkedList linkedList: linked list to search (in your program, you can call the linked list node struct anything that makes sense)
int comparisonCount: pointer to int filled in by the function with the count of strcmp comparisons done in searching the linked list
Create a search function called searchForWordTwice. It returns nothing and will have the following parameters:
char searchWord (or you can use a string object): word to search for (entered by the user)
struct linkedList linkedList: linked list to search
struct linkedList hashTable[]: hash table to search
int comparisonCount[2]: array containing the count of strcmp comparisons done in searching the extremely-long sorted linked list (element 0) and in searching the hash table (element 1)
It must call your linked list search function and then displays one of the following messages once the search is done:
“word was found in the list in number comparisons”, where word is the word being searched for, list is either “linked list” or “hash table bucket” and number equals the number of times that strcmp was called
“word was NOT found in the list in number comparisons”
You will use this search function to search the hash table bucket and the sorted linked list.
Don’t worry about the grammatical inconsistency of possibly displaying “1 comparisons”.
Indent the message displayed by one TAB (‘\t’).
Once you are finished the loop, display the total number of searches, the total number of comparisons done by searching the sorted linked list, and the total number of comparisons done by searching the hash table.
Other Requirements
Design your linked list code so that you do not have to duplicate code unnecessarily. This is very important.
Clean up all allocated memory before exiting.
Use constants to avoid magic numbers.
You can assume that all words will be in lowercase.
Your program must compile without warnings. If you have to use a #pragma as stated in the C course notes in order to get rid of the Microsoft-specific warnings, you should do so.
Your project must be named dsA2.
The source file that contains your main() function must be called dsA2.cpp. Put all hash table-related code in hashing.cpp. Put all linked list-related code in linkedlist.cpp (and linkedlist.h if you are using a class). Create other .h files as needed to support compilation. Do not create any other source files.
Remember to put appropriate header comments at the top of ALL source files.
Create a JPG file called compare.jpg that contains a screenshot of your program running with a full screen of sample output (so that I can see the difference in comparison count between the methods for at least two successful and three unsuccessful searches (do not skimp on the number of searches)). Make sure that it includes the final summary output. Put this file in the top directory of your project (so that it is submitted with your submission). A sample of this file is provided for you. Please follow the output format and contents exactly. Not providing this file as required could produce an automatically-failing mark in this request.
Provide your words.txt file in the same directory as your source code.
Submit your submission to the appropriate dropbox as required by the SET Submission Standards document.
Hints
An easy way to sort the input words in random order is the following:
Get 2000+ words, one per line. Copy the words to the clipboard.
Start Excel.
Paste the words into Excel (so that you have one column of 2000+ rows).
In the next column, put random numbers using =rand().
Select all of the data.
Sort the data using the column with the random number as the sorting key.
Delete the random number column.
Save the data into a text file.
Here is a 500 word sample input file.
Unit testing your functions is a really good idea. You should not leave your unit testing code in your submitted source.
Don’t forget that malloc() does not support C++ string objects so you must use new instead if you have strings as part of a dynamically-allocated struct or class.
You can use the examples provided on the DS website and C website as long as you attribute credit in a comment. You must attribute credit for the djb2 hash function (credit the real author Dan Bernstein and state that you got it from lecture).