it2051229 BST Text Concordance

A text concordance BST is a BST of all the distinct words in a piece of text, and in this project we consider the problem of constructing one concordance for a document stored in a file. To construct such a concordance BST, we begin with an empty BST. As each word is read, we will check the list whether it is in current BST or not. If the word is a new word, it is inserted into the BST in the appropriate place. If the word you read appears in the BST already, the frequency of the word will be increased by one. Obviously, it may be necessary to insert words at any point in this BST. Each node consists of four fields, one for word, one for its frequency, and two for right/left child link(reference) fields.

Each time a word is read from a document, this BST must be searched, always beginning with the rootnode.

(The data type of the word field would be a string. If the word starts with non-alphabet(number or special symbol) do not insert it in the list. Also the characters of all words will be upper case only without any punctuation.)

ALGORITHM:

Input : A Text file (concordinput.txt)

Function : Construct a text concordance BST from a document stored in a file.

Output :

1. A list of distinct words with their frequencies in an ascending alphabetic order by using inorder traversal method.

2. A list of words in the same format of Figure 4.7 in the text (pp. 105) using a preorder traversal method.

Academic Honesty!

It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.

Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.