In this assignment you are asked to write a program to break the code of an encoded message. Code breaking can be very complicated. Today’s computers can do much of the work but it still needs human skill to find the clues.
Many code-breakers begin by checking how often certain letters and words appear in a message. In English, for example, E, T, A and O are the commonest letters. TH, HE, AN and RE are common combinations. THE, OF, AND and TO are common words. You cannot write long messages without repeating some of these. The code-breaker can begin to guess the meanings by studying these repetitions. Before the computers were used, it could take weeks or months of intense study and mathematical calculation to crack the code.
In the encoded message you are asked to decipher, each letter of the alphabet was substituted for another letter and it was a one-to-one mapping. To find the cipher you could make use of the fact that some letters of the alphabet occur in any piece of text more often than the others.
You could start by writing a program to count the number of times each letter occurs in a text file, then use this program to find the percentage of the occurrences of each letter in the training text (any long enough piece of English text) relatively to the total number of the letters in the text. You could then do the same thing with the encoded message. You could then write a program to substitute the most frequent letter from the encoded message with the most frequent letter from your training text, and so on.
Possibly you will not get all the letters absolutely right, because some letters, which normally occur with more or less the same frequency could give different results in the training and the encoded messages. You will have to deal with that eventuality by applying your own logic and substituting the letters in the text using interactions with the computer. For example, if in the encoded message you have the word ‘mln’, and it got decoded as the word ‘tge’, which does not exist, it is possible that the correct decoded word is ‘the’. Therefore you should substitute ‘l’ for ‘h’ rather than for ‘g’ in the whole encoded message.
Develop two programs, one procedural and another object-oriented. Develop the procedural code first and the object-oriented code second. You can reuse suitable parts of the procedural program to create the object-oriented code. Make sure both programs generate the same results. Submit both programs. Both programs should operate as follows:
Implement a simple option menu to make your programs easier to use.
Test your programs. Provide sufficient evidence of testing.
Improve your programs. Consider special cases such as choosing an option that is not available in the menu, requesting the user interactively for a text file name and reading the data from that file, etc.