This coursework is designed to assess fundamental skills in Python programming that are required to successfully build tools for forensic & security analysis. The coursework is divided into three parts:
1. Python Knowledge and Understanding
Write a short essay (max. 1000 words) explaining the following.
2. Basic Python Programming
Write a Python 2.7 script that does the following.
(a) Generates a string with N opening brackets ("[") and N closing brackets ("]"), in some arbitrary order. This should be implemented as a function that picks a random N and then generates a random string of N opening and N closing brackets. Use the module random.
(b) Determines whether the generated string from (a) is balanced; that is, whether it consists entirely of correctly nested pairs of opening/closing brackets (in that order). Write a function that is passed a string as an argument and then performs the check.
(c) Allows the user to input a string as an optional command line parameter that is then tested using the function from (b). (Note: The input string can contain non-bracket characters.)
Examples of balanced strings of brackets:
[]
[][]
[[]]
[[][]]
[[][[]]]
Examples of strings of brackets that are not balanced:
][
][][
]][[
[]][[]
3. Advanced Python Programming
Write a Python program capable of splitting a text from a text file into sentences. Include at least the following rules:
Sentence boundaries occur at one of "." (periods), "?" or "!", with the exception that
(a) Periods followed by whitespace followed by a lower case letter are not sentence boundaries.
(b) Periods followed by a digit with no intervening whitespace are not sentence boundaries.
(c) Periods followed by whitespace and then an upper case letter, but preceded by any of a short list of titles are not sentence boundaries. Sample titles include Mr., Mrs., Dr., and so on.
(d) Periods internal to a sequence of letters with no adjacent whitespace are not sentence boundaries (for example, www.aptex.com, or e.g.)
(e) Periods followed by certain kinds of punctuation (notably comma and more periods) are probably not sentence boundaries.
Your task is to write a program that given the name of a text file is able to write its content with each sentence on a separate line.
Test your program with the following short text:
Dr. Harrison bought bargain.co.uk for 2.5 million pounds, i.e. he
paid a lot for it. Did he mind? John Smith, Esq. thinks he didn't.
Nevertheless, this isn't true... Well, with a probability of .9 it
isn't. !
The result should be:
Dr. Harrison bought bargain.co.uk for 2.5 million pounds, i.e. he
paid a lot for it. !
Did he mind? John Smith, Esq. thinks he didn't. !
Nevertheless, this isn't true... !
Well, with a probability of .9 it isn’t.!