Consider that a journalist wants to create a simple database for covering the 2010 Tour de France cycling race. The race consists of a number of stages identified by a starting and destination city. Each stage is completed on a single date, and goes over a specific number of kilometers.
Riders (first name and last name, which together are unique) are employed by a team which is identified by a name and is managed by the team boss. Each team has a sponsor and each sponsor is only allowed to support one team. The team boss is not a rider and is identified by his/her first and last name together. Each team is led by a team leader who himself/herself is also a rider. A team needs to have at least one rider to enroll in the race. Each rider must belong to one and only one team in the race. No two teams are allowed to have the same boss. Riders participate in a number of stages (not necessarily all, as they could give up after a number of days) and it is possible that a rider is absent from all stages. It is also possible that there are not any riders participating in a stage.
For each stage that a rider participates in, his total time spent from start to finish as well as the final position achieved by the rider for the stage are recorded. Stages are either Mountain or Flat, but there may be other kinds of stages for other cycling races. For mountain stages the database keeps track of the grade (= difficulty level), the number of mountains in it, and the best climber (a rider). For a flat stage we record the number of intermediate sprints.
For the information of the Best Climber of Mountain Stage, please model it as an attribute instead of a relationship.
(a) Draw an Entity-Relationship diagram for this database using UML nota- tion. Be sure to include all the entities mentioned above, together with attributes (including primary key attributes), relationships and multiplic- ity constraints. Note that generalisation/specification will be involved in the ER diagram.
(b) Please classify the relationships existing in the produced ER diagram into different categories such as 1..1, 1..* , *..* and generalisation/specification relationships, if any. Please list the relationship(s) under each applicable category.
(c) Translate the ER diagram to relations, represented by their schema. Make sure that you apply the correct translation strategies based on the type of relationships in the ER diagram.
Please follow carefully the following guidelines when you finish this question:
Consider the following relation schema for table R:
R(ENo,DNo,PNo,EName,DName,PName,Edob,PCity,PCountry,hours,rate)
Attributes starting with "E" refer to Employees, those starting with D refer to Departments, and those with P to Projects. Employees, Departments, and Projects are identified by unique numbers. The number of hours and hourly rate of pay for an employee to carry out a project are determined by himself/herself and the project. There may be multiple projects that are conducted in a de- partment and multiple departments can be involved in a single project as well. Any employee can be affiliated with a few different departments and work in multiple projects at the same time. A project may involve multiple employees. Names for employees, departments and projects are not generally unique. A project will only be conducted in a single city. Multiple cities from the same country may appear in the table, however, cities are uniquely named within and across all countries.
Answer the following questions:
(a) Identify the Functional Dependencies in R. Be sure to only include func- tional dependencies that satisfy the following 4 rules: 1) Only include non-trivial FDs; 2) Minimize the determinant (LHS), that is, only include full FDs; 3) Maximize the RHS; and 4) Only include FDs that cannot be derived from other FDs using Armstrongs axioms. Please refer to Page 26 of the Module 10 lecture notes for the details of the above requirements.
(b) Identify the candidate key(s) of R based on the Functional Dependencies. You need to use the concept of attribute closure to identify the key(s). Intermediate steps in this process should be detailed.
(c) Assume that R is in 1NF. Now normalise the relation to 2NF, 3NF, and BCNF. Be sure to indicate the FDs you are removing at each step, and why. Just giving the decompositions in each of the three Normal Forms is not sufficient.
Please kindly note that for Question(c) the full mark will only be given to the fully correct normalization result. If you get some functional dependency wrong in Question(a), then some partial marks will be deducted for both the functional dependencies and the possibly incorrect normalization result. Given this, it is very important to get the functional dependencies right in the first instance.