Database 1
Database Design
- Prof. dr. Paul M.E. De Bra
- Dept. of Computer Science
- Eindhoven Univ. of Technology
Third Normal Form
- Third Normal Form
- Informal Presentation
- Example and Discussion
- Formal Definition
- 3NF Decomposition Algorithm
- Principle and Properties
- Lossless-join, dependency-preserving decomposition into 3NF
- Proof of Correctness
- Example of 3NF Decomposition
- Third Normal Form and Boyce-Codd Normal Form
Third Normal Form
Informal Presentation
- Motivation
- There are some situations where
- BCNF decomposition is not dependency preserving, and
- Efficient checking for FD violation on updates is important
- Solution
- Define a weaker normal form, called Third Normal Form
- FDs can be checked on individual relations without computing a join
- There is always a lossless-join, dependency-preserving decomposition into 3NF
Third Normal Form
Informal Presentation
- Motivation
- Sometimes a relational schema and its FDs are not in BCNF but one does not want to decompose further
- Example:
- Relation Bookings with attributes:
- title, the name of the performance
- theater, the name of the theater where the performance is being shown
- city, where the theater is located
- FDs are: theater city, title city theater
- Is there a BCNF violation?
Third Normal Form
Informal Presentation
- Motivation
- Decomposition to get to BCNF may not always be desirable
- BCNF decomposition is not dependency preserving, and
- Efficient checking for FD violation on updates is important
- 3NF relaxes BCNF to allow relations that cannot be decomposed
into BCNF relations without losing ability to check each FD
- Informal Definition of 3NF
- A relation R is in third normal form if:
Third Normal Form
Informal Presentation
- Informal Definition of 3NF
- A relation R is in third normal form if:
- The difference between BCNF and 3NF:
- “B is a member of some candidate key”
- Previous example schema is in 3NF
- Candidate keys here are: (title, city) and (theater, title)
- Theater is not a superkey but city is a member of a cand. key
- What is the problem with this schema?
Third Normal Form
Informal Presentation
- Informal Definition of 3NF
- Previous example schema is in 3NF
- What is the problem with this schema?
- The schema contains redundant information
Third Normal Form
Formal Definition
- Definition
- A relation schema R is in third normal form (3NF) if
- for all functional dependencies in F+ of the form ,
where R and R, at least one of the following holds:
- is a trivial functional dependency ( )
- contains a key for R
- every B is part of some candidate key of R
- BCNF and 3NF
- A BCNF relation is in 3NF
- A 3NF relation is not necessary in BCNF
Third Normal Form
Formal Presentation
- Example
- Consider the two relational schemas
- R1 = (cust-name, name, house-num, street, city, state)
- cust-num name, house-num, street, city, state
-
- R2 = (house-num, street, city, state, zip)
- house-num, street, city, state zip
- zip state
- Are these relations in 3NF?
Third Normal Form
Formal Presentation
- Example
- For R1
- The only nontrivial functional dependencies in F+ are those with cust-num as a member of the left-side of the FD
- As cust-num is a superkey of R1, these functional dependencies satisfy the second condition for 3NF
Third Normal Form
Formal Presentation
- Example
- For R2
- There are two kinds of nontrivial functional dependencies in F+:
- Those with (house-num, street, city, state) as a subset of the left hand side of the FD: As (house-num, street, city, state) is a superkey for R2, then functional dependencies satisfy the second condition for 3NF
- Those of the form {zip} {state} where
- For any such functional dependency:
- ( {state}) – ( {zip}) = {state} (or = )
- Because state is a candidate key of R2, such functional dependencies satisfy the third condition for 3NF
Third Normal Form
Decomposition into 3NF
- Principles
- Input/Output
- Input
- A set of functional dependencies F
- A relation schema R
- Output
- A lossless-join, dependency-preserving decomposition in 3NF
- Decomposition of R into relation schemas that are in 3NF
- Canonical Cover
- The set of dependencies Fc in the algorithm is a canonical cover of the functional dependencies
Third Normal Form
Decomposition into 3NF
- Principles
- The algorithm takes as set of dependencies and adds one schema at a time, instead of decomposing the initial schema repeatedly
- The result is not uniquely defined since
- A set of functional dependencies can have more than one canonical cover
- In some cases, the result of the algorithm depends on the order which it considers the dependencies in Fc
(minor bug in the algorithm, see later)
Third Normal Form
Decomposition into 3NF
- Decomposition
- Given: relation R, set F of functional dependencies
- Find: decomposition of R into a set of 3NF relation Ri
- Algorithm (sketch, real algorithm on next slide):
- Decomposition produces a lossless join and preserves dependencies
Third Normal Form
Decomposition into 3NF
- Algorithm (Silberschatz, 7.14)
- Note: forgotten step: remove Ri that are subset of another Rk
Third Normal Form
Decomposition into 3NF
- Example
- Semester database of a university
- Relational schema R=(L, I, T, R, S, G)
- Attributes
- L: Lecture R: Room
- I: Instructor S: Student
- G: Grade T: Time
- Functional Dependencies
- L I, TR L, TI R, LS G, TS R, TRI LR
Third Normal Form
Decomposition into 3NF
- Example
- R=(L, I, T, R, S, G)
- F: {L I, TR L, TI R, LS G, TS R, TRI LR}
- Are all FDs necessary? No !
- TR L, TI R then TRI LR
- Canonical cover of F
- Fc= {L I, TR L, TI R, TS R, LS G}
- Key: (ST)
- Key attributes: S, T
Third Normal Form
Decomposition into 3NF
- Example
- R = (L, I, T, R, S, G)
- Fc = {L I, TR L, TI R, TS R, LS G}
- Key attributes: S, T
- Decomposition in 3NF
- R1 = (L, I) R2 = (T, R, L)
- R3 = (T, I, R) R4 = (L, S, G)
- R5 = (S, T, R)
Third Normal Form
Decomposition into 3NF
- 3NF Decomposition Algorithm
- Proof of Correctness
- 3NF decomposition algorithm is lossless join, dependency preserving decomposition into 3NF
- Lossless join
- Dependency preserving
- 3NF
-
Third Normal Form
Decomposition into 3NF
- 3NF Decomposition Algorithm
- Decomposition is dependency preserving
- 3NF decomposition algorithm is dependency preserving since there is a relation for every FD in Fc.
Third Normal Form
Decomposition into 3NF
- 3NF Decomposition Algorithm
- Decomposition is lossless join
- Lossless join decomposition
- A decomposition {R1, R2} is a lossless-join decomposition
if R1 R2 R1 or R1 R2 R2
- Idea:
- A candidate key (K) is in one of the relations Ri in decomposition (last step of algorithm guarantees this)
- Closure of candidate key under Fc must contain all attributes in R (definition of candidate key)
- Follow the steps of attribute closure algorithm (Fig. 7.7)
to show that the sufficient lossless join condition is satisfied for K+. (details omitted)
Third Normal Form
Decomposition into 3NF
- 3NF Decomposition Algorithm
- Decomposition into 3NF
- Claim
- If a relation Ri is in the decomposition generated by the synthesis algorithm, then Ri is in 3NF
- Idea
- To test for 3NF, it is sufficient to consider the functional dependencies whose right-hand side is a single attribute
- Therefore to see that Ri is in 3NF, we must show that any functional dependency that holds in Ri, satisfies the definition of 3NF
Third Normal Form
Decomposition into 3NF
- 3NF Decomposition Algorithm
- Decomposition into 3NF
- Demonstration
- Assume is the dependency that generated Ri in the algorithm
- B must be in or , since B is in Ri and generated Ri
- Let us consider two possible cases
- B is in but not
- B is in but not
Third Normal Form
Decomposition into 3NF
- 3NF Decomposition Algorithm
- Decomposition into 3NF
- Demonstration
- B is in but not in :
- must be superkey (why?)
- The second condition of 3NF is satisfied
- B is in but not in
- is thus a candidate key
- The third alternative in the definition of 3NF is satisfied
- Note: we cannot show that is a superkey. This shows exactly why the third alternative is present in the definition of 3NF
Third Normal Form
Decomposition into 3NF
- B is in
- Assume is not a superkey
- must contain some attribute not in
- Since B is in F+ it must be derivable from Fc, by using attribute closure on
- Attribute closure cannot have used
- if it had been used, must be contained in the attribute closure of , which is not possible, since we assumed is not a superkey
- Now, using (- {B}) and B, we can derive B (since , and since B is non-trivial)
- Then, B is extraneous in the right-hand side of ; which is not possible since is in Fc
- Thus, if B is in then must be a superkey
Third Normal Form
Comparison of BCNF and 3NF
- BCNF or 3NF?
- Relations in BCNF and 3NF
- Relations in BCNF: no repetition of information
- Relations in 3NF: problem of repetition of information
- Decomposition in BCNF and in 3NF
- It is always possible to decompose a relation into relations in 3NF and
- the decomposition is lossless
- dependencies are preserved
- It is always possible to decompose a relation into relations in BCNF and
- the decomposition is lossless
- the information is not repeated
Third Normal Form
Comparison of BCNF and 3NF
- To summarize
- Design Goals
- Goal for a relational database design is:
- BCNF (no redundant information)
- Lossless join
- Dependency preservation
- If we cannot achieve this, we accept:
- 3NF (possible repetition of information)
- Lossless join
- Dependency preservation