Database Management System - Chapter 7: Normalization of Relational Tables

Modification anomalies Functional dependencies Major normal forms Relationship independence Practical concerns Modification Anomalies Unexpected side effect Insert, modify, and delete more data than desired Caused by excessive redundancies Strive for one fact in one place

ppt36 trang | Chia sẻ: candy98 | Lượt xem: 564 | Lượt tải: 0download
Bạn đang xem trước 20 trang tài liệu Database Management System - Chapter 7: Normalization of Relational Tables, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
Chapter 7Normalization of Relational TablesOutline Modification anomaliesFunctional dependenciesMajor normal formsRelationship independencePractical concernsModification AnomaliesUnexpected side effectInsert, modify, and delete more data than desiredCaused by excessive redundanciesStrive for one fact in one place Big University Database TableModification Anomaly ExamplesInsertionInsert more column data than desiredMust know student number and offering number to insert a new courseUpdateChange multiple rows to change one factMust change two rows to change student class of student S1DeletionDeleting a row causes other facts to disappearDeleting enrollment of student S2 in offering O3 causes loss of information about offering O3 and course C3Functional DependenciesConstraint on the possible rows in a tableValue neutral like FKs and PKsAssertedUnderstand business rulesFD DefinitionX  YX (functionally) determines YX: left-hand-side (LHS) or determinantFor each X value, there is at most one Y valueSimilar to candidate keys FD Diagrams and ListsStdSSN  StdCity, StdClassOfferNo  OffTerm, OffYear, CourseNo, CrsDescCourseNo  CrsDescStdSSN, OfferNo  EnrGradeFDs in DataProve non existence (but not existence) by looking at dataTwo rows that have the same X value but a different Y valueIdentifying FDsEasy identificationStatements about uniquenessPKs and CKs resulting from ERD conversion1-M relationship: FD from child to parentDifficult identificationLHS is not a PK or CK in a converted tableLHS is part of a combined primary or candidate keyEnsure minimality of LHSNormalizationProcess of removing unwanted redundanciesApply normal formsIdentify FDsDetermine whether FDs meet normal formSplit the table to meet the normal form if there is a violation Relationships of Normal Forms1NFStarting point for most relational DBMSsNo repeating groups: flat rowsCombined Definition of 2NF/3NFKey column: candidate key or part of candidate keyAnalogy to the traditional justice oathEvery non key column depends on all candidate keys, whole candidate keys, and nothing but candidate keysUsually taught as separate definitions2NFEvery nonkey column depends on all candidate keys, not a subset of any candidate keyViolationsPart of key  nonkeyViolations only for combined keys2NF ExampleMany violations for the big university database tableStdSSN  StdCity, StdClassOfferNo  OffTerm, OffYear, CourseNo, CrsDescSplitting the tableUnivTable1 (StdSSN, StdCity, StdClass) UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc) 3NFEvery nonkey column depends only on candidate keys, not on non key columnsViolations: Nonkey  NonkeyAlterative formulationNo transitive FDsA  B, B  C then A  COfferNo  CourseNo, CourseNo  CrsDesc then OfferNo  CrsDesc3NF ExampleOne violation in UnivTable2CourseNo  CrsDescSplitting the tableUnivTable2-1 (OfferNo, OffTerm, OffYear, CourseNo)UnivTable2-2 (CourseNo, CrsDesc)BCNFEvery determinant must be a candidate key.Simpler definitionApply with simple synthesis procedureSpecial cases not covered by 3NFPart of key  Part of keyNonkey  Part of keySpecial cases are not commonBCNF ExamplePrimary key: (OfferNo, StdSSN)Many violations for the big university database tableStdSSN  StdCity, StdClassOfferNo  OffTerm, OffYear, CourseNoCourseNo  CrsDescSplit into four tables Simple Synthesis ProcedureEliminate extraneous columns from the LHSsRemove derived FDs Arrange the FDs into groups with each group having the same determinant. For each FD group, make a table with the determinant as the primary key.Merge tables in which one table contains all columns of the other table. Simple Synthesis Example IBegin with FDs shown in Slide 8Step 1: no extraneous columnsStep 2: eliminate OfferNo  CrsDescStep 3: already arranged by LHSStep 4: four tables (Student, Enrollment, Course, Offering)Step 5: no redundant tablesSimple Synthesis Example IIAuthNo  AuthName, AuthEmail, AuthAddress AuthEmail  AuthNoPaperNo  Primary-AuthNo, Title, Abstract, StatusRevNo  RevName, RevEmail, RevAddressRevEmail  RevNoRevNo, PaperNo  Auth-Comm, Prog-Comm, Date, Rating1, Rating2, Rating3, Rating4, Rating5 Simple Synthesis Example II SolutionAuthor(AuthNo, AuthName, AuthEmail, AuthAddress) UNIQUE (AuthEmail) Paper(PaperNo, Primary-Auth, Title, Abstract, Status) FOREIGN KEY (Primary-Auth) REFERENCES Author Reviewer(RevNo, RevName, RevEmail, RevAddress) UNIQUE (RevEmail) Review(PaperNo, RevNo, Auth-Comm, Prog-Comm, Date, Rating1, Rating2, Rating3,Rating4, Rating5) FOREIGN KEY (PaperNo) REFERENCES Paper FOREIGN KEY (RevNo) REFERENCES Reviewer Multiple Candidate KeysMultiple candidate keys do not violate either 3NF or BCNFStep 5 of the Simple Synthesis Procedure creates tables with multiple candidate keys.You should not split a table just because it contains multiple candidate keys.Splitting a table unnecessarily can slow query performance.Relationship Independence and 4NFM-way relationship that can be derived from binary relationships Split into binary relationshipsSpecialized problem4NF does not involve FDs Relationship Independence ProblemRelationship Independence Solution Extension to the Relationship Independence SolutionMVDs and 4NFMVD: difficult to identifyA  B | C (multi-determines)A associated with a collection of B and C valuesB and C are independentNon trivial MVD: not also an FD4NF: no non trivial MVDs MVD RepresentationA  B | COfferNo  StdSSN | TextNoGiven the two rows above the line, the two rows below the line are in the table if the MVD is true.Higher Level Normal Forms5NF for M-way relationshipsDKNF: absolute normal formDKNF is an ideal, not a practical normal formRole of NormalizationRefinementUse after ERDApply to table design or ERDInitial designRecord attributes and FDsNo initial ERDMay reverse engineer an ERD after normalizationAdvantages of Refinement ApproachEasier to translate requirements into an ERD than list of FDsFewer FDs to specifyFewer tables to splitEasier to identify relationships especially M-N relationships without attributesNormalization ObjectiveUpdate biasedNot a concern for databases without updates (data warehouses)DenormalizationPurposeful violation of a normal formSome FDs may not cause anomaliesMay improve performanceSummaryBeware of unwanted redundanciesFDs are important constraintsStrive for BCNFUse a CASE tool for large problemsImportant tool of database developmentFocus on the normalization objective