Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
287 views
in Technique[技术] by (71.8m points)

sql - Identifying transitive dependencies

I am working with a table that has a composite primary key composed of two attributes (with a total of 10) in 1NF form.

  • In my situation a fully functional dependency involves the dependent relying on both attributes in my primary key.
  • A partial dependency relies on either one of the attributes from the primary key.
  • A transitive dependency involves two or more non-key attributes in a functional dependence where one of the non-key attributes is dependent on a key attribute from my primary key.

Pulling the transitive dependencies out of the table, seems do this after normalization, but my assignment requires us to identify all functional dependencies before we draw the dependency diagram (after which we normalize the tables). Parenthesis identify the primary key attributes:

(Student ID), Student Name, Student Address, Student Major, (Course ID), Course Title, Instructor ID, Instructor Name, Instructor Office, Student_crse_grade
  • Only one class is taught for each course ID.
  • Students may take up to 4 courses.
  • Each course may have a maximum of 25 students.
  • Each course is taught by only one Instructor.
  • Each student may have only one major.
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

From your question it seems that you do not have a clear understanding of basics.

Application relationships & situations

First you have to take what you were told about your application (including business rules) and identify the application relationships (aka associations). Each gets a (base) table (aka relation) variable. Such an application relationship can be characterized by a row membership criterion (aka predicate) (aka meaning) that is a statement template. Eg suppose criterion student [student_id] takes course [course_title] has table variable TAKES. The parameters of the criterion are the columns of its table. We can use a table name with columns (like an SQL declaration) as a shorthand for the criterion. Eg TAKES(student_id,course_title). A criterion plus a row makes a statement (aka proposition) about a situation. Eg row (17,'CS101') gives student 17 takes course 'CS101' ie TAKES(17,'CS101'). Rows that give a true statement go in the table and rows that make a false one stay out.

If we can rephrase a criterion as the AND/conjunction of two others then we only need the tables with those other criteria. This is because JOIN is defined so that the JOIN of two tables containing the rows making their criteria true returns the rows that make the AND/conjunction of their criteria true. So we can JOIN the two tables to get back the original. (This is what normalization is doing by decomposing tables into components.)

/* student with id [si] has name [sn] and address [sa] and major [sm]
    and takes course [ci] with title [ct]
    from instructor with id [ii] and name [in] and office [io]
    with grade [scg] */
T(si,sn,sa,sm,ci,ct,ii,in,io,scg)

/* student with id [si] has name [sn] and address [sa] and major [sm] */
    and takes course [ci] with grade [scg]
SG(si,sn,sa,sm,ci,scg)

/* course [ci] with title [ct]
    is taught by instructor with id [ii] and name [in] and office [io] */
CI(ci,ct,ii,in,io,scg)

/* T(si,sn,sa,sm,ci,ct,ii,in,io,scg)
    IFF SG(si,sn,sa,sm,ci,scg) AND CI(ci,ct,ii,in,io,scg) */
T = SG JOIN CI

Together the application relationships and situations that can arise determine both the rules and constraints! They are just things that are true of every application situation or every database state (ie values of one or more base tables) (which are are a function of the criteria and the possible application situations.)

Then we normalize to reduce redundancy. Normalization replaces a table variable by others whose predicates AND/conjoin together to the original's when this is beneficial.

The only time a rule can tell you something that you don't know already know from the (putative) criteria and (putative) situations is when you don't really understand the criteria or what situations can turn up, and the a priori rules are clarifying something about that. A person giving you rules is already using application relationships that they assume you understand and they can only have determined that a rule holds by using them and all the application situations that can arise (albeit informally)!

(Sadly many presentations of information modeling don't even mention application relationships. Eg: If someone says "there is a X:Y relationship" then they must already have in mind a particular binary application relationship between entities; knowing it and what application situations can arise, they are reporting that it has a certain cardinality in a certain direction. This will correspond to some application relationship and table using column sets that identify entities. Plus some presentations/methods call FKs "relationships"--confusing them with those relationships.)

Check out "fact-based" information modeling methods Object-Role Modeling or (its predecessor) NIAM.

FDs & CKs

Given the criterion for putting rows into or leaving them out of a table and all possible situations that can arise, only some values (sets of rows) can ever be in a table variable.

For every subset of columns you need to decide which other columns can only have one value for a given subrow value for those columns. When it can only have one we say that the subset of columns functionally determines that column. We say that there is a FD (functional dependency) columns->column. This is when we can express the table's predicate as "... AND column=F(columns)" for some function F. But every superset of that subset will also functionally determine it, so that cuts down on cases. Conversely, if a given set does not determine a column then no subset of the set does. Applying Armstrong's axioms gives all the FDs that hold when given FDs hold. Also, you may think in terms of column sets being unique; then all other columns are functionally dependent on that set. Such a set is called a superkey.

Only after you have determined the FDs can you determine the CKs (candidate keys)! A CK is a superkey that contains no smaller superkey. (That a CK and/or superkey is present is also a constraint.) We can pick a CK as PK (primary key). PKs have no other role in relational theory.

A partial dependency relies on either one of the attributes from the Primary key.

Don't use "involve" or "relies on" to give a definition. Say, "when" or "iff" ("if and only if").

Read a definition. A FD that holds is partial when/iff using a proper subset of the determinant gives a FD that holds with the same determined column; otherwise it is full. Note that this does not involve CKs. A relation is in 2NF when all non-prime attributes are fully functionally dependent on every CK.

A transitive dependency involves two or more non-key attributes in a functional dependence where one of the non-key attributes is dependent on a key attribute (from my PK).

Read a definition. S -> T is transitive when/iff there is an X where S -> X and X -> T and not (X -> S) and not (X = T). Note that this does not involve CKs. A relation is in 3NF when all non-prime attributes are non-transitively dependent on every CK.

"1NF" has no single meaning.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...