Check BCNF at movie database?

464 views Asked by At

I am a bit confused how to check BCNF in a database diagram.I saw some youtube videos about what is database normalization (1NF,2NF...) but when the time comes to apply these rules to my project , i don't know what to do .

My ER diagram for a movie database

ER diagram to postgre sql code:

CREATE TABLE People ( 
    birth_date  DATE    NOT NULL,
    last_name   CHAR(30)    NOT NULL,
    name    CHAR(30)    NOT NULL,
    person_id   INTEGER NOT NULL,
PRIMARY KEY (person_id) );

CREATE TABLE Roles ( 
    role_id INTEGER NOT NULL,
    role_name   CHAR(30)    NOT NULL,
PRIMARY KEY (role_id) );

CREATE TABLE genre ( 
    genre_id    INTEGER NOT NULL,
    genre_name  INTEGER NOT NULL,
PRIMARY KEY (genre_id) );

CREATE TABLE Movies ( 
    movie_id    INTEGER NOT NULL,
    title   CHAR(30)    NOT NULL,
    rating  REAL    NOT NULL,
    release_date    DATE    NOT NULL,
PRIMARY KEY (movie_id) );

CREATE TABLE film_people ( 
    role_id INTEGER NOT NULL,
    person_id   INTEGER NOT NULL,
    movie_id    INTEGER NOT NULL,
    FK1_movie_id    INTEGER NOT NULL,
    FK2_person_id   INTEGER NOT NULL,
    FK3_role_id INTEGER NOT NULL,
PRIMARY KEY (FK1_movie_id, FK2_person_id, FK3_role_id),
UNIQUE (role_id),
UNIQUE (person_id),
UNIQUE (movie_id) );

CREATE TABLE film_genre ( 
    movie_id    INTEGER NOT NULL,
    genre_id    INTEGER NOT NULL,
    FK1_movie_id    INTEGER NOT NULL,
    FK2_genre_id    INTEGER NOT NULL,
PRIMARY KEY (FK1_movie_id, FK2_genre_id),
UNIQUE (movie_id),
UNIQUE (genre_id) );

ALTER TABLE film_people ADD FOREIGN KEY (FK1_movie_id) REFERENCES Movies (movie_id) ON DELETE CASCADE ON UPDATE CASCADE;

ALTER TABLE film_people ADD FOREIGN KEY (FK2_person_id) REFERENCES People (person_id) ON DELETE CASCADE ON UPDATE CASCADE;

ALTER TABLE film_people ADD FOREIGN KEY (FK3_role_id) REFERENCES Roles (role_id) ON DELETE CASCADE ON UPDATE CASCADE;

ALTER TABLE film_genre ADD FOREIGN KEY (FK1_movie_id) REFERENCES Movies (movie_id) ON DELETE CASCADE ON UPDATE CASCADE;

ALTER TABLE film_genre ADD FOREIGN KEY (FK2_genre_id) REFERENCES genre (genre_id) ON DELETE CASCADE ON UPDATE CASCADE;

Main source for movie database design: How to design a movie database?

So, at given ER diagram , my purpose is to find functional dependencies and apply BCNF normalization.

Any help is appreciated!

1

There are 1 answers

1
reaanb On BEST ANSWER

The Entity-Relationship model is more of a conceptual model than a logical model. It adds semantics which makes it more familiar to people untrained in formal logic, while restricting one to a subset of the relational model and producing reasonably normalized relations. We don't normally deal with normalization at the ER level - entities and relationships are designed with only a single primary key and everything else is a dependency of those keys, there's no notation to indicate alternate candidate keys or partial or transitive functional dependencies. Also, the extra semantics of ER is a hindrance when normalizing - there's no logical value in trying to maintain the difference between entity sets and value sets, or between entity relations and relationship relations.

Normalization is better checked and applied after converting your ER model to a relational model. By this, I mean converting each entity and each relationship to a separate relation and listing their functional dependencies and candidate keys. These can be mostly derived directly from the ER model, but watch out for hidden natural keys (especially when you used a surrogate key), look for dependencies among the non-key attributes (especially if you used EER composite attributes), and watch out for 4NF violations from multivalued dependencies if you used EER multivalued attributes.

You can draw relational diagrams which consist of a box for each column and arrows between the boxes to indicate dependencies. I prefer to just type it out in text, e.g.

genre: genre_id -> genre_name
movies: movie_id -> title, release_date, rating
film_genre: movie_id, genre_id -> ()

The empty parentheses in the last line is to indicate that there are no non-key attributes in that relation. It's important to try and identify ALL candidate keys and dependencies that hold in order to normalize correctly, don't just assume the ER model is correct or complete. Once you've got all your candidate keys and functional dependencies listed, you can then work through the normal forms to check them.

BTW, your relationship relations aren't implemented correctly in your physical model/SQL. They should be:

CREATE TABLE film_people ( 
    role_id INTEGER NOT NULL,
    person_id INTEGER NOT NULL,
    movie_id INTEGER NOT NULL,
    PRIMARY KEY (movie_id, person_id, role_id),
    FOREIGN KEY (movie_id) REFERENCES Movies (movie_id) ON DELETE CASCADE ON UPDATE CASCADE,
    FOREIGN KEY (person_id) REFERENCES People (person_id) ON DELETE CASCADE ON UPDATE CASCADE,
    FOREIGN KEY (role_id) REFERENCES Roles (role_id) ON DELETE CASCADE ON UPDATE CASCADE
);

CREATE TABLE film_genre ( 
    movie_id INTEGER NOT NULL,
    genre_id INTEGER NOT NULL,
    PRIMARY KEY (movie_id, genre_id),
    FOREIGN KEY (movie_id) REFERENCES Movies (movie_id) ON DELETE CASCADE ON UPDATE CASCADE,
    FOREIGN KEY (genre_id) REFERENCES genre (genre_id) ON DELETE CASCADE ON UPDATE CASCADE
);

I removed the duplicated columns, e.g. movie_id and FK1_movie_id. Perhaps these were created due to the inclusion of key attributes on your relationships in your ER diagram? Normally we understand that a relationship's key is combined from the keys of the entities that participate in the relationship, so we don't indicate it on the diagram.

I also removed the unique constraints on each of the columns. Think about it - does each movie have only one role? Can each person play only one part in his/her life? Can each role be performed only once? Does each movie belong to only one genre? Does each genre contain only one movie? Those constraints didn't make sense.

Also, on your diagram, the (0,N) cardinality indicators don't make sense. 0 usually indicates optional components of a relationship. Can a film_genre instance be recorded without a movie_id and/or genre_id? No, each relationship instance requires both entities. ER associations are generally assumed to be unconstrained unless indicated, so the only cardinality I ever indicate is 1 when an entity in a relationship is dependent and not part of the key. For optional associations I use dotted lines.

Back to your question, take some time to think about the film_people relation, there may be a BCNF violation here depending on how you interpret the situation. Are there hidden overlapping candidate keys? For example, are both (movie_id, person_id) and (movie_id, role_id) unique? In other words, can a person play only one role in a movie, and can only one person play each role in a movie? Think about (movie_id, role_id) and (role_id, person_id) in the same way.

Finally, also think about your ON DELETE CASCADE clauses. If you delete a person, it'll also delete associated roles from movies. If you delete a role, it'll delete a person's association with a movie. Is that correct?