Do you always design databases in at least Third Normal Form (3NF)? Why?
Databases and 3rd Normal Form
507 views Asked by peter AtThere are 4 answers
Generally I aim to design a database to at least Boyce Codd / 5th Normal Form except where I find good reasons not to. The main reason for using Normal Form is to avoid redundancy that could otherwise cause incorrect results. Another advantage is that Normal Form helps avoid "bias" in the database design (the tendency to suit some patterns of usage more than others), which makes it easier to support schema changes as requirements evolve.
No, not always.
You need to understand the Normal Forms well, to avoid doing stupid mistakes. But sometimes it's useful, for convenience or performance reasons. (Caching)
I find it mostly important to avoid data duplication that can get out of sync. (Changed names, relocated resources.) Still, sometimes that's exactly what you want. (Invoice summary at a specific point in time.)
"dportas" has an important point about bias. Breaking the Normal Form rules tends to make your code and data less maintainable and flexible.
I would at least design that way (unless I was creating a data warehouse which has differnt rules). Anytime I have seen people try to denormalize in the initial design phase they have created far more problems than they solved. I curse those people on a daily basis as I am the person who gets to fix this junk.
Back in the day, I used ER modeling for data analysis, then converted to a relational model, then converted the relational model to into SQL tables (with their indexes). It sounds complicated, but it's not. Each step is simple and manageable.
If you do the ER modeling correctly, you associate each attribute with the correct entity or relationship. Every entity gets a key, whether natural or synthetic. Discovering all the useful relationships is the deepest part of data analysis.
The ER model can be used as a guide for forming relations out of the attributes. Every entity gets a relation, and every many-to-many relationship gets a relation, while many-to-one relationships get a foreign key in one of the entity relations.
Such a design is automatically in 3NF. Departures from BCNF, 4NF, and 5NF are rare.
Making SQL tables based on relations is straightforward. Index design requires you to anticipate volume, load, and resources, and is somewhat DBMS dependent.