Saturday, March 26, 2016

Normalization in Database

In This Post we are discussing about Normalization in Database and different types of Normalization.This article helps who have been working with databases and might have heard Database Normalization.

What is Normalization?
Database Normalization is the process of organizing the data in a database.The main problem existing in "raw" database is redundancy. Redundancy means storing the same data items in more than one place. Redundancy creates several problems like fallowing:

1. Extra storage space because storing the same data in many places takes large amount of disk space.
2. Ensuring data dependencies make sense(only storing related data in a table)
3.Entering same data more than once during insertion
4.deleting same data more than once during deletion
5.Modifying the data more than once 


To solve these problems the "raw"database needs to be normalized. This is step by step process of  removing  different kinds of redundancy at each step.

Normalization Rules:
Normalization rules are divided into the following normal form

First Normal Form(1NF)
Second Normal Form(2NF)
Third Normal Form(3NF)
BCNF

First Normal Form(1NF):
The First Rule is set the basic rule for an organized database. Each set of column must have a unique value,such that multiple columns can not be used to fetch the same row. Each table should be organized into rows,and each row should have a primary key that represents it as unique.



Using the first normal form data redundancy increases,there will be many columns with same data in multiple  rows but each row contain unique.

Second Normal Form(2NF):
A relations is said to be 2NF if it is already 1NF and each and every attribute fully depends on the primary key of the relation.If a table has some attributes which is not dependent on the primary key of that table,then it is not in 2NF.Let us explain, Empid is the primary key of the above relation.Emp_name ,sales,Bank_id all depend upon Empid. But Bank_name depend upon Bank_id which is not the primary key of the table.So the table is 1NF,but not in 2NF. If this position can be removed into another related relation,it would come 2NF.


After removing the portion into another relation we store lesser amount of data in two relations without any loss information. There is also a significant reduction in redundancy.

Third Normal Form(3NF):
A relation is said to be in 3NF, if it is already in 2NF and there exists no transitive dependency in that relation. Speaking inversely, if a table contains transitive dependency, then it is not in 3NF, and the table must be split to bring it into 3NF.

Boyce-Code Normal Form (BCNF)

A relationship is said to be in BCNF if it is already in 3NF and the left hand side of every dependency is a candidate key. A relation which is in 3NF is almost always in BCNF. These could be same situation when a 3NF relation may not be in BCNF the following conditions are found true.

1. The candidate keys are composite.
2. There are more than one candidate keys in the relation.

3. There are some common attributes in the relation.




Consider, as an example, the above relation. It is assumed that:

1. A professor can work in more than one department
2. The percentage of the time he spends in each department is given.
3. Each department has only one Head of Department.

The normalization of the relation is done by creating a new relation for Dept. and Head of Dept. and deleting Head of Dept. form the given relation. The normalized relations are shown in the following.





See the dependency diagrams for these new relations.



Fourth Normal Form (4NF)


When attributes in a relation have multi-valued dependency, further Normalization to 4NF and 5NF are required. Let us first find out what multi-valued dependency is.
A multi-valued dependency is a typical kind of dependency in which each and every attribute within a relation depends upon the other, yet none of them is a unique primary key.
We will illustrate this with an example. Consider a vendor supplying many items to many projects in an organization. The following are the assumptions:

A vendor is capable of supplying many items.
A project uses many items.
A vendor supplies to many projects.
An item may be supplied by many vendors.

A multi valued dependency exists here because all the attributes depend upon the other and yet none of them is a primary key having unique value.


No comments:

Post a Comment

High Paying Jobs after Learning Python

Everyone knows Python is one of the most demand Programming Language. It is a computer programming language to build web applications and sc...