Second Normal Form (2NF)
In Second
Normal Form (2NF), the normalization process highly depends on the following;
- Keys
- Functional Dependency, and
- Decomposition.
For a table
to be in 2NF, it has to satisfy the following set of properties;
Property
1: The table should satisfy all the properties of previous normal form,
i.e., 1NF. In other words, the table should be in 1NF..
Property
2: There should not be any partial key dependencies. This property is not
applicable for relations (tables) which have single simple attribute as Primary
Key. Because, single attribute primary key means that all the other attributes
can be determined by the Prime attribute. Hence, property 2 is applicable for
relations those have more than one attribute combination as Primary Key (i.e.,
Composite Key). Because, here is the possibility of some of the attributes of
the table depend on any one or all of the attributes of the composite primary
key.
Let us consider the table 1 STUDENT given
below for our discussions.
[Assumptions
on the following table STUDENT: student can register many courses, and every
course is taught by exactly one instructor]
RegNo
|
SName
|
Gen
|
PR
|
Phone
|
Courses
|
CInst
|
R1
|
Sundar
|
M
|
BTech
|
9898786756
|
Database
|
Kumar
|
R2
|
Ram
|
M
|
MS
|
9897786776
|
Database
|
Kumar
|
R3
|
Karthik
|
M
|
MCA
|
8798987867
|
Data Structures
|
Steve
|
R4
|
John
|
M
|
BSc
|
7898886756
|
Multimedia
|
Badrinath
|
R1
|
Sundar
|
M
|
BTech
|
9898786756
|
Data Structures
|
Steve
|
R3
|
Karthik
|
M
|
MCA
|
8798987867
|
Multimedia
|
Badrinath
|
Table 1 - STUDENT
For
this table, we can write all the possible Functional Dependencies as follows;
RegNo →
SName
RegNo →
Gen
RegNo →
PR
RegNo →
Phone
Courses →
CInst
CInst →
Courses
RegNo Courses → CInst
RegNo Cinst → Courses
Or, the
first four FDs can also be written as,
RegNo →
SName Gen PR Phone
It
means, wherever you give a register number, (for example, ‘R1’), it will always
show one
SName value (for ‘R1’,
SName is ‘Sundar’), one
Gen value (for ‘R1’, Gen is ‘M’), one
PR value (‘R1’ is doing ‘BTech’), and
one
Phone ( 9898786756 is the phone
number of ‘R1’).
But,
attributes Courses and CInst (Course instructors) cannot be
included in this list. Because, RegNo
cannot uniquely identify either Courses
or CInst. The reason is some of the
students registered more than one course (for example ‘R1’ registered
‘Database’, and ‘Data Structures’ courses).
As a whole, all the FDs can be collectively written with the key for this relation as,
RegNo Courses →
SName Gen PR Phone CInst
For the table STUDENT, according to the above said FD, the key is (
RegNo, Courses) or (
RegNo, CInst). But, both the keys are showing
Partial Dependencies. That is, to determine the attributes
SName,
Gen,
PR, and
Phone, the whole key is not required. Likewise, to determine the attributes
Courses or
CInst, the whole key is not required. Hence, the design clearly shows
Partial Key Dependency.
How
do we convert an un-normalized table into Second Normal Form (2NF)?
The
following steps to be followed;
- Find the possible Primary Key for the table in
question
- If Primary key is composite, then find the
individual Functional Dependencies (FDs) of every attribute of composite key.
- If all or some of the composite key individually
identify other attributes, then Decompose (break into different tables) the
table in question into two or more tables.
Steps
explained with example;
- As we have already discussed, for the table
STUDENT given in table 1, the Primary key is the Composite Key. The key is
(RegNo, Courses) which is a composite key.
- As Primary key is the composite for this
relation, we need to find individual FDs for all the composite attributes,
i.e., FDs of RegNo and Courses. The attributes Courses and RegNo have following FDs;
-
Courses → CInst
RegNo →
SName
RegNo →
Gen
RegNo →
PR
RegNo →
Phone
- Thus, attribute Course determines the attribute CInst uniquely. Attribute RegNo determines the attributes SName, Gen, PR, and Phone uniquely. Both
RegNo and Courses show partial key dependencies and thus violates Property 2.
Hence, we need decomposition.
How
do we decompose the un-normalized relation?
We
need to identify the set of attributes which can be determined by individual or
composite attributes of the Composite Primary Key. In its simplest form, we
break the table in question into multiple tables based on the determination of
other attributes by the attributes in the primary key. That is, create a new table for every prime attribute of the composite key (partial key of whole key) and its dependent attributes.
In
our example, we can break STUDENT into two tables. Because, RegNo determines
one set of attributes, and Courses determines other set. Hence, we would get
the following conceptual schemas.
STUDENT(RegNo,
SName, Gen, PR, Phone) and
COURSES(Courses,
CInst)
Let
us visualize these tables.
RegNo
|
SName
|
Gen
|
PR
|
Phone
|
R1
|
Sundar
|
M
|
BTech
|
9898786756
|
R2
|
Ram
|
M
|
MS
|
9897786776
|
R3
|
Karthik
|
M
|
MCA
|
8798987867
|
R4
|
John
|
M
|
BSc
|
7898886756
|
Table 2: STUDENT
Courses
|
CInst
|
Database
|
Kumar
|
Data Structures
|
Steve
|
Multimedia
|
Badrinath
|
TABLE 3: COURSES
Look
at those tables carefully. Table 2 STUDENT has no repeating groups. Hence it is
in 1NF. The only primary key is RegNo
which is single simple attribute. It satisfies the condition of 2NF. Hence it
is in 2NF. Table 2 COURSES has no repeating groups as well, has one FD where
singe attribute Courses is the
Primary Key. Hence Courses is also in 2NF.
Though
the tables are in 2NF, they show a problem. The problem is both table are not
related, which leads to some of the following questions can be answered and
some cannot be.
Q1 - we can answer student information if
we have RegNo,
Q2 - we can have course instructor if we
know the course name.
Q3 - we cannot answer the courses
registered by any student which is very vital part of the design.
The
solution is to include a Foreign key
in any one of the tables to establish a link between these tables or create a
new table for establishing links.
On
deciding which of the table’s attribute can be taken as Foreign Key in other
table, we may use mapping cardinality ratios. In our example, the relationship
between STUDENT and COURSES is Many-To-Many. That is, one student can register
many courses and a course can have one or more students. So, we need to create
a third table to establish a link as follows (where the duplication accepted);
RegNo
|
Courses
|
R1
|
Database
|
R2
|
Database
|
R3
|
Data Structures
|
R4
|
Multimedia
|
R1
|
Data Structures
|
R3
|
Multimedia
|
Table 4: STU_COURSES
At
the end, we have three tables (STUDENT, COURSES, and STU_COURSES) as part of
the normalization process of table 1 STUDENT with the following schemas (Primary
keys underlined).
STUDENT(RegNo,
SName, Gen, PR, Phone) and
COURSES(Courses,
CInst)
STU_COURSES(RegNo,
Courses) – only this table shows minimal redundancy
Some
of the advantages and disadvantages of the above normalization:
Advantages:
1.
As in table 1 STUDENT, there is no multiple repeated
records of students in table 2 STUDENT and no multiple repeated records for
courses in table 3 COURSES.
2. Space
is reduced.
3.
No redundancy, so no inconsistency
Disadvantages:
1.
A new table need to be created
2.
To answer Q3 mentioned above, we need to join
all the three tables. If Q3 is one of the frequent queries used in the
application, then it makes trouble. That is, it needs more system resources.