Syrian Social Nationalist Party and Database normalization: Difference between pages

From Wikipedia, the free encyclopedia
(Difference between pages)
Content deleted Content added
Adonhk (talk | contribs)
 
AstroWiki (talk | contribs)
 
Line 1: Line 1:
'''Database normalization''', sometimes referred to as ''canonical synthesis'', is a technique for designing [[relational database]] [[Table (database)|tables]] to minimize duplication of information and, in so doing, to safeguard the database against certain types of logical or structural problems, namely data anomalies. For example, when multiple instances of a given piece of information occur in a table, the possibility exists that these instances will not be kept consistent when the data within the table is updated, leading to a loss of [[data integrity]]. A table that is sufficiently normalized is less vulnerable to problems of this kind, because its structure reflects the basic assumptions for when multiple instances of the same information should be represented by a single instance only.
{{Citations missing|date=August 2007}}
[[Image:Flag of the Syrian Social Nationalist Party.svg|right|thumb|SSNP flag]]
The '''Syrian Social Nationalist Party''' (or SSNP) ([[Arabic language|Arabic]]: الحزب السوري القومي الاجتماعي ''al-Hizb as-Sūrī al-Qawmī al-Ijtimā`ī''), often referred to in [[French language|French]] as ''Parti Populaire Syrien'',
is a [[nationalist]] [[political party]] in [[Syria]] and [[Lebanon]]. It advocates the establishment of a [[Greater Syria]]n national state, including present Syria, Lebanon, the [[Hatay Province]] of [[Turkey]], [[Israel]], the [[Palestinian territories]], the [[Sinai Peninsula]] of [[Egypt]], [[Cyprus]], [[Jordan]], [[Iraq]], and [[Kuwait]]. <ref>Irwin, p. 24; [http://www.ssnp.com/new/about.htm ssnp.com] "Our Syria has distinct natural boundaries…" (accessed 30 June 2006).</ref>


<!--(POV: cite this or remove it) More highly normalized tables simplify development and maintenance of the database and contribute to its [[extensibility]].-->Higher degrees of normalization typically involve more tables and create the need for a larger number of [[join (SQL)|joins]], which can reduce [[computer performance|performance]]. Accordingly, more highly normalized tables are typically used in database applications involving many isolated [[database transaction|transactions]] (e.g. an [[automated teller machine]]), while less normalized tables tend to be used in database applications that need to map complex relationships between data entities and data attributes (e.g. a reporting application, or a full-text search application).
Founded in Beirut in 1932, the party has played a significant role in Lebanese politics at various points, notably being involved in attempted coups in 1949 and 1961. It was active in resistance against the Israeli occupation of Lebanon from 1982 on. It is now part of the pro-Syrian bloc, along with [[Amal Movement|Amal]] and [[Hezbollah]] (see [[Politics of Lebanon]]). In Syria, the SSNP became a major political force in the early 1950s, but was thoroughly repressed in 1955. It remained organised, and in 2005 was legalised and joined the [[Baath Party]]-led [[National Progressive Front]]. It is thought to be the largest legal party in Syria apart from the Baath, with perhaps 90,000 members.


Database theory describes a table's degree of normalization in terms of [[Database_normalization#Normal_forms|normal forms]] of successively higher degrees of strictness. A table in third normal form ('''3NF'''), for example, is consequently in second normal form ('''2NF''') as well; but the reverse is not necessarily the case.
==Foundation and early years==
[[Image:Antun11.jpg|thumb|left|Antun Sa'adeh]]The SSNP was founded by [[Antun Saadeh]], a journalist/philosopher from a [[Eastern Orthodox Church|Greek Orthodox]] family in the town of [[Dhour el Shweir]]. Saadeh had emigrated to [[South America]] in 1919 (via the USA where he stayed for about a year before continuing on to [[Brazil]]), at the age of fifteen, and in the years he lived there engaged in both Arabic-language journalism and [[Syrian nationalism|Syrian nationalist]] political activity. On his return to Lebanon some ten years later he continued working as a journalist and also taught German in the [[American University of Beirut]]. In November 1932 he established the first nucleus of the Syrian Social Nationalist Party. The party operated underground for the first three years of its existence. After it began overt activity, it was the object of harsh repression by the French [[League of Nations mandate|mandatory]] authorities. Saadeh himself was arrested several times, and in 1938 was forced to remain in South America after a visit he made there before the outbreak of [[World War II]].<ref>Charif, pp. 243-244n</ref>


Although the normal forms are often defined informally in terms of the characteristics of tables, rigorous definitions of the normal forms are concerned with the characteristics of mathematical constructs known as [[Relation (mathematics)|relations]]. Whenever information is represented relationally, it is meaningful to consider the extent to which the representation is normalized.
The party he founded was organised with a hierarchical structure and a powerful leader. Its ideology was an entirely secular form of nationalism; indeed, it posited the complete separation of religion and politics as one of the two fundamental conditions for real national unity. The other condition was determined economic and social reform.<ref>Hourani, p. 326</ref>


== Problems addressed by normalization ==
Saadeh's concept of the nation was that it was shaped by geography, not by ethnic origins, language or religion, and this led him to conclude that the Arabs could not form one nation but many nations could be called Arab. Arab nationalist thinker [[Sati' al-Husri]] considered that Saadeh "misrepresented" Arab nationalism, incorrectly associating it with a [[Bedouin]] image of the Arab and with Muslim sectarianism. Palestinian historian [[Maher Charif]] sees Saadeh's theory as a response to the religious diversity of Syria, and points to his later extension of his vision of the Syrian nation to include [[Iraq]], a country also noted for its religious diversity, as further evidence for this.<ref>Charif, p. 216</ref> The party also accepted that due to "religious and political considerations", the separate existence of Lebanon was necessary for the time being.<ref>Hourani, p. 326</ref>
[[Image:Update anomaly.png|280px|thumb|right|An '''update anomaly'''. Employee 519 is shown as having different addresses on different records.]]
[[Image:Insertion anomaly.svg|280px|thumb|right|An '''insertion anomaly'''. Until the new faculty member is assigned to teach at least one course, his details cannot be recorded.]]
[[Image:Deletion anomaly.svg|280px|thumb|right|A '''deletion anomaly'''. All information about Dr. Giddens is lost when he temporarily ceases to be assigned to any courses.]]
A table that is not sufficiently normalized can suffer from logical inconsistencies of various types, and from anomalies involving [[CRUD (acronym)|data operations]]. In such a table:


* The same information can be expressed on multiple records; therefore updates to the table may result in logical inconsistencies. For example, each record in an "Employees' Skills" table might contain an Employee ID, Employee Address, and Skill; thus a change of address for a particular employee will potentially need to be applied to multiple records (one for each of his skills). If the update is not carried through successfully—if, that is, the employee's address is updated on some records but not others—then the table is left in an inconsistent state. Specifically, the table provides conflicting answers to the question of what this particular employee's address is. This phenomenon is known as an '''update anomaly'''.
Lebanese historian Kamal Salibi gives a somewhat contrasting interpretation, pointing to the position of the Greek Orthodox community as a large minority in both Syria and Lebanon for whom "the concept of pan-Syrianism was more meaningful than the concept of Arabism" while at the same time they resented [[Maronite]] dominance in Lebanon. Saadeh, according to Salibi,
* There are circumstances in which certain facts cannot be recorded at all. For example, each record in a "Faculty and Their Courses" table might contain a Faculty ID, Faculty Name, Faculty Hire Date, and Course Code—thus we can record the details of any faculty member who teaches at least one course, but we cannot record the details of a newly-hired faculty member who has not yet been assigned to teach any courses. This phenomenon is known as an '''insertion anomaly'''.
<blockquote>found a ready following among his co-religionists. His idea of secular pan-Syrianism also proved attractive to many Druzes and Shiites; to Christians other than the Greek Orthodox, including some Maronites who were disaffected by both Lebanism and Arabism; and also to many Sunnite Muslims who set a high value on secularism, and who felt that they had far more in common with their fellow Syrians of whatever religion or denomination than with fellow Sunnite or Muslim Arabs elsewhere. Here again, an idea of nationalism had emerged which had sufficient credit to make it valid. In the Lebanese context, however, it became ready cover for something more archaic, which was essentially Greek Orthodox particularism.<ref>Salibi, pp. 54-55</ref></blockquote>
* There are circumstances in which the deletion of data representing certain facts necessitates the deletion of data representing completely different facts. The "Faculty and Their Courses" table described in the previous example suffers from this type of anomaly, for if a faculty member temporarily ceases to be assigned to any courses, we must delete the last of the records on which that faculty member appears. This phenomenon is known as a '''deletion anomaly'''.


Ideally, a relational database table should be designed in such a way as to exclude the possibility of update, insertion, and deletion anomalies. The normal forms of relational database theory provide guidelines for deciding whether a particular design will be vulnerable to such anomalies. It is possible to correct an unnormalized design so as to make it adhere to the demands of the normal forms: this is called normalization. Removal of redundancies of the tables will lead to several tables, with [[referential integrity]] restrictions between them.
From 1945 on, the party adopted a more nuanced stance regarding Arab nationalism, seeing Syrian unity as a potential first step towards an Arab union led by Syria.<ref>Hourani, p. 326</ref>


Normalization typically involves decomposing an unnormalized table into two or more tables that, were they to be combined (joined), would convey exactly the same information as the original table.
==The SSNP in Lebanon, 1947-1975==
{{Politics of Lebanon}}
[[Image:Ssnpmap.png|thumb|left|Greater Syria, as claimed by SSNP]]Saadeh returned to Lebanon in 1947. Upon his arrival, Saadeh made a famous speech declaring his opposition to the government. The government retaliated by arresting members of the party and tensions remained between the two sides until it culminated in a failed coup d'etat attempt.


== Background to normalization: definitions ==
In 1949, Members of the pro-government Phalanges party attacked and burned the SSNP's newspaper office in The Gemayze aria, seconds after Saadeh left the place. The government response to the attack, was redraw both parties licenses and declared both parties illegal, but only SSNP members were arrested(including Top Officials) before issuing a memo reinstating the Phalanges license. Police broke into Saadeh house to arrest him but he had already left. In the middest of these events Saadeh declared a coup d'etat against the government and starting organizing the party's members. He received a message from the Syrian military dictator [[Husni al-Za'im]] offering him weapons to support his coup d'etat and asked to meet him in his palace. Saadeh accepted the invitation and traveled to Syria to meet the president. When he arrived to the palace, he was instantly apprehended and handed over to the Lebanese authorities who trialled and executed him within 8 hours.<ref>[[Lebanese Broadcasting Corporation]] Pierre Gemayel Documentary</ref>
*'''Functional dependency''': Attribute B has a [[functional dependency]] on attribute A i.e. '''A → B''' if, for each value of attribute A, there is exactly one value of attribute B. If value of A is repeating in [[tuple]]s then value of B will also repeat. In our example, Employee Address has a functional dependency on Employee ID, because a particular Employee ID value corresponds to one and only one Employee Address value. (Note that the reverse need not be true: several employees could live at the same address and therefore one Employee Address value could correspond to more than one Employee ID. Employee ID is therefore '''not''' functionally dependent on Employee Address.) An attribute may be functionally dependent either on a single attribute or on a combination of attributes. It is not possible to determine the extent to which a design is normalized without understanding what functional dependencies apply to the attributes within its tables; understanding this, in turn, requires knowledge of the problem domain. For example, an Employer may require certain employees to split their time between two locations, such as New York City and London, and therefore want to allow Employees to have more than one Employee Address. In this case, Employee Address would no longer be functionally dependent on Employee ID.


Another way to look at the above is by reviewing basic mathematical functions:


Let '''F(x)''' be a mathematical function of one independent variable. The independent variable is analogous to the attribute A. The dependent variable (or the dependent attribute using the lingo above), and hence the term functional dependency, is the value of F(A); A is an independent attribute. As we know, mathematical functions can have only one output. Notationally speaking, it is common to express this relationship in mathematics as '''F(A) = B'''; or, '''B → F(A)'''.
=== 1950 - 1960 ===


There are also functions of more than one independent variable--commonly, this is referred to as multivariable functions. This idea represents an attribute being functionally dependent on a combination of attributes. Hence, '''F(x,y,z)''' contains three independent variables, or independent attributes, and one dependent attribute, namely, '''F(x,y,z)'''. In multivariable functions, there can only be one output, or one dependent variable, or attribute.
The party was seen in these years as a right-wing, anti-Communist organization.<ref>Seale, p. 50</ref>. The party opposed Nasserite influences and objected to declaration of The United Arab Republic. This opposition is based on ideological beliefs.


*'''Trivial functional dependency''': A trivial functional dependency is a functional dependency of an attribute on a superset of itself. {Employee ID, Employee Address} → {Employee Address} is trivial, as is {Employee Address} → {Employee Address}.
During the [[Lebanon crisis of 1958|Lebanese civil war]] of 1958 party members participated on the Government side, fighting against the Arab nationalist rebels in northern Lebanon and in Mount Lebanon.<ref>[http://home.iprimus.com.au/fidamelhem/ssnp/The%20lebanese_crisis_of_1958_and_the%20the%20SSNp.htm Article on pro-SSNP website on the party's role in the 1958 civil war] accessed 19 January 2006.</ref> The party was subsequently legalized.
*'''Full functional dependency''': An attribute is fully functionally dependent on a set of attributes X if it is
** functionally dependent on X, and
** not functionally dependent on any proper subset of X. {Employee Address} has a functional dependency on {Employee ID, Skill}, but not a ''full'' functional dependency, because is also dependent on {Employee ID}.
*'''Transitive dependency''': A transitive dependency is an indirect functional dependency, one in which ''X''→''Z'' only by virtue of ''X''→''Y'' and ''Y''→''Z''.
*'''Multivalued dependency''': A multivalued dependency is a constraint according to which the presence of certain rows in a table implies the presence of certain other rows: see the [[Multivalued Dependency]] article for a rigorous definition.
*'''Join dependency''': A table ''T'' is subject to a [[join dependency]] if ''T'' can always be recreated by joining multiple tables each having a subset of the attributes of ''T''.
*'''Superkey''': A [[superkey]] is an attribute or set of attributes that uniquely identifies rows within a table; in other words, two distinct rows are always guaranteed to have distinct superkeys. {Employee ID, Employee Address, Skill} would be a superkey for the "Employees' Skills" table; {Employee ID, Skill} would also be a superkey.
*'''Candidate key''': A [[candidate key]] is a minimal superkey, that is, a superkey for which we can say that no proper subset of it is also a superkey. {Employee Id, Skill} would be a candidate key for the "Employees' Skills" table.
*'''Non-prime attribute''': A non-prime attribute is an attribute that does not occur in any candidate key. Employee Address would be a non-prime attribute in the "Employees' Skills" table.
*'''Primary key''': Most [[database management system|DBMSs]] require a table to be defined as having a single unique key, rather than a number of possible unique keys. A [[primary key]] is a key which the database designer has designated for this purpose.


==History==
{{Expand-section|date=June 2008}}


[[Edgar F. Codd]] first proposed the process of normalization and what came to be known as the '''1st normal form''':
=== 1961 - 1975 ===


{{quote|There is, in fact, a very simple elimination<ref>His term ''eliminate'' is misleading, as nothing is "lost" in normalization. He probably described ''eliminate'' in a mathematical sense to mean elimination of complexity.</ref> procedure which we shall call normalization. Through decomposition non-simple domains are replaced by "''domains whose elements are atomic (non-decomposable) values.''"|Edgar F. Codd|A Relational Model of Data for Large Shared Data Banks<ref>{{cite journal|first=E.F.|last=Codd|authorlink=E.F. Codd|title=A Relational Model of Data for Large Shared Data Banks|journal=[[Communications of the ACM]]|volume=13|issue=6|month=June|year=1970|pages=377–387|url=http://www.acm.org/classics/nov95/toc.html | doi = 10.1145/362384.362685 <!--Retrieved from Yahoo! by DOI bot-->}}</ref>}}
In 1961 the party launched an abortive coup attempt in Lebanon, resulting in renewed proscription and the imprisonment of many of its leaders.{{Fact|date=March 2007}} In prison the SSNP militants read and discussed politics and reconsidered their ideology, coming under the influence of [[Marxism]] and other left-wing ideas.{{Fact|date=March 2007}} By the beginning of the 1970s, the party had undergone a considerable ideological transformation, and was seen as decidedly left-wing and no longer deeply inimical to Arab nationalism. These ideological turns, however, resulted in splits, and there are now two rival groups laying claim to Saadeh's mantle.{{Fact|date=March 2007}}


In his paper, Edgar F. Codd used the term "non-simple" domains to describe a heterogeneous data structure, but later researchers would refer to such a structure as an [[abstract data type]].


== Normal forms ==
=== Civil war and Resistance ===
The '''normal forms''' (abbrev. '''NF''') of relational database theory provide criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is to inconsistencies and anomalies. Each table has a "'''highest normal form'''" ('''HNF'''): by definition, a table always meets the requirements of its HNF and of all normal forms lower than its HNF; also by definition, a table fails to meet the requirements of any normal form higher than its HNF.


The normal forms are applicable to individual tables; to say that an entire database is in normal form ''n'' is to say that all of its tables are in normal form ''n''.
Proof of this new orientation came with the outbreak of the Lebanese Civil War of 1975. SSNP militias fought alongside the nationalist and leftist forces, against the Phalangists and their right-wing allies. An important development followed with the renewal of contact between the party and its former bitter enemy, the Syrian Baath Party.[10]


Newcomers to database design sometimes suppose that normalization proceeds in an iterative fashion, i.e. a 1NF design is first normalized to 2NF, then to 3NF, and so on. This is not an accurate description of how normalization typically works. A sensibly designed table is likely to be in 3NF on the first attempt; furthermore, if it is 3NF, it is overwhelmingly likely to have an HNF of 5NF. Achieving the "higher" normal forms (above 3NF) does not usually require an extra expenditure of effort on the part of the designer, because 3NF tables usually need no modification to meet the requirements of these higher normal forms.
After the Israeli invasion of Lebanon in 1982 and subsequent rout of the leftist forces, a number of the leftist organizations regrouped to engage in resistance to the Israeli occupation. Along with the Lebanese Communist Party, the Communist Action Organization, and some smaller leftist groups, the SSNP played a prominent role in this. One of the best-known early actions of the resistance was the killing of two Israeli soldiers in the Wimpy Cafe on west Beirut's central Rue Hamra by party member Khalid Alwan. The party continues to commemorate this date.


[[Edgar F. Codd]] originally defined the first three normal forms (1NF, 2NF, and 3NF). These normal forms have been summarized as requiring that all non-key attributes be dependent on "the key, the whole key and nothing but the key". The fourth and fifth normal forms (4NF and 5NF) deal specifically with the representation of many-to-many and one-to-many relationships among attributes. Sixth normal form (6NF) incorporates considerations relevant to [[temporal database]]s.
The Israelis had to hit the SSNP hard since the latter was highly active in the Lebanese National Resistance. They bombed one of their main headquarters in Bekaa Valley after the kamikaze operation of Malek Wehbe. The Israeli intelligence also played a major role in dividing the party in 1987 (official division of the SSNP) and assassinating most of its prominent leaders such as Habib Keyrouz who was a popular leader among SSNP students, SSNP comrades and SSNP board members also knows as "oumana". Not to mention that Habib Keyrouz was a very close politician to President Hafez Assad.


===First normal form===
The SSNP took a pro-Syrian position in debate about Syria's role in Lebanon. Its popular support in Lebanon is now rather limited.
{{Main|First normal form}}


A table is in '''first normal form (1NF)''' [[if and only if]] it represents a relation.<ref name="DateReln">"[T]he overriding requirement, to the effect that the table must directly and faithfully represent a relation, follows from the fact that 1NF was originally defined as a property of relations, not tables." Date, C.J. [http://www.dbdebunk.com/page/page/629796.htm "What First Normal Form Really Means"] in ''Date on Database: Writings 2000-2006'' (Springer-Verlag, 2006), p. 128.</ref> Given that database tables embody a relation-like form, the defining characteristic of one in first normal form is that it does not allow duplicate rows or nulls. Simply put, a table with a unique key (which, by definition, prevents duplicate rows) and without any nullable columns is in 1NF.
==The SSNP in Syria==


Note that the restriction on nullable columns as a 1NF requirement, as espoused by Chris Date, et. al., is controversial. This particular requirement for 1NF is a direct contradiction to Dr. Codd's vision of the relational database, in which he stated that "null values" must be supported in a fully relational DBMS in order to represent "missing information and inapplicable information in a systematic way, independent of data type."<ref name="Codd-rel">Codd, E.F. "Is Your DBMS Really Relational?" Computerworld, October 14, 1985.</ref> By redefining 1NF to exclude nullable columns in 1NF, no level of normalization can ever be achieved unless all nullable columns are completely eliminated from the entire database. This is in line with Date's and Darwen's vision of the perfect relational database, but can introduce additional complexities in SQL databases to the point of impracticality.<ref name="coles">Coles, M. [http://www.sqlservercentral.com/articles/Advanced/2921/ '''Sic Semper Null''']. 2007. SQL Server Central. Redgate Software.</ref>
In [[Syria]] the SSNP grew to a position of considerable influence in the years following the country's independence in 1946, and was a major political force immediately after the restoration of democracy in 1954. It was a fierce rival of the [[Syrian Communist Party]] and of the radical pan-Arab [[Baath Party]], the other main ideological parties of the period. In April 1955 Colonel [[Adnan al-Malki]], a Baathist officer who was a very popular figure in the Syrian army, was assassinated by a party member. This provided the Communists and Baathists with the opportunity to eliminate their main ideological rival, and under pressure from them and their allies in the security forces the SSNP was practically wiped out as a political force in Syria.


One requirement of a relation is that every table contains exactly one value for each attribute. This is sometimes expressed as "no repeating groups"<ref name="Kent">"First normal form excludes variable repeating fields and groups" Kent, William. [http://www.bkent.net/Doc/simple5.htm "A Simple Guide to Five Normal Forms in Relational Database Theory"], ''Communications of the ACM'' '''26''' (2), Feb. 1983, pp. 120-125.</ref>. While that statement itself is axiomatic, experts disagree about what qualifies as a "repeating group", in particular whether a value may be a relation value; thus the precise definition of 1NF is the subject of some controversy. Notwithstanding, this theoretical uncertainty applies to relations, not tables. Table manifestations are intrinsically free of variable repeating groups because they are structurally constrained to the same number of columns in all rows.
The SSNP's stance during the Lebanese civil war was consistent with that of Syria, and that facilitated a rapprochement between the party and the Syrian government. During Hafez al-Assad's [[President of Syria|presidency]], the party was increasingly tolerated. After the succession of his son [[Bashar al-Asad|Bashar]] in 2000, this process continued. In 2001, although still officially banned, the party was permitted to attend meetings of the Baath-led [[National Progressive Front]] coalition of legal parties as an observer. In Spring [[2005]] the party was legalised in Syria, as the first non-[[socialist]] and non-Arabist party. It is considered to be one of the largest political parties in the country, after the ruling Baath Party, with perhaps 90,000 members.<ref>[http://www.atimes.com/atimes/Middle_East/GD26Ak04.html ''Asia Times'' article by Syrian political analyst Sami Moubayed]. Accessed 19 January 2006</ref>


Put at its simplest; when applying 1NF to a database, every record must be the same length. This means that each record has the same number of fields, and none of them contains a null value.
In the 22 April 2007 [[People's Council of Syria]] [[Syrian parliamentary election, 2007|election]] the party was awarded 2 out of 250 seats in the parliament.


==Outside Lebanon and Syria==
===Second normal form===
{{main|Second normal form}}
{{Original research|date=September 2007}}
Apart from in Lebanon and Syria, the party also has a following among the large [[diaspora]]s of these countries. It has overseas branches in a variety of countries, including [[Australia]], the [[United States]], Brazil, Argentina and several [[Western Europe]]an countries. It is less popular in the rest of the [[Middle East]], with a very small number of supporters in Jordan and the Palestinian Authority areas, but practically no following in more peripheral parts of what it refers to as Greater Syria.
The criteria for '''second normal form''' ('''2NF''') are:
*The table must be in 1NF.
*None of the non-prime attributes of the table are functionally dependent on a part (proper subset) of a candidate key; in other words, all functional dependencies of non-prime attributes on candidate keys are full functional dependencies.<ref name="Codd23">Codd, E.F. "Further Normalization of the Data Base Relational Model." (Presented at Courant Computer Science Symposia Series 6, "Data Base Systems," New York City, May 24th-25th, 1971.) IBM Research Report RJ909 (August 31st, 1971). Republished in Randall J. Rustin (ed.), ''Data Base Systems: Courant Computer Science Symposia Series 6''. Prentice-Hall, 1972.</ref> For example, consider an "Employees' Skills" table whose attributes are Employee ID, Employee Name, and Skill; and suppose that the combination of Employee ID and Skill uniquely identifies records within the table. Given that Employee Name depends on only one of those attributes &ndash; namely, Employee ID &ndash; the table is not in 2NF.
*In simple terms, a table is 2NF if it is in 1NF and all fields are dependent on the whole of the primary key, or a relation is in 2NF if it is in 1NF and every non-key attribute is fully dependent on each candidate key of the relation.
*Note that if none of a 1NF table's candidate keys are composite &ndash; i.e. every candidate key consists of just '''one''' attribute &ndash; then we can say immediately that the table is in 2NF.
*All columns must be a fact about the entire key, and not a subset of the key.


===Third normal form===
==Footnotes==
{{main|Third normal form}}
{{reflist}}


The criteria for '''third normal form''' ('''3NF''') are:
==References==
*The table must be in 2NF.
*Charif, Maher, ''Rihanat al-nahda fi'l-fikr al-'arabi'', Damascus, Dar al-Mada, 2000
*Transitive dependencies must not be eliminated. All attributes must rely only on the primary key. So, if a database has a table with columns Student ID, Student, Company, and Company Phone Number, it is not in 3NF. This is because the Phone number relies on the Company. So, for it to be in 3NF, there must be a second table with Company and Company Phone Number columns; the Phone Number column in the first table would be removed.
*[[Albert Hourani|Hourani, Albert]], ''La Pensée Arabe et l'Occident'' (French translation of ''Arab Thought in the Liberal Age'')
Dont refer dis section its given wrong
*Irwin, Robert, "An Arab Surrealist". ''[[The Nation (U.S. periodical)|The Nation]]'', January 3, 2005, 23&ndash;24, 37&ndash;38. There is [http://past.thenation.com/doc/20050103/irwin an online version], but only the first two paragraphs are shown to non-subscribers.

*Salibi, Kamal, ''A House of Many Mansions: The History of Lebanon Reconsidered'', London, [[I.B. Tauris]], 1998 ISBN 1-86064-912-2
===Boyce-Codd normal form===
*Seale, Patrick, ''Asad: the Struggle for the Middle East'', Berkely, University of California Press, 1988 ISBN 0-520-06976-5
{{main|Boyce-Codd normal form}}
*[http://www.cedarland.org/teams.html#syrian Information on Lebanese parties, from Lebanese nationalist-leaning website www.cedarland.org]

A table is in '''Boyce-Codd normal form''' ('''BCNF''') if and only if, for every one of its non-trivial functional dependencies ''X → Y'', ''X'' is a superkey—that is, ''X'' is either a candidate key or a superset thereof.<ref name="CoddBCNF">Codd, E. F. "Recent Investigations into Relational Data Base Systems." IBM Research Report RJ1385 (April 23rd, 1974). Republished in ''Proc. 1974 Congress'' (Stockholm, Sweden, 1974). New York, N.Y.: North-Holland (1974).</ref>

===Fourth normal form===
{{main|Fourth normal form}}

A table is in '''fourth normal form''' ('''4NF''') if and only if, for every one of its non-trivial [[multivalued dependency|multivalued dependencies]] ''X [[Image:twoheadrightarrow.gif]] Y'', ''X'' is a superkey—that is, ''X'' is either a candidate key or a superset thereof.<ref name="Fagin">"A relation schema R* is in fourth normal form (4NF) if, whenever a nontrivial multivalued dependency X →→ Y holds for R*, then so does the functional dependency X → A for every column name A of R*. Intuitively all dependencies are the
result of keys." {{cite journal|first=Ronald|last=Fagin|title=Multivalued Dependencies and a New Normal Form for Relational Databases|journal=ACM Transactions on Database Systems|volume=2|issue=1|month=September|year=1977|pages=267|url=http://www.almaden.ibm.com/cs/people/fagin/tods77.pdf|doi=10.1145/320557.320571}}</ref>
* For example, if you can have two phone numbers values and two email address values, then you should not have them in the same table.

===Fifth normal form===
{{main|Fifth normal form}}

The criteria for '''fifth normal form''' ('''5NF''' and also '''PJ/NF''') are:
*The table must be in 4NF.
*There must be no non-trivial join dependencies that do not follow from the key constraints. A 4NF table is said to be in the 5NF [[if and only if]] every join dependency in to is implied by the candidate keys.

===Domain/key normal form===
{{main|Domain/key normal form}}

'''Domain/key normal form''' (or '''DKNF''') requires that a table not be subject to any constraints other than domain constraints and key constraints.

===Sixth normal form===
According to the definition by [[Christopher J. Date]] and others, who extended database theory to take account of temporal and other interval data, a table is in '''sixth normal form''' ('''6NF''') if and only if it satisfies no non-trivial (in the formal sense) join dependencies at all,<ref>{{cite book | last = Date | first = Chris J. | authorlink = Christopher J. Date| coauthors = Hugh Darwen, Nikos A. Lorentzos | title = Temporal Data and the Relational Model: A Detailed Investigation into the Application of Interval and Relation Theory to the Problem of Temporal Database Management | origyear = 2003 | origmonth = January | publisher = Elsevier LTD | location = Oxford | isbn = 1558608559 | pages = p176 | chapter = Chapter 10 Database Design, Section 10.4: Sixth Normal Form|quote=A relvar R is in '''sixth normal form''' (abbreviated 6NF) if and only if it satisfies no nontrivial join dependencies at all—where, as before, a join dependency is trivial if and only if at least one of the projections (possiblyU_projections) involved is taken over the set of all attributes of the relvar concerned.}}</ref>, meaning that the fifth normal form is also satisfied. When referring to "join" in this context it should be noted that Date et al. additionally use generalized definitions of relational operators that also take account of interval data (e.g. from-date to-date) by conceptually breaking them down ("unpacking" them) into atomic units (e.g. individual days), with defined rules for joining interval data, for instance.<ref>{{cite book | last = Date | first = Chris J. | authorlink = Christopher J. Date| coauthors = Hugh Darwen, Nikos A. Lorentzos | title = Temporal Data and the Relational Model: A Detailed Investigation into the Application of Interval and Relation Theory to the Problem of Temporal Database Management | origyear = 2003 | origmonth = January | publisher = Elsevier LTD | location = Oxford | isbn = 1558608559 | pages = p149 | chapter = Chapter 9 Generalizing the relational operators, Section 9.4 join}}</ref>

Sixth normal form is intended to decompose relation variables to irreducible components. Though this may be relatively unimportant for non-temporal relation variables, it can be important when dealing with temporal variables or other interval data. For instance, if a relation comprises a supplier's name, status, and city, we may also want to add temporal data, such as the time during which these values are, or were, valid (e.g. for historical data) but the three values may vary independently of each other and at different rates. We may, for instance, wish to trace the history of changes to Status.

For further discussion on Temporal Aggregation in SQL, see also Zimyani, <ref name="zimyani">{{cite web | url = http://www.sigmod.org/sigmod/record/issues/0606/p16-article-zimanyi.pdf | title = Temporal Aggregates and Temporal Universal Quantification in Standard SQL | work = ACM SIGMOD Record, volume 35, number 2 | publisher = [[Association for Computing Machinery|ACM]] | author = Zimyani, E.|date=June 2006}}.</ref> For a non-relational approach, see [[TSQL2]].

In a different meaning, '''sixth normal form''' may also be used by some to refer to [[Domain/key normal form]] (DKNF).

== Denormalization ==
{{main|Denormalization}}
Databases intended for [[Online transaction processing|Online Transaction Processing (OLTP)]] are typically more normalized than databases intended for [[Online Analytical Processing|Online Analytical Processing (OLAP)]]. OLTP Applications are characterized by a high volume of small transactions such as updating a sales record at a super market checkout counter. The expectation is that each transaction will leave the database in a consistent state. By contrast, databases intended for OLAP operations are primarily "read mostly" databases. OLAP applications tend to extract historical data that has accumulated over a long period of time. For such databases, redundant or "denormalized" data may facilitate [[business intelligence]] applications. Specifically, [[Dimension table|dimensional tables]] in a [[star schema]] often contain denormalized data. The denormalized or redundant data must be carefully controlled during [[Extract, transform, load|ETL]] processing, and users should not be permitted to see the data until it is in a consistent state. The normalized alternative to the star schema is the [[snowflake schema]]. It has never been proven that this denormalization itself provides any increase in performance, or if the concurrent removal of data constraints is what increases the performance. In many cases, the need for denormalization has waned as computers and RDBMS software have become more powerful, but since data volumes have generally increased along with hardware and software performance, OLAP databases often still use denormalized schemas.

Denormalization is also used to improve performance on smaller computers as in computerized cash-registers and mobile devices, since these may use the data for look-up only (e.g. price lookups). Denormalization may also be used when no RDBMS exists for a platform (such as Palm), or no changes are to be made to the data and a swift response is crucial.

===Non-first normal form (NF² or N1NF)===
In recognition that denormalization can be deliberate and useful, the non-first normal form is a definition of database designs which do not conform to the first normal form, by allowing "sets and sets of sets to be attribute domains" (Schek 1982). This extension is a (non-optimal) way of implementing hierarchies in relations. Some academics have dubbed this practitioner developed method, "First Ab-normal Form", Codd defined a relational database as using relations, so any table not in 1NF could not be considered to be relational.

Consider the following table:

{| class="wikitable"
|+ Non-First Normal Form
|-
! Person !! Favorite Colors
|-
| Bob || blue, red
|-
|Jane || green, yellow, red
|}

Assume a person has several favorite colors. Obviously, favorite colors consist of a set of colors modeled by the given table.

To transform this NF² table into a 1NF an "unnest" operator is required which extends the
relational algebra of the higher normal forms. The reverse operator is called "nest" which is not always the mathematical inverse of "unnest", although "unnest" is the mathematical inverse to "nest". Another constraint required is for the operators to be [[bijection|bijective]], which is covered by the [[Partitioned Normal Form]] (PNF).

== Further reading==
* [http://www.troubleshooters.com/littstip/ltnorm.html Litt's Tips: Normalization]
* Date, C. J. (1999), ''[http://www.aw-bc.com/catalog/academic/product/0,1144,0321197844,00.html An Introduction to Database Systems]'' (8th ed.). Addison-Wesley Longman. ISBN 0-321-19784-4.
* Kent, W. (1983) ''[http://www.bkent.net/Doc/simple5.htm A Simple Guide to Five Normal Forms in Relational Database Theory]'', Communications of the ACM, vol. 26, pp. 120-125
* Date, C.J., & Darwen, H., & Pascal, F. ''[http://www.dbdebunk.com Database Debunkings]''
* H.-J. Schek, P. Pistor Data Structures for an Integrated Data Base Management and Information Retrieval System

==Notes and References==
{{reflist|2}}
{{refbegin}}
* Paper: "Non First Normal Form Relations" by G. Jaeschke, H. -J Schek ; IBM Heidelberg Scientific Center. -> Paper studying normalization and denormalization operators nest and unnest as mildly described at the end of this wiki page. The paper contains the Formalization through Set Theory of 1NF and NF^2 relations.
{{refend}}

==See also==
*[[Optimization]]
*[[Aspect (computer science)]]
*[[Cross-cutting concern]]
*[[Refactoring]]
*[[Business rules]]


==External links==
==External links==
* [http://databases.about.com/od/specificproducts/a/normalization.htm Database Normalization Basics] by Mike Chapple (About.com)
* [http://www.alqawmi.com/ SSNP website] (in Arabic)
* [http://www.databasejournal.com/sqletc/article.php/1428511 Database Normalization Intro], [http://www.databasejournal.com/sqletc/article.php/26861_1474411_1 Part 2]
* [http://www.esaadah.com/ SSNP School]
* [http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html An Introduction to Database Normalization] by Mike Hillyer.
* [http://www.ssnp.net/ SSNP website] (in Arabic)
* [http://www.utexas.edu/its/windows/database/datamodeling/rm/rm7.html Normalization] by ITS, University of Texas.
* [http://www.ssnp.com/ Another SSNP website]
* [http://www.datamodel.org/NormalizationRules.html Rules of Data Normalization] by Data Model.org
* [http://www.ssnp.info/ SSNP Information Network]
* [http://phlonx.com/resources/nf3/ A tutorial on the first 3 normal forms] by Fred Coulson
* [http://www.alnahdah.org/ Democratic Group at SSNP (Tayyar Demokrati)]
* [http://www.Tahawolat.com/ Tahawolat Magazine] Articles about society and culture
* [http://www.dbnormalization.com/ DB Normalization Examples]
* [http://support.microsoft.com/kb/283878 Description of the database normalization basics] by Microsoft
* [http://www.barrywise.com/2008/01/database-normalization-and-design-techniques/ Database Normalization and Design Techniques] by Barry Wise, recommended reading for the Harvard MIS.

{{Databases}}


{{Database normalization}}
{{Syrian political parties}}
{{Lebanese political parties}}


[[Category:Arab nationalist political parties]]
[[Category:Databases]]
[[Category:Political parties established in 1932]]
[[Category:Data modeling]]
[[Category:Political parties in Lebanon]]
[[Category:Database constraints]]
[[Category:Political parties in Syria]]
[[Category:Database normalization| ]]
[[Category:Lebanese Civil War]]
[[Category:Relational algebra]]


[[cs:Normalizace databáze]]
[[ar:الحزب السوري القومي الاجتماعي]]
[[de:Normalisierung (Datenbank)]]
[[cs:Syrská národně sociální strana]]
[[es:Normalización de bases de datos]]
[[de:Syrische Soziale Nationalistische Partei]]
[[fr:Forme normale (bases de données relationnelles)]]
[[fr:Parti social nationaliste syrien]]
[[ko:데이터베이스 정규화]]
[[nl:Syrische Socialistische Nationale Partij]]
[[it:Normalizzazione del database]]
[[uk:Сирійська націонал-соціалістична партія Лівану]]
[[he:נירמול בסיס נתונים]]
[[sv:Syriska socialnationalistiska partiet]]
[[nl:Databasenormalisatie]]
[[ja:リレーションの正規化]]
[[no:Normalisering]]
[[pl:Normalizacja bazy danych]]
[[pt:Normalização de dados]]
[[ru:Нормальная форма]]
[[simple:Database normalisation]]
[[sk:Normalizácia (databázy)]]
[[fi:Tietokannan normalisointi]]
[[sv:Normalform (databaser)]]
[[uk:Нормалізація баз даних]]
[[zh:数据库正规化]]

Revision as of 08:18, 13 October 2008

Database normalization, sometimes referred to as canonical synthesis, is a technique for designing relational database tables to minimize duplication of information and, in so doing, to safeguard the database against certain types of logical or structural problems, namely data anomalies. For example, when multiple instances of a given piece of information occur in a table, the possibility exists that these instances will not be kept consistent when the data within the table is updated, leading to a loss of data integrity. A table that is sufficiently normalized is less vulnerable to problems of this kind, because its structure reflects the basic assumptions for when multiple instances of the same information should be represented by a single instance only.

Higher degrees of normalization typically involve more tables and create the need for a larger number of joins, which can reduce performance. Accordingly, more highly normalized tables are typically used in database applications involving many isolated transactions (e.g. an automated teller machine), while less normalized tables tend to be used in database applications that need to map complex relationships between data entities and data attributes (e.g. a reporting application, or a full-text search application).

Database theory describes a table's degree of normalization in terms of normal forms of successively higher degrees of strictness. A table in third normal form (3NF), for example, is consequently in second normal form (2NF) as well; but the reverse is not necessarily the case.

Although the normal forms are often defined informally in terms of the characteristics of tables, rigorous definitions of the normal forms are concerned with the characteristics of mathematical constructs known as relations. Whenever information is represented relationally, it is meaningful to consider the extent to which the representation is normalized.

Problems addressed by normalization

An update anomaly. Employee 519 is shown as having different addresses on different records.
An insertion anomaly. Until the new faculty member is assigned to teach at least one course, his details cannot be recorded.
A deletion anomaly. All information about Dr. Giddens is lost when he temporarily ceases to be assigned to any courses.

A table that is not sufficiently normalized can suffer from logical inconsistencies of various types, and from anomalies involving data operations. In such a table:

  • The same information can be expressed on multiple records; therefore updates to the table may result in logical inconsistencies. For example, each record in an "Employees' Skills" table might contain an Employee ID, Employee Address, and Skill; thus a change of address for a particular employee will potentially need to be applied to multiple records (one for each of his skills). If the update is not carried through successfully—if, that is, the employee's address is updated on some records but not others—then the table is left in an inconsistent state. Specifically, the table provides conflicting answers to the question of what this particular employee's address is. This phenomenon is known as an update anomaly.
  • There are circumstances in which certain facts cannot be recorded at all. For example, each record in a "Faculty and Their Courses" table might contain a Faculty ID, Faculty Name, Faculty Hire Date, and Course Code—thus we can record the details of any faculty member who teaches at least one course, but we cannot record the details of a newly-hired faculty member who has not yet been assigned to teach any courses. This phenomenon is known as an insertion anomaly.
  • There are circumstances in which the deletion of data representing certain facts necessitates the deletion of data representing completely different facts. The "Faculty and Their Courses" table described in the previous example suffers from this type of anomaly, for if a faculty member temporarily ceases to be assigned to any courses, we must delete the last of the records on which that faculty member appears. This phenomenon is known as a deletion anomaly.

Ideally, a relational database table should be designed in such a way as to exclude the possibility of update, insertion, and deletion anomalies. The normal forms of relational database theory provide guidelines for deciding whether a particular design will be vulnerable to such anomalies. It is possible to correct an unnormalized design so as to make it adhere to the demands of the normal forms: this is called normalization. Removal of redundancies of the tables will lead to several tables, with referential integrity restrictions between them.

Normalization typically involves decomposing an unnormalized table into two or more tables that, were they to be combined (joined), would convey exactly the same information as the original table.

Background to normalization: definitions

  • Functional dependency: Attribute B has a functional dependency on attribute A i.e. A → B if, for each value of attribute A, there is exactly one value of attribute B. If value of A is repeating in tuples then value of B will also repeat. In our example, Employee Address has a functional dependency on Employee ID, because a particular Employee ID value corresponds to one and only one Employee Address value. (Note that the reverse need not be true: several employees could live at the same address and therefore one Employee Address value could correspond to more than one Employee ID. Employee ID is therefore not functionally dependent on Employee Address.) An attribute may be functionally dependent either on a single attribute or on a combination of attributes. It is not possible to determine the extent to which a design is normalized without understanding what functional dependencies apply to the attributes within its tables; understanding this, in turn, requires knowledge of the problem domain. For example, an Employer may require certain employees to split their time between two locations, such as New York City and London, and therefore want to allow Employees to have more than one Employee Address. In this case, Employee Address would no longer be functionally dependent on Employee ID.

Another way to look at the above is by reviewing basic mathematical functions:

Let F(x) be a mathematical function of one independent variable. The independent variable is analogous to the attribute A. The dependent variable (or the dependent attribute using the lingo above), and hence the term functional dependency, is the value of F(A); A is an independent attribute. As we know, mathematical functions can have only one output. Notationally speaking, it is common to express this relationship in mathematics as F(A) = B; or, B → F(A).

There are also functions of more than one independent variable--commonly, this is referred to as multivariable functions. This idea represents an attribute being functionally dependent on a combination of attributes. Hence, F(x,y,z) contains three independent variables, or independent attributes, and one dependent attribute, namely, F(x,y,z). In multivariable functions, there can only be one output, or one dependent variable, or attribute.

  • Trivial functional dependency: A trivial functional dependency is a functional dependency of an attribute on a superset of itself. {Employee ID, Employee Address} → {Employee Address} is trivial, as is {Employee Address} → {Employee Address}.
  • Full functional dependency: An attribute is fully functionally dependent on a set of attributes X if it is
    • functionally dependent on X, and
    • not functionally dependent on any proper subset of X. {Employee Address} has a functional dependency on {Employee ID, Skill}, but not a full functional dependency, because is also dependent on {Employee ID}.
  • Transitive dependency: A transitive dependency is an indirect functional dependency, one in which XZ only by virtue of XY and YZ.
  • Multivalued dependency: A multivalued dependency is a constraint according to which the presence of certain rows in a table implies the presence of certain other rows: see the Multivalued Dependency article for a rigorous definition.
  • Join dependency: A table T is subject to a join dependency if T can always be recreated by joining multiple tables each having a subset of the attributes of T.
  • Superkey: A superkey is an attribute or set of attributes that uniquely identifies rows within a table; in other words, two distinct rows are always guaranteed to have distinct superkeys. {Employee ID, Employee Address, Skill} would be a superkey for the "Employees' Skills" table; {Employee ID, Skill} would also be a superkey.
  • Candidate key: A candidate key is a minimal superkey, that is, a superkey for which we can say that no proper subset of it is also a superkey. {Employee Id, Skill} would be a candidate key for the "Employees' Skills" table.
  • Non-prime attribute: A non-prime attribute is an attribute that does not occur in any candidate key. Employee Address would be a non-prime attribute in the "Employees' Skills" table.
  • Primary key: Most DBMSs require a table to be defined as having a single unique key, rather than a number of possible unique keys. A primary key is a key which the database designer has designated for this purpose.

History

Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form:

There is, in fact, a very simple elimination[1] procedure which we shall call normalization. Through decomposition non-simple domains are replaced by "domains whose elements are atomic (non-decomposable) values."

— Edgar F. Codd, A Relational Model of Data for Large Shared Data Banks[2]

In his paper, Edgar F. Codd used the term "non-simple" domains to describe a heterogeneous data structure, but later researchers would refer to such a structure as an abstract data type.

Normal forms

The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is to inconsistencies and anomalies. Each table has a "highest normal form" (HNF): by definition, a table always meets the requirements of its HNF and of all normal forms lower than its HNF; also by definition, a table fails to meet the requirements of any normal form higher than its HNF.

The normal forms are applicable to individual tables; to say that an entire database is in normal form n is to say that all of its tables are in normal form n.

Newcomers to database design sometimes suppose that normalization proceeds in an iterative fashion, i.e. a 1NF design is first normalized to 2NF, then to 3NF, and so on. This is not an accurate description of how normalization typically works. A sensibly designed table is likely to be in 3NF on the first attempt; furthermore, if it is 3NF, it is overwhelmingly likely to have an HNF of 5NF. Achieving the "higher" normal forms (above 3NF) does not usually require an extra expenditure of effort on the part of the designer, because 3NF tables usually need no modification to meet the requirements of these higher normal forms.

Edgar F. Codd originally defined the first three normal forms (1NF, 2NF, and 3NF). These normal forms have been summarized as requiring that all non-key attributes be dependent on "the key, the whole key and nothing but the key". The fourth and fifth normal forms (4NF and 5NF) deal specifically with the representation of many-to-many and one-to-many relationships among attributes. Sixth normal form (6NF) incorporates considerations relevant to temporal databases.

First normal form

A table is in first normal form (1NF) if and only if it represents a relation.[3] Given that database tables embody a relation-like form, the defining characteristic of one in first normal form is that it does not allow duplicate rows or nulls. Simply put, a table with a unique key (which, by definition, prevents duplicate rows) and without any nullable columns is in 1NF.

Note that the restriction on nullable columns as a 1NF requirement, as espoused by Chris Date, et. al., is controversial. This particular requirement for 1NF is a direct contradiction to Dr. Codd's vision of the relational database, in which he stated that "null values" must be supported in a fully relational DBMS in order to represent "missing information and inapplicable information in a systematic way, independent of data type."[4] By redefining 1NF to exclude nullable columns in 1NF, no level of normalization can ever be achieved unless all nullable columns are completely eliminated from the entire database. This is in line with Date's and Darwen's vision of the perfect relational database, but can introduce additional complexities in SQL databases to the point of impracticality.[5]

One requirement of a relation is that every table contains exactly one value for each attribute. This is sometimes expressed as "no repeating groups"[6]. While that statement itself is axiomatic, experts disagree about what qualifies as a "repeating group", in particular whether a value may be a relation value; thus the precise definition of 1NF is the subject of some controversy. Notwithstanding, this theoretical uncertainty applies to relations, not tables. Table manifestations are intrinsically free of variable repeating groups because they are structurally constrained to the same number of columns in all rows.

Put at its simplest; when applying 1NF to a database, every record must be the same length. This means that each record has the same number of fields, and none of them contains a null value.

Second normal form

The criteria for second normal form (2NF) are:

  • The table must be in 1NF.
  • None of the non-prime attributes of the table are functionally dependent on a part (proper subset) of a candidate key; in other words, all functional dependencies of non-prime attributes on candidate keys are full functional dependencies.[7] For example, consider an "Employees' Skills" table whose attributes are Employee ID, Employee Name, and Skill; and suppose that the combination of Employee ID and Skill uniquely identifies records within the table. Given that Employee Name depends on only one of those attributes – namely, Employee ID – the table is not in 2NF.
  • In simple terms, a table is 2NF if it is in 1NF and all fields are dependent on the whole of the primary key, or a relation is in 2NF if it is in 1NF and every non-key attribute is fully dependent on each candidate key of the relation.
  • Note that if none of a 1NF table's candidate keys are composite – i.e. every candidate key consists of just one attribute – then we can say immediately that the table is in 2NF.
  • All columns must be a fact about the entire key, and not a subset of the key.

Third normal form

The criteria for third normal form (3NF) are:

  • The table must be in 2NF.
  • Transitive dependencies must not be eliminated. All attributes must rely only on the primary key. So, if a database has a table with columns Student ID, Student, Company, and Company Phone Number, it is not in 3NF. This is because the Phone number relies on the Company. So, for it to be in 3NF, there must be a second table with Company and Company Phone Number columns; the Phone Number column in the first table would be removed.

Dont refer dis section its given wrong

Boyce-Codd normal form

A table is in Boyce-Codd normal form (BCNF) if and only if, for every one of its non-trivial functional dependencies X → Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[8]

Fourth normal form

A table is in fourth normal form (4NF) if and only if, for every one of its non-trivial multivalued dependencies X Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[9]

  • For example, if you can have two phone numbers values and two email address values, then you should not have them in the same table.

Fifth normal form

The criteria for fifth normal form (5NF and also PJ/NF) are:

  • The table must be in 4NF.
  • There must be no non-trivial join dependencies that do not follow from the key constraints. A 4NF table is said to be in the 5NF if and only if every join dependency in to is implied by the candidate keys.

Domain/key normal form

Domain/key normal form (or DKNF) requires that a table not be subject to any constraints other than domain constraints and key constraints.

Sixth normal form

According to the definition by Christopher J. Date and others, who extended database theory to take account of temporal and other interval data, a table is in sixth normal form (6NF) if and only if it satisfies no non-trivial (in the formal sense) join dependencies at all,[10], meaning that the fifth normal form is also satisfied. When referring to "join" in this context it should be noted that Date et al. additionally use generalized definitions of relational operators that also take account of interval data (e.g. from-date to-date) by conceptually breaking them down ("unpacking" them) into atomic units (e.g. individual days), with defined rules for joining interval data, for instance.[11]

Sixth normal form is intended to decompose relation variables to irreducible components. Though this may be relatively unimportant for non-temporal relation variables, it can be important when dealing with temporal variables or other interval data. For instance, if a relation comprises a supplier's name, status, and city, we may also want to add temporal data, such as the time during which these values are, or were, valid (e.g. for historical data) but the three values may vary independently of each other and at different rates. We may, for instance, wish to trace the history of changes to Status.

For further discussion on Temporal Aggregation in SQL, see also Zimyani, [12] For a non-relational approach, see TSQL2.

In a different meaning, sixth normal form may also be used by some to refer to Domain/key normal form (DKNF).

Denormalization

Databases intended for Online Transaction Processing (OLTP) are typically more normalized than databases intended for Online Analytical Processing (OLAP). OLTP Applications are characterized by a high volume of small transactions such as updating a sales record at a super market checkout counter. The expectation is that each transaction will leave the database in a consistent state. By contrast, databases intended for OLAP operations are primarily "read mostly" databases. OLAP applications tend to extract historical data that has accumulated over a long period of time. For such databases, redundant or "denormalized" data may facilitate business intelligence applications. Specifically, dimensional tables in a star schema often contain denormalized data. The denormalized or redundant data must be carefully controlled during ETL processing, and users should not be permitted to see the data until it is in a consistent state. The normalized alternative to the star schema is the snowflake schema. It has never been proven that this denormalization itself provides any increase in performance, or if the concurrent removal of data constraints is what increases the performance. In many cases, the need for denormalization has waned as computers and RDBMS software have become more powerful, but since data volumes have generally increased along with hardware and software performance, OLAP databases often still use denormalized schemas.

Denormalization is also used to improve performance on smaller computers as in computerized cash-registers and mobile devices, since these may use the data for look-up only (e.g. price lookups). Denormalization may also be used when no RDBMS exists for a platform (such as Palm), or no changes are to be made to the data and a swift response is crucial.

Non-first normal form (NF² or N1NF)

In recognition that denormalization can be deliberate and useful, the non-first normal form is a definition of database designs which do not conform to the first normal form, by allowing "sets and sets of sets to be attribute domains" (Schek 1982). This extension is a (non-optimal) way of implementing hierarchies in relations. Some academics have dubbed this practitioner developed method, "First Ab-normal Form", Codd defined a relational database as using relations, so any table not in 1NF could not be considered to be relational.

Consider the following table:

Non-First Normal Form
Person Favorite Colors
Bob blue, red
Jane green, yellow, red

Assume a person has several favorite colors. Obviously, favorite colors consist of a set of colors modeled by the given table.

To transform this NF² table into a 1NF an "unnest" operator is required which extends the relational algebra of the higher normal forms. The reverse operator is called "nest" which is not always the mathematical inverse of "unnest", although "unnest" is the mathematical inverse to "nest". Another constraint required is for the operators to be bijective, which is covered by the Partitioned Normal Form (PNF).

Further reading

Notes and References

  1. ^ His term eliminate is misleading, as nothing is "lost" in normalization. He probably described eliminate in a mathematical sense to mean elimination of complexity.
  2. ^ Codd, E.F. (1970). "A Relational Model of Data for Large Shared Data Banks". Communications of the ACM. 13 (6): 377–387. doi:10.1145/362384.362685. {{cite journal}}: Unknown parameter |month= ignored (help)
  3. ^ "[T]he overriding requirement, to the effect that the table must directly and faithfully represent a relation, follows from the fact that 1NF was originally defined as a property of relations, not tables." Date, C.J. "What First Normal Form Really Means" in Date on Database: Writings 2000-2006 (Springer-Verlag, 2006), p. 128.
  4. ^ Codd, E.F. "Is Your DBMS Really Relational?" Computerworld, October 14, 1985.
  5. ^ Coles, M. Sic Semper Null. 2007. SQL Server Central. Redgate Software.
  6. ^ "First normal form excludes variable repeating fields and groups" Kent, William. "A Simple Guide to Five Normal Forms in Relational Database Theory", Communications of the ACM 26 (2), Feb. 1983, pp. 120-125.
  7. ^ Codd, E.F. "Further Normalization of the Data Base Relational Model." (Presented at Courant Computer Science Symposia Series 6, "Data Base Systems," New York City, May 24th-25th, 1971.) IBM Research Report RJ909 (August 31st, 1971). Republished in Randall J. Rustin (ed.), Data Base Systems: Courant Computer Science Symposia Series 6. Prentice-Hall, 1972.
  8. ^ Codd, E. F. "Recent Investigations into Relational Data Base Systems." IBM Research Report RJ1385 (April 23rd, 1974). Republished in Proc. 1974 Congress (Stockholm, Sweden, 1974). New York, N.Y.: North-Holland (1974).
  9. ^ "A relation schema R* is in fourth normal form (4NF) if, whenever a nontrivial multivalued dependency X →→ Y holds for R*, then so does the functional dependency X → A for every column name A of R*. Intuitively all dependencies are the result of keys." Fagin, Ronald (1977). "Multivalued Dependencies and a New Normal Form for Relational Databases" (PDF). ACM Transactions on Database Systems. 2 (1): 267. doi:10.1145/320557.320571. {{cite journal}}: Unknown parameter |month= ignored (help)
  10. ^ Date, Chris J. "Chapter 10 Database Design, Section 10.4: Sixth Normal Form". Temporal Data and the Relational Model: A Detailed Investigation into the Application of Interval and Relation Theory to the Problem of Temporal Database Management. Oxford: Elsevier LTD. pp. p176. ISBN 1558608559. A relvar R is in sixth normal form (abbreviated 6NF) if and only if it satisfies no nontrivial join dependencies at all—where, as before, a join dependency is trivial if and only if at least one of the projections (possiblyU_projections) involved is taken over the set of all attributes of the relvar concerned. {{cite book}}: |pages= has extra text (help); Unknown parameter |coauthors= ignored (|author= suggested) (help); Unknown parameter |origmonth= ignored (help)
  11. ^ Date, Chris J. "Chapter 9 Generalizing the relational operators, Section 9.4 join". Temporal Data and the Relational Model: A Detailed Investigation into the Application of Interval and Relation Theory to the Problem of Temporal Database Management. Oxford: Elsevier LTD. pp. p149. ISBN 1558608559. {{cite book}}: |pages= has extra text (help); Unknown parameter |coauthors= ignored (|author= suggested) (help); Unknown parameter |origmonth= ignored (help)
  12. ^ Zimyani, E. (June 2006). "Temporal Aggregates and Temporal Universal Quantification in Standard SQL" (PDF). ACM SIGMOD Record, volume 35, number 2. ACM..
  • Paper: "Non First Normal Form Relations" by G. Jaeschke, H. -J Schek ; IBM Heidelberg Scientific Center. -> Paper studying normalization and denormalization operators nest and unnest as mildly described at the end of this wiki page. The paper contains the Formalization through Set Theory of 1NF and NF^2 relations.

See also

External links