Unique key

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by L337 kybldmstr (talk | contribs) at 06:32, 11 October 2007 (Added content previously blanked due to vandalism). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In relational database design, a unique key or primary key is a candidate key to uniquely identify each row in a table. A unique key or primary key comprises a single column or set of columns. No two distinct rows in a table can have the same value (or combination of values) in those columns. Depending on its design, a table may have arbitrarily many unique keys but at most one primary key.

A unique key must uniquely identify all possible rows that exist in a table and not only the currently existing rows. Examples of unique keys are Social Security numbers (associated with a specific person) or ISBNs (associated with a specific book). Telephone books and dictionaries cannot use names or words or Dewey Decimal system numbers as candidate keys because they do not uniquely identify telephone numbers or words.

A primary key is a special case of unique keys. The major difference is that for unique keys the implicit NOT NULL constraint is not automatically enforced, while for primary keys it is. Thus, the values in a unique key columns may or may not be NULL. Another difference is that primary keys must be defined using another syntax.

The relational model, as expressed through relational calculus and relational algebra, does not distinguish between primary keys and other kinds of keys. Primary keys were added to the SQL standard mainly as a convenience to the application programmer.

Unique keys as well as primary keys can be referenced by foreign keys.

Defining Primary Keys

Primary keys are defined in the ANSI SQL Standard, through the PRIMARY KEY constraint. The syntax to add such a constraint to an existing table is defined in SQL:2003 like this:

  ALTER TABLE <table identifier> 
      ADD [ CONSTRAINT <constraint identifier> ] 
      PRIMARY KEY ( <column expression> {, <column expression>}... )

The primary key can also be specified directly during table creation. In the SQL Standard, primary keys may consist of one or multiple columns. Each column participating in the primary key is implicitly defined as NOT NULL. Note that some DBMS require that primary key columns are explicitly marked as being NOT NULL.

  CREATE TABLE table_name (
     id_col  INT,
     col2    CHARACTER VARYING(20),
     ...
     CONSTRAINT tab_pk PRIMARY KEY(id_col),
     ...
  )

If the primary key consists only of a single column, the column can be marked as such using the following syntax:

  CREATE TABLE table_name (
     id_col  INT  PRIMARY KEY,
     col2    CHARACTER VARYING(20),
     ...
  )

Defining Unique Keys

The definition of unique keys is syntactically very similar to primary keys.

  ALTER TABLE <table identifier> 
      ADD [ CONSTRAINT <constraint identifier> ] 
      UNIQUE ( <column expression> {, <column expression>}... )

Likewise, unique keys can be defined as part of the CREATE TABLE SQL statement.

  CREATE TABLE table_name (
     id_col   INT,
     col2     CHARACTER VARYING(20),
     key_col  SMALLINT,
     ...
     CONSTRAINT key_uniqe UNIQUE(key_col),
     ...
  )
  CREATE TABLE table_name (
     id_col  INT  PRIMARY KEY,
     col2    CHARACTER VARYING(20),
     ...
     key_col  SMALLINT UNIQUE,
     ...
  )

Surrogate Keys

In some design situations the natural key that uniquely identifies a tuple in a relation is difficult to use for software development. For example, it may involve multiple columns or large text fields. A surrogate key can be used as the primary key. In other situations there may be more than one candidate key for a relation, and no candidate key is obviously preferred. A surrogate key may be used as the primary key to avoid giving one candidate key artificial primacy over the others.

Since primary keys exist primarily as a convenience to the programmer, surrogate primary keys are often used - in many cases exclusively - in database application design.

Due to the popularity of surrogate primary keys, many developers and in some cases even theoreticians have come to regard surrogate primary keys as an inalienable part of the relational data model. This is largely due to a migration of principles from the Object-Oriented Programming model to the relational model, creating the hybrid object-relational model. In the ORM, these additional restrictions are placed on primary keys:

  • Primary keys should be immutable, that is, not change until the record is destroyed.
  • Primary keys should be anonymous integer or numeric identifiers.

However, neither of these restrictions are part of the relational model or any SQL standard. Due diligence should be applied when deciding on the immutability of primary key values during database and application design. Some database systems even imply that values in primary key columns cannot be changed using the UPDATE SQL statement.