When designing tables, I’ve developed a habit of having one column that is unique and that I make the primary key. This is achieved in three ways depending on requirements:
- Identity integer column that auto increments.
- Unique identifier (GUID)
- A short character(x) or integer (or other relatively small numeric type) column that can serve as a row identifier column
Number 3 would be used for fairly small lookup, mostly read tables that might have a unique static length string code, or a numeric value such as a year or other number.
For the most part, all other tables will either have an auto-incrementing integer or unique identifier primary key.
The Question 🙂
I have recently started working with databases that have no consistent row identifier and primary keys are currently clustered across various columns. Some examples:
- datetime/character
- datetime/integer
- datetime/varchar
- char/nvarchar/nvarchar
Is there a valid case for this? I would have always defined an identity or unique identifier column for these cases.
In addition there are many tables without primary keys at all. What are the valid reasons, if any, for this?
I’m trying to understand why tables were designed as they were, and it appears to be a big mess to me, but maybe there were good reasons for it.
A third question to sort of help me decipher the answers: In cases where multiple columns are used to comprise the compound primary key, is there a specific advantage to this method vs. a surrogate/artificial key? I’m thinking mostly in regards to performance, maintenance, administration, etc.?