Uniqueidentifier Column as Primary Key, a worst choice : Connect SQL

Wednesday, July 22, 2009

Uniqueidentifier Column as Primary Key, a worst choice

GUID or int as Primary key ???

Though it is not necessary that your Primary Key column is always a cluster index too. By default, Sql Server creates cluster index on column or group of columns which you have declared as your table PK, and most of DBAs don't like to go against this default behavior of Sql Server.

But problem arises when for uniqueness a column with uniqueidentifier data type is added, for surrogate key to make it Primary Key, finally for your table.

The GUID is a wide column (16 bytes to be specific) and contains a unique combination of 33 uppercase and numeric characters. This column because it is the primary key is going to be stored in, of course, the clustered index .

Also, if a GUID is used instead of an integer identity column then the 33 characters need to be matched for each row that is returned using that column in the where clause.

If a high volume of inserts are done on these tables then GUID's being large will contribute to page splits, as will the fact that NEWID() generates a random value, which could place a new record on any of the data pages will cause performance problems.

Recommendations

INT must be used as Primary Key instead of GUID because:
INT takes only 4 bytes, saving your physical and memoray storage.
INT as primary key (identity) creates incremental values resulting less then 1% of indexes fregmention during heavy insert.
There are T-SQL operators available for INT like >,= and <

8 comments:

AnonymousJuly 30, 2009 at 4:07 PM
2. You can use pretty good UUID with only 8 bytes (using Base64). It's more than 4, but ok to me.

3 and 4. If you use a UUID based on time and offset, they are incremental so their is no fragmentation and you can use >= and <.
ReplyDelete
Replies
moonBrainJuly 30, 2009 at 4:24 PM
The controversy continues lol.
Try high volume replication and tell me if you think using an Integer as a primary key is a good idea.
ReplyDelete
Replies
UrielKaJuly 30, 2009 at 5:15 PM
guid aren`t match character by character,they can also be match in 4 asm integer comparsion instructions or even in 2 if using 64-bit machines.
the main problem with guid is it randomess and big size
ReplyDelete
Replies
DaveJuly 30, 2009 at 6:30 PM
Why would using <=> on a primary key ever be a good idea? And yeah--replication using a sequence is teh badz.
ReplyDelete
Replies
aasim abdullahJuly 31, 2009 at 4:58 AM
MySql provides UUID() function, But Microsoft Sql Server only supports GUID, which never generates sequential values.
For replication: You need GUID column for Merge Replication. Yes you can have separate column (other then PK column)for this purpose, But most important is to avoid cluster index fragmentation.
ReplyDelete
Replies
AnonymousAugust 6, 2009 at 4:29 AM
Sql Server doesn't have UUID() but does have NEWSEQUENTIALID() (SQL 2008??).
ReplyDelete
Replies
AnonymousSeptember 1, 2009 at 7:19 AM
SELECT NEWID() --- to get new guid in sql server
ReplyDelete
Replies
AnonymousJune 29, 2010 at 1:33 PM
So... use a guid and make the PK nonclustered. Simples!
ReplyDelete
Replies

Add comment

All suggestions are welcome

Connect SQL

Wednesday, July 22, 2009

Uniqueidentifier Column as Primary Key, a worst choice

8 comments:

Translate

About Me

Categories

Sql Server Must Visit Blogs