cancel
Showing results for 
Search instead for 
Did you mean: 

Question on create clustered index on non-primary key columns

Former Member
0 Kudos

My case is ase 12.5.

By default for each table, when create a primary key, there is a clustered index also created on the primary key column by default.

Also, only one clustered index allowed on on table.

Suppose I have a table mytab with single column primary key, say mytab(id,  col1, col2, col3...)

then I want to best performance for insert like

insert into mytab(id,  col1, col2, col3...)

values (.....)

For this case, if clustered index impact performance? for example, if maybe change primary key index as non-clustered can improve performance?

Accepted Solutions (0)

Answers (2)

Answers (2)

former_member185199
Contributor
0 Kudos

zhou!

the custered index always impacts performance when you have a lot of insert wich not result at the end of the table (as mike mentioned already), because in this case there is a need to shift  pages to make new space for the new records. if you cant garantie this is not the case, you should not use clustered index.

regards

dj

Former Member
0 Kudos

> Suppose I have a table mytab with single column primary key, say mytab(id,  col1, col2, col3...)

> then I want to best performance for insert like

> insert into mytab(id,  col1, col2, col3...)

> values (.....)

Best Performance depends on a lot of factors. How many rows are you inserting at a time ?

How many simultaneous inserts ?

Is you id sequentially increasing ?

What performance do you want for getting at your data ?

Are you using datapages, datarows, allpages locking ?

Here are the basics.

1) clustered indexes will generally be slower for inserts/deletes than non-clustered indexes as you can end up with page splits (on inserts) and page merges (on deletes). However, if the index is wide then the non-clustered index will contain a lot of data and be slower.

2) Never create a non-unique clustered index - you'll get overflow pages which are hideously slow (well they were in version 11)

3) clustered indexes can reduce contention if you're inserting simultaneously.

4) How are you getting data out of the database.

The best thing to do is test. Be careful if you're using a SAN - performance can vary according to other applications on the SAN.

Former Member
0 Kudos

thanks, Mike. for my case, rows in table is more than  5,000,000 . There is a transaction need to insert one row in a few tables like insert into... value....   the sequence ID is generated by customized code.

User request no delay for this transaction but it has kind of delay sometime.

So If I change all primary key as non-cluster, should have improvement for this transaction, right?

As Primary key is unique, for query, it always like where id = id value,=always for one row. So for this case, cluster or non-cluster should have no big difference for this query, right?

Former Member
0 Kudos

> "User request no delay for this transaction"


As you've said there is always a delay somewhere and if you've got SAN you've got little control over it.


> So If I change all primary key as non-cluster, should have improvement for this transaction, right?


Not necessarily. "The best thing to do is test"


If you're inserting one row at a time in sequence - I see no reason not to use a clustered index (saves space and in the world of SANs smaller space = better performance). If the row is quite small then you'd only get a page split every once in a while. But, if you're inserting one row at a time then you're obviously not a high-performance system anyway so it doesn't really matter if the index is a clustered one or a non-clustered one.


I'd focus on what you do with the data afterwards. Most systems are write-once read many so the emphasis is on fast selects where clustering really helps. However, since you're using sequential id's as your key rather than a logical PK then I doubt clustering will help you. Personally I don't use sequential id's as a primary key.



former_member188958
Active Contributor
0 Kudos

Some other factors to think about:

Is there only one such process inserting rows, or multiple inserts working in parallel?

If you have a single-partition table with no clustered index, or clustered index on an key that keeps increasing (like datetime), then all the inserts are going to the same last page of the table, which can result in contention on that page.   Round-robin partitioning can help out then as inserts are randomly distributed among the partitions and each partition has its own last page.

Having a clustered index on a key with a fairly random distribution results in the insertion point being randomly distributed in the table space.  You get occasional page splits, but very infrequent contention for an exclusive lock on the same page.

What is the lock scheme of the table? 

Former Member
0 Kudos

Thanks, guys.

the transaction include something like:

Insert into(...) tb1 values(...)

Insert into(...) tb2 values(...)

Insert into(...) tb3 values(...)

Insert into(...) tb4 values(...)

all table lock schema is datarows.