索引鍵的唯一性（4/4）：非唯一聚集索引上的唯一和非唯一非聚集索引

2024-07-21 02:46:42

字體：大中小

來源：轉載

供稿：網友

在上一篇文章里，我談了唯一聚集索引上的唯一和非唯一非聚集索引的區別。在這篇文章里，我想談下非唯一聚集索引上的唯一和非唯一聚集索引的區別。我們都知道，SQL Server內部把非唯一聚集索引當作唯一聚集索引處理的。如果你定義了一個非唯一聚集索引，SQL Server會增加叫做uniquifier到你的索引記錄，它導致你聚集索引的導航結構（B樹的非葉子層）里，每條索引行都要用到4 bytes的開銷。

下列代碼再次創建我們的Customers表，這次在它上面定義非唯一聚集索引，最后定義2個非聚集索引，1個是唯一的，另1個是非唯一的。

 1 -- Create a table with 393 length + 7 bytes overhead = 400 bytes  2 -- Therefore 20 records can be stored on one page (8.096 / 400) = 20,24  3 CREATE TABLE Customers  4 (  5     CustomerID INT NOT NULL,  6     CustomerName CHAR(100) NOT NULL,  7     CustomerAddress CHAR(100) NOT NULL,  8     Comments CHAR(189) NOT NULL  9 ) 10 GO 11  12 -- Create a non unique clustered index on the PRevious created table 13 CREATE CLUSTERED INDEX idx_Customers ON Customers(CustomerID) 14 GO15 16 -- Insert 80.000 records 17 DECLARE @i INT = 1 18 WHILE (@i <= 20000) 19 BEGIN 20     DECLARE    @j INT = 1     21     INSERT INTO Customers VALUES 22     ( 23         @i, 24         'CustomerName' + CAST(@i AS CHAR) + CAST(@j AS CHAR), 25         'CustomerAddress' + CAST(@i AS CHAR), 26         'Comments' + CAST(@i AS CHAR) 27     ) 28      29     SET @j += 1; 30      31     INSERT INTO Customers VALUES 32     ( 33         @i, 34         'CustomerName' + CAST(@i AS CHAR) + CAST(@j AS CHAR), 35         'CustomerAddress' + CAST(@i AS CHAR), 36         'Comments' + CAST(@i AS CHAR) 37     ) 38      39     SET @j += 1; 40      41     INSERT INTO Customers VALUES 42     ( 43         @i, 44         'CustomerName' + CAST(@i AS CHAR) + CAST(@j AS CHAR), 45         'CustomerAddress' + CAST(@i AS CHAR), 46         'Comments' + CAST(@i AS CHAR) 47     ) 48      49     SET @j += 1; 50      51     INSERT INTO Customers VALUES 52     ( 53         @i, 54         'CustomerName' + CAST(@i AS CHAR) + CAST(@j AS CHAR), 55         'CustomerAddress' + CAST(@i AS CHAR), 56         'Comments' + CAST(@i AS CHAR) 57     ) 58      59     SET @i += 1 60 END 61 GO62 63 -- Create a unique non clustered index on the clustered table 64 CREATE UNIQUE NONCLUSTERED INDEX idx_UniqueNCI_CustomerID 65 ON Customers(CustomerName) 66 GO 67  68 -- Create a non-unique non clustered index on the clustered table 69 CREATE NONCLUSTERED INDEX idx_NonUniqueNCI_CustomerID 70 ON Customers(CustomerName) 71 GO

我們通過DMV sys.dm_db_index_physical_stats查看索引的相關信息。

 1 -- Retrieve physical information about the unique non-clustered index  2 SELECT * FROM sys.dm_db_index_physical_stats  3 (  4     DB_ID('ALLOCATIONDB'),  5     OBJECT_ID('Customers'),  6     2,  7     NULL,  8     'DETAILED' 9 ) 10 GO11 12 -- Retrieve physical information about the non-unique non-clustered index 13 SELECT * FROM sys.dm_db_index_physical_stats 14 ( 15     DB_ID('ALLOCATIONDB'), 16     OBJECT_ID('Customers'), 17     3, 18     NULL, 19     'DETAILED'20 ) 21 GO

從圖中我們可以看到，在導航層的唯一非聚集索引每個索引行占用107 bytes，而非唯一非聚集索引每個索引行平均占用117 bytes（最小111 bytes，最大119 bytes）。我們用DBCC PAGE分析下各自的根頁。

1 TRUNCATE TABLE dbo.sp_table_pages2 INSERT INTO dbo.sp_table_pages3 EXEC('DBCC IND(ALLOCATIONDB, Customers, -1)') 4 5 SELECT * FROM dbo.sp_table_pages ORDER BY IndexLevel DESC

首先是唯一非聚集索引，它的根頁是21058。

1 DBCC PAGE(ALLOCATIONDB, 1, 21058, 3) 2 GO3 4 DBCC PAGE(ALLOCATIONDB, 1,21057,3)5 GO

從圖中我們可以看出，唯一非聚集索引在索引根層（還有中間層）只保存了唯一非聚集鍵，因為非聚集鍵本身就已經唯一了。

我們換參數1再來看看根頁信息：

1 DBCC TRACEON(3604)2 DBCC PAGE(ALLOCATIONDB, 1, 21058, 1) 3 GO

這107 bytes包含下列信息：

1 byte：狀態位
n bytes：非唯一聚集索引鍵——這里是CustomerName列，100 bytes
4 bytes：頁ID（PageID）
2 bytes：文件ID（FileID）

我們再來看看唯一非聚集索引的葉子頁。

1 DBCC PAGE(ALLOCATIONDB, 1,21322,3)2 GO

我們可以看出SQL Server通過非唯一聚集鍵和uniquifier來指向聚集表的對應記錄。

這里我們可以得出結論：在非唯一聚集索引上的唯一非聚集索引只在葉子層使用4 bytes的uniquifier，因為這里SQL Server使用聚集鍵和uniquifier直接指向對應的記錄。這個4 bytes的uniquifier在唯一非聚集索引的非葉子層（根層和中間層）不存在。

我們再來看看非唯一非聚集索引的根頁，它的根頁是22986。

1 DBCC PAGE(ALLOCATIONDB, 1, 22986, 3) 2 GO

這里的根頁輸出信息非常有意思！索引記錄的鍵必須設計為唯一。SQL Server如何讓非唯一非聚集索引鍵唯一呢？非常簡單——加下聚集鍵（CustomerID (key)）（4 bytes）。但是聚集鍵這里默認還是不唯一的，因此SQL Server又加了uniquifier（4 bytes），因次當你uniquifier不為0的時候，每個索引行都有8 bytes的開銷。當uniquifier為0時，你只要4 bytes的開銷，因為這個情況下uniquifier并不物理保存在索引記錄里，0是SQL Server自動假定的值。

我們再看看參數為1的信息：

1 DBCC TRACEON(3604)2 DBCC PAGE(ALLOCATIONDB, 1, 22986, 1) 3 GO

這111 bytes 包含下列信息：

1 byte：狀態位
n bytes：非唯一非聚集索引鍵——這里是CustomerName列，100 bytes
n bytes：非唯一聚集索引鍵——這里是CustomerID列，4 bytes
4 bytes：頁ID（PageID）
2 bytes：文件ID（FileID）
4 bytes：uniquifier用到的字節
4 bytes：uniquifier本身值，如果uniquifier非0的話

剛才我們通過sys.dm_db_index_physical_stats知道非唯一非聚集索引的索引記錄是111 bytes，最長是117 bytes。

我們來看看非唯一非聚集索引的葉子頁：

1 DBCC PAGE(ALLOCATIONDB, 1,23308,3)2 GO

這和非唯一聚集索引上定義的唯一非聚集索引是一樣的。葉子層通過聚集鍵（CustomerID）和uniquifier指向聚集表的對應記錄。從這個例子我們可以看出，當你在非唯一聚集索引上定義非唯一非聚集索引時，會有巨大的開銷（每個索引行 8 bytes），因為SQL Server內部要保證非聚集鍵唯一，這就需要大量的存儲開銷。