Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

sql server - Query Index for Dictionary Based Queries

What would be the most efficient way to query and index for the following:

SELECT * Persons.LastName A-D
SELECT * Persons.LastName E-L
SELECT * Persons.LastName M-R
SELECT * Persons.LastName S-Z

I'm using the following which is terribly inefficient and difficult to index:

WHERE LastName LIKE '[a-d]%'

Any ideas on a better way to go about this ? I think this might be a good scenario for Filtered Index, however the where clause would need to be more sargable.

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As Sam says, LIKE '[a-d]%' is SARGable (well almost). Almost because of a not optimized Predicate (see below for more info).

Example #1: if you run this query in AdventureWorks2008R2 database

SET STATISTICS IO ON;
SET NOCOUNT ON;

PRINT 'Example #1:';
SELECT  p.BusinessEntityID, p.LastName
FROM    Person.Person p
WHERE   p.LastName LIKE '[a-a]%'

then, you will get an execution plan based on Index Seek operator (optimized predicate: green rectangle, non-optimized predicate:red rectangle): enter image description here The output for SET STATISTICS IO is

Example #1:
Table 'Person'. Scan count 1, logical reads 7

This means the server have to read 7 pages from buffer pool. Also, in this case, the index IX_Person_LastName_FirstName_MiddleName includes all columns required by SELECT, FROMand WHERE clauses: LastName and BusinessEntityID. If table has a clustered index then all non clustered indices will include the columns from clustered index key (BusinessEntityID is the key for PK_Person_BusinessEntityID clustered index).

But:

1) Your query have to show all columns because of SELECT * (it's a bad practice): BusinessEntityID, LastName, FirstName, MiddleName, PersonType, ..., ModifiedDate.

2) The index (IX_Person_LastName_FirstName_MiddleName in previous example) doesn't includes all required columns. This is the reason why, for this query, this index is a non-covering index.

Now, if you execute the next queries then you will get diff. [actual] execution plans (SSMS, Ctrl + M):

SET STATISTICS IO ON;
SET NOCOUNT ON;

PRINT 'Example #2:';
SELECT  p.*
FROM    Person.Person p
WHERE   p.LastName LIKE '[a-a]%';
PRINT @@ROWCOUNT;

PRINT 'Example #3:';
SELECT  p.*
FROM    Person.Person p
WHERE   p.LastName LIKE '[a-z]%';
PRINT @@ROWCOUNT;

PRINT 'Example #4:';
SELECT  p.*
FROM    Person.Person p WITH(FORCESEEK)
WHERE   p.LastName LIKE '[a-z]%';
PRINT @@ROWCOUNT;

Results:

Example #2:
Table 'Person'. Scan count 1, logical reads 2805, lob logical reads 0
911

Example #3:
Table 'Person'. Scan count 1, logical reads 3817, lob logical reads 0 
19972

Example #4:
Table 'Person'. Scan count 1, logical reads 61278, lob logical reads 0
19972

Execution plans: enter image description here

Plus: the query will give you the number of pages for every index created on 'Person.Person':

SELECT i.name, i.type_desc,f.alloc_unit_type_desc, f.page_count, f.index_level FROM sys.dm_db_index_physical_stats(
    DB_ID(), OBJECT_ID('Person.Person'), 
    DEFAULT, DEFAULT, 'DETAILED' ) f 
INNER JOIN sys.indexes i ON f.object_id = i.object_id AND f.index_id = i.index_id
ORDER BY i.type


name                                    type_desc    alloc_unit_type_desc page_count index_level
--------------------------------------- ------------ -------------------- ---------- -----------
PK_Person_BusinessEntityID              CLUSTERED    IN_ROW_DATA          3808       0
PK_Person_BusinessEntityID              CLUSTERED    IN_ROW_DATA          7          1
PK_Person_BusinessEntityID              CLUSTERED    IN_ROW_DATA          1          2
PK_Person_BusinessEntityID              CLUSTERED    ROW_OVERFLOW_DATA    1          0
PK_Person_BusinessEntityID              CLUSTERED    LOB_DATA             1          0
IX_Person_LastName_FirstName_MiddleName NONCLUSTERED IN_ROW_DATA          103        0
IX_Person_LastName_FirstName_MiddleName NONCLUSTERED IN_ROW_DATA          1          1

...

Now, if you compare Example #1 and Example #2 (both returns 911 rows)

`SELECT p.BusinessEntityID, p.LastName ... p.LastName LIKE '[a-a]%'`
vs.
`SELECT * ... p.LastName LIKE '[a-a]%'`

then you will see two diff.:

a) 7 logical reads vs. 2805 logical reads and

b) Index Seek (#1) vs. Index Seek + Key Lookup (#2).

You can see that the performance for SELECT * (#2) query is far worst (7 pages vs. 2805 pages).

Now, if you compare Example #3 and Example #4 (both returns 19972 rows)

`SELECT * ... LIKE '[a-z]%`
vs.
`SELECT * ... WITH(FORCESEEK) LIKE '[a-z]%`

then you will see two diff.:

a) 3817 logical reads (#3) vs. 61278 logical reads (#4) and

b) Clustered Index Scan (PK_Person_BusinessEntityID has 3808 + 7 + 1 + 1 + 1 = 3818 pages) vs. Index Seek + Key Lookup.

You can see that the performance for Index Seek + Key Lookup (#4) query is far worst (3817 pages vs. 61278 pages). In this case, you can see that and Index Seek on IX_Person_LastName_FirstName_MiddleName plus a Key Lookup on PK_Person_BusinessEntityID (clustered index) will give you a lower performance than a 'Clustered Index Scan'.

And all these bad execution plans are possible because of SELECT *.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...