Query Design performance issues #02 – Column-side conversions and Function usage in WHERE

This post has been republished via RSS; it originally appeared at: Azure Database Support Blog articles.

When designing queries and filters (WHERE) sometimes we need to convert some data before filtering, and depending on what you do you may get a bad plan.

 

Find below some scenarios

  • 1 - Date filter
    • 1.1 Solution - Using range filter
    • 1.2 Solution - Using DATE data type
  • 2 - Function on Column / TRIM function
    • 2.1 Solution - Cleanup source data
    • 2.2 Solution - Using like
    • 2.3 Solution - Using computed column + index
  • 3 - Implicit conversions
    • 3.1 Solution - Using parameters with correct data type

Creating environment

To create a test environment you can use adventure works + this scrip to create a table with data types needed

 

 

 

 

 

DROP TABLE IF EXISTS dbo.[Customer_TEST] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[Customer_TEST]( [CustomerID] INT NOT NULL PRIMARY KEY IDENTITY(1,1), [NameStyle] [dbo].[NameStyle] NULL, [Title] [nvarchar](8) NULL, [FirstName] [dbo].[Name] NULL, [MiddleName] [dbo].[Name] NULL, [LastName] [varchar](50) NULL, [Suffix] [nvarchar](10) NULL, [CompanyName] [nvarchar](128) NULL, [SalesPerson] [nvarchar](256) NULL, [EmailAddress] [nvarchar](50) NULL, [Phone] [dbo].[Phone] NULL, [PasswordHash] [varchar](128) NULL, [PasswordSalt] [varchar](10) NULL, [rowguid] [uniqueidentifier] NULL, [ModifiedDate] [datetime] NULL ) ON [PRIMARY] GO INSERT INTO [dbo].[Customer_TEST] ([NameStyle],[Title],[FirstName],[MiddleName],[LastName] ,[Suffix],[CompanyName],[SalesPerson],[EmailAddress] ,[Phone],[PasswordHash],[PasswordSalt],[rowguid],[ModifiedDate]) SELECT NameStyle ,Title ,FirstName ,MiddleName ,LastName = CONVERT(VARCHAR(50),LastName) ,Suffix ,CompanyName ,SalesPerson ,EmailAddress ,Phone ,PasswordHash ,PasswordSalt ,rowguid ,ModifiedDate FROM [SalesLT].[Customer] GO 100 -- LOAD 100x CREATE NONCLUSTERED INDEX IX_Customer_TEST_LastName ON dbo.[Customer_TEST](LastName) CREATE NONCLUSTERED INDEX IX_Customer_TEST_ModifiedDate ON dbo.[Customer_TEST](ModifiedDate) GO

 

 

 

 

1 - Date filter

One classic scenario we see this is when customer is looking to filter data range or when you want to ignore the hour part of the datetime data type.

 

DECLARE @ModifiedDate DATETIME2(7) = '2009-05-16 16:33:33.077'
SELECT
CustomerID
,ModifiedDate
,CONVERT(VARCHAR(10),ModifiedDate,120)
,CONVERT(VARCHAR(10),@ModifiedDate,120)
FROM dbo.[Customer_TEST]
WHERE CONVERT(VARCHAR(10),ModifiedDate,120) = CONVERT(VARCHAR(10),@ModifiedDate,120)

 

This conversion "CONVERT(VARCHAR(10),ModifiedDate,120)" converts full date + time to "YYYY-MM-DD" format. The problem here is that SQL will have to convert all values in the table to this format before filtering and will not use an index seek.

 

2020-08-05 14_41_00-perf.sql - fonsecanet.database.windows.net.Test123 (FonsecaSergio (100))_ - Micr.png

The plan will show this conversion as a warning that you can quickly find this

2020-08-05 14_41_38-.png

 

1.1 Solution - Using range filter

Use some date math to create two parameters and do like sample below

DateColumn >= DAY 00:00:00 and DateColumn < DAY+1 00:00:00

 

DECLARE @ModifiedDate DATETIME2(7) = '2009-05-16 16:33:33.077'
SELECT
CustomerID
,ModifiedDate
,STARTDATE = CONVERT(VARCHAR(10),@ModifiedDate,120)
,ENDDATE = CONVERT(VARCHAR(10),DATEADD(day, 1, @ModifiedDate),120)
FROM dbo.[Customer_TEST]
WHERE
ModifiedDate >= CONVERT(VARCHAR(10),@ModifiedDate,120)
AND
ModifiedDate < CONVERT(VARCHAR(10),DATEADD(day, 1, @ModifiedDate),120)
GO

2020-08-06 12_21_42-perf.sql - fonsecanet.database.windows.net.Test123 (FonsecaSergio (99)) - Micros.png

1.2 Solution - Using DATE data type

If time part of the date is not needed for the business you can use DATE data type. It will also consume less space in disk as a bonus.

 

 

2 - Function on Column / TRIM function

Sometimes you are searching data that was not cleanup and you may have spaces before and after data and you need to TRIM data before you filter

  • "XXX"
  • " XXXX"
  • "XXX "

 

DECLARE @LastName VARCHAR(50)= 'Gates'
SELECT *
FROM dbo.[Customer_TEST]
WHERE TRIM(LastName) = @LastName
GO
DECLARE @LastName VARCHAR(50)= 'Gates'
SELECT LastName, CustomerID
FROM dbo.[Customer_TEST]
WHERE TRIM(LastName) = @LastName
GO

You will get a Cluster Index Scan or Noncluster Index Scan ( If its not needed all columns (*) ) but on both ways you will never be able to get a SEEK

 

2020-08-05 14_20_08-perf.sql - fonsecanet.database.windows.net.Test123 (FonsecaSergio (100))_ - Micr.png

2.1 Solution - Cleanup source data

You can cleanup empty spaces in the data before using it, like a big update in the source data and do not use function in the column side

DECLARE @LastName VARCHAR(50)= 'Gates'
SELECT LastName, CustomerID
FROM dbo.[Customer_TEST]
WHERE LastName = @LastName

 

2.2 Solution - Using like

If only right trim is needed you can use like with % wildcard at end.

*If you use % at start it will also avoid index usage and is not recommended

DECLARE @LastName VARCHAR(50)= 'Gates'
SELECT *
FROM dbo.[Customer_TEST]
WHERE LastName LIKE @LastName + '%'

 

2020-08-05 14_24_38-perf.sql - fonsecanet.database.windows.net.Test123 (FonsecaSergio (100))_ - Micr.png

2.3 Solution - Using computed column + index

You can create a computed column and create noncluster index over the computed column.

*I also included LastName in index to avoid lookup when looking for original LastName without the trim

ALTER TABLE dbo.[Customer_TEST]
ADD LastName_Test AS TRIM(LastName)
GO
CREATE NONCLUSTERED INDEX IX_Customer_TEST_LastName_Test
ON dbo.[Customer_TEST](LastName_Test) INCLUDE (LastName)
GO

You can use the new column or the function used to create computed column and it will be using the index automatically.

This is good because you can fix the performance issue without changing the code. Be aware that as you are adding new column and may affect queries using * to return all columns. That is also not a best practice

 

2020-08-05 14_31_43-perf.sql - fonsecanet.database.windows.net.Test123 (FonsecaSergio (100))_ - Micr.png

 

3 - Implicit conversions

You should always match the parameter in procedures / application with the table data type. Depending on the data type you have SQL will have to convert the source data and will cause performance issues and avoid index usage

DECLARE @LastName VARCHAR(50) = 'Gates'
SELECT LastName, CustomerID
FROM dbo.[Customer_TEST]
WHERE LastName = @LastName
GO
DECLARE @LastName NVARCHAR(50) = 'Gates'
SELECT LastName, CustomerID
FROM dbo.[Customer_TEST]
WHERE LastName = @LastName

 

2020-08-05 15_03_04-perf.sql - fonsecanet.database.windows.net.Test123 (FonsecaSergio (100))_ - Micr.png

This implicit conversion is shown as a warning

2020-08-05 15_03_48-.png

SQL will use Data Type precedence to check if data need to be converted and what the destination.

 

In this sample you should convert VARCHAR to NVARCHAR, because you can lose information if you convert NVARCHAR to VARCHAR

 

https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql

2020-08-05 15_33_54-Data Type Precedence (Transact-SQL) - SQL Server _ Microsoft Docs and 2 more pag.png

 

3.1 Solution - Using parameters with correct data type

As said before match the parameter in procedures / application with the table data type

 

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.