Techniques to enhance Linked Server performance

This post has been republished via RSS; it originally appeared at: Azure Database Support Blog articles.

In this article, I want to highlight some information and examples on how to enhance linked server queries,  by using  Inner Join of type Remote and OpenQuery:

 

Remote Join:

when you have an inner join between local and remote table (a linked server) , you can force the join operation to be on the remote server.

 

If you have two tables , one local small table (10 rows in my example testdest table) and one huge linked server table (sourctbl table with about 76 million rows)

 

on the Managed instance, I executed the select query with inner join , the left table is the small local table:

 

 

 

select testdest.rowid , sourcetbl.createdate from testdest inner join [mylinkedserver].mydatabase.[dbo].[sourcetbl] sourcetbl /*76915552 rows*/ on testdest.rowid = sourcetbl.rowid

 

 

and here is the execution plan:

 

 

as you see, the estimated execution plan here does not show too much about what is happening on the remote server , it shows only that remote query cost is 76%.

the execution time is more than 3.5 minutes:

SQL Server parse and compile time:

CPU time = 31 ms, elapsed time = 597 ms.

(802264 rows affected)

Table 'testdest'. Scan count 76915552, logical reads 153831104, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

SQL Server Execution Times:

CPU time = 203750 ms,  elapsed time = 217270 ms.

 

I executed the same query but with Remote hint:

 

select testdest.rowid , sourcetbl.createdate from testdest inner remote join [mylinkedserver].mydatabase.[dbo].[sourcetbl] sourcetbl /*76915552 rows*/ on testdest.rowid = sourcetbl.rowid

 

 

the Remote Join execution plan:

execution plan 2.jpg

 

the estimated execution plan shows that the query optimizer is using index scan and  the remote query cost here is 97%, but the actual execution time is much lower, about 2~3 seconds:

Warning: The join order has been enforced because a local join hint is used.

SQL Server parse and compile time:

CPU time = 16 ms, elapsed time = 593 ms.

(802264 rows affected)

Table 'testdest'. Scan count 1, logical reads 2, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

SQL Server Execution Times:

CPU time = 937 ms,  elapsed time = 2875 ms.

 

the idea here is that the join is performed on the remote server , and the number of network packets that are transferred between the instances back and forth are much lower.

if you compare between the inner join and remote inner join

 

Inner Join:

Picture2.jpg

Picture3.jpg

 

Inner Remote Join:

Picture4.jpg

 

 

you will notice that this execution plan is using index scan and the first one was using seek, but here the index scan is for only 10 rows and was only once , also the rows retrieved from the remote server is only 802264 (the result  set of the query).

 

*  it is a important to create a covering index on remote server as well, rowID in our example here.

*  Remote hint is working only for Inner Join not outer joins.

 

 

 

 

OpenQuery:

Here also I will give you examples of how to work with remote queries using the OPENQUERY statement.

 

select top  1000 example:

 

select top 1000 * from [mylinkedserver].mydatabase.[dbo].[sourcetbl]

 

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 629 ms.

(1000 rows affected)

SQL Server Execution Times:

CPU time = 31 ms,  elapsed time = 1812 ms.

 

the execution plan:

Picture6.jpg

 

the openquery statement:

 

select * from openquery([mylinkedserver], 'select top 1000 * mydatabase.[dbo].[sourcetbl]’) sourcetbl

 

 

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 146 ms.

(1000 rows affected)

SQL Server Execution Times:

CPU time = 31 ms,  elapsed time = 888 ms.

 

openquery execution plan:

Picture7.jpg

 

"where" example:

 

select * from [mylinkedserver].mydatabase.[dbo].[sourcetbl] where rowid =50

 

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 657 ms.

(50155 rows affected)

SQL Server Execution Times:

CPU time = 1078 ms,  elapsed time = 10950 ms.

 

2nd try

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

(50155 rows affected)

SQL Server Execution Times:

CPU time = 984 ms,  elapsed time = 9002 ms.

 

"Where" with OpenQuery example:

 

select * from openquery([mylinkedserver], 'select * from mydatabase.[dbo].[sourcetbl] where rowid =50') sourcetbl

 

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 149 ms.

(50155 rows affected)

SQL Server Execution Times:

CPU time = 1094 ms,  elapsed time = 2009 ms.

2nd try

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

(50155 rows affected)

SQL Server Execution Times:

CPU time = 1094 ms,  elapsed time = 2111 ms.

 

with Openquery, the evaluation of the query and part of the query processing “of where clause or  top rows for example” will be on the remote server, put in consideration also the load on both servers and which one of them is the production and which one of them has more valuable resources.

 

 

for more information about Remote Joins and Openquery:

 

Remote Inner Join:

https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-join?view=sql-server-ver15#arguments

 

 OpenQuery:

https://docs.microsoft.com/en-us/sql/t-sql/functions/openquery-transact-sql?view=sql-server-ver15

 

 

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.