SQL Query Performance Tuning is a fun place if you know precisely what you are doing otherwise you will go down the rabbit hole. When it comes to fine tuning you need to understand the context and analyze all the possible scenarios with questions for example
Did you notice any data volume change? If yes at what percentage?
If you have any data logs it helps but in the absence of the logs, we can still quickly profile the data to answer this question. Change in data volume causes poor performance due to the fact that most of the time SQLs are created and tested with small volume or sample data sets. A left outer join can cause imbalance if that table (used for left join) grows exponentially overnight. Understanding data volume really helps and it influences your fine-tuning exercise.
Corruption of SQL Execution Plans
In some cases due to an applied patch overnight on a SQL Server box cause, execution plans go out of sync. Just recompiling your Stored Proc may work sometimes because by recompiling you are forcing a fresh execution plan to be generated.
Imagine that some changes were pushed/elevated to Production that caused wrong indexing or may have dropped an existing index. It is always good to double check indexes and stats on any table/s involved in your SQLs.
In this article, I want to focus on a simple approach to understanding what exactly SQL Server is trying to do with your query. For this exercise, we will use Microsoft SQL Server Management Studio (version 17.3) connecting to SQL Server 2019 (CTP).
As part of the experiment, we will try to use this new function called APPROX_COUNT_DISTINCT Read more about it if you are into big data and trying to get the approx distinct counts on a table with billion history rows.
Primarily my focus is to see the function use from an execution plan perspective. APPROX_COUNT_DISTINCT is very useful when you are dealing with large data tables and as per Microsoft, this function has 2% error rate within a 97% probability. Again you can learn more about it.
Here is the SQL for our exercise. I’m using AdentureWorksDW database.
USE [AdventureWorksDW] select Wagetype,count(distinct(FactCallCenterID)) from [dbo].[FactCallCenter] group by Wagetype select Wagetype,APPROX_COUNT_DISTINCT(FactCallCenterID) from [dbo].[FactCallCenter] group by Wagetype
First, let’s run the first SQL statement (refer above) and save the execution plan. If you are not familiar with Execution Plan all you have to do is, in the SQL window right click to get the context menu and choose “Display Estimated Execution Plan” (I’m using SQL Server Management Studio)
USE [AdventureWorksDW] select Wagetype,count(distinct(FactCallCenterID)) from [dbo].[FactCallCenter] group by Wagetype
Second, let’s run the second SQL statement (refer above) and save the execution plan.
USE [AdventureWorksDW] select Wagetype,APPROX_COUNT_DISTINCT(FactCallCenterID) from [dbo].[FactCallCenter] group by Wagetype
Once we have both execution plans are saved, we can take advantage of the Compare Execution Plans feature right from the Microsoft SQL Server Studio. I have attached all the screenshots. In Figure 7 you can notice few differences and you can start comparing side by side for example differences in CPU timing etc.
Hope this in helps you in your SQL fine tuning adventures. By the way, there are powerful tools with heavy graphics rich visualizations to help you in this journey. Knowing the basics will definitely help in my opinion and based on my experience.