Table Spool in Equality Correlated Subquery
Table Spool in Equality Correlated Subquery
By : Kasim Wirama, MCDBA
Recent posting, I examine spooling characteristic in nonequality correlated subquery. In nonequality correlated subquery, optimizer generate index spool which created index on the fly spool operation and cache execution result from immediately previous execution. They are called eager index spool and lazy index spool. Let’s take a look spooling behavior in equijoin correlated subquery.
Given the example of equijoin correlated subquery below :
select
o1.OrderID, o1.Freight
from dbo.Orders as o1
where o1.Freight >
(
select AVG(o2.freight)
from dbo.Orders as o2
where o2.CustomerID = o1.CustomerID
);
The query displays order information for each customers whose freight cost is above average for themselves. Notice that predicate in inner query uses equality (=) instead of nonequality in my recent posting. Execution plan for this correlated subquery is :
|--Nested Loops(Inner Join)
|--Table Spool
| |--Segment
| |--Sort(ORDER BY:([o1].[CustomerID] ASC))
| |--Clustered Index Scan(OBJECT:([Northwind].[dbo].[Orders].[PK_Orders] AS [o1]), WHERE:([Northwind].[dbo].[Orders].[CustomerID] as [o1].[CustomerID] IS NOT NULL))
|--Nested Loops(Inner Join, WHERE:([Northwind].[dbo].[Orders].[Freight] as [o1].[Freight]>[Expr1004]))
|--Compute Scalar(DEFINE:([Expr1004]=CASE WHEN [Expr1012]=(0) THEN NULL ELSE [Expr1013]/CONVERT_IMPLICIT(money,[Expr1012],0) END))
| |--Stream Aggregate(DEFINE:([Expr1012]=COUNT_BIG([Northwind].[dbo].[Orders].[Freight] as [o1].[Freight]), [Expr1013]=SUM([Northwind].[dbo].[Orders].[Freight] as [o1].[Freight])))
| |--Table Spool |--Table Spool
First optimizer does scanning all records in Orders table, then sorts the rows based on CustomerID. Next operator is Segment operator. Segment operator will break all records into groups partitioned by CustomerID column. For each groups will be insert into Table Spool operator. Output of table spool is one row which will be processed by topmost Nested Loops operator. Now optimizer will count average freight from second Table Spool operator. Average function is representated by Compute Scalar operator and Stream Aggregate operator, so the function return 1 average freight for current customerid and it compares each rows of customerid from third Table Spool. After the current group of customerid has been processed, table spool is truncated and next group of customerid will be inserted into table spool operator. Do second and third table spool operators refer to first table spool operator? The answer is yes, at graphical execution plan you can check Primary Node ID property for second and third one is same as Node ID property for first Table Spool operator.
If you compare in what extents rewinds between table spool and index spool. Rewinds rate of index spool is higher than that of table spool. So for data that contains many duplicates, index spool performs better than table spool.