Execution Plan for Noncorrelated Query
By : Kasim Wirama, MCDBA
This posting, I would like to show various possible execution plan for non correlated query and possible action to rewrite query for better execution plan. Let’s run the query at Northwind database that displays list of orderid whose freight cost is under average freight cost of all orders
SELECT orderid FROM dbo.orders
WHERE freight < (SELECT AVG(freight) FROM dbo.orders)
Execution plan for the non correlated and scalar query is :
|--Nested Loops(Inner Join, WHERE:([Northwind].[dbo].[Orders].[Freight]<[Expr1006]))
|--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [Expr1013]=(0)
| |--Stream Aggregate(DEFINE:([Expr1013]=COUNT_BIG([|
|--Clustered Index Scan (OBJECT:([Northwind].[dbo].
|--Clustered Index Scan(OBJECT:([Northwind].[dbo].[Orders].[PK_Orders]))
The above execution plan calculate average value of freight and then verify all records in orders database which has freight value below average value of freight. It’s straightforward description from the execution plan. Now let’s look into another noncorrelated subquery
SELECT
orderid
FROM orders
WHERE customerid =
(
SELECT customerid
FROM customers
WHERE contactname = 'Antonio Moreno'
);
Execution plan for the query is :
|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008]))
|--Assert(WHERE:(CASE WHEN [Expr1007]>(1) THEN (0)
| |--Stream Aggregate(DEFINE:([Expr1007]=Count(*), [Expr1008]=ANY([Northwind].[dbo].[Customers].[CustomerID])))
| |--Clustered Index Scan(OBJECT:([Northwind].[dbo].[Customer WHERE:([Northwind].[dbo].[Customers].[ContactName]=N'Antonio Moreno'))
|--Index Seek(OBJECT:([Northwind].[dbo].[Orders].[CustomerID]), SEEK:([Northwind].[dbo].[Orders].[CustomerID]=[Expr1008]) ORDERED FORWARD)
The query above could probably returns error because inner query might return non scalar value (more than 1 row) and contactname is not unique, so probably 1 contactname serves several customerid. The way optimizer checking whether inner query returns more than 1 rows is by adding ASSERT operator. If it is true, then the query raises error.
Another noticeable operator is ANY operator. It is internal operator that counts the number of record for each customerID. But because Stream Aggregate expects customerid to be aggregated or to be put in GROUP BY clause, so it requires ANY as internal aggregate operator, so the query :
SELECT COUNT(*), customerid
FROM customers
WHERE contactname = 'antonio moreno'
is not parsed by optimizer except you define GROUP BY for customerid.
As long as, you are aware that contactname could have more than one customerid, the following query is valid and it is considered valid to previous non correlated query :
SELECT
orderid
FROM orders JOIN customers
ON orders.customerid = customers.customerid
WHERE customers.contactname = 'Antonio Moreno'
Execution plan is :
|--Nested Loops(Inner Join, OUTER REFERENCES:([Northwind].[dbo].[Customers].[CustomerID]))
|--Clustered Index Scan(OBJECT:([Northwind].[dbo].[Customers].[PK_Customers]), WHERE:([Northwind].[dbo].[Customers].[ContactName]=N'Antonio Moreno'))
|--Index Seek(OBJECT:([Northwind].[dbo].[Orders].[CustomerID]), SEEK:([Northwind].[dbo].[Orders].[CustomerID]=[Northwind].[dbo].[Customers].[CustomerID]) ORDERED FORWARD)
This execution plan is simpler than previous one.
Now create unique index on contactname column at customers table and execute previous non correlated subquery, the execution plan is :
|--Nested Loops(Inner Join, OUTER REFERENCES:([Northwind].[dbo].[Customers].[CustomerID]))
|--Index Seek(OBJECT:([Northwind].[dbo].[Customers].[ix_1]), SEEK:([Northwind].[dbo].[Customers].[ContactName]=N'Antonio Moreno') ORDERED FORWARD)
|--Index Seek(OBJECT:([Northwind].[dbo].[Orders].[CustomerID]), SEEK:([Northwind].[dbo].[Orders].[CustomerID]=[Northwind].[dbo].[Customers].[CustomerID]) ORDERED FORWARD)
For this case, proper index selection will remove some operator and execution plan will be simpler and more efficient because I tell optimizer that there is only one contactname for each customerid, so optimizer decides to choose index seek to get customer record with contactName = ‘Antonio Moreno’ and optimizer stops searching to other customerid with same contactName.