SQL Server Indonesia User Groups Community Execution Plan for Noncorrelated Query - kasim.wirama
Welcome to SQL Server User Group Indonesia
Sign in JoinHelp | Sign In Live ID

kasim.wirama

Execution Plan for Noncorrelated Query

Execution Plan for Noncorrelated Query

By : Kasim Wirama, MCDBA

 

This posting, I would like to show various possible execution plan for non correlated query and possible action to rewrite query for better execution plan. Let’s run the query at Northwind database that displays list of orderid whose freight cost is under average freight cost of all orders

 

SELECT orderid FROM dbo.orders

WHERE freight < (SELECT AVG(freight) FROM dbo.orders)

 

Execution plan for the non correlated and scalar query is :

  |--Nested Loops(Inner Join, WHERE:([Northwind].[dbo].[Orders].[Freight]<[Expr1006]))

       |--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [Expr1013]=(0)

       |    |--Stream Aggregate(DEFINE:([Expr1013]=COUNT_BIG([|        

|--Clustered Index Scan (OBJECT:([Northwind].[dbo].

       |--Clustered Index Scan(OBJECT:([Northwind].[dbo].[Orders].[PK_Orders]))

 

The above execution plan calculate average value of freight and then verify all records in orders database which has freight value below average value of freight. It’s straightforward description from the execution plan. Now let’s look into another noncorrelated subquery

 

SELECT

            orderid

FROM orders

WHERE customerid =

(

            SELECT customerid

            FROM customers

            WHERE contactname = 'Antonio Moreno'

);

 

Execution plan for the query is :

 

  |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008]))

       |--Assert(WHERE:(CASE WHEN [Expr1007]>(1) THEN (0)

       |    |--Stream Aggregate(DEFINE:([Expr1007]=Count(*), [Expr1008]=ANY([Northwind].[dbo].[Customers].[CustomerID])))

       |         |--Clustered Index Scan(OBJECT:([Northwind].[dbo].[Customer WHERE:([Northwind].[dbo].[Customers].[ContactName]=N'Antonio Moreno'))

       |--Index Seek(OBJECT:([Northwind].[dbo].[Orders].[CustomerID]),   SEEK:([Northwind].[dbo].[Orders].[CustomerID]=[Expr1008]) ORDERED FORWARD)

 

The query above could probably returns error because inner query might return non scalar value (more than 1 row) and contactname is not unique, so probably 1 contactname serves several customerid. The way optimizer checking whether inner query returns more than 1 rows is by adding ASSERT operator. If it is true, then the query raises error.

 

Another noticeable operator is ANY operator. It is internal operator that counts the number of record for each customerID. But because Stream Aggregate expects customerid to be aggregated or to be put in GROUP BY clause, so it requires ANY as internal aggregate operator, so the query :

 

SELECT COUNT(*), customerid

FROM customers

WHERE contactname = 'antonio moreno'

 

is not parsed by optimizer except you define GROUP BY for customerid.

 

As long as, you are aware that contactname could have more than one customerid, the following query is valid and it is considered valid to previous non correlated query :

 

SELECT

            orderid

FROM orders JOIN customers

ON orders.customerid = customers.customerid

WHERE customers.contactname = 'Antonio Moreno'

 

Execution plan is :

  |--Nested Loops(Inner Join, OUTER REFERENCES:([Northwind].[dbo].[Customers].[CustomerID]))

       |--Clustered Index Scan(OBJECT:([Northwind].[dbo].[Customers].[PK_Customers]), WHERE:([Northwind].[dbo].[Customers].[ContactName]=N'Antonio Moreno'))

       |--Index Seek(OBJECT:([Northwind].[dbo].[Orders].[CustomerID]), SEEK:([Northwind].[dbo].[Orders].[CustomerID]=[Northwind].[dbo].[Customers].[CustomerID]) ORDERED FORWARD)

 

This execution plan is simpler than previous one.

 

Now create unique index on contactname column at customers table and execute previous non correlated subquery, the execution plan is :

 

  |--Nested Loops(Inner Join, OUTER REFERENCES:([Northwind].[dbo].[Customers].[CustomerID]))

       |--Index Seek(OBJECT:([Northwind].[dbo].[Customers].[ix_1]), SEEK:([Northwind].[dbo].[Customers].[ContactName]=N'Antonio Moreno') ORDERED FORWARD)

       |--Index Seek(OBJECT:([Northwind].[dbo].[Orders].[CustomerID]), SEEK:([Northwind].[dbo].[Orders].[CustomerID]=[Northwind].[dbo].[Customers].[CustomerID]) ORDERED FORWARD)

 

For this case, proper index selection will remove some operator and execution plan will be simpler and more efficient because I tell optimizer that there is only one contactname for each customerid, so optimizer decides to choose index seek to get customer record with contactName = ‘Antonio Moreno’ and optimizer stops searching to other customerid with same contactName.

Posted: Saturday, January 17, 2009 5:01 AM by kasim.wirama

Filed under:

Comments

No Comments