Tuesday, March 27, 2012

Convert *= / =* to outer joins (ANSI Join)

Hi,

I'm working on converting *= and =* to 'left outer join' and 'right outer join'.

I noticed the difference in behavior between the =* and the phrase "right outer join" and between *= and 'left outer join'. The result set is different. Here is an example:

select a.au_id, b.title, c.qty

from titleauthor a, titles b, sales c

where (a.title_id =* b.title_id)

and (a.title_id =* c.title_id)

I try to conver the above to:

select a.au_id, b.title, c.qty

from titleauthor a

right outer join titles b

on (a.title_id = b.title_id )

right outer join sales c

on (a.title_id = c.title_id )

The first results into 391 rows in pubs database of sql-server 2000 and the second produces 34 rows. It seems not that straight forward to convert.

The question is: If I want to get 391 row, what should I change/add in the second sql statement?

Thank you. Appreciate your help.

Joan

I think this:

select a.au_id, b.title, c.qty

from titleauthor a, titles b, sales c

where (a.title_id =* b.title_id)

and (a.title_id =* c.title_id)

Is actually not a right join between these tables directly, but actually more this query:

--changed the a b c aliases to just be the table for clarity
select titleauthor.au_id, titles.title, sales.qty
from sales
cross join titles
left outer join titleauthor
on titles.title_id = titleauthor.title_id
and sales.title_id = titleauthor.title_id

The commas can be translated directly to CROSS JOIN, with the where being the same. So in your query, since there is no link between sales and titles, it does a cross join. You can see this by looking at the plan:

Code Snippet

SET SHOWPLAN_ALL ON
GO

select a.au_id, b.title, c.qty
from titleauthor a, titles b, sales c
where (a.title_id =* b.title_id)
and (a.title_id =* c.title_id)

SET SHOWPLAN_ALL OFF
GO

|--Hash Match(Right Outer Join, HASH:([a].[title_id])=([b].[title_id]),
RESIDUAL:([pubs].[dbo].[titleauthor].[title_id] as
[a].[title_id]=[pubs].[dbo].[titles].[title_id] as [b].[title_id]
AND [pubs].[dbo].[titleauthor].[title_id] as
[a].[title_id]=[pubs].[dbo].[sales].[title_id] as [c].[title_id]))
|--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[titleidind] AS [a]))
|--Nested Loops(Inner Join)
|--Index Scan(OBJECT:([pubs].[dbo].[titles].[titleind] AS [b]))

|--Clustered Index Scan(OBJECT:([pubs].[dbo].[sales].[UPKCL_sales] AS [c]))


Note, no join criteria. Hopefully this helps, it is one of the reasons why the syntax was changed Smile
|||

Hi,

Is it because the question is not clear?

Joan

|||You mean the query? It is clear enough, it is just a matter of what you are asking. The ANSI Style is far more clear to express queries as they are going to be/can be expressed.

The old style was ambiguous like this and was based on the concept that you (logically) first cross join each of the tables in the FROM, then for each row check the criteria. Works like a champ for INNER joins, but OUTER joins not so much.|||

Hi Louis,

Thank you so much!

I'll try to fix my query. I might have question later.

Thanks again.

Joan

|||

Hi Louis,

Here I get another example. The old code is as follows:

Select CU.Name, CI.Amount, CI.PolicyItemId, CI.timestamp, CU.CreditSurchargeId, CU.ValueType_ES ,CU.IsDefault, CU.IsModifiable, CU.IsCredit, CP.ObjectCategoryId, CU.Type_ES, CP.DisplayOrder ,CP.Amount, CP.Effectivedateid
from CreditSurchargePolicyLine CP, CreditSurchargeUnit CU, CreditSurchargePolicyLineItem CI
where CP.PolicyLineId = 4
and CP.ObjectSubjectId = 4
and CP.Status_ES = 'A'
and CP.CreditSurchargeId = CU.CreditSurchargeId
and CP.ObjectSubjectId = CU.ObjectSubjectId
and CU.CreditSurchargeId *= CI.CreditSurchargeId
and CI.PolicyItemId = 30153677
and CI.PolicyLineItemId = 1
and CP.ObjectCategoryId in (0, 1)
and CP.CreditSurchargeId <> 79
and CP.CreditSurchargeId <> 78
and CP.CreditSurchargeId <> 83
and CP.effectivedateID = 1

I converted as follows:

Select CU.Name, CI.Amount, CI.PolicyItemId, CI.timestamp, CU.CreditSurchargeId, CU.ValueType_ES, CU.IsDefault, CU.IsModifiable, CU.IsCredit, CP.ObjectCategoryId, CU.Type_ES, CP.DisplayOrder ,CP.Amount, CP.Effectivedateid
from CreditSurchargePolicyLine CP JOIN CreditSurchargeUnit CU ON CP.CreditSurchargeId = CU.CreditSurchargeId
AND CP.ObjectSubjectId = CU.ObjectSubjectId
LEFT OUTER JOIN CreditSurchargePolicyLineItem CI ON (CU.CreditSurchargeId = CI.CreditSurchargeId)
where CP.PolicyLineId = 4
and CP.ObjectSubjectId = 4
and CP.Status_ES = 'A'
and CI.PolicyItemId = 30153677
and CI.PolicyLineItemId = 1
and CP.EffectiveDateId = 1
and CP.ObjectCategoryId in (0, 1)
and CP.CreditSurchargeId <> 79
and CP.CreditSurchargeId <> 78
and CP.CreditSurchargeId <> 83

The old code returns 23 rows, which it should be. But the new code returns 11 rows, which is not correct. But I can't tell what is wrong with the new code.

Thank you!

Joan

|||

This query may fix your problem..

Code Snippet

Select
CU.Name
, CI.Amount
, CI.PolicyItemId
, CI.timestamp
, CU.CreditSurchargeId
, CU.ValueType_ES
, CU.IsDefault
, CU.IsModifiable
, CU.IsCredit
, CP.ObjectCategoryId
, CU.Type_ES
, CP.DisplayOrder
, CP.Amount
, CP.Effectivedateid
from
CreditSurchargePolicyLine CP

JOIN CreditSurchargeUnit CU ON
CP.CreditSurchargeId = CU.CreditSurchargeId
AND CP.ObjectSubjectId = CU.ObjectSubjectId

LEFT OUTER JOIN CreditSurchargePolicyLineItem CI ON
CU.CreditSurchargeId = CI.CreditSurchargeId
And CI.PolicyItemId = 30153677
And CI.PolicyLineItemId = 1

where
CP.PolicyLineId = 4
and CP.ObjectSubjectId = 4
and CP.Status_ES = 'A'
and CP.EffectiveDateId = 1
and CP.ObjectCategoryId in (0, 1)
and CP.CreditSurchargeId <> 79
and CP.CreditSurchargeId <> 78
and CP.CreditSurchargeId <> 83

|||

Thank you ManiD! Thank you all!

|||

You are welcome.

Remember when you use left/right outer join, if you want to apply any filter attach that filter on JOIN condition itself - rather than

at where clause.

-Mani.D

|||

>>Remember when you use left/right outer join, if you want to apply any filter attach that filter on JOIN condition itself - rather than at where clause.<<

In spirit this is usually right, but this isn't quite true. You have to be careful and cognizant about where to put FILTER criteria, but it can go either place. You just have to realize that:

In the JOIN clause, a condition is applied to the joining of the two sets of data. And from the left side of a LEFT join will be returned no matter what (or the right side of a RIGHT join or both sides of a FULL join for that matter Smile

In the WHERE clause, the condition is applied the the set of rows produced from the FROM clause. So if a row was returned in the FROM clause as the result of an LEFT OUTER JOIN, if you then try to filter out the data by comparing data from the right table, all of the values will be NULL. So they will be filtered out unless you realize this.
sqlsql

No comments:

Post a Comment