Monday, October 6, 2008

SQL NOT IN Constraint and NULL values

Few days back i was working on a query which involved "IN". To verify the result i am getting i used "NOT IN" and guess what... i get 0 result.After spending couple of hours on the issue i found that if in the subquery there is any NULL value then query will return a count of o records.
According to SQL Online Book:
"Any null values returned by subquery or expression that are compared to test_expression using IN or NOT IN return UNKNOWN. Using null values in together with IN or NOT IN can produce unexpected results."
Click http://msdn.microsoft.com/en-us/library/ms177682(SQL.90).aspx to read more
On another website i got better example to explain:
To state it simply, why does query A return a result but B doesn't?
A: select 'true' where 3 in (1, 2, 3, null)B: select 'true' where 3 not in (1, 2, null)
This was on SQL Server 2005. I also found that calling set ansi_nulls off causes B to return a result.
The most accepted reason for this behaviour is:
Query A is the same as:select 'true' where 3 = 1 or 3 = 2 or 3 = 3 or 3 = nullSince 3 = 3 is true, you get a result.
Query B is the same as:select 'true' where 3 <> 1 and 3 <> 2 and 3 <> nullWhen ansi_nulls is on, 3 <> null is UNKNOWN, so the predicate evaluates to UNKNOWN, and you don't get any rows.
When ansi_nulls is off, 3 <> null is true, so the predicate evaluates to true, and you get a row
And the another reason i found which explains it well is as follows:
In A, 3 is tested for equality against each member of the set, yielding (FALSE, FALSE, TRUE, UNKNOWN). Since one of the elements is TRUE, the condition is TRUE. (It's also possible that some short-circuiting takes place here, so it actually stops as soon as it hits the first TRUE and never evaluates 3=NULL.)
In B, I think it is evaluating the condition as NOT (3 in (1,2,null)). Testing 3 for equality against the set yields (FALSE, FALSE, UNKNOWN), which is aggregated to UNKNOWN. NOT ( UNKNOWN ) yields UNKNOWN. So overall the truth of the condition is unknown, which at the end is essentially treated as FALSE

No comments: