Discussion:
Puzzling ORDER BY
(too old to reply)
Gene Wirchenko
2012-06-21 16:37:54 UTC
Permalink
Dear SQLers:

http://www.mssqltips.com/sqlservertip/1455/using-the-case-expression-instead-of-dynamic-sql-in-sql-server/
has an example with some puzzling code. Example 3's code is:

***** Start of Code *****
CREATE PROCEDURE dbo.getCustomerData @sortby VARCHAR(9),
@sortdirection CHAR(4)
AS
SET nocount ON

SELECT customerid, firstname, lastname, statecode, statedescription,
totalsales
FROM dbo.Customer
ORDER BY
CASE @sortdirection
WHEN 'asc' THEN
CASE @sortby
WHEN 'firstname' THEN firstname
WHEN 'lastname' THEN lastname
END
END
ASC,
CASE @sortdirection
WHEN 'desc' THEN
CASE @sortby
WHEN 'firstname' THEN firstname
WHEN 'lastname' THEN lastname
END
END
DESC
GO

EXEC dbo.getCustomerData 'lastname', 'desc'
***** End of Code *****

This looks like it specifies an order of two columns one
ascending and one descending but with one of them null. How does this
work? The syntax is weird (to me).

Sincerely,

Gene Wirchenko
Jeroen Mostert
2012-06-21 19:08:33 UTC
Permalink
Post by Gene Wirchenko
http://www.mssqltips.com/sqlservertip/1455/using-the-case-expression-instead-of-dynamic-sql-in-sql-server/
***** Start of Code *****
@sortdirection CHAR(4)
AS
SET nocount ON
SELECT customerid, firstname, lastname, statecode, statedescription,
totalsales
FROM dbo.Customer
ORDER BY
WHEN 'asc' THEN
WHEN 'firstname' THEN firstname
WHEN 'lastname' THEN lastname
END
END
ASC,
WHEN 'desc' THEN
WHEN 'firstname' THEN firstname
WHEN 'lastname' THEN lastname
END
END
DESC
GO
EXEC dbo.getCustomerData 'lastname', 'desc'
***** End of Code *****
This looks like it specifies an order of two columns one
ascending and one descending but with one of them null.
Yes. ORDER BY NULL effectively doesn't sort, because the same sort key is
applied to all rows. So for @sortby = 'lastname', @sortdirection = 'desc',
the ORDER BY becomes

ORDER BY NULL ASC, lastname DESC

With the first part having no effect on the end result.
Post by Gene Wirchenko
How does this work?
Not particularly well, since the optimizer doesn't know what order you
actually want until the query is executed. You may or may not get acceptable
performance out of it. I know the author explicitly wrote the article as a
way of illustrating how to get *away* from dynamic SQL, and certainly, if
you can get these queries to work well for you they're easier than cobbling
together ORDER BY clauses as text, but unfortunately dynamic SQL often
remains the best choice here where performance is concerned. The author
demonstrates the query on 4 rows, but neglects to demonstrate it on 4
million rows.
--
J.
Gene Wirchenko
2012-06-21 20:40:16 UTC
Permalink
On Thu, 21 Jun 2012 21:08:33 +0200, Jeroen Mostert
Post by Jeroen Mostert
Post by Gene Wirchenko
http://www.mssqltips.com/sqlservertip/1455/using-the-case-expression-instead-of-dynamic-sql-in-sql-server/
***** Start of Code *****
[snip]
Post by Jeroen Mostert
Post by Gene Wirchenko
***** End of Code *****
This looks like it specifies an order of two columns one
ascending and one descending but with one of them null.
Yes. ORDER BY NULL effectively doesn't sort, because the same sort key is
the ORDER BY becomes
ORDER BY NULL ASC, lastname DESC
With the first part having no effect on the end result.
What confused me is that explicitly using null (as you wrote) is
an error. This left me wondering about the syntax.
Post by Jeroen Mostert
Post by Gene Wirchenko
How does this work?
Not particularly well, since the optimizer doesn't know what order you
actually want until the query is executed. You may or may not get acceptable
performance out of it. I know the author explicitly wrote the article as a
way of illustrating how to get *away* from dynamic SQL, and certainly, if
you can get these queries to work well for you they're easier than cobbling
together ORDER BY clauses as text, but unfortunately dynamic SQL often
remains the best choice here where performance is concerned. The author
demonstrates the query on 4 rows, but neglects to demonstrate it on 4
million rows.
It seems to me that each case would be executed, and in the case
of, say, 4 million rows, this could be quite a hit. Is that why you
are concerned about the performance?

While trying to figure it out, I found other examples with up to
six different case orders. It struck me as rather kludgy.

Sincerely,

Gene Wirchenko
Jeroen Mostert
2012-06-22 06:24:30 UTC
Permalink
Post by Gene Wirchenko
On Thu, 21 Jun 2012 21:08:33 +0200, Jeroen Mostert
Post by Jeroen Mostert
Post by Gene Wirchenko
http://www.mssqltips.com/sqlservertip/1455/using-the-case-expression-instead-of-dynamic-sql-in-sql-server/
***** Start of Code *****
[snip]
Post by Jeroen Mostert
Post by Gene Wirchenko
***** End of Code *****
This looks like it specifies an order of two columns one
ascending and one descending but with one of them null.
Yes. ORDER BY NULL effectively doesn't sort, because the same sort key is
the ORDER BY becomes
ORDER BY NULL ASC, lastname DESC
With the first part having no effect on the end result.
What confused me is that explicitly using null (as you wrote) is
an error. This left me wondering about the syntax.
Yes, you're right -- if you supply a constant expression explicitly, SQL
Server will catch this and flag it as nonsensical. A bit overzealous, I
think. It's mostly harmless, compared to other things that aren't detected.

If you use "NULL + [column]", or any other expression that always yields
NULL but isn't constant, the error goes away. The CASE, of course, is not a
constant expression.
Post by Gene Wirchenko
Post by Jeroen Mostert
Post by Gene Wirchenko
How does this work?
Not particularly well, since the optimizer doesn't know what order you
actually want until the query is executed. You may or may not get acceptable
performance out of it. I know the author explicitly wrote the article as a
way of illustrating how to get *away* from dynamic SQL, and certainly, if
you can get these queries to work well for you they're easier than cobbling
together ORDER BY clauses as text, but unfortunately dynamic SQL often
remains the best choice here where performance is concerned. The author
demonstrates the query on 4 rows, but neglects to demonstrate it on 4
million rows.
It seems to me that each case would be executed, and in the case
of, say, 4 million rows, this could be quite a hit. Is that why you
are concerned about the performance?
The problem is not the CASE itself (the execution overhead is trivial, and
constant per query) but the fact that the optimizer doesn't know what order
you actually want at compile time. If you write "ORDER BY lastname DESC" and
there is an index on lastname, the optimizer can simply use that. But if you
have multiple clauses covered by CASE, the optimizer cannot know what order
you want, so it won't be able to use any indexes.

As Erland pointed out, in the worst cases where you order by lots of stuff
it's unlikely an index can be proficiently used anyway, so this doesn't
always matter. However, his remark about OPTION (RECOMPILE) is not entirely
correct, I think -- I'll respond to that separately.
Post by Gene Wirchenko
While trying to figure it out, I found other examples with up to
six different case orders. It struck me as rather kludgy.
Dynamic search or sort conditions can't really avoid looking kludgy -- the
only other major option is dynamic SQL, which looks even kludgier. Building
it client side can avoid some of the kludginess, but has its own drawbacks
in security and maintainability.
--
J.
Erland Sommarskog
2012-06-22 09:31:08 UTC
Permalink
Post by Jeroen Mostert
Yes, you're right -- if you supply a constant expression explicitly, SQL
Server will catch this and flag it as nonsensical. A bit overzealous, I
think. It's mostly harmless, compared to other things that aren't detected.
If you use "NULL + [column]", or any other expression that always yields
NULL but isn't constant, the error goes away. The CASE, of course, is
not a constant expression.
Or you can say ORDER BY (SELECT NULL). Which is not very useful in the
main ORDER BY for a query, but this is useful with row_number() when you
don't care about the order. (ORDER BY is mandatory for row_number.)
--
Erland Sommarskog, SQL Server MVP, ***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
Erland Sommarskog
2012-06-21 21:31:17 UTC
Permalink
Post by Jeroen Mostert
Not particularly well, since the optimizer doesn't know what order you
actually want until the query is executed. You may or may not get
acceptable performance out of it. I know the author explicitly wrote the
article as a way of illustrating how to get *away* from dynamic SQL, and
certainly, if you can get these queries to work well for you they're
easier than cobbling together ORDER BY clauses as text, but
unfortunately dynamic SQL often remains the best choice here where
performance is concerned. The author demonstrates the query on 4 rows,
but neglects to demonstrate it on 4 million rows.
Well, if you have umpteen sort conditions, the odds are very good that the
very most of them will call for sorting anyway. But, true, if a sort
condition is used 80% of the time, and there is a index matching this
condition dynamic SQL wins.

Although, these days you can avoid the syntax overhead of dynamic SQL
by using OPTION (RECOMPILE). (Requires SQL 2008 SP2 or SQL 2008 R2 SP1.)
Then SQL Server will compile the statement every time, and consider
the parameters as constants.
--
Erland Sommarskog, SQL Server MVP, ***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
Continue reading on narkive:
Loading...