Discussion:
CPU time and logical reads
(too old to reply)
simon
2012-02-29 14:16:00 UTC
Permalink
I have one join of 2 tables(first table - filtered index, has 6827
rows and second one has 1 million of rows).
Join is by identity column, which is clustered index in small table
and nonclustered in large table.
There is no order - so, I guess, merge join is not optimal here. But
query optimizer choose it.
CPU time is about 350 ms(and also execution time is near), and IO is
about 1900 logical reads.

Because one table is small and other is large, I would take nested
loop (for each value in small table, I would take index seek on large
table).
Now CPU is much faster, only 80 ms, but logical reads are more than 10
times greater (20000).

Since logical reads are from memory it doesn't affect the performance
a lot.
If I would clear cash, than loop join would be slower than merge
join.
But even if I execute: DBCC DROPCLEANBUFFERS, DBCC FREESYSTEMCACHE,
DBCC FREESESSIONCACHE, DBCC FREEPROCCACHE, I still have logical reads.
How can I remove pages from cache?

Anyway, when pages are in cache(that is probably most of the time),
then Loop join would be 5-times faster than merge. Should I use Hint
operator here?(and first table will always have small number of rows,
only second will grow).

So, should i focus more on CPU time when optimizing query(and ignore
logical reads since they are almost never physical) or should I focus
more on logical reads.

If I choose HASH JOIN in this case i get similar results as with merge
join(CPU time is little bigger, reads are the same).

br, Simon
simon
2012-02-29 14:40:39 UTC
Permalink
Post by simon
I have one join of 2 tables(first table - filtered index, has 6827
rows and second one has 1 million of rows).
Join is by identity column, which is clustered index in small table
and nonclustered in large table.
There is no order - so, I guess, merge join is not optimal here. But
query optimizer choose it.
CPU time is about 350 ms(and also execution time is near), and IO is
about 1900 logical reads.
Because one table is small and other is large, I would take nested
loop (for each value in small table, I would take index seek on large
table).
Now CPU is much faster, only 80 ms, but logical reads are more than 10
times greater (20000).
Since logical reads are from memory it doesn't affect the performance
a lot.
If I would clear cash, than loop join would be slower than merge
join.
But even if I execute: DBCC DROPCLEANBUFFERS, DBCC FREESYSTEMCACHE,
DBCC FREESESSIONCACHE, DBCC FREEPROCCACHE, I still have logical reads.
How can I remove pages from cache?
Anyway, when pages are in cache(that is probably most of the time),
then Loop join would be 5-times faster than merge. Should I use Hint
operator here?(and first table will always have small number of rows,
only second will grow).
So, should i focus more on CPU time when optimizing query(and ignore
logical reads since they are almost never physical)  or should I focus
more on logical reads.
If I choose HASH JOIN in this case i get similar results as with merge
join(CPU time is little bigger, reads are the same).
br, Simon
Something more.
If I use HASH join and remove MAXDOP1 than hash join has the best
performance since it is the only one which executes in parallel.
So, should I use hash join here?
(in global, merge join has problems with parallel execution, if CPU is
burden with other tasks)
Erland Sommarskog
2012-02-29 16:23:53 UTC
Permalink
Post by simon
But even if I execute: DBCC DROPCLEANBUFFERS, DBCC FREESYSTEMCACHE,
DBCC FREESESSIONCACHE, DBCC FREEPROCCACHE, I still have logical reads.
How can I remove pages from cache?
The query processor always reads from the cache. It asks the storage engine
for the page, and SE will read the page from disk if the page is not in
cache. So you will always get local reads.
Post by simon
Anyway, when pages are in cache(that is probably most of the time),
then Loop join would be 5-times faster than merge. Should I use Hint
operator here?(and first table will always have small number of rows,
only second will grow).
Only use hints if you have a real performance problem and nothing else
helps.

When measuring time, I think that wallclock time is the best measurement.
That is, run SELECT sysdatetime before and after. (And make sure that you
don't return a lot of rows to the client.) CPU time may exceed wallclock
time, if you have parallelism.
--
Erland Sommarskog, SQL Server MVP, ***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
Gert-Jan Strik
2012-02-29 18:57:45 UTC
Permalink
If you are considering to overrule the optimizer's query plan, then you
will have to consider all relevant factors yourself.

You already mentioned the buffer cache hit rate. If a hot cache is the
typical situation then a selection of a small number of rows out of a
large set is usually faster with Loop Joins than Merge Joins. If you
don't mind to occasionally wait longer (when most rows are not in
cache), then you could consider such join hint.

Most of the time, a parallel query plan that uses Hash Join is faster
than a Merge Join. But is also consumes much more resources. Both in
terms of memory and of CPU cycles. If you will typically be running your
query in low activity periods, then it can be a very good choice. If
your server is pushed to its limit, then this approach can backfire in a
big way.

In other words, the best approach depends on your system setup
(hardware) and other use of the same system (other queries, average
load, etc.)

One of the advantages of simply keeping the automatically generated
query plan is, that the optimizer re-established the best query plan
after each statistics update. It will alter its strategy when it is
appropriate. For example when the tables grow or when the system is
under memory pressure or very busy. So unless your query is really worth
it, and you know what you are doing, it is best to just go with the
automatically created query plan if its performance is satisfactory.
--
Gert-Jan
simon
2012-03-01 13:19:27 UTC
Permalink
Post by Gert-Jan Strik
If you are considering to overrule the optimizer's query plan, then you
will have to consider all relevant factors yourself.
You already mentioned the buffer cache hit rate. If a hot cache is the
typical situation then a selection of a small number of rows out of a
large set is usually faster with Loop Joins than Merge Joins. If you
don't mind to occasionally wait longer (when most rows are not in
cache), then you could consider such join hint.
Most of the time, a parallel query plan that uses Hash Join is faster
than a Merge Join. But is also consumes much more resources. Both in
terms of memory and of CPU cycles. If you will typically be running your
query in low activity periods, then it can be a very good choice. If
your server is pushed to its limit, then this approach can backfire in a
big way.
In other words, the best approach depends on your system setup
(hardware) and other use of the same system (other queries, average
load, etc.)
One of the advantages of simply keeping the automatically generated
query plan is, that the optimizer re-established the best query plan
after each statistics update. It will alter its strategy when it is
appropriate. For example when the tables grow or when the system is
under memory pressure or very busy. So unless your query is really worth
it, and you know what you are doing, it is best to just go with the
automatically created query plan if its performance is satisfactory.
--
Gert-Jan
"And make sure that you don't return a lot of rows to the client"
Is it possible to set that result set won't be returned to client?

At measuring, I usually returns values to variables to avoid time for
returning values to the client.
Or simetimes, when I don't feel to create variables for each column, I
just use temp table:

DECLARE @time DATETIME
SET @time=SYSDATETIME()

SELECT * INTO #temp
SELECT DATEDIFF(ms, @time, SYSDATETIME())
DROP TABLE #temp


Well, hash join has exactly the same logical reads as merge join, only
CPU time is twice.
Without using MAXDOP1 and using parallelism, elapsed time is 5 times
lower than elapsed time of merge join.

I'm the only one working on this computer and optimizer still
establish that the best join is merge join on one core.
So, using hint here is appropriate, I think.

For future, first table will always have the similar number of rows,
only second table will grow.
And greater the second table is, better the hash join is comparing to
merge.

On the other hand, LOOP is also good choice, since first table will be
always small.
And time for index seek on second table will not be much greater as
table grows.

So, LOOP is always on safe side. Good choice too.

Since I have many cores available and most time free to use, using
hash join is ok.
Why other 2 joins are not executed in parallel?

Even if I set "cost threshold for parallelism" to 0, only hash join is
executed in parallel.

It would be nice to have hint "FORCE PARALLEL EXECUTION" for single
query.
And also it would be nice to have setting "Max degree for parallelism"
on database level and not on server level.

I think there is a good idea to set max degree to 1 for oltp system
with many users and many short queries.
With this setting, query optimizer will not bother with calculating
for each query, if maybe using parallel is better.
(on the other side, when taking plan from cache it also won't bother).
And because there is many users also all cores will be working, but
only one per user.

And for some long running, extensive queries, I would set hint maxdop
to 0 and since cost threshold for parallelism is 0 it would tend to
executes in parallel.

Otherwise I agree to keep the automatically generated query plan is
the best option in 99% of cases.
If I have hint HASH JOIN in my JOIN, it executes 5 times faster.
But if my JOIN is part of larger query, this hint would cause that
also all other parts of query would be executed in parallel, which at
the end last longer than executing on one core.

br,Simon

Loading...