Hey,
Sorry for the long delay in posts, the usual, life getting in the way! But I recently had what to me seemed an obscure problem (although in hindsight, duh), but turned out to be caused by out of date statistics (Big thanks to Brent Ozar for the assist twitter | blog).
The Problem
To cut a long story short, I had two very similar queries, the only difference was that one had an ORDER BY in there, which sends the data to the TempDB for sorting. The query with the ORDER BY took 1 minute 42 seconds to complete, the query without the ORDER BY, 6 seconds. Me in my infinite wisdom, believed it was a TempDB problem. I of course, was wrong.
Speaking to Brent on Twitter, he checked out the execution plans, and found that the statistics were wildly off (I run an update nightly, but why that didn’t work will be addressed in the second part of this post). I updated the statistics overnight, ran the queries again, and hey presto, query completes in seconds. WTF right?
It turned out that SQL was generating an execution plan based on how many rows it thought the tables had, and because it was so out of date, it was doing a table spool, thinking it was the quickest way forward. When we compared what it thought it had (41,000 rows), to what it actually had (657 million…), it was obvious (To Brent!) that the statistics were out of date.
Sure enough, I updated the statistics, execution plan changed to use a hash match instead of a table spool, and the query completes in < 2 seconds. Damn. Great, but damn.
What I was doing wrong
So why weren’t my statistics updating? As it’s a large database, index rebuilds on a regular basis are a no go, so I had an agent job set to run sp_updatestats on the database every night. What I hadn’t taken into account, is that sp_updatestats will only update statistics on tables which have changed by a certain percentage since the last update. In my case (And I believe by default), this is 20%. So in a table of 100 rows, unless 21 rows have been updated/altered, the statistics will be skipped over. In my table of 100 million, the 300K inserts that occur a day don’t even register, rendering my job practically useless as a nightly job.
So what to do?
I was pointed to this excellent script at http://ola.hallengren.com. It will do the job perfectly, and for a lot my system I am going to implement this.
But additionally I have some systems which I’m not allowed to deploy 3rd party scripts on (which kills me because I live and die by Adam Machanic’s sp_whoisactive) so I had to use another option.
Harking back to my maintenance plan wizard posts, setting the Update Statistics job to run will do the trick. It’s a bit (lot) more blunt force than Ola’s procedure, but for most smaller databases it will do the trick nicely.
Just something I wanted to throw out there while it’s fresh, as it had been killing me, and it’s such an easy fix in hindsight that it makes me want to hit myself in the hand with a hammer for not spotting it. But I have pretty hands, so I’ll forgive myself.
Hope everyone had a great Christmas, and have a happy and healthy New Year!
Stephen