The story begins with a Sql Server upgrade from 2012 to 2016. But the upgrade itself also forced dealing with existing issues we didn’t know we had. That plus the opportunity to take advantage of new features are what account for this dramatic change.
Learning #1: Vmware CPU hot swap option is bad. Upon first upgrading performance was worse than 2012 overall and some queries seemed to hang without returning rows. Excessive cxpacket waits and blocking threads on the same session indicated a parallelism issue. I found the vmware best practices for sql server said not to use this option and that all our sql servers were setup using this option. Turning it off and bouncing the severs gave us from 30 to 80% speed increase for parallel operations and the freezing issue became much less frequent.
Learning #2: SSIS basic logging trashes performance. SSIS in 2016 has the option of custom logging. Knowing we don’t need or use logging unless there’s an error (we instead rely on the much more useful and efficient BiXpress audit framework), I created a custom log plan (only takes a few minutes) to only log error events and replaced all our basic log assignments in our job steps. This knocked off another 30% or so from all our job steps.
Learning #3: New cardinality estimator can loose its mind with nasty queries. I’ve had a problematic query for sometime that has implicit datatype conversions. Since it’s generated with a dynamic where clause for incremental loads, parameter sniffing has been an issue for performance. I had previously solved this by using the recompile option. But in 2016 the query would unpredictably go serial. And instead of running 3 minutes it would run for an hour and up to 6 hours! The solution? Use of the new enable parallel plan preference. This took the query down to under 1 minute! But it would still occasionally hang or run for 45 to 120 minutes with little change in volume. The final solution was to use the new hint to prefer the legacy cardinality estimator (which was a trace flag in 2014).
Lesson #4: Updatable clustered columstores! Previously I dropped and recreated non-clustered columnstores…I knew being able to update rather than recreate the indexes would save time but would also cost on the update side. But the primary driver was to eliminate the potential table lock issues. Just stopping the drops saved a few minutes as expected. But by converting to clustered columnstore also helped performance for all the processes that use the related fact tables. And no noticeable hit on the update side! (Testing proved the existing merge update superior to alternatives).
There were many detours during the upgrade especially around getting all the add-on components working with 2016 but I will save those for another day.
The end of the story finds us with a screaming system with faster reporting and super fast updates. And our freezing queries a distant memory.
If you are doing data warehousing with Sql Server and looking to upgrade, feel free to ping me with any questions.