Why the DBA is Necessary to the Cloud- Part I
There are a lot of people and companies starting to push the same old myth regarding the death of the database administrator role in companies. On the Oracle side, it started with release Oracle 7 and now is proposed with the introduction of cloud. Hopefully my post will help ease the mind of those out there with concerns. There are a number of OBVIOUS reasons this is simply not true, but I’m going to write a few posts over the next year on some of the less obvious ones that will ensure DBAs stay employed for the long haul.
The first and to some- less obvious reason that DBAs are going to continue to be a necessary role in Information Technology with the Cloud is that almost all databases use a Cost Based Optimizer, (CBO).
I’m not going to go into when it was introduced in the different platforms, but over 90% of database platforms used in the market today have a CBO. This grants the database the ability to make performance decisions based on cost vs. strict rules, granting, (in theory and in most instances) better performance.
Articles, Bugs and Challenges, (ABCs)
There was an interesting thread on Oracle-l on hit of IO for an EBS environment due to extended statistics. There were links in the conversation to Jonathan Lewis’ blog that bring you to some incredibly interesting investigations on adaptive plans and other posts on configuration recommendations/bugs involved with extended statistics.
With the introduction of the CBO, the DBA was supposed to have less to worry about in the way of performance. The database was supposed to have automated statistics gathering that would then be used, along with type of process, kernel settings and parameters to make intelligent decisions without human intervention. The capability allowed the engine to take advantage of advanced features outside of simple rules, (if index for where clause columns exist, then use, etc.)
Some CBOs perform with more consistency than others, but many times the challenge of why a database chose a plan is lost on the DBA due to the complexity required to make these decisions. The one thing the DBA thought they could count on was the database engine using up to date statistics on objects, calls and parameters to make the decision. DBAs began to tear apart the algorithms behind every table/index scan and the cost of each process and limits for each memory and IO feature. As their knowledge increased, IT shops became more dependent upon their skills to take the CBO to the level required to ensure customers received the data they needed when they needed it. We learned to know when to ignore the cost on a query or transaction and how to force the database to choose the improved plan.
Dynamic Sampling or Is It Dynamic Pain?
I am a database administrator that HATED Oracle dynamic sampling and still find the cost way out weighing the benefit. There were few cases where it served a DBA like me, who possessed strong CBO and statistics knowledge, that for Oracle to make choices for me, (especially with SQL that had controlled hints included in the statements) caused me to find new ways to disable it anyway I could. I had dreams of the feature maturing into something that would serve my needs instead of waking me from those dreams to address another challenge where none should have been present.
If you managed as many multi-TB databases as I did, extensive dynamic sampling, especially on large objects could come back to haunt you. I performed a number of traces on processes where an Exadata was being accused of a configuration problem when in truth, it was 8 minutes of dynamic sampling out of a 9 minute db time. In each instance, I proved dynamic sampling was to blame via trace file evidence and in each instance, developers and application folks involved would ask why dynamic sampling was even considered a feature. I did see the feature usage and benefits, but it was rarely for the very large databases I managed.
Adaptive Plans
The next logical step in Oracle’s mind for enhancing features like dynamic sampling was to add Adaptive Plans. This is another feature that Oracle has introduced to benefit query and transactional process performance in databases. Allow the plan to adapt to allow the plan to adapt to the run in question, but if you’ve read the thread and the links included in the first part of this post, you’ll know that if often performs less than optimally.
Silver Bullets- NOT
In the end, OnPrem databases required extensive knowledge of the internal database workings, metrics and an strong research skills were required to guarantee the most consistent performance for any enterprise database engine.
All DBAs have experienced the quick fix solutionist, (not even a word, but I’m making it up here!) that would make recommendations like:
“Oh, it’s eating up CPU? Let’s get more/faster CPU!”
“I/O waits? Just get faster disk!”
“We need more compute? Just throw more at it!”
As a DBA, we knew that this was the quick and honestly, a temporary fix. To quote Cary Millsap, “You can’t hardware your way out of a software problem.” It’s one of my favorites, as I found myself in the situation of explaining why adding hardware was only a short-term solution. To answer why it’s short-term, we have to ask ourselves, “What is the natural life of a database?”
Growth.
Either in design, processes, users or code, (especially with poorly written code.) If you didn’t correct the poor foundation that was causing the heavy usage on the system by ensuring it ran more efficiently, you would only find yourself in the same place in six months or if lucky, two years, explaining why the “database sucks” again. This required research, testing and traditional optimization techniques, not enabling by granting it more resources to eat up in the future.
The Cloud is the Key
Considering that in a very high level view, any cloud is really just running all of these same product features and database engines on somebody else’s computer. How does this allow for complex features that required expertise to manage bypassed?
Unlike initial project startups or quick development spin ups, do we think companies are just going to continue to pay for more and more compute and IO?
I would be willing to bet it’s more cost effective to have people who know how to do more with less. At what point does that graph of price vs. demand hit the point that having people who know what they’re doing with a database make a difference? I think it’s a lot lower than the threshold many companies assume with statements of “You won’t need a Database Administrator anymore- Just standard administrator and developers!”
Tell me what you think!
I couple of thoughts…
if you ever have the ‘opportunity’ to manage People Soft on Oracle, you will learn to love dynamic sampling. It has saved my goose on a number of occasions.
The bigger Oracle gets, the more bugs they introduce.
You can’t outsource good customer service. It has been my experience that developers and business analysts want a consistent regular human that they can either speak to directly or walk into their cube. Human interaction is still important. They want a consistent voice on the end of the phone who they know and trust.
Thank you for sharing those thoughts.
Kellyn, why don’t you stop using animated GIF in your posts? It really distracts from reading.
SEO builds higher reach of content with images, gift and videos. Gifs are a great “in between” that relates an idea without consuming too many resources. Penetration of a post with a gif is over 10% higher than without.