Subscribe to Blog via Email
Follow me on TwitterMy Tweets
Data gravity and the friction it causes within the development cycle is an incredibly obvious problem in my eyes.
Data gravity suffers from the Von Newmann Bottleneck. It’s a basic limitation on how fast computers can be. Pretty simple, but states that the speed of where data resides and where it’s processed is the limiting factor in computing speed.
OLAP, DSS and VLDB DBAs are constantly in battle with this challenge. How much data is being consumed in a process, how much must be brought from disk and will the processing required to create the results end up “spilling” to disk vs. completing in memory.
Microsoft researcher Jim Gray has spend most of his career looking at the economics of data, which is one of the most accurate terms of this area of technical study. He started working at Microsoft in 1995 and although passionate about many areas of technology, his research on large databases and transactional processing speeds is one of great respect in my world.
Now some may say this has little to do with being a database administrator, but how many of us spend significant time on the cost based optimizer, as moving or getting data has cost- so economics of data it is.
And this is the fundamental principle of data gravity and why DBAs get the big bucks.
If you’re interested in learning more about data gravity, DevOps and the future of DBAs, register for the upcoming webinar.
I’m off to Columbus, Ohio tomorrow for a full day of sessions on Friday for the Ohio Oracle User Group. The wonderful Mary E. Brown and her group has set up a great venue and a fantastic schedule. Next week, I’m off to SQL Saturday Vancouver to present on DevOps for the DBA to a lovely group of SQL Server attendees. It’s my first time to Vancouver, British Columbia and as it’s one of the cities on our list of potential future locations to live, I’m very excited to visit.
Speaking of SQL Server- Delphix‘s own SQL Server COE, (Center of Excellence) meets twice a month to discuss various topics surrounding our much-loved Microsoft offering. This week, one of the topics discussed a previous change made to permissions to the Backup Operator role from SQL Server 2008R2 to SQL Server 2012. This feature, referred to as “File Share Scoping” was unique to 2008R2 clusters and no longer exists.
Now many may say, “but this is such an old version. We’ve got SQL Server 2017, right?” The challenge is, there are folks out there with 2008 instances and it’s good to know about these little changes that can make big impacts to your dependent products. This change impacted products with shared backups file systems and as we know, having access to a backup can offload a lot of potential load on a system.
Now, for my product, Delphix, we are dependent on read access to backup files for the initial creation of our “golden copy” that we source everything from. The change in SQL Server 2012 from the previous File Share Scoping in 2008R2 was only made to Microsoft Failover Clusters, to then offering access to only those with Administrator, where previously, anyone with Backup Operator role could attain access, too.
Our documentation clearly states during configuration of a Delphix engine for the validated sync, (creation of the golden copy) the customer must grant read access for the backup shares to the Delphix OS user and doesn’t state to grant Backup Operator. As with everything, routine can spell failure, as the Backup Operator role previously offered this access with 2008R2 and it was easy to assume the configuration complete upon database level role grants.
Using Powershell from the command line, note that you can’t view the root of the shared drive with the file server role, Backup Operator in the newer release.
PS C:\Users\user> Get-SmbShareAccess -name "E$" | ft -AutoSize Name ScopeName AccountName AccessControlType AccessRight ---- --------- ----------- ----------------- ----------- E$ USER1-SHARE BUILTIN\Administrators Allow Full E$ * BUILTIN\Administrators Allow Full E$ * BUILTIN\Backup Operators Allow Full E$ * NT AUTHORITY\INTERACTIVE Allow Full
If you’d like to read more details on backup and recovery changes from SQL Server 2008R2 to 2012, check out the documentation from Microsoft here.
This is an extensive series of blog posts, (four so far) to be followed by an ebook, a podcast and two webinars. One is to be announced soon from Oracle called, “The DBA Diaries” and the other will be a from Delphix, titled, “The Revolution: From Databases and DevOps to DataOps“.
The goal for all of this is to ease transition for the Database community as the brutal shift to the cloud, now underway, changes our day to day lives. Development continues to move at an ever accelerating pace and yet the DBA is standing still, waiting for the data to catch up with it all. This is a concept that many refer to as “data gravity“.
The concept was first coined just a few years ago by a Senior VP Platform Engineer, Dave McCrory. It was an open discussion aimed at understanding how data impacted the way technology changed when connected with network, software and compute.
He discusses the basic understanding that there’s a limit in “the speed with which information can get from memory (where data is stored) to computing (where data is acted upon) is the limiting factor in computing speed.” called the Von Newmann Bottleneck.
These are essential concepts that I believe all DBAs and Developers should understand, as data gravity impacts all of us. Its the reason for many enhancements to database, network and compute power. Its the reason optimization specialists are in such demand. Other roles such as backup, monitoring and error handling can be automated, but the more that we drive logic into programs, nothing is as good as true skill in optimization when it comes to eliminating much of data gravity issues. Less data, less weight- it’s as simple as that.
We all know the cloud discussions are coming, and with that, even bigger challenges are felt by the gravity from data. Until then, let’s just take a step back and recognize that we need some new goals and some new skills. If you’re like to learn more about data gravity, but don’t have time to take it all in at once, consider following it on Twitter, which is curated by Dave McCrory.
I’m off to Jacksonville, Fl. tomorrow to speak at SQL Saturday #649!
I just returned from a week in Paris and it was fantastic!
No amount of terrorist attack could hinder my enthusiasm for the week of vacation and although Tim and I were unaware a man with a hammer at Notre Dame would garner so much attention from the news outlets, I realized quick enough that I was in a slew of online, video and print as the major American tourist who witnessed…well, a person lying wounded on the pavement with police over him, guns drawn. I also became quickly alerted that there was a massive crowd coming towards me looking for shelter from any further gun shots in the cathedral, (which I was currently first in line!) We were ushered inside immediately and I have to say, I can’t think of a more beautiful place to be held while a terror attack investigation was going on outside.
I’d tweeted just before entering the courtyard and then the line to enter the cathedral that we were going to visit Notre Dame and after being ushered in by security, some of my followers on Twitter alerted me that there was something going on, but they weren’t sure what. I quickly responded that it was alright and that we were being protected inside the cathedral. This ended up alerting the major media outlets and they started to contact me. We had about 2 hours inside, so I did grant BBC and AP the time to tell them that the situation was very well contained and that we were safe. I was hoping to keep it from being blown out of proportion, to be honest.
Needless to say, after the initial investigation was completed and we were released, Tim and I went to get a glass of red wine, (I commonly don’t drink wine) to calm the nerves and as I’ve stated in my interviews- the French police had everything incredibly under control and contained the situation in an efficient and orderly manner. I applaud how effectively they’ve handled the escalated terrorism threats in France and I wish other countries were as prepared as they are.
The rest of the week we enjoyed Paris and even spent a day down on the Southern coast of France. The last evening, just days after the terrorist event, you could see that no one was getting the French down, definitely not some silly terrorist!
So, back to work for me! I have two webinars this week!
Tomorrow is an Oracle one from Pro Huddle, focused on DevOps for the DBA!
Wednesday is for the SQL Server VirtualPass folks and is on Virtualization, Opportunities When Migrating to the Cloud!
I love questions- They give me something to write about that I don’t have to come up with from my experience or challenges…:)
So in my last post, Paul asked:
I am not sure what happens to the other changes which happened while the release was happening? Presumably they are also lost? Presumably the database has to go down while the data is reverted?
In our scenario to answer this question, I’m going to perform the following on the VEmp_826 virtualized database:
As I’m about to make these changes to my database, I take a snapshot which is then displayed in the Delphix Admin console using the “Camera” icon in the Configuration pane.
Note the SNC Range listing on each of them. Those are the SCNs available in that snapshot and the timestamp is listed, as well.
Now I log into my target host that the VDB resides on. Even though this is hosted on AWS, it really is no different for me than logging into any remote box. I set my environment and log in as the schema owner to perform the tasks we’ve listed above.
So we’ll create a table, index and some support objects for our test:
CREATE TABLE REW_TST ( C1 NUMBER NOT NULL ,C2 VARCHAR2(255) ,CDATE TIMESTAMP ); CREATE INDEX PK_INX_RT ON REW_TST (C1); CREATE SEQUENCE RT_SEQ START WITH 1; CREATE OR REPLACE TRIGGER RW_BIR BEFORE INSERT ON REW_TST FOR EACH ROW BEGIN SELECT C1_SEQ.NEXTVAL INTO :new.C1 FROM DUAL; END; /
Now that it’s all created, we’ll take ANOTHER snapshot.
This snapshot takes just a couple seconds and is about 10 minutes after our first one and contains the changes that were made to the database since we took the initial change.
Now I’ll add a couple rows from another transaction into the KINDER_TBL from yesterday:
INSERT INTO KINDER_TBL VALUES (1,dbms_random.string('A', 200), SYSDATE); INSERT INTO KINDER_TBL VALUES (2,dbms_random.string('B', 200), SYSDATE); INSERT INTO KINDER_TBL VALUES (3,dbms_random.string('C', 200), SYSDATE); INSERT INTO KINDER_TBL VALUES (4,dbms_random.string('D', 200), SYSDATE); INSERT INTO KINDER_TBL VALUES (5,dbms_random.string('E', 200), SYSDATE); INSERT INTO KINDER_TBL VALUES (6, dbms_random.string('F', 200), SYSDATE); INSERT INTO KINDER_TBL VALUES (7,dbms_random.string('G', 200), SYSDATE); COMMIT;
We’ll take another snapshot:
Now let’s add a ton of rows to the new table we’ve created:
SQL> Begin For IDS in 1..10000 Loop INSERT INTO REW_TST (C2) VALUES (dbms_random.string('X', 200)); Commit; End loop; End; /
And take another snapshot.
Now that I have all of my snapshots for the critical times in the release change, there is a secondary option that is available.
As I pointed out earlier, there is a range of SCNs in each snapshot. Notice that I can now provision by time or SCN from the Admin Console:
So I could easily go back to any of my SET beginning or ending SCN by the snapshot OR I could click on the up/down arrows or type the exact SCN I want to pinpoint for the recovery. Once I’ve decided on the correct SCN to recover to, then click on Refresh VDB and it will go to that SCN, just like doing a recovery via RMAN, but instead of having to type out the commands and the time constraints, this would be an incredibly quick recovery.
Notice that I can go back to any of my snapshots, too. For the purpose of this test, we’re going to go back to just before I added the data to the new table by clicking on the Selected SCN and clicking Rewind VDB.
Note that now this is the final snapshot shown in the system, no longer displaying the one that was taken after we inserted the 10K rows into REW_TST.
If we look at our SQL*Plus connection to the database, we’ll see that it’s no longer connected from our 10K row check on our new table:
And if I connect back in, what do I have for my data in the tables I worked on?
Pssst- there are fourteen rows instead of seven because I inserted another 7 yesterday when I created the table… 🙂
I think that answers a lot of the questions posed by Paul, but I’m going to jump in a little deeper on one-
Yes, the database did experience an outage as the VDB was put back to the point in time or SCN requested by the User Interface or Command line interface for Delphix. You can see this from querying the database:
SQL> select to_char(startup_time,'DD-MM-YYYY HH24:MI:SS') startup_time from v$instance; STARTUP_TIME ------------------- 10-05-2017 12:11:49
The entire database is restored back to this time and the Dsource, the database the VDB is sourced from and keeps track of everything going on in all VDBs, has pushed the database back, yet the snapshots from before this time exist, (tracked by the Delphix Engine.)
If you view the alert log for the VDB, you’ll also see the tail of the recovery steps, including our requested SCN:
alter database recover datafile list clear Completed: alter database recover datafile list clear alter database recover datafile list 1 , 2 , 3 , 4 Completed: alter database recover datafile list alter database recover if needed start until change 2258846 Media Recovery Start started logmerger process Parallel Media Recovery started with 2 slaves Wed May 10 12:11:05 2017 Recovery of Online Redo Log: Thread 1 Group 3 Seq 81 Reading mem 0 Mem# 0: /mnt/provision/VEmp_826/datafile/VEMP_826/onlinelog/o1_mf_3_dk3r6nho_.log Incomplete recovery applied all redo ever generated. Recovery completed through change 2258846 time 05/10/2017 11:55:30 Media Recovery Complete (VEmp826)
Any other changes around the change that you’re tracking from are impacted by a rewind, so if there are two developers working on the same database, they could impact each other, but with the small footprint of a VDB, why wouldn’t you just give them each their own VDB and merge the changes at the end of the development cycle? The glorious reasons for adoption virtualization technology is to have the ability to work in 2 week sprints and be more agile than our older, waterfall methods that are laden with problems.
Let me know if you have any more questions- I live for questions that offer me some incentive to go look at what’s going on under the covers!
As a heterogenous company, I get a lot of questions on what database platforms are important, why and if we should invest in them. When looking at some of the lists and some of the marketing, it can be a bit overwhelming.
Now I’m not going to promise to take all of the confusion out of the myriad of database platforms out there, but I’m hoping to help with what’s relevant.
So let’s take a look at what is in the list of top databases engines in the industry per db-engines.com. The reason I use this site as my main “go-to” for information on database popularity has a lot to do with it’s method for collecting its information. There are other ways of calculating popularity, but I’ve found that if you do by dollar amount, number of installations or instances, there is a bias that impacts and I find db-engines is just more reliable and telling of trends.
The first thing you’ll notice is that [currently], RDBMS, (relational databases) rule the world. Features, scalability and maturity have given relational a strong base and it continues today. There are a number of databases that fall into the “big data” arena, including Cassandra, Hive, Solr and others, that bridge different platforms. I still agree with the way that db-engines has classified them and it should be noted that they’ve looked at the architecture and not the use of the database.
Oracle had a large lead in the field, but as you can see, it’s decreasing on a consistent basis. Most of us in the Oracle realm are hoping that the cloud shift will start to show the success that they are betting on. Just as they did the search from (I)nternet to (C)loud, we have faith in this giant. In contrast, MySQL is consistently strengthening their #2 spot. They still are viewed as “open source” by many and with the power behind their advanced features, including clusters with version 5.7, folks are finding them to be a strong alternative to Oracle while still staying with the power of the Oracle company.
SQL Server is #3 and with their Azure cloud offering, they are currently the second strongest cloud behind Amazon. I’ve worked in SQL Server as long as Oracle, having started with MSSQL 6.0-7.0 back in 2000. I was very passionate about database technology and didn’t feel a strong kinship to any one platform, so was happy to work in Oracle, SQL Server, Sybase or others that came along in my early years of database administration. Microsoft has invested heavily in business intelligence and data visualization with 2016 and I’m enthralled with some of the newest features.
MongoDB is the only document store that’s in the top 20. It’s a love/hate relationship for those that use it and for those that came from the RDBMS world, they had a negative view of the platform. There are a number of web developers that are loyal to the database platform and have propagated it out to the technical world.
A document store is quite the “anti-RDBMS”, considering it violates so many of the rules of an RDBMS, with different columns per row and no schemas. Amazon attempts to have their own version of any database and for MongoDB, it’s DynamoDB. Other competitors include:
Cassandra and Hbase, both via Apache, are the top choices. These have similarities to DynamoDB and other NoSQL databases. These databases are highly scalable with their ability to be spanned on bare metal hardware and require significant technical knowledge of hardware, software and development to architect an environment that functions seamlessly and efficiently. Having a strong architect is essential and having a strong, collaborative team is important to this type of technology.
Rounding up the top ten in the list is Redis, which is a key-column store database engine. Redis is an in-memory data store that supports a wide range of development languages and has a strong open source community. One of the interesting features in this database engine is its ability to house geospatial indexes and use radius queries with them. I worked with spatial data for four years when I first became a DBA and was always fascinated with this type of data. Redis has something called GEORADIUS that makes working with this data much easier.
When I was at Pass Summit, Elasticsearch was something that was mentioned often. I also noticed that it was one of the platforms mentioned in Brent Ozar’s salary data, too. Its pushed for AWS, but I’m seeing it more commonly on Azure and it has numerous followers due to its ability to do deep log analysis then build custom dashboards with Kibana.
I was also surprised, as I think a few will be, that Splunk is listed. It is a search engine that offers insight into log data. As logs can accumulate considerable storage in a short time, they’ve branched wisely into a cloud offering. Just as tracing and logs are the key to database administrators, the value in log data can benefit anyone in Information Technology.
Now there are a few others that fall outside of what’s in the top categories and database platforms that I’d like to mention.
Snowflake, (169) and Redshift, (69) are starting to gain momentum. Why? With migration to the cloud, less is more. The ability to virtualize your data warehouses are a great way to make it an easier shift, but also to lessen your cloud footprint, which means less storage usage.
This is a key to what Delphix is educating customers on as well, as we also virtualize all of non-production databases, many of the platforms seen in the db-engine list, making it easier to migrate to the cloud and lessen the storage footprint. This is an area that technology should be emphasizing more. I believe, as with most monthly subscriptions, the smaller bill looks great at first, but when you add it up over the long haul, you realize how much you’re paying for the cloud and it’s essential to recognize the opportunities to save on storage, compute and resources.
The second area I want to focus on is control. Control over our data, control over what we can do with our data and control over support when something goes wrong with our data. We choose database platforms to have some control over what features and product support we have. The dynamic has shifted as many companies now look to open source to control over what they produce by creating the features and product more on their own. They feel limited by large technical companies and want more control over how much they can do and direction.
No matter what you decide, no matter what your favorite database platform is or what you use in your company, databases make the world go round. The world continues to spin and databases continue to be the mechanism to deliver the data that we use everyday to answer questions, build trends and analysis from and make the decisions that the world relies on. We Database Administrators and Developers are the ones that are lucky enough to be those to work in this exciting arena in technology.
The Oak Table network is a closed membership of about 100 technical scientists who have a passion for technology and the world around them. These men and women, representing some of the brightest minds from each of their companies, are individuals who have built companies and major technical innovations.
Initially, much of their research and work was presented at Oracle Open World. Over time, Oracle Open World welcomed less and less of their participation. They decided to create their own network and present their information at their own conference, held annually at the same time as OOW. Over a decade later, the event now has company sponsors and a dedicated following of Oracle specialists who attend to receive the deepest and most interesting details of technical data in the industry.
There is a “cool factor” for those going to Oak Table World. Most of the members are also part of Oracle’s famed ACE Program, (most have ACE Director associations). Some of the members work for Oracle; most members work for companies that are Oracle partners. All of the sessions have a tendency to leave the attendee with their mind blown after each session.
The conference this year was sponsored by Delphix, Pythian and Amazon Web Services and was held at the Children’s Creativity Museum. The two day schedule only allows for 13 1-hour sessions and 8 10-minute “TED” style talks. This year the conference attracted over 400 attendees in two days. A majority of the talks were the deeper technical content that wasn’t deemed suitable for the more marketing driven Oracle Open World event.
Because Delphix is one of the major sponsors, by default, it receives quite a bit of exposure at the event. In addition to sponsoring, Delphix does the coordination for the event. These opportunities provide us a way to create awareness with the primary technical thought leaders in the Oracle community.
In the next couple weeks, you’ll see a post on Oak Table with links for the presenters incredible slides and to the recorded sessions. One session that wasn’t captured was Toon Koppelaars one hour session on the first day. We know this was a session that many said was a must-see, so I’ll be working with Laura Ramsey from OTN to get a webinar set up so many can take advantage of Toon’s amazing contribution.
If you have questions, please feel free to reach out to me and I’ll happily share information on what we shared today and how we will continue to support Oak Table Network and thank you to each of the sponsors who, like Delphix, know the value of these incredible technologists.
I’ve been going through some SERIOUS training in just over a week. This training has successfully navigated the “Three I’s”, as in its been Interesting, Interactive and Informative. The offerings are very complete and the knowledge gained is limitless.
I’d also like to send a shout out to Steve Karam, Leighton Nelson and everyone else at Delphix who’s had a hand in designing the training, both for new employees and for the those working with our hands on labs. I’ve had a chance to work with both and they’re just far above anything I’ve seen anywhere else.
Most DBAs know- If you attempt to take a shortcut in patching or upgrading, either by not testing or hoping that your environments are the same without verifying, shortcuts can go very wrong, very quickly.
Patching is also one of the most TEDIOUS tasks required of DBAs. The demands on the IT infrastructure for downtime to apply quarterly PSU patches, (not including emergency security patches) to all the development, test, QA and production databases is a task I’ve never looked forward to. Even when utilizing Enterprise Manager 12c Patch Plans with the DBLM management pack, you still had to hope that you checked compliance for all environments and prayed that your failure threshold wasn’t tripped, which means a large amount of your time would have to be allocated to address patching outside of just testing and building out patch plans.
I bet most of you already knew you could virtualize your development and test from a single Delphix compressed copy, (referred to as a DSource.) create as many virtual copies, (referred to as VDBs) as your organization needs to have for development, testing, QA, backup and disaster recovery, (if you weren’t aware of this, you can thank me later… :))
What you may not know, (and what I learned this week) is that you can also do the following:
Considering how much time and resources are saved by just eliminating such a large portion of time required for patching and upgrading, this is worth investing in Delphix just for this alone!
Want to learn more? Check out the following links:
Want to Demo Delphix? <– Click here!
I’m a Leaf on the Wind, Watch How I Soar.
This is one of my favorite lines from the movie Serenity.
Without Alan Tudyk’s character dying and all at the end of this, (sorry if I ruined the movie for anyone…) this is how a DBA should be in the database environment- skilled, reliable and maybe a little off our rockers. Our job is to protect the data, the database and all of the database.
With that said, I’m going to list my Ten Rules of Database Administration.
Happy Friday Folks!
There was a question posted on Oracle-l forum today that should have a blog post for easy lookup for folks. Regarding your Enterprise Manager repository database, (aka OMR.) This database has a restricted use license, which means you can use it for the Enterprise Manager repository, but you can’t add partitioning to it or RAC or dataguard features without licensing those features. You also can’t use the diagnostic and tuning pack features available in Enterprise Manager on the repository database without licensing it outside of the EMDiagnostics tool. You can view information about the license that is part of the OMR here.
No one wants to be open to an audit or have a surprise when inspecting what management packs they’re using.
To view what management packs you’re using for any given EMCC page, you can use the console and access it from the Setup menu from EM12c or EM13c:
With that said, Hans Forbrich made a very valuable addition to the thread and added how to disable EM management control access in your OMR database-
Run the following to disable it via SQL*Plus as SYSDBA:
ALTER SYSTEM SET CONTROL_MANAGEMENT_PACK_ACCESS='NONE' scope=BOTH;
Other packs are disabled using the EM Cloud Control with the appropriate privileges in the console using the SETUP menu in 22.214.171.124 with a patch or higher:
The view can be changed from licensed databases to all databases and then you can go through and adjust management packs as licensed and then apply.
Don’t make yourself open to an audit when Enterprise Manager can make it really easy to manage the management packs you are accessing.
After my AWR Warehouse session at DOAG in Nuremberg, Germany last week, an attendee asked me if he really needed to use trace files anymore with the value provided by AWR, (Automatic Workload Repository) and ASH, (Active Session History.) I responded that trace files were incredibly valuable and should never be discounted.
To use an analogy, when I was 22 years old, and entered a doctor’s office missing a small percentage of my visual field after an apparent migraine, doctors used a number of different “tools” at their disposal in an attempt to diagnose what had happened. They first started with an Ophthalmoscope to determine the health of my retinas. They assumed that I was simply experiencing visual disturbances due to a migraine and sent me home with a prescription strength case of Tylenol after verifying that my eyes were healthy.
After no change with another two days passed, the doctors then proceeded with a Computed Tomography Scan, aka CAT/CT scan. This is a standard first line inspection of the brain and again, it resulted with no information as to what was causing the visual disturbance. If it hadn’t been for a neurologist that was on duty at the hospital, I may very well have been sent home again. He asked me about the severe migraine, the actual specifics of the experience and suddenly realized he was hearing someone describe to him the symptoms of a stroke. No one had considered to ask what had happened and since I was in my early twenties, hadn’t considered this. At this point, the doctor asked for me to have an MRI, (Magnetic Resonance Imaging) with contract. An injection of gadolinium contrast resulted in certain tissues and abnormalities more clearly visible and for me, it showed that I had experienced a break in the blood vessels in the back right of my brain, issuing a small aneurysm, but suffering only a little blood loss. The main damage was to the brain tissue in the back area of my brain which “translates” the information being sent from my optic nerve and this was the cause of my left side visual field loss. Due to this new information, he was able to start the research that in the coming years saved my life.
This doctor, like a database administrator or developer, knew to use the right tool for the job once he knew what he was up against and this is how we succeed when others fail in the technical industry. To be told that you no longer need a tool such as tracing because valuable data is provided from AWR or ASH limits the ability to see real culprits that the secondary tools discussed weren’t really designed to capture.
To know when to use one tool over the other is often the biggest challenge. A number of years back, a customer was experiencing terrible, overall performance issues in an ETL process that no one had been able to pinpoint. I always love a good challenge and began to look at it via AWR reports. I noticed that there were waits that were unaccounted for in the SQL processing.
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop | -------------------------------------------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | | | 688 | | | | 1 | DELETE | _ITEM_MONTH_ROLLUP | | | | | | | 2 | NESTED LOOPS | | 1490 | 86420 | 688 | | |<--This one doesn't add up to time/value consumed. | 3 | VIEW | VW_NSO_1 | 1 | 26 | 20 | | | | 4 | SORT UNIQUE | | 1 | 40 | 20 | | | | 5 | TABLE ACCESS BY INDEX ROWID| PROCESS_LOG_MASTER | 1 | 40 | 4 | | | | 6 | INDEX RANGE SCAN | PROCESS_LOG_MASTER_I7 | 132 | | 1 | | | | 7 | PARTITION RANGE ITERATOR | | 1490 | 47680 | 386 | KEY | KEY | | 8 | INDEX RANGE SCAN | ITEM_MONTH_ROLLUP_I2 | 1490 | 47680 | 386 | KEY | KEY | --------------------------------------------------------------------------------------------------------------
– dynamic sampling used for this statement (level=8)
– SQL profile “SYS_SQLPROF_013dad8465770005” used for this statement
CPU Costing is off and there is a profile on this statement. Stats are up to date, why am I seeing dynamic sampling level 8??
I proceeded to capture a trace and that’s when I was able to demonstrate that due to auto-DOP in 11.2, a choice was made to perform dynamic sampling that was adding over 80% to the execution time, impacting performance significantly.
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring blah, blah, blah, blah... :)
call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 28.60 48.52 322895 43776 489123 189 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 28.60 48.52 322895 43776 489123 189
Limiting your tools, no matter if you limit yourself to just using tracing or AWR/ASH, SLOB, Snapper or any other tool is a poor choice. Use the right tool for the situation that provides answer to the questions the business relies on you for. You wouldn’t trust a doctor that used a CAT scan to check a broken bone vs. an X-ray, so why would you trust an optimization specialist that limits themselves to one set of tools?
So I’ll be heading out for Oracle Open World 2015 in just over another week. This is the first year in a while that I won’t be speaking, (my own fault, as dummy here hasn’t figured out how to submit sessions internally yet, duh!) but I’m still going to support the demo grounds, meet with customers and speak at Oak Table World!
Per Steve Karam, if you need to know who’s speaking and why you should attend the sessions at the Children’s Museum, well here’s a few:
Jonathan Lewis – Yes. That Jonathan Lewis. Also an alien of extraordinary ability.Tanel Poder – Doer of things you didn’t know SQL could do and the snappiest snappermakerAlex Gorbachev – Jack of all trades, master of…well, all of them really.Kellyn Pot’Vin-Gorman – Cloud Control control, techno wiz, and data lacquererTim Gorman – Kellyn’s hype man. And world renowned Oracle expert.Kevin Closson – Refusing to party like it’s 1999 since 1999Gwen Shapira – A fifty petabyte machine learning datastore in human formCary Milsap – His advice has probably saved your database and careerOak Table World – The best Oracle speakers outside of OpenWorld. Literally, not figuratively.
Now if that doesn’t sum it up, well, let’s be honest, there are over 60 members and more speaking that listed here, so if you have time, it’s the best “unconference”, EVER.
With that said, we’ll get back to why everyone is there in the first place and it’s because of my awesome employer, Oracle!
I’m busy preparing my new sessions for post Oracle Open World, which I have three conferences in November, (San Antonio, Chicago and Germany) and ensuring customers I’m in charge of for OOW preparation are ready to go when the stage lights go up! It’s a fun kind of crazy chaos and the kind I commonly thrive under, so we’re all good!
With that said, there are a few events I’m looking forward to-
Sunday, Oct. 25th, 8:30am- Bridge Run with Crazy Oracle Peeps
Yeah, me and my arthritis will be walking that bridge, but it’s still a great event and I’ll be going. I’m not doing the bay swim or the bike event. Enough is enough, you crazy, exercise nuts! 🙂
I have wonderful peers at Oracle and Akshai Duggal is one of them! She’s going to be explaining how our SCP teams build out the REALLY massive Enterprise Manager environments, ensuring uptime, clear and concise monitoring, along with understanding pesky things like metrics and templates. Check out her session with real use cases, how can you beat that?
Monday, Oct. 26th, 5pm- in the RAC Attack Area- Enterprise Manager SIG from IOUG and Oracle.
Come meet up with all the top EM specialists in the world as we discuss the best topics surrounding Enterprise Manager and enjoy the hospitality of the OTN lounge area!
Monday, Oct. 26th, 6pm, the ODTUG Reception, (another invite only event…)
Another event I look forward to every year and even though I’m over booked this year, I’ll make sure to stop in for this, too!
Tuesday, Oct. 27th, 10-10:50am- Oak Table World- Stewart Bryson and I are going to talk about OBIEE and AWR Warehouse. Children’s Museum
We just wanted to play around a bit and see how much trending and analysis we could do with the product and had a great time building this out. Come check it out and then head on over for….
The big boss is talking, no not Larry, but my boss, Prakash Ramamurthy! He’s going to be discussing the coolest of the cool in new stuff that I’m totally obsessed with right now and if you miss out on this session, you’re going to be missing out on the best!
Tuesday, Oct. 27th, 12:15pm, Oak Table Ted Talk! Secret Topic! Children’s Museum
I can’t tell you what I’m going to talk about, but I’m going to tell you what I can’t tell you now. Yeah, I really just said that… 🙂
Courtney Llamas and a slew of great folks will be on stage talking how to become an EM Champion!
Wednesday, Oct. 28th, 4:15pm- Hybrid Cloud—Pivot to the Cloud with Oracle Enterprise Manager Cloud Control [CON9712] Moscone South 300
I really enjoy working with Steve Steltling in our SCP team and he’ll be presenting with Phillip Brown on Hybrid Cloud!
Wednesday, Oct. 28th- Bloggers Meetup from Pythian…..I’m waiting….still waiting….no invite yet….Where is it?
This is an event that I always enjoy and wait for the invite from Pythian each year, but nothing yet and will be pestering Alex Gorbachev until it shows up… 🙂
Wednesday Oct. 28th, evening- NOT GOING TO TREASURE ISLAND
Yeah, you heard that right. Every year, even when I did get a wrist band, I gave it to someone deserving and then Tim and I enjoy the quiet city for a dinner out in ANY restaurant we like… 🙂 We just aren’t much into concerts and love the opportunity to allow someone else to go that may not have by giving them Tim’s band, (now that I’m with Oracle, I no longer get one…)
Werner DeGruyter, our EM Yoda and Shailesh Dwivedi are going to talk about a great topic. I love mining Oracle EM data, so if that’s something you are impressed with, don’t miss out on this session!
If you want to see all the best of the Enterprise Manager sessions, here’s the Oracle Open World Schedule specific to EM!
Friday is the flight home, pick up my dog, DaVinci from the pet hotel and find out if my youngest son, Josh, still knows who I am!
There may be a reason that one needs to re-add a database to the AWR Warehouse. This is a new opportunity for me to learn and offer assistance, but I’m working off the grid to figure out a solution. Luckily, it’s just a matter of creating a new process from pre-existing code and processes.
The process requires both the AWR Warehouse UI and a SQLPlus connection.
1. Disable the snapshots
2. Extract the AWR for the Database
3. Drop the database from the AWR Warehouse
4. Add the database back into the AWR Warehouse with the retention time or any other configuration change.
5. Update the configuration table in the source database to show the correct snapshots.
Log into the AWR Warehouse console and locate the database that needs to be re-added:
From the AWR Warehouse UI, click on the database and then choose to disable the snapshot uploads for the database in question.
Extract AWR Data for the Database
From the AWR Warehouse host, log into the AWR Warehouse repository, (database) and now extract the AWR data. This will use the formal AWR data extract procedure, found in the $ORACLE_HOME/rdbms/admin directory and yes, it works with the AWR Warehouse.
~~~~~~~~~~~~~ AWR EXTRACT ~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ This script will extract the AWR data for a range of snapshots ~ ~ into a dump file. The script will prompt users for the ~ ~ following information: ~ ~ (1) database id ~ ~ (2) snapshot range to extract ~ ~ (3) name of directory object ~ ~ (4) name of dump file ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A list of databases will be displayed that you can choose from. In the AWR Warehouse, all the source databases will be shown:
Databases in this Workload Repository schema ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ DB Id DB Name Host ------------ ------------ ------------ * 1027807260 AWRWHSE AWRWHSE_HOST 9352681 APPSB1 HOST1 4212609618 AASDB HOST2 4212608695 AASDB HOST3 so on and so forth....
Enter in the DBID for the database from the list. We’ll use the first AASDB from the list for our example:
Enter value for dbid: 4212607231 Using 4212607231 for Database ID
Specify the number of days of snapshots to choose from ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Entering the number of days (n) will result in the most recent (n) days of snapshots being listed. Pressing <return> without specifying a number lists all completed snapshots. Enter value for num_days: Listing all Completed Snapshots
As I didn’t enter a number in for the number of days, you can see I want ALL of the AWR data for this database. Now the extract script will ask for snapshot information. Again, we want all of it and the report will show us all of the snapshots. If there are so many, that you can’t see the first one, use the AWR Warehouse UI to see the first and last snapshot for this step.
Specify the Begin and End Snapshot Ids ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Enter value for begin_snap: 1777 Begin Snapshot Id specified: 1777 Enter value for end_snap: 2303 End Snapshot Id specified: 2303
Specify the Directory Name ~~~~~~~~~~~~~~~~~~~~~~~~~~ Directory Name Directory Path ------------------------------ ----------------------------------- DATA_PUMP_DIR /u01/admin/AWRWHSE/dpdump/ OPATCH_INST_DIR /u01/home/oracle/126.96.36.199/OPatch OPATCH_LOG_DIR /u01/home/oracle/188.8.131.52/QOpatch OPATCH_SCRIPT_DIR /u01/home/oracle/184.108.40.206/QOpatch Choose a Directory Name from the above list (case-sensitive). Enter value for directory_name: DATA_PUMP_DIR Using the dump directory: DATA_PUMP_DIR
Specify the Name of the Extract Dump File ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The prefix for the default dump file name is awrdat_1777_2303. To use this name, press <return> to continue, otherwise enter an alternative. Enter value for file_name: awrw_AASDB Using the dump file prefix: awrw_AASDB
End of AWR Extract SQL> !ls -la /u01/admin/AWRWHSE/dpdump/awrw_AASDB.dmp -rw-r----- 1 oracle oinstall 268779520 May 4 16:49 /u01/admin/AWRWHSE/dpdump/awrw_AASDB.dmp
Now that the extract is completed and the data is safe, it’s time to return to the AWR Warehouse console.
Click on the database in the AWR Warehouse and choose to remove it from the repository.
Once the database has been removed, go through the proper steps to add it again to the AWR Warehouse. Once added, disable the upload of snapshots. This will keep from concurrency issues and other issues until we’re finished.
Reload the AWR Data
Log back into the AWR Warehouse Repository database, it’s time to load the data back into the AWR Warehouse.
Specify the Name of the Dump File to Load ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Please specify the prefix of the dump file (.dmp) to load: /u01/admin/AWRWHSE/dpdump/awrw_AASDB.dmp
Pressing will result in the recommended default tablespace (identified by *) being used. Enter value for default_tablespace: MGMT_TABLESPACE
As the space will be used at some point anyway, (the mgmt_tablespace is expected to grow, so no concern here and the temporary user utilized for the process will be dropped at the end.
Choose to use the standard TEMP tablespace and proceed forward past this step. As the AWR objects are already built into the MGMT_TABLESPACE, there is no need to remap from the SYSAUX as the process in a standard database would require.
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA . . imported "AWR_STAGE"."WRH$_SQL_PLAN" 89.485 MB 329981 rows . . imported "AWR_STAGE"."WRH$_SQLTEXT" 192.34 MB 398276 rows . . imported "AWR_STAGE"."WRH$_SYSMETRIC_SUMMARY" 232.12 KB 1129472 rows ....... ........ Append Data for "AWR_STAGE".WRH$_SGASTAT. INSERT /*+ APPEND */ INTO SYS.WRH$_SGASTAT (SNAP_ID, DBID, INSTANCE_NUMBER, NAME, POOL, BYTES) SELECT SNAP_ID, 3228342000, INSTANCE_NUMBER, NAME, POOL, BYTES FROM "AWR_STAGE".WRH$_SGASTAT WHERE DBID = :c_dbid
Now that we’ve loaded the data back in, we need to update the metadata table to show that more than the initial data exists.
select * from dbnsmp.caw_load_metadata where dbid=<dbid for database in question>;
aasdb3 oracle_database 4212607231 1777 1778 27-APR-15 27-APR-15 3 1
Update the table to reflect the new data, as there should only be one upload max that’s been sent over. If there is more than one, you’ll need to change the update statement to just reflect the one row that shows the max(end_snap_id) as the AWR Warehouse loads from the lowest snapid to the most recent.
select * from dbsnmp.caw_load_metadata where dbid=421607231;
Update dbsnmp.caw_load_metadata set end_snap_id=2303 where dbid=4212607231 and dump_id=<most recent dump_id>;
If there isn’t any data at all, then force one upload and then edit it. This is easily done from the Console:
Proceed back and update the row in the metadata table and then check the UI to verify that the beginning and end snapshot looks correct. You should now have all the AWR data loaded back into the AWR Warehouse! And that’s, how to reload AWR Data from a database that must be removed and re-added to the AWR Warehouse.
This is a new process for me, so please, let me know if you note anything amiss and I’ll be glad to research it!
Additionally, there are some great opportunities in professional development at the IOUG Strategic Leadership Program. I’ll be speaking with James Lui on how to improve your value with personal brands and social media.
I’m the WIT luncheon speaker on Monday, so for those of you that signed up for this great, yearly event for women in tech before it sold out, I’m looking forward to meeting as many of the powerful women in our industry, (and hopefully a few men, too!) before and after the event.
There are also a few SIG’s that I’ll be attending that aren’t on the schedule:
Tue-12:30 p.m. – 1:00 p.m.Cloud Computing and Virtualization SIG Meeting (ID: 943)
Wed-12:30 p.m. – 1:00 p.m.Oracle Enterprise Manager SIG Meeting (ID: 949)
|Id Number||Date and Time||Session Title||Product Line(s)||Session Track(s)||Session Room|
|976||Sun. Apr. 12
9:00 am – 1:00 pm
|Everything I Needed to Know About Enterprise Manager I Learned at COLLABORATE – Hands-on Lab||Oracle Enterprise Manager||Manageability||Palm D|
|0||Mon. Apr. 13
9:15 – 10:15 am
|Zero to Manageability in One Hour: Build a Solid Foundation for Oracle Enterprise Manager 12c||Oracle Cloud – DBaaS/PaaS| Oracle Database| Oracle Enterprise Manager||Manageability||Banyan B|
|112||Mon. Apr. 13
3:15 – 4:15 pm
|The Power of the AWR Warehouse||Oracle Enterprise Manager||Manageability||Banyan B|
|967||Mon. Apr. 13
4:30 – 5:30 pm
|IOUG Strategic Leadership Program: Staying on Top of Key Trends and Growing Your Personal Brand with Social Media||Applicable to All||Professional Development||Banyan E|
|1003||Wed. Apr. 15
2:00 – 3:00 pm
|OakTable: TED-Style Lightning Talks (2:00-3:00pm)||Applicable to All||Database||Mandalay K|
|986||Mon. Apr. 13
12:45 – 1:45 pm
|Twitter 101 for Oracle Professionals||Applicable to All|||Professional Development||Banyan D|
Thanks to everyone at IOUG for the opportunity to be so involved with this great conference and see everyone soon!
I’ve just returned from my honeymoon in Venice with the wonderful Tim Gorman, but before we enjoyed our week in this picturesque city, we spent a week doing joint keynotes and two sessions each on our own tech topics at these two incredible conferences in neighboring countries.
SIOUG, (Slovenia Oracle User Group) in Ljubljana, Slovenia held their 2014 conference Oct. 13th-14th at the Plaza Hotel. Tim and I did a joint keynote, “Assumptions, the Death of First Dates and Tech Projects Around the World” which was great fun, as it’s loaded with a number of Pulp Fiction stills that transcends any language barrier and gets everyone finding humor in the obvious, but also understanding how important it is to never assume anything in the tech world.
As soon as we arrived via car, driving in from Venice, we were hosted to a wonderful, traditional Slovenian dinner by Joze, @joc1954 and Lily Senegacnik and started hanging out with Milena Gerova,
@milenagerova and Nikolay Kovachev, @KovachevPro (both from the Bulgarian Oracle User Group). A great time was had by all and the conference was wonderful! Throughout both conferences, both Milena and Nikolay were in the middle of a number of technical and social events, always having a great time!
Once finished with the two day conference in Slovenia, we then took a wonderful drive down to Croatia to present the same sessions and keynote at the HROUG, (Croatian Oracle User Group) fall conference in Rovinj, Croatia. This is a beautiful small town on the coast and we were not disappointed.
I’d met the HROUG president at the IOUC summit back at the end of September and seeing him in his “element” was well worth the time! Davor is bigger than life and treated us as if we were all family. He sang with the wonderful band that played at many of the events and ensured we had both a tour of the wonderful town of Rovinj, along with an incredible meal!
The actual 2014 conference was held at the Hotel Istra on the island near Rovinj. It was an absolutely beautiful location to do a keynote and tech sessions. We were humbled by the incredibly positive feedback Tim and I received from both conferences keynote attendees and thrilled that it was received so well. We also appreciated the support of Zoran Pavlovic,
@ChallengeZoran and Maja Vaselica, @orapassion, (Serbian Oracle User Group) as we pointed them out as a fellow and rare Oracle couple. It was wonderful to finally meet them and get to know these two wonderfully warm and technically bright folks! I was also impressed that as many folks as there were attended the advanced EM12c concepts and of course, the AWR Warehouse had a great turnout, as expected.
Both conferences are well worth the time and effort, my only recommendation is, if you decide to drive there with a rental car, just make sure that you purchase a Vignette BEFORE coming into the country, (or Switzerland, Austria or Bulgaria, too!) I’ll leave you folks to ask Tim what happens when you don’t! 🙂
**Note- This is an article I wrote about 4 years ago that, as far as I know, can only be found in hard copy and although I’ve changed my stance on statistics, etc. with 11g/12c, the main point of CTAS benefits still stands the test of time.
As a kid did you ever watch the cartoon Hanna-Barbara’s, “the Jetson’s” and see Elroy drop out of George Jetson’s flying saucer, going to school via his personal, hyper-pod and thought, “forget the bus, I want to go to school in the hyper-pod!”?
For any developer or DBA working with large objects in an OLAP environment, updating and deleting large tables, even when partitioned, can feel like you are on that school bus again. So how do you get off the school bus and ride on Elroy’s hyper-pod to get things done, “Jetson’s Style”?
In my time working with large objects, I’ve seen and discovered more ways to use the CTAS, (“create table as” for those who are new to the acronym!) than I ever thought possible. It should be a pact of any DBA or developer in a warehouse or mart environment to have all the ways this feature can be used added to their arsenal.
For our current day vs. Jetson’s futuristic comparison, we’re going to say our environment consists of data partitioned by company_id, residing mostly in five primary tables. This partitioning method has been quite successful for how the business loads and reports on data, but your first challenge is to prune out all purchase orders older than five years from the system. Once complete, the business has some reporting requirements for you to invest time into, so keep that mind open and ready to work through some challenges!
The tables involved have the following info, examples of columns and sizes:
COMPANY_ACCT, 253 columns, 296GB, 1000 partitions total.
C_ID (partitioning key)
… so on and so forth…
COMPANY_TRANS,120 columns, 312GB, 1000 partitions total.
… so on and so forth…
ID_COMP_DIM, 412 columns, 212GB, 1000 partitions total.
… so on and so forth…
GEO_COMP_DIM, 82 columns, 180GB, 1000 partitions total.
… so on and so forth…
The traditional OLTP processing method to tackle the removal would be to either:
How could you perform the delete faster? How could you accomplish this task without actually utilizing a delete statement at all?
Elroy Jetson would say “CTAS it!” Anything that can be done, can be done faster, more efficiently and more suave when it’s done with a CTAS. Combine this with exchange partition feature and there are opportunities here that anyone can appreciate. Anything that can be done, can be done faster, more efficiently and more suave when it’s done with a CTAS, (“create table as” for those of you who are newer to the acronym..) Pairing up CTAS with exchange partition and you have a match made in heaven.
Now to return to our challenge at hand- let’s take our first table, the COMPANY_ACCT table and remove all the accounts that haven’t been active for over five years.
The traditional way, version 1:
DELETE FROM COMPANY_ACCT
WHERE LAST_ACTIVE_DT < TO_DATE(’01-JUN-2005’,’DD-MON-YYYY’);
Or version 2, process via a procedure and pass in the company id and process multi-thread or in parallel:
DELETE FROM COMPANY_ACCT
AND LAST_ACTIVE_DT < TO_DATE(’01-JUN-2005’,’DD-MON-YYYY’);
After each of these steps, you will have stale statistics and considerable empty blocks- blocks that will be scanned on every hash and table scan that are empty until you “coalesce” this partition. This is something that you must keep in mind when you process data the “old-fashioned” way.
How more efficient would this be to process this via a CTAS with an exchange partition?
Note** No delete in the following statement!!
CREATE TABLE COMPANY_ACCT_229 AS
SELECT * FROM COMPANY_ACCT
AND LAST_ACTIVE_DT > TO_DATE(’01-JUN-2005’,’DD-MON-YYYY’);
CREATE INDEX CA_229_idx1 on COMPANY_ACCT_229(LAST_ACTIVE_DT) LOCAL;
Update statistics and note that I use very, very small sample sizes for my estimate_percent. I have very consistent data across my data sets, but I have very large amounts of data. This gives me the luxury of collecting stats and not the time required for larger sample sizes. Due to the amount of partitions, I am careful to gather only what I need to not impact the performance of my data dictionary. I do not like seeing long ops on tables such as col$, tab$, ind$, etc… If you see this, look into the amount of statistics- are you gathering what you need or just flooding the data dictionary with data that can not give you the consistent and solid execution plans that are your goal?
Exec dbms_stats.gather_table_stats(ownname=>’OWNER’,tabname=>’COMPANY_ACCT_229’, estimate_percent=>.00001);
We are now ready to exchange our partition-
ALTER TABLE COMPANY_ACCT EXCHANGE PARTITION P_229 WITH COMPANY_ACCT_229 UPDATE INDEXES;
Your delete of the data for this partition is complete. There is no fragmentation and you have updated the stale statistics, you are as clean as possible for all processing going forward.
What is the cost savings?
Original Delete process per partition, no updateof statistics, on average: 28 min.
CTAS and exchange partition, including index creation, minus statistics update on average: 7 min.
What if you’re challenged to reproduce the whole table- all the partitions without the older than five years data. How can you complete this task with the CTAS method?
CREATE TABLE DW_PROD.COMPANY_ACCTS2
C_ID NUMBER not null,
C_NBR VARCHAR2(30) not null,
ACCT_TYPE_CD VARCHAR2(1) not null,
… ßColumns to be included in the new table
partition by list (C_ID)
partition P_1001 values ('1001'),
partition P_1022 values ('1022'),
partition P_1023 values ('1023'),
partition P_1024 values ('1024'),
partition P_1025 values ('1025'),
partition P_1026 values ('1026'),
partition P_1029 values ('1029'),
partition P_1031 values ('1031'),
… ßpartition list
<storage parameters, including parallel> PARALLEL (degree 8)
AS select /*+ PARALLEL(ca 8) */
FROM COMPANY_ACCT ca
WHERE ca.LAST_ACTIVE_DT > TO_DATE(’01-JUN-2005’,’DD-MON-YYYY’);
Create the indexes that exist on the original, update the statistics on the secondary and then perform a rename- renaming the original to “old” and the COMPANY_ACCT2 to the original name, COMPANY_ACCT.
What are the time savings, performing this way vs. original:
Delete Statement, in parallel, against all the partitions, WITHOUT update of statistics: 23 hrs, 12 min.
CTAS and rename, INCLUDING update of statistics: 7 hrs, 4 min.
This is a 60% time savings- 60% less maintenance window of the DBA’s personal time or outage to the business. How can you say no to this?
Dimensional data the CTAS way…
As a special request for the business, a developer needs to create a table with multiple joins to three of the largest tables. The developer makes multiple attempts, but must stop due to performance hits on the database.
SELECT /*+ parallel(g 8) parallel(i 8) use_hash(g i) */
NVL2 (g.geo _tract, 'B', 'Z'))
FROM geo_comp_dim g, id_comp_dim i
WHERE g.c_id = i.c_id;
The explain plan shows the impressive cost of this query, along with outrageous amount of temporary tablespace required to perform the hash and sort.
|SELECT STATEMENT, GOAL = ALL_ROWS||
|PX SEND QC (ORDER)||SYS||:TQ10002||
|SORT ORDER BY||
|PX SEND RANGE||SYS||:TQ10001||
|PX SEND BROADCAST||SYS||:TQ10000||
|PX BLOCK ITERATOR||
|TABLE ACCESS FULL||DM_OWNER||GEO_COMP_DIM||
|PX BLOCK ITERATOR||
|TABLE ACCESS FULL||DM_OWNER||ID_COMP_DIM||
What is the heaviest weight in this process to the cost of the query?
The hash join between the two tables, but what does it really need from the two tables? The original query needs almost everything from GEO_COMP_DIM, but it’s a relatively small cost vs. ID_COMP_DIM, which the developer only needs three columns of the total 412 columns!
Those of us who have studied a hash join know that we should expect the hash join to require 1.6 times our table we lead in with for our hash. What is we created a smaller lead table for the hash then?
CTAS to the rescue again!
CREATE TABLE ID_COMP_CID_DATA AS
SELECT C_ID, C_MOD_KEY, C_COMP_ID FROM ID_COMP_DIM;
Replace the original ID_COMP_DIM table in the query with ID_COMP_CID_DATA. What is the performance improvement?
|SELECT STATEMENT, GOAL = ALL_ROWS||
|PX SEND QC (ORDER)||SYS||:TQ10002||
|SORT ORDER BY||
|PX SEND RANGE||SYS||:TQ10001||
|PX SEND BROADCAST||SYS||:TQ10000||
|PX BLOCK ITERATOR||
|TABLE ACCESS FULL||DM_OWNER||GEO_COMP_DIM||
|PX BLOCK ITERATOR||
|TABLE ACCESS FULL||DM_OWNER||ID_COMP_CID_DATA||
Same explain plan, different, narrow table to work with… The time savings is considerable, too..
Original: 7 hrs, 16 min.
Same Query off of CTAS “dimensional” table:
Creation of ID_COMP_CID_DATA: 10 min
Query: 28 min.
The business has now been asked to change how they report data. Originally, the data was represented as yes or no, but the data now needs to be presented as values requested by the client, no longer satisfied with Yes or No.
A simple CTAS of the table with a decode of the columns to the new values will create the new table in the format you need without having to perform a standard update.
create table COMP_BUS
STORAGE (FREELISTS 48) nologging as
select /*+ parallel(I,8) */
, DECODE(i.sold_flag, 'Y', '1', '0') AS sold
, i.pass_id AS p_id
, TO_NUMBER(DECODE(i.sl_dt, null, null, i.sl_dt - TO_DATE('01-JAN-1999', 'DD-MON-YYYY'))) AS sl_dt
from DM_OWNER_A.comp_bus_dim i ;
create index bus_cid_idx on DM_OWNER.comp_bus (c_id ) tablespace BUS_INDX1 parallel 4;
ALTER TABLE COMP_BUS noparallel;
ALTER INDEX bus_cid_idx noparallel;
No concern over empty blocks or performance issues that would be experienced with an update will be experienced with the CTAS. Create the table, create any indexes, update statistics and rename the table. The steps may seem more involved, but the time savings and ability for the CTAS to pay-forward in performance benefits are significant.
Here’s where you have to ask yourself- Do you want to be on the yellow school bus or the hyper-pod?
Features that can be utilized in a CTAS that should be investigated for performance benefits:
In the world of data warehouses and marts, theory would say, anyone should be able to easily update or delete rows without issue, but in reality, there are more efficient ways to process large amounts of data. We, as DBAs and developers, must be more pro-active, more futuristic- more ”Jetson” like.
Congratulations to my coworker, Andy Colvin who just received his Oracle ACE!! Just one more immensely technically gifted individual who is very deserving of this award at Enkitec and makes me proud to be part of such a great group of folks!
Posted in Database
The REI Monorail Sling bag has the volume of a daypack and the features to meet all the challenges of the daily urban grind. Outdoor-inspired details make hauling the load comfortable and convenient.
Light, Sturdy, Roomy and Comfortable!!
Pros: Lightweight, Easy To Access Items, High Quality, Comfortable, Good Capacity, Easy To Carry, Stylish
Cons: No Zipper pocket in main
Best Uses: Weekend Trips
Describe Yourself: Modern, Stylish, Practical, Comfort-Oriented, Career
Travel Frequency: 6-10 Annual Trips
Primary use: Business
Was this a gift?: No
Going on both regional and international trips this year and needed a bag to carry both my ultrabook, tablet and accessories. This bag sits close to the body and is extremely comfortable. I have arthritis in my neck and shoulders, so for anyone needing extra comfort for long trips through airports, cities and towns- this is a great bag!!
Posted in Database