Category: Oracle

August 10th, 2017 by dbakevlar

In my latest blog post on the Delphix site, I continue my conversation with why DevOps is the next step for DBAs and how DBAs can embrace this next step in their evolution.

This is an extensive series of blog posts, (four so far) to be followed by an ebook, a podcast and two webinars.  One is to be announced soon from Oracle called, “The DBA Diaries” and the other will be a from Delphix, titled, “The Revolution:  From Databases and DevOps to DataOps“.

The goal for all of this is to ease transition for the Database community as the brutal shift to the cloud, now underway, changes our day to day lives.  Development continues to move at an ever accelerating pace and yet the DBA is standing still, waiting for the data to catch up with it all.  This is a concept that many refer to as “data gravity“.

The concept was first coined just a few years ago by a Senior VP Platform Engineer, Dave McCrory.  It was an open discussion aimed at understanding how data impacted the way technology changed when connected with network, software and compute.

He discusses the basic understanding that there’s a limit in “the speed with which information can get from memory (where data is stored) to computing (where data is acted upon) is the limiting factor in computing speed.” called the Von Newmann Bottleneck.

These are essential concepts that I believe all DBAs and Developers should understand, as data gravity impacts all of us.  Its the reason for many enhancements to database, network and compute power.  Its the reason optimization specialists are in such demand.  Other roles such as backup, monitoring and error handling can be automated, but the more that we drive logic into programs, nothing is as good as true skill in optimization when it comes to eliminating much of data gravity issues.  Less data, less weight-  it’s as simple as that.

We all know the cloud discussions are coming, and with that, even bigger challenges are felt by the gravity from data.  Until then, let’s just take a step back and recognize that we need some new goals and some new skills.  If you’re like to learn more about data gravity, but don’t have time to take it all in at once, consider following it on Twitter, which is curated by Dave McCrory.

I’m off to Jacksonville, Fl. tomorrow to speak at SQL Saturday #649!

 

 

 

Posted in Database, DBA Life, Delphix, Oracle, SQLServer Tagged with: , , ,

August 7th, 2017 by dbakevlar

It’s finally time to upgrade our Linux Target!  OK, so we’re not going to upgrade the way a DBA would normally upgrade a database server when we’re working with virtualization.

So far, we’ve completed:

  • 1.  Updating our instances so that we’ll have a GUI interface if we’ll need one.
  • 2.  Installed Oracle on the Linux Source and upgraded our Dsource database to 12c

 

Now we’re done with our Linux Source and onto our Linux Target.

Install and Configure VNC and Oracle

We’ll run through and install and configure the VNC Viewer requirements just like we did in Part I and Part II. We’ll also install Oracle, but only this time, we’ve performed a software installation only.

We’ll install the Enterprise Edition and we’ll make sure to install it in the same path as we did on our Linux Source, (/u01/app/oracle/product/12.1/db_1)  We’re not installing the multi-tenant, as we didn’t configure this on our source, either.

Once that is complete, it’s time to get our VDB’s upgraded.

The first thing you need to remember is that the VDBs are simply virtual images of our Dsource that is already UPGRADED.

Add the New Oracle Home to the Linux Target

Log into Delphix Admin Console and click on Environments.

click on the Linux Target and then click on the refresh button:

Click on the Databases tab and you’ll now see the DB12c Oracle home is now present in the list:

Prep VDBs for switch to new home

Copy your environments profile from 11g.env to 12c.env.  Update the Oracle home to point to the new 12c home and save the file.

Now I have three VDBs on this target:

[delphix@linuxtarget ~]$ ps -ef | grep pmon

delphix   7501     1  0 Jul12 ?        00:01:17 ora_pmon_devdb
delphix   8301     1  0 Jul06 ?        00:01:49 ora_pmon_VEmp6
delphix  16875     1  0 Jul05 ?        00:01:57 ora_pmon_qadb

Log into the Linux Target and from the command line, set the environment and log into each database via SQL Plus and shut it down.

. 11g.env

export ORACLE_SID=VEmp6f
sqlplus / as sysdba
shutdown immediate;
exit;

and so on and so forth…. 🙂

Copy all the init files from the 11g Oracle Home/dbs for the VDBs over to the 12c Oracle Home/dbs/.

And this is where it all went wrong for two of the VDBs…

Back on the Delphix Admin Console, click on Manage –> Datasets

Click on each of the VDBs involved.  Click on Configuration –> Upgrade, (up arrow icon) and say yes to upgrading.  Update the Oracle home from the 11g in the drop down to the new 12c Oracle home and click the gold check mark to confirm.

OK, so this is what I WOULD have done for all three VDBs if I’d been working with JUST VDBs, but this is where it gets a bit interesting and I had to go off the beaten path for a solution.  I’m unsure if this is documented anywhere inside Delphix, (Delphix Support is awesome, so I’m sure they already know this, but for my own sanity) here’s the situation and the answer.  The environment I am working on is built off of AWS AMIs that consist of Delphix containers.  Containers are very powerful, allowing you to “package” up a database, application and other tiers of virtualized environments, offering the ability to manage it all as one piece.  This was the situation for the qadb and the devdb instances.

Due to this, I couldn’t run the upgrade inside the Delphix Admin console since these two VDBs were part of Delphix “data pods.”  The following are the steps to then address this change.

Remove the Containers, (Subsequently the VDBs as Well!)

  1. Log into the Delphix’s Jet Stream.
  2. Upper right hand corner, click on Usage Overview
  3. Scroll down and click on Employee Application, (its the template for the VDBs in question..)
  4. At the bottom of this page, you’ll see the two containers that possess the VDBs as part of them.  To the right, there is a trash can icon for delete.  (The reason this is an option is that I have a template built for this container and it will be very simple to recreate a VDB and vfile to repopulate this container, (matter of minutes, max.)

       5. Delete the two containers that are controlling the administration of these two VDBs still pointing to the 11g home.

Create the New VDBs and Virtualized Application, (Vfiles)

Now, log into the Delphix Admin console.

  1. Provision two VDBs from the orcl source db, one as devdb and another as qadb, just as it was before, both on the Linux target.
  2. Provision two vfiles of the Web Application, one as QA_Web and the other as DEV_Web, port numbers 1080 and 2080, keeping all other defaults.
  3. Once completed, (couple minutes, max) then lets return to Jet Stream and create the containers that will house the system.

Create the New Containers From the Template

In Jet Stream

  1. Click on Data Management in the upper right hand corner
  2. You will be brought to the Templates tab, click on the Employee Application template
  3. Click on Add Container

  1. Name:  “Dev 12c Container”, Owner: Dev and choose the devdb and the DEV_Web for the sources, then complete the container creation.
  2. Click again on Add Container
  3. Name: “QA 12c Container”, Owner: QA and choose the qadb and the QA_Web for the sources, then complete the container creation.

This will take just a moment to finish creating and that’s all there is to it.  You now have two DB12c environments that are completely containerized and upgraded from their previous 11g state.

Our source shows we’re using our DB12c upgraded database:

And we can also see everything is upgraded and happy in our Delphix Administration Console.  

It may have taken a little longer for me to upgrade with the complexity of the containers introduced, but the power of data pods is exceptional when we’re managing our data, the database and the application as one piece anyway.  Shouldn’t we treat it as one?

 

 

 

 

 

 

 

Posted in AWS, Delphix, Oracle

August 3rd, 2017 by dbakevlar

I’ve been asked what it takes to be a successful evangelist and realizing that what makes one successful at it, is often like holding sand in your hands- no matter how tightly you hold your fists, it’s difficult to contain the grains.

The term evangelist is one that either receives very positive or very negative responses.  I’m not a fan of the term, but no matter if you use this term or call them advocates, representative, influencer-  it doesn’t matter, they are essential to the business, product or technology that they become the voice for.

Those that I view as successful evangelists in the communities that I am part of?

There are a number of folks I’m sure I missed I also admire as I interact and observe their contributions, but these are a few that come to mind when I think of fellow evangelists.

What makes an evangelist successful?  It may not be what you think.

1. It’s Not Just About the Company

Most companies think they hire an evangelist to promote and market the company and yet, when all you do it push out company info, company marketing- People STOP listening to you.  What you say, do and are interested in should drive people to want to know more about you, including the company you work for and what that company does.

All of these folks talk about interests outside of work.  They post about their lives, their interests and contribute to their communities.  This is what it means to be really authentic and setting an example.  People want to be more like them because they see the value they add to the world than just talking points.

2.  They’re Authentic

Authenticity is something most find very elusive.  If you’re just copying what another does, there’s nothing authentic about that.  There’s nothing wrong finding a tip or tidbit that someone else is doing and adopting it, but it has to WORK for you.  I was just part of a conversation yesterday, where Jeff and I were discussing that he doesn’t use Buffer, (social media scheduling tool) where I live by it.  It doesn’t work for Jeff and there’s nothing wrong with that.  We are individuals and what makes us powerful evangelists is that we figured out what works for each of us.

3.  In the Know

As a technical evangelist, you can’t just read the docs and think you’re going to be received well.  Theory is not practice and I’ve had a couple disagreements with managers explaining why I needed to work with the product.  I’ve had to battle for hardware to build out what I’ve been expected to talk on and only once I didn’t fight for it and I paid for it drastically.  I won’t write on a topic unless I can test it out on my own.  Being in the trenches provides you a point of view no document can provide.

Documentation is secondary to experience.

4.  Your View is Outward

This is a difficult one for most companies when they’re trying to create evangelists from internal employees.  Those that may be deeply involved at the company level may interact well with others, but won’t redirect to an external view.  I’ve had people ask me why my husband isn’t doing as much as I am in the community.  Due to his position, he must be more internally and customer facing.  My job is very separate from my fellow employees.  I must always be focused outward and interact at least 95% of my time with the community.  You’ll notice all of the folks listed are continually interacting with people outside of their company and are considered very “approachable.”

We volunteer our time in the community- user groups, board of directors, events and partnering with companies.  We socialize, as we know our network is essential to the companies we represent.

5.  We Promote

I wish I did more public promotion like I see some of these other folks.  I’m like my parents-  I stand up for others and support them on initiatives and goals.  I do a lot of mentoring, but less when I’m blogging.  My mother was never about empty compliments and I did take after her on this.  I’m just not very good at remembering to compliment people on social media and feel I lack in this area, but I continually watch others do this for folks in the community and this is so important.

We ensure to work with those that may need introductions in our network, support in the community and reach out to offer our help.  In the public view, this is quite transparent, so when others pay this forward or return the favor, it can appear that people just bend over backwards for us, but we often have been their for the folks in question in the past, with no expectations and people remembered this.

We do promote our company, but for the right reasons.  The company has done something good for the community, has something special going on, but rarely do we push out anything marketing, as it just doesn’t come across very well from us.  It’s not authentic.

Additional Recommendations

  • Refrain from internet arguments, social media confrontations

I’m not saying to be a pushover.  I literally have friends muted and even blocked.  There’s nothing wrong with NOT being connected to individuals that have very different beliefs or social media behavior.  You shouldn’t take it personally– this is professional and you should treat it as such.

You may find, (especially for women and people of color) that certain individuals will challenge you on ridiculous topics and battle you on little details.  This is just the standard over-scrutinizing that we go through and if it’s not too bad, I tell people to just ignore it and not respond.  If it escalates, don’t hesitate to mute or block the person.  You’re not there to entertain them and by removing your contributions from their feed- “out of sight, out of mind”, offering peace to both of you… 🙂

  • Use automation tools and send out content that INTERESTS YOU.

Contribute what you want and limit to a certain percentage of what your company wants and be authentic.  Find your own niche and space and don’t send out “noise.”

There are a ton of tools out there.  Test out buffer, hootsuite, Klout or SumAll to make social media contributions easier.  If you don’t have a blog, create one and show what you’re working on and don’t worry about the topic.  You’ll be surprised that if you just write on challenges you’re facing, how you’ve solved a problem you’ve come across and write on a topic that you couldn’t find a solution to online, people will find value in your contributions.

  • Interact and be receptive of others

Have fun with social media and have real conversations.  People do appreciate honesty with respect.  Answer comments and questions on your blog.  Respond to questions on forums for your product and promote other people’s events and contributions.

When people approach you at an event or send you a direct message, try to engage with them and thank them for having the guts to come up and speak with you.  It’s not easy for most people to approach someone they don’t know.

  • Volunteer and Contribute

We used to be part of our community and as our world has changed with technology, the term community has changed.  These communities wouldn’t exist without the contributions of people.  Volunteer to help with user groups, events and forums.  Don’t just volunteer to be on a board of directors and not do anything.  It’s not something to just put on your CV and think you’re contributing.  There is incredible power in the simple act of doing, so DO.  Provide value and ask how you can help.  Kent has been a board member, a volunteer and even a president of user groups.  Jeff has run content selections and run events even though he’s limited in what he’s allowed to do as an Oracle employee and Rie promotes information about every woman speaker at SQL Saturday events along with all she does to run the Atlanta SQL Saturday, (largest one in the US!)  I won’t even try to name all the different contributions that Grant is part of, including the new attendees event at Pass Summit, (Microsoft’s version of Oracle Open World for my Oracle peeps!)

For those companies that are thinking-  “I hired an evangelist, so I want them to be all about me and all invested in the company.”  If they do, you’ll never have the successful evangelist that will be embraced by the community, able to promote your product/company in a powerful, grassroots way and if their eyes are always looking inside, they will miss everything going on outside and as we all know, technology moves fast.  Look away and you’ll miss it.

 

Posted in DBA Rants, Delphix, Oracle, SQLServer Tagged with:

July 31st, 2017 by dbakevlar

This is the Part III in a four part series on how to:

  1.  Enable VNC Viewer access on Amazon EC2 hosts.
  2.  Install DB12c and upgrade a Dsource for Delphix from 11g to 12c, (12.1)
  3.  Update the Delphix Configuration to point to the newly upgraded 12c database and the new Oracle 12c home.
  4.  Install DB12c and upgrade target VDBs for Delphix residing on AWS to 12.1 from the newly upgraded source.

In Part II, we finished upgrading the Dsource database, but now we need to get it configured on the Delphix side.

Log into the Delphix Admin console to make the changes required to recognize the Dsource is now DB12c and has a new Oracle home.

Log into the Delphix console as the Delphix_Admin user and go to the Manage –> Environments.

Click on the Refresh button and let the system recognize the new Oracle Home for DB12c:

Once complete, you should see the 12.1 installation we performed on the Linux Source now listed in the Environments list.

Click on Manage –> Datasets and find the Dsource 11g database and click on it.

Click on the Configuration tab and click on the Upgrade icon, (a small up arrow in the upper right.)

Update to the new Oracle Home that will now be listed in the dropdown and scroll down to save.

Now click on the camera icon to take a snap sync to ensure everything is functioning properly.  This should only take a minute to complete.

The DSource is now updated in the Delphix Admin console and we can turn our attentions to the Linux target and our VDBs that source from this host.  In Part IV we’ll dig into the other half of the source/target configuration and how I upgraded Delphix environments with a few surprises!

 

Posted in AWS, Delphix, Oracle Tagged with: , , ,

July 26th, 2017 by dbakevlar

I’m finally getting back to upgrading the Linux Source for a POC I’m doing with some folks and picking up from where we left off in Part I

Address Display Issue

Now that we have our VNC Viewer working on our Amazon host, the first thing we’ll try is to run the Oracle installer, (unzipped location –> database –> runInstaller) but it’s going to fail because we’re missing the xdpinfo file.  To verify this, you’ll need to open up a terminal from Application –> System Tools –> Terminal:

$ ls -l /usr/bin/xdpyinfo
 ls: /usr/bin/xdpyinfo: No such file or directory

We’ll need to install this with yum:

$ sudo yum -y install xorg-x11-utils

Once we’ve completed this, let’s verify our display:

$ echo $DISPLAY

:1.0 <– (0 is local, first number is the display, just as ipaddress:display for your VNC Viewer connection.)

If it’s correct, you can test it by executing xclock:

$ xclock

The clock should appear on the screen if the display is set correctly.

Install Oracle 12c

Run the installer:

$ ./runInstaller

The installer will come up for Oracle 12c and you can choose to enter in your information, but I chose to stay uninformed… 🙂  I chose to install AND upgrade the database to DB12c from 11g.

The warnings for swap and the few libraries I also chose to ignore by clicking ignore all and proceeded with the installation.

Root.sh and the Trace Analyzer

Once the installation of the new Oracle Home is complete, choose to run the root.sh script when prompted:

$ sudo /u01/app/oracle/product/12.1/db_1/root.sh

Overwrite all files when prompted by the script run and it’s up to you, but I chose to install the Oracle Trace File Analyzer so I can check it out at a later date.  You’ll then be prompted to choose the database to upgrade.  We’re going to upgrade our source database, ORCL in this example.

Upgrade Our Oracle DSource(Database)

Choose to proceed forward with the upgrade on the database, but know that you’ll require more space for the archive logs that are generated during the upgrade.  The check will tell you how much to add, but I’d add another 1Gb to ensure you are prepared with the other steps you have to run as we go through the preparation steps.

Log into SQL Plus as SYSDBA to perform this step:

ALTER SYSTEM SET DB_RECOVERY_FILE_DEST_SIZE=8100M;

Go through any warnings, but steps like stats collection and grants on triggers will have to be performed post the upgrade.

Drop the OLAP catalog:

$ sqlplus / as sysdba

@$ORACLE_HOME/olap/admin/catnoamd.sql

exit

Remove the OEM catalog for Enterprise Manager, first shutting down the console from the terminal:

$ emctl stop dbconsole

Copy the emremove.sql from the 12c Oracle Home/rdbms/admin and place it in the same location for 11g home.  Log into SQL Plus as SYSDBA:

SET ECHO ON;

SET SERVEROUTPUT ON;

@$ORACLE_HOME/rdbms/admin/emremove.sql

Empty the recyclebin post these steps:

purge recyclebin;

The assumption is that you have a backup prepared or you can use flashback with your resources allocated and proceed forward with upgrade.

Choose to upgrade the 11g listener and choose to install EM Express if you’d like to have that for monitoring.  Make sure to keep the default checks for the following window to update everything we need and collect stats before the upgrade runs to ensure it proceeds efficiently through all objects required.

Choose to proceed with the upgrade and if you’ve followed these instructions, you should find a successful installation of DB12c and upgrade of the database.  Keep in mind, we’re not going to go multi-tenant in this upgrade example, so if you were looking for those steps, my POC I’m building isn’t going to take that on in this set of blog posts.

Post Upgrade Steps:

Update your environment variables, including copying the 11g.env to a new profile called 12c.env and updating the Oracle Home.  Now set your environment and log into SQL Plus as SYSDBA to the upgraded database.

Update all the necessary dictionary and fixed stats:

EXECUTE DBMS_STATS.GATHER_DICTIONARY_STATS;

EXECUTE DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;

Now, for our next post, we’ll need to set up the same installations on our Amazon host and VNC Viewer configuration we did for the Source and then install Oracle DB12c on our target server as we did in this post.  Then we’ll discuss how to get all our Delphix VDBs, (virtual databases) upgraded to be the same as our source in no time!

 

Posted in AWS, Delphix, Oracle Tagged with: , , ,

July 19th, 2017 by dbakevlar

There are a plethora of mishaps in the early space program to prove the need for DevOps, but Fifty-five years ago this month, there was one in particular that is often used as an example for all.  This simple human error almost ended the whole American space program and it serves as a strong example of why DevOps is essential as agile speeds up the development cycle.  

The Mariner I space probe was a pivotal point in the space race between the United States and the Soviet Union.  The space probe was a grand expedition into a series of large, sophisticated, as well as interplanetary missions, all to carry the Mariner moniker.  For this venture to launch, (pun intended) it was dependent on a huge, as well as new development project for a powerful booster rocket called the Atlas-Centaur.  The development program ran into so many testing failures that NASA ended up dropping the initial project and going with a less sophisticated booster to meet the release date, (i.e. features dropped from the project.)  These new probe designs were based off the previously used Ranger moon probes, so there was less testing thought needed and the Atlas Agena B Booster was born, bringing the Mariner project down to a meager cost of $80 million.

The goal of the Mariner I was to perform an unmanned mission to Mars, Venus and Mercury.  It was equipped with solar cells on its wings to assist on the voyage, which was all new technology, but the booster, required to escape Earth’s gravity, was an essential part of the project. As the boosters were based off of older technology than many of the newer features, the same attention wasn’t offered to it while testing was being performed.

On July 22nd, 1962, the Mariner I lifted off, but after approximately four minutes in, it veered off course.  NASA made the fateful decision to terminate the spacecraft and destroyed millions of dollars of equipment, ensuring it didn’t end up crashing on its own into populated areas.

As has already been well documented, the guidance system, which was supposed to correct the Mariner 1 flight, had a single typo in the entire coded program.  A missing hyphen, required for instructions to adjust flight patterns was missing.  Where it should have read “R-dot-bar sub-n”, instead was “R-dot-bar sub n”.  This minor change caused the program to over-correct small velocity changes and created erratic steering commands to the spacecraft.

This missing hyphen caused a loss of millions of dollars in the space program and is considered the most expensive hyphen in history.

How does this feed into the DevOps scenario?  

Missing release dates for software can cost companies millions of dollars, but so can the smallest typos.  Reusing code and automation of programming, along with proper policies, process and collaboration throughout the development cycle ensures that code isn’t just well written, but in these shortened development cycles, it’s reviewed and tested fully before it’s released.  When releases are done in smaller test scenarios, a feedback loop is ensured so that errors are caught early and guaranteed not to go into production.

Posted in devops, Oracle, SQLServer Tagged with:

July 13th, 2017 by dbakevlar

For a POC that I’m working on with the DBVisit guys, I needed a quick, 12c environment to work on and have at our disposal as required.  I knew I could build out an 11g one in about 10 minutes with our trust free trial, but would then need to upgrade it to 12c.

Disable snapshots to Delphix Engine

This is a simple prerequisite before you upgrade an Oracle source database and takes down the pressure on the system, as well as confusion as the database upgrades the Oracle home, etc.

Simply log into the Delphix Admin console, click on your source group that the source database belongs to and under Configuration, in the right hand side, you’ll see a slider that needs to be moved to the “disable” position to no longer take interval snapshots.

Configure GUI for Simplified Oracle Installation

EC2 doesn’t come default with the GUI interface, so we just need to install it on the host to make life a little easier for the upgrade:

  •  Check for updates:
[delphix@linuxsource database]$ sudo yum update -y

….

  xfsprogs.x86_64 0:3.1.1-20.el6                                                
  yum.noarch 0:3.2.29-81.el6.centos                                             
  yum-plugin-fastestmirror.noarch 0:1.1.30-40.el6                               
Replaced:
  python2-boto.noarch 0:2.41.0-1.el6                                            
Complete!
  • Install the desktop:
[delphix@linuxsource database]$ sudo yum groupinstall -y "Desktop"

  xorg-x11-xkb-utils.x86_64 0:7.7-12.el6                                        
  xulrunner.x86_64 0:17.0.10-1.el6.centos                                       
  zenity.x86_64 0:2.28.0-1.el6                                                  
Complete!
  • Install dependencies like fonts needed:
[delphix@linuxsource database]$ sudo yum install -y pixman pixman-devel libXfont
[delphix@linuxsource database]$ sudo yum -y install tigervnc-server

Each of the above should show completed successfully.

  • Set the password for the VNC:
[delphix@linuxsource database]$ vncpasswd
Password:
Verify:
  • Restart the SSHD Service:

sudo service sshd restart

[delphix@linuxsource database]$ sudo service sshd restart
Stopping sshd:                                             [  OK  ]
Starting sshd:                                             [  OK  ]
  • Configure VNC Server properties with SUDO privs:
[delphix@linuxsource database]$ sudo vi /etc/sysconfig/vncservers
VNCSERVERS="1:delphix"
VNCSERVERARGS[2]="-geometry 1280X1024

Save and exit the vncservers configuration file.

  • Start VNC Server:
[delphix@linuxsource database]$ sudo service vncserver start

….

Log file is /home/delphix/.vnc/linuxsource.delphix.local:1.log
                                                           [  OK  ]
  • I’m the only one who will be accessing this host to perform these types of tasks, so I’ll use port 5901 and add a firewall rule:
[delphix@linuxsource database]$ sudo iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 5901 -j ACCEPT

You can now use the VNC Viewer to access the GUI for the Linux Source and install/upgrade Oracle.  I’m assuming you already have it, but if you don’t, download it and do a quick install.  Keep in mind, to install Oracle via the GUI on the Linux Target, I’ll need to perform these steps on that target, too.

Let’s check and verify that we can get to the Linux Source desktop.  Configure a new connection in the VNC Viewer and remember to use the public IP and “:1” for your user that you wish to log into.  Save and log into the Linux Source.

In the next post, I’ll update the Linux Source Oracle database and we’ll proceed with upgrading Delphix source and target databases on Amazon.

 

Posted in Oracle Tagged with: , , ,

July 3rd, 2017 by dbakevlar

Database Administrators, (DBAs) through their own self-promotion, will tell you they’re the smartest people in the room and being such, will avoid buzzwords that create cataclysmic shifts in technology as DevOps has.  One of our main role is to maintain consistent availability, which is always threatened by change and DevOps opposes this with a focus on methodologies like agile, continuous delivery and lean development.

Residing a step or more behind bleeding edge has never phased the DBA.  We were the cool kids by being retro, those refusing to fall for the latest trend or the coolest new feature, knowing that with bleeding edge comes risk and that a DBA that takes risks is a DBA out of work.  So we  put up the barricades and refused the radical claims and cultural shift to DevOps.

As I travel to multiple events focused on numerous platforms the database is crucial to, I’m faced with peers frustrated with DevOps and considerable conversation dedicated to how it’s the end of the database administrator.  It may be my imagination, but I’ve been hearing this same story, with the blame assigned elsewhere-  either its Agile, DevOps, the Cloud or even a latest release of the actual database platform.  The story’s the same-  the end of the Database Administrator.

The most alarming and obvious pain point of this, is that in each of these scenarios, the result was the Database Administrator a focal point in the end more so than they were when it began.  When it comes to DevOps, the specific challenges of the goal needed the DBA more so than any of these storylines.  As development hurdled top speed to deliver what the business required, the DBA and operations as a whole, delivered the security, the stability and the methodologies to build automation at the level that the other groups simply never needed previously.

Powerful DBAs with skills not just in scripting, but in efficiency and logic, were able to take complicated, multi-tier environments and break them down into strategies that could be easily adopted.  As they’d overcome the challenges of the database being central and blamed for everything in the IT environment, they were able to dissect and built out complex management and monitoring of end-to-end DevOps.  As essential as System, Network and Server Administration was to the Operations group, the Database Administrator possessed advanced skills required, a hybrid of the developer and the operations personnel that make them a natural fit for DevOps.

The truth is, the DBA is not ruined by DevOps, but the role is revolutionized.

Thanks to this awesome post from 2012 from Alex Tatiyants which resonated so well with the DBAs I speak to every day, even in 2017.

Posted in DBA Life, devops, Oracle Tagged with: ,

June 28th, 2017 by dbakevlar

I’ve been at KSCOPE 2017 all week and it’s been a busy scheduled even with only two sessions.  Its just one of those conferences that has so much going on all the time that the days just speed by at 140MPH.

As with most major conferences, KSCOPE abstract submission was about 9 months ago.  This was a time when I was just coming to grips with how much Delphix could solve in the IT environment and may have been one of the first abstracts I submitted as a Delphix employee.  I wasn’t too thrilled with my choice and thank KSCOPE for still accepting me as a speaker, so the end product that I presented was very different than what I submitted.

I was in the DBA track, but after arriving, I started to push out on my network that I was building a developer presentation perspective with our development/tester interface, Jet Stream.  One of the challenges I experienced as a developer, a release manager and my many years as a database administrator was in releasing to multiple tiers.  This complexity has become an increasing pain point with the introduction of agile methodologies.

The demonstration example was an agile development team of a web developer, a tester and a database administrator team.  They were to bulk load new employee data to save the business from having to manually enter data and ensure that the numbering order of the employee ID was sequenced.

I proceeded to build out an environment this week in the cloud that would represent such an environment.  It consisted of our common employee web application, an Oracle database to store the employee information, some structured flat files of employee information and some scripts to load this information from the files.  These were all then used to create development and test environments and created containers to track changes via Jet Stream as each new step of the development and testing occurred.

In newer agile development shops, the DBA may also be an integral part of the development team and in this scenario, this demonstrated how we may solve problems in the database can be in conflict with how an application was designed to function, causing downtime.  With the container of Virtual Database, (VDB), Virtual File Directories, (vfiles)-  virtualized development environments that are complete read/write copies to use for development and test, we were able to use Jet Stream to version control not just the code changes, but the data changes, too.

In my demo, I showed how the bulk load process, designed by the DBA, had created a sequence and trigger on the table used to populate the employees table with data from the structured flat file, then proceeded to load it.  The data loaded without issue and the employee ID was now sequential-  requirement solved and job complete.  It was simple to then create a bookmark in the timeflow and noting what had been done in that iteration.

The problem was, post the bulk load of the data, the change actually broke the application and no new employees could be added through the interface.  We proved this by attempting to add an employee in the application and then querying the database to verify that it wasn’t just the application that wasn’t displaying the new employee addition.

I was able to demonstrate in Jet Stream, that instead of using rollback scripts and backing out files to previous versions, I was able to quickly “rewind” to the “Before 2.7 release bookmark” and all tiers of the container were reset to before the data load, saving considerable time and resources for the agile team.

If you’d like to learn more about Jet Stream, Templates, Containers or how this can provide incredible value in our hectic agile DBA and development lives, check out the following links:

Delphix Jet Stream PDF

Valuable Jet Stream Concepts

Jet Stream Container Overview

 

 

Posted in Delphix, Oracle Tagged with: , ,

June 19th, 2017 by dbakevlar

So you’re going to see a lot of posts from me in the coming months surrounding topics shared by Oracle and SQL Server.  These posts offer me the opportunity to re-engage with my Oracle roots and will focus on enhancing my SQL Server knowledge for the 2014 and 2016, (2017 soon enough, too) features, which I’m behind in.

I’m going to jump right in with both feet with the topics of hints.  The official, (and generic) definition of a SQL hint is:

“A hint is an addition to a SQL statement that instructs the database engine on how to execute the statement.”

Hints are most often used in discussion on queries, but they can assist in influencing the performance of inserts, updates and deletes, too.  What you’ll find is that the actual terminology is pretty much the same for hints in SQL statements for Oracle and SQL Server, but the syntax is different.

The Optimizer and Oracle

Oracle hints were quite common during the infancy of the Oracle Cost Based Optimizer, (CBO).  It could be frustrating for a database administrator who was accustomed to the Rules Based Optimizer, (rules, people!  If there’s an index, use it!) to give up control of performance to a feature that simply wasn’t taking the shortest route to the results.  As time passed from Oracle 9i to 10g, we harnessed hints less, trusting the CBO and by Oracle 11g, it started to be frowned upon unless you had a very strong use case for hinting.  I was in the latter scenario, as my first Oracle 11g database environment required not just new data, but a new database weekly and a requirement for me to guarantee performance.  I knew pretty much every optimal plan for every SQL statement in the systems and it was my responsibility to make sure each new database chose the most optimal plan.  I had incorporated complex hints, (and then profiles as we upgraded…)

With the introduction of database version Oracle 12c, it became a sought after skill to use hints effectively again, as many new optimizer features, (often with the words “dynamic” or “automated” in them) started to impact performance beyond what was outside the allowable.

SQL Server’s Query Optimizer

SQL Server’s optimizer took a big jump in features and functionality in SQL Server 2014.  With this jump, we started to see a new era of SQL Server performance experts with the introduction of SQL Server 2016 that moved even further into expertise with optimization, not only in collecting performance data via dynamic management views/functions, (DMVs/DMFs) but also in ability to influence the SQL Server Query Optimizer to make intelligent decisions with advanced statistics features and elevated hinting.

Hints have a more convoluted history in the SQL Server world than in the Oracle one.  I have to send some love and attention to Kendra Little after I found this cartoon she drew in regards to her frustration with the use of ROWLOCK hints:

After reading this, my plan is still to go deeper into a number of areas of performance, including the optimizers, but today, we’ll just stick to a high level difference on hinting in queries.

Hints

In our examples, we’ll focus on forcing the use of a HASH join instead of a nested loop, using an index for a specific table and a MERGE join.  Let’s say we want to use a hash join on the Employees and a merge join on the Job_history table.  We also want to make sure that we use the primary key for one of the employee ID joins, as a less optimal index usage results with lower costs even though the performance isn’t as optimal due to concurrency.

The query would look like the following in Oracle:

SELECT   /*+ LEADING(e2 e1) USE_HASH(e1) INDEX(e1 emp_emp_id_pk) 
           USE_MERGE(j) FULL(j) */
         e1.Name, j.job_id, sum(e2.salary) total_sal
FROM     employees e1, employees e2, job_history j
WHERE    e1.employee_id = e2.manager_id
AND      e1.employee_id = j.employee_id
AND      e1.hire_date = j.start_date
GROUP BY e1.first_name, e1.last_name, j.job_id
ORDER BY total_sal;

If there was a subquery as part of this statement, we could add a second set of hints for it, as each query supports its own hints in the statement after the word SELECT.

If we were to take the same statement in SQL Server, the hints would look a bit different.  Yeah, the following is about as close as I could get to “apples to apples” in hints and in TSQL, so please forgive me if it ain’t as pretty as I would have preferred it to be:

SELECT   e1.Name, j.Jobid, sum(pr.Salary) Total_Salary 
FROM     Employees AS e1, INNER MERGE JOIN Job_History AS j 
         LEFT OUTER HASH JOIN Employees AS e2
         WITH (FORCESEEK (emp_emp_id_pk(e2.EmployeeID)))
ON       e1.EmployeeID = e2.ManagerID 
WHERE    e1.EmployeeID = j.EmployeeID 
AND      e1.HireDate = j.StartDate 
GROUP BY e1.Name, j.JobID 
ORDER BY Total_Salary;

In a TSQL statement, each hint is placed at the object in the statement that its in reference to.  The hints are written out commands, (vs. more hint syntax required in Oracle) and the force seek on the primary key for Employees.

As you can see, Oracle signals a hint when put between /*+ and ending with a */.  Each requires some syntax and advanced performance knowledge, but all in all, the goal is the same-  influence the optimizer to perform in a specific way and [hopefully] choose the optimal plan.

Please let me know if you’d like to see more in this series, either by sending me an email to dbakevlar at Gmail or commenting on this post and I’m going to go start preparing for KSCOPE 2017–  Someone explain to me how it already is the end of June!! 🙂

 

 

Posted in Oracle, SQLServer Tagged with: ,

May 26th, 2017 by dbakevlar

I’m in sunny Phoenix this week at the Data Platforms 2017 Conference and looking forward to a break in the heat when I return to Colorado this evening.

As this event is for big data, I expected to present on how big data could benefit from virtualization, but was surprised to find that I didn’t have a lot of luck finding customers utilizing us for this reason, (yet).  As I’ve discussed in previous presentations, I was aware of what a “swiss army knife” virtualization is, resolving numerous issues, across a myriad of environments, yet often unidentified.

The Use Case

To find my use case, I went out to the web and found a great article, “The Case for Flat Files in Big Data Projects“, by Time.com interactive graphics editor, Chris Wilson.  The discussion surrounds the use of data created as part of ACA and used for another article, “How Much Money Does Your Doctor Get From Medical Companies“.  The data in the interactive graphs that are part of the article is publicly available from cms.gov and the Chris discusses the challenges created by it and how they overcame it.

Upon reading it, there was a clear explanation of why Chris’ team did what they did to consume the data in a way that complimented their skill set.  It resonated with anyone who works in any IT shop and how we end up with technical choices that we’re left to justify later on.  While observing conversations at the conference this week, I lost count of how often I accepted the fact that there wasn’t a “hadoop” shop or a “hive” shop, but everyone had a group of varied solutions that resulted in their environment and if you didn’t have it, don’t count it out-  Pig, Python, Kafka or others could show up tomorrow.

This results in a more open and accepting technical landscape, which I, a “legacy data store” technologist, was welcome.  When I explained my upcoming talk and my experience, no one turned up their nose at any technology I admitted to having an interest in.

With the use case found online, was also the data.  As part of the policies in the ACA, cms.gov site, (The Center for Medicare and Medicaid) you can get access to all of this valuable data and it can offer incredible insight into these valuable programs.  The Time article only focuses on the payments to doctors from medical companies, but the data that is collected, separated out by area and then zipped up by year, is quite extensive, but as noted by a third article, as anticipated as the data was, it was cumbersome and difficult to use.

The Requirements

I proceeded to take this use case and imagine it as part of an agile environment, with this data as part of the pipeline to produce revenue by providing information to the consumer.  What would be required and how could virtualization not only enhance what Chris Wilson’s team had built, but how could the entire pipeline benefit from Delphix’s “swiss army knife” approach?

  1.  I can’t assume this is the main data store.  These flat files are a supplement to legacy data stores.
  2. There would be a standard development environment-  development and testing would need their own environments, not just a production copy of these files, applications, etc.
  3. If it’s providing data to a consumer and data is in perpetual motion in the age of the internet, an agile development method would need to be in place, which means a short development cycle with many, small, “scrum like” development groups from different departments working on tasks.
  4. Automation and seamless deployment would assist in less human intervention and resource demands, along with more successful deployments.

Solution

There were four areas that I focused to solve and eliminate bottlenecks that I either experienced or foresaw an organization experiencing when having this data as part of their environment.

  1. Eliminate the need to have multiple copies of the files, slow and manual process to propagate files to targets for development, test, etc. with Delphix’s vFile option, this would include any applications or other non-relational database tier included in the scenario.
  2. Eliminate any legacy data stores copies and refreshes that big data was dependent from and create VDBs for all development, test and reporting.
  3. Protect all non-production environments by masking non-production databases and flat files.
  4. Containerize environments for easy deployment, delivery, testing and cloud migrations.

vFiles

Each of the files, compressed were just over 500M and uncompressed, 15-18G.  This took about over 4 minutes per file to transfer to a host and could add up to considerable space.

I used Delphix vFile to virtualize files.  This means that there is a single, “gold copy” host of the files at the Delphix engine and then there’s an NFS Mount that “projects” the file access to each target, which can be used for unique copies to as many development, test and reporting copies.

Fig. 1- Creating vFiles from dSource that flat files are sourced on.

If a refresh is required, then the parent is refreshed and an automated refresh to all the “children” can be performed.  Changes can be made at the child level and if catastrophic, Delphix would allow for easy recovery, allowing for data version control, not just code version control throughout the development and testing process.

Fig. 2- Demonstration of target vFile, showing NFS Mount, files available, (created in less than 10 seconds) and how easily disabled and proven to be “Projection” of files.

Its a pretty cool feature and one that is very valuable to the big data arena.  I heard countless stories of how often, due to lack of storage, data scientists and developers were taking subsets of data to test and then once to production, find out that their code wouldn’t complete or fail when presented against the full data.  Having the ability to have the FULL files without taking up more space for multiple environments would be incredibly beneficial and shorten development cycles.

Virtualize

Most big data shops are still dependent on legacy data stores.  These are legendary roadblocks due to their size, complexity and demands for refreshes.  I proposed that those be virtualized so that each developer could have a copy and instant refresh without storage demands to again, ease development deadline pressures and allow for full access of data towards the development success.

Protect

Most people know we mask relational databases, but did you know we have Agile Data Masking for  flat files?  If these files are going to be pushed to non-production systems, especially with as much as we’re starting to hear about GDPR, (General Data Protection Regulations) from the EU in the US now, shouldn’t we mask outside of the database?

What kind of files can be masked?

  • Multi-record
  • CSV
  • XML
  • Word
  • Excel
  • PowerPoint
  • Unstructured
  • EDI

Thats a pretty considerable and cool list.  The ability to go in and mask data from flat files is a HUGE benefit to big data folks.  Many of them were looking at file security from the permissions and encryption level, so the ability to render the data benign to risk is a fantastic improvement.

Containerize

The last step is in simple recognition that big data is complex and consists of a ton of moving parts.  To acknowledge how much is often home built, open source, consisting of legacy data stores, flat files, application and other dependent tiers, should be expected.

Fig. 3- A Container, created on-prem, then moved to the cloud and to as many environments as required for the development cycle to meet the business needs.

Delphix has the ability to create templates of our sources, (aka dSources) which is nothing more than creating a container.  In my use case enhancement, I took all of these legacy data stores, applications, (including any Ajax code) flat files and then create a template from it for simple refreshes, deployments via jenkins, Chef jobs or other DevOps automation.  The ability to then take these templates and deploy them to the cloud would make a migration from on-prem to the cloud a simpler process or from one cloud vendor to another.

Fig. 4- A look at the full scenario-  Delphix engines masking files, databases, creating containers and deploying it all on-prem and to the cloud.

The end story is that this use case could be any big data shop or start up in the world today.  So many of these companies are hindered by data and Delphix virtualization could easily let their data move at the speed of business.

I want to thank Data Platforms 2017 and all the people who were so receptive of my talk.  If you’d like access to the slide deck, it’s been uploaded to Slideshare. I had a great time in Phoenix and hope I can come back soon!

 

 

Posted in big data, Cloud, Data Masking, Oracle Tagged with: ,

May 10th, 2017 by dbakevlar

I love questions-  They give me something to write about that I don’t have to come up with from my experience or challenges…:)

So in my last post, Paul asked:

I am not sure what happens to the other changes which happened while the release was happening? Presumably they are also lost? Presumably the database has to go down while the data is reverted?

The Setup

In our scenario to answer this question, I’m going to perform the following on the VEmp_826 virtualized database:

  1. Add a table
  2. Add an index
  3. Include transactions, both inserts and deletes
  4. Rewind the database using the Admin Console

As I’m about to make these changes to my database, I take a snapshot which is then displayed in the Delphix Admin console using the “Camera” icon in the Configuration pane.

Note the SNC Range listing on each of them.  Those are the SCNs available in that snapshot and the timestamp is listed, as well.

Now I log into my target host that the VDB resides on.  Even though this is hosted on AWS, it really is no different for me than logging into any remote box.  I set my environment and log in as the schema owner to perform the tasks we’ve listed above.

Create Table

So we’ll create a table, index and some support objects for our test:

CREATE TABLE REW_TST
(
C1 NUMBER NOT NULL
,C2 VARCHAR2(255)
,CDATE TIMESTAMP
);

CREATE INDEX PK_INX_RT ON REW_TST (C1);
CREATE SEQUENCE RT_SEQ START WITH 1;

CREATE OR REPLACE TRIGGER RW_BIR
BEFORE INSERT ON REW_TST
FOR EACH ROW
BEGIN
  SELECT C1_SEQ.NEXTVAL
  INTO   :new.C1
  FROM   DUAL;
END;
/

Now that it’s all created, we’ll take ANOTHER snapshot.

Add Data to Kinder Table

This snapshot takes just a couple seconds and is about 10 minutes after our first one and contains the changes that were made to the database since we took the initial change.

Now I’ll add a couple rows from another transaction into the KINDER_TBL from yesterday:

INSERT INTO KINDER_TBL
VALUES (1,dbms_random.string('A', 200), SYSDATE);
INSERT INTO KINDER_TBL
VALUES (2,dbms_random.string('B', 200), SYSDATE);
INSERT INTO KINDER_TBL
VALUES (3,dbms_random.string('C', 200), SYSDATE);
INSERT INTO KINDER_TBL
VALUES (4,dbms_random.string('D', 200), SYSDATE);
INSERT INTO KINDER_TBL
VALUES (5,dbms_random.string('E', 200), SYSDATE);
INSERT INTO KINDER_TBL
VALUES (6, dbms_random.string('F', 200), SYSDATE);
INSERT INTO  KINDER_TBL
VALUES (7,dbms_random.string('G', 200), SYSDATE);
COMMIT;

We’ll take another snapshot:

Add Data to the New Table

Now let’s add a ton of rows to the new table we’ve created:

SQL> Begin
For IDS in 1..10000
Loop
INSERT INTO REW_TST (C2)
VALUES (dbms_random.string('X', 200));
Commit;
End loop;
End; 
  /

And take another snapshot.

Now that I have all of my snapshots for the critical times in the release change, there is a secondary option that is available.

Snapshots at the DBA Level

As I pointed out earlier, there is a range of SCNs in each snapshot.  Notice that I can now provision by time or SCN from the Admin Console:

So I could easily go back to any of my SET beginning or ending SCN by the snapshot OR I could click on the up/down arrows or type the exact SCN I want to pinpoint for the recovery. Once I’ve decided on the correct SCN to recover to, then click on Refresh VDB and it will go to that SCN, just like doing a recovery via RMAN, but instead of having to type out the commands and the time constraints, this would be an incredibly quick recovery.

Notice that I can go back to any of my snapshots, too.  For the purpose of this test, we’re going to go back to just before I added the data to the new table by clicking on the Selected SCN and clicking Rewind VDB.

Note that now this is the final snapshot shown in the system, no longer displaying the one that was taken after we inserted the 10K rows into REW_TST.

If we look at our SQL*Plus connection to the database, we’ll see that it’s no longer connected from our 10K row check on our new table:

And if I connect back in, what do I have for my data in the tables I worked on?

Pssst-  there are fourteen rows instead of seven because I inserted another 7 yesterday when I created the table… 🙂

I think that answers a lot of the questions posed by Paul, but I’m going to jump in a little deeper on one-

Database Outage During a Rewind

Yes, the database did experience an outage as the VDB was put back to the point in time or SCN requested by the User Interface or Command line interface for Delphix.  You can see this from querying the database:

SQL> select to_char(startup_time,'DD-MM-YYYY HH24:MI:SS') startup_time
 from v$instance;
STARTUP_TIME
-------------------
10-05-2017 12:11:49

The entire database is restored back to this time and the Dsource, the database the VDB is sourced from and keeps track of everything going on in all VDBs, has pushed the database back, yet the snapshots from before this time exist, (tracked by the Delphix Engine.)

If you view the alert log for the VDB, you’ll also see the tail of the recovery steps, including our requested SCN:

alter database recover datafile list clear
Completed: alter database recover datafile list clear
alter database recover datafile list
 1 , 2 , 3 , 4
Completed: alter database recover datafile list
alter database recover if needed
 start until change 2258846
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 2 slaves
Wed May 10 12:11:05 2017
Recovery of Online Redo Log: Thread 1 Group 3 Seq 81 Reading mem 0
  Mem# 0: /mnt/provision/VEmp_826/datafile/VEMP_826/onlinelog/o1_mf_3_dk3r6nho_.log
Incomplete recovery applied all redo ever generated.
Recovery completed through change 2258846 time 05/10/2017 11:55:30
Media Recovery Complete (VEmp826)

Any other changes around the change that you’re tracking from are impacted by a rewind, so if there are two developers working on the same database, they could impact each other, but with the small footprint of a VDB, why wouldn’t you just give them each their own VDB and merge the changes at the end of the development cycle?  The glorious reasons for adoption virtualization technology is to have the ability to work in 2 week sprints and be more agile than our older, waterfall methods that are laden with problems.

Let me know if you have any more questions-  I live for questions that offer me some incentive to go look at what’s going on under the covers!

Posted in Database, Delphix, Oracle Tagged with:

April 28th, 2017 by dbakevlar

I did a couple great sessions yesterday for the awesome Dallas Oracle User Group, (DOUG.)  It was the first time I did my thought leadership piece on Making Sense of the Cloud and it was a great talk, with some incredible questions from the DOUG attendees!

This points me to a great [older] post on things IT can do to help guarantee tech projects are more successful. DevOps is a standard in most modern IT shops and DBAs are expected to find ways to be part of this valuable solution.  If you inspect the graph, displaying the value of different projects in ROI, vs. how often these different types of projects run over budget and time, it may be surprising.

Where non-software projects are concerned, the project rarely runs over the schedule, but in the way of benefits, often comes up short.  When we’re dealing with software, 33% of project run over time, but the ROI is excruciatingly high and worthy of the investment.  You have to wonder how much of that over-allocation in time feeds into the percentage increase in cost?  If this could be deterred, think about how more valuable these projects would become?

The natural life of a database is growth.  Very few databases stay a consistent size, as companies prosper, critical data valuable to the company requires a secure storage location and a logical structure to report on that data is necessary for the company’s future.  This is where relational databases come in and they can become the blessing and the burden of any venture.  Database administrators are both respected and despised for their necessity to manage the database environment as the health of the database is an important part of the IT infrastructure and with the move to the cloud, a crucial part of any viable cloud migration project.

How much of that time, money and delay shown in those projects are due to the sheer size and complexity of the database tier?  Our source data shows how often companies just aren’t able to hold it together due to lacking skills, lacking estimates in time estimates and other unknowns that come back to bit us.

I can’t stress enough why virtualization is key to removing a ton of the overhead, time and money that ends up going into software projects that include a database.

Virtualizing non-production databases results in:

  1. Ability to deliver full copies of production for developers without extensive demands on storage.
  2. Ability to deliver those databases in a matter of minutes vs. days or weeks.
  3. Ability to refresh databases as needed for any project.
  4. Self-service user-interface so developers and testers can recover from a catastrophic issue in a database without having to grovel to a DBA to restore a virtual database.
  5. Ability to branch the VDB and do versioning, which is awesome for both developers and testers, (I know, we DBAs care very little about this feature… :))
  6. In migrations/cloud migrations, the ability to migrate databases in short periods of time and to limit the storage footprint to save company the money they were promised the cloud would deliver that most are finding out in the long run, is not occurring with traditional database scenarios.

It’s definitely something to think about and if you don’t believe me, test it yourself with a free trial!  Not enough people are embracing virtualization and it takes so much of the headache out of RDBMS management.

Posted in AWS, Azure, Cloud, Oracle, SQLServer Tagged with: , ,

April 18th, 2017 by dbakevlar

For over a year I’ve been researching cloud migration best practices.  Consistently there was one red flag that trips me that I’m viewing recommended migration paths.  No matter what you read, just about all of them include the following high level steps:

As we can see from above, the scope of the project is identified, requirements laid out and a project team is allocated.

The next step in the project is to choose one or more clouds, choose the first environments to test out in the cloud, along with security concerns and application limitations.  DBAs are tested repeatedly as they continue to try to keep up with the demand of refreshing or ensuring the cloud environments are able to keep in sync with on-prem and the cycle continues until a cutover date is issued.  The migration go or no-go occurs and the either non-production or all of the environment is migrated to the cloud.

As someone who works for Delphix, I focus on the point of failure where DBAs can’t keep up with full clones and data refreshes in cloud migrations or development and testing aren’t able to complete the necessary steps that could be if the company was using virtualization.  From a security standpoint, I am concerned with how few companies aren’t investing in masking with the sheer quantity of breeches in the news, but as a DBA, there is a whole different scenario that really makes me question the steps that many companies are using to migrate to the cloud.

Now here’s where they loose me every time- the last step in most cloud migration plans is to optimize.

I’m troubled by optimization being viewed as the step you take AFTER you migrate to the cloud.  Yes, I believe that there will undoubtedly be unknowns that no one can take into consideration before the physical migration to a cloud environment, but to take databases, “as is” when an abundance of performance data is already known about the database that could and will impact performance, seems to be inviting unwarranted risk and business impact.

So here’s my question to those investing in a cloud migration or have already migrated to the cloud-  Did you streamline and optimize your database/applications BEFORE migrating to the cloud or AFTER?

 

 

Posted in AWS, Azure, Oracle, SQLServer Tagged with: , ,

April 13th, 2017 by dbakevlar

I was in a COE, (Center of Excellence) meeting yesterday and someone asked me, “Kellyn, is your blog correct?  Are you really speaking at a Blockchain event??”  Yeah, I’m all over the technical map these days and you know what?

I love the variety of technology, the diversity of attendance and the differences in how the conferences are managed.  Now that last one might seem odd and you might think that they’d all be similar, but its surprising how different they really are.

Getting to Know You

Today I’m going to talk about an aspect of conferences that’s very near to my heart, which is networking via events.  For women in technology, there are some unique challenges for us when it comes to networking.  Men have concerns about approaching women to network- such as fearful of accusations of inappropriate interaction and women have the challenge that a lot of networking opportunities occur outside of the workplace and in social situations that we may not be comfortable in.  No matter who you are, no matter what your intentions, there’s a lot of wariness and in the end, women often just lose out when it comes to building their network.  I’ve been able to breach this pretty successfully, but I have seen where it’s backfired and have found myself on more than one occasion defending both genders who’ve ended up on the losing side of the situation.

With that said, conferences and other professional events can assist with helping us geeks build our networks and it’s not all about networking events.  I noticed a while back that the SQL Server community appeared to be more networked among their members.  I believe part of this is due to the long history of their event software and some of its features.

Using the SQL Pass website, specifically the local user group event management software-  notice that its all centralized.  Unlike the significantly independent Oracle user groups, SQL Server user groups are able to use a centralized repository for their event management, speaker portal, scheduling, etc.  It’s not to say that there aren’t any events outside of Pass Summit and SQL Saturdays, there’s actually a ton, but this was the portal for the regional user groups, creating the spoke that bridged out to the larger community.

Centralized System

Outside of submitting my abstract proposals to as many SQL Saturdays worldwide from one portal, I also can maintain one speaker biography, information about my blog, Twitter, Linkedin and other social media in this one location.

The second benefit of this simplicity, is that these biographies and profiles “feed” the conference schedules and event sites.  You have a central location for management, but hundreds of event sites where different members can connect.  After abstracts have been approved and the schedule built, I can easily go into an event’s schedule and click on each speaker biography and choose to connect with anyone listed who has entered their social media information in their global profile.

Using my profile as an example, you’ll notice the social media icons under my title are available with a simple click of the mouse:

This gives me both an easy way to network with my fellow speakers, but also an excuse to network with them!  I can click on each one of the social media buttons and choose to follow each of the speakers on Twitter and connect with them on Linkedin.  I send a note with the Linkedin connection telling the speaker that we’re both speaking at the event and due to this, I’d like to add them to my network.

As you can join as many regional and virtual user groups as you like, (and your Pass membership is free) I joined the three in Colorado, (Denver, Boulder and Colorado Springs.) Each one of those offers the ability to also connect with the board members using a similar method, (now going to use Todd and David as my examples from the Denver SQL Server user group.)

The Oracle user groups have embraced adding twitter links to most speaker bios and some board groups, but I know for RMOUG, many still hesitated or aren’t using social media to the extent they could.  I can’t stress enough how impressed I am when I see events incorporate Linkedin and Twitter into their speaker and management profiles, knowing the value they bring to technical careers, networks and the community.

New Kids on the Block

Although the SQL Server community is a good example, they aren’t the only ones.  I’m also speaking at new events on emergent technologies, like Data Platforms 2017.  I’ll be polite and expose my own profile page, but I’m told I’m easy to find in the sea of male speakers… 🙂 Along with my picture, bio and session information, there are links to my social media connections, allowing people to connect with me:

Yes, the Bizzabo software, (same software package that RMOUG will be using for our 2018 conference, along with a few other Oracle events this coming year) is aesthetically appealing, but more importantly, it incorporates important networking features that in the past just weren’t as essential as they are in today’s business world.

I first learned the networking tactic of connecting with people I was speaking with from Jeff Smith and I think its a great skill that everyone should take advantage of, no matter if you’re speaking or just attending.  For women, I think it’s essential to your career to take advantage of opportunities to network outside of the traditional ways we’ve been taught in the past and this is just one more way to work around that glass ceiling.

 

 

Posted in DBA Life, Oracle, SQLServer Tagged with: , , ,

April 11th, 2017 by dbakevlar

I’m itching to dig more into the SQL Server 2016 optimizer enhancements, but I’m going to complete my comparison of indices between the two platforms before I get myself into further trouble with my favorite area of database technology.

<–This is sooo me.

Index Organized Tables

Index Organized Tables, (IOT) are just another variation of a primary b-tree index, but unlike a standard table with an index simply enforcing uniqueness, the index IS the table.  The data is arranged in order to improve performance and in a clustered primary key state.

This is the closest to a clustered index in SQL Server that Oracle will ever get, so it makes sense that a comparison in performance and fragmentation is the next step after I’ve performed standard index and primary key index comparisons to Oracle.

Let’s create a new copy of our Oracle objects, but this time, update to an Index Organized Table:

CREATE TABLE ora_tst_iot(
        c1 NUMBER, 
        c2 varchar2(255),
        CREATEDATE timestamp DEFAULT CURRENT_TIMESTAMP,
        CONSTRAINT pk_ora_iot PRIMARY KEY (c1))
    ORGANIZATION INDEX 
    TABLESPACE users
    PCTTHRESHOLD 20
    OVERFLOW TABLESPACE users;

CREATE SEQUENCE C1_iot_SEQ START WITH 1;

CREATE OR REPLACE TRIGGER C1_iot_BIR 
BEFORE INSERT ON ORA_TST_IOT
FOR EACH ROW
BEGIN
  SELECT C1_iot_SEQ.NEXTVAL
  INTO   :new.C1
  FROM   DUAL;
END;
/

The PCTThreshold can be anywhere between 0-50, but I chose 20 for this example.  I didn’t add any compression, as C1 is a simple sequence which won’t have the ability to take advantage of compression and I also added the additional support objects of a sequence and a trigger, just as I did for the previous test on the Oracle table.

First Data Load

Now we’ll insert the rows from ORA_INDEX_TST into ORA_TST_IOT

SQL> insert into ora_tst_iot(c2) select c2 from ora_index_tst;

995830 rows created.

Elapsed: 00:00:04:01

There won’t be any fragmentation in the current table-  it was directly loaded-  no deletes, no updates.  Although it won’t be shown in the examples, I will collect stats at regular intervals and flush the cache to ensure I’m not impacted in any of my tests.

SQL> ANALYZE INDEX PK_INDEXPS VALIDATE STRUCTURE;
Index analyzed.

Elapsed: 00:00:00.37

SQL> select index_name from dba_indexes
  2  where table_name='ORA_TST_IOT';

INDEX_NAME
------------------------------
PK_ORA_IOT
SQL> analyze index pk_ora_iot validate structure;

Index analyzed.

SQL> analyze index pk_ora_iot compute statistics;

Index analyzed.

SQL> SELECT LF_BLKS, LF_BLK_LEN, DEL_LF_ROWS,USED_SPACE, PCT_USED FROM INDEX_STATS where NAME='PK_ORA_IOT';

     32115 7996 0  224394262     88

Well, the table IS THE INDEX.  We collect stats on the table and now let’s remove some data, rebuild and then see what we can do to this IOT-

SQL> select * from ora_tst_iot
  2  where c1=994830;

    994830

SBTF02LYEQDFGG2522Q3N3EA2N8IV7SML1MU1IMEG2KLZA6SICGLAVGVY2XWADLZSZAHZOJI5BONDL2L
0O4638IK3JQBW7D92V2ZYQBON49NHJHZR12DM3JWJ1SVWXS76RMBBE9OTDUKRZJVLTPIBX5LWVUUO3VU
VWZTXROKFWYD33R4UID7VXT2NG5ZH5IP9TDOQ8G0

10-APR-17 03.09.52.115290 PM

SQL> delete from ora_tst_iot
  2  where c1 >=994820           
  3  and c1 <=994830;

11 rows deleted.

SQL> commit;

Now we see what the data originally looked like-  C2 is a large column data that was consuming significant space.

What if we now disable our trigger for our sequence and reinsert the rows with smaller values for c2, rebuild and then update with larger values again?

ALTER TRIGGER C1_IOT_BIR DISABLE;

INSERT INTO ORA_TST_IOT(C1,C2) VALUES (994820, 'A');
INSERT INTO ORA_TST_IOT(C1,C2) VALUES (994821, 'B');
INSERT INTO ORA_TST_IOT(C1,C2) VALUES (994822, 'C');
INSERT INTO ORA_TST_IOT(C1,C2) VALUES (994823, 'D');
INSERT INTO ORA_TST_IOT(C1,C2) VALUES (994824, 'E');
INSERT INTO ORA_TST_IOT(C1,C2) VALUES (994825, 'F');
INSERT INTO ORA_TST_IOT(C1,C2) VALUES (994826, 'G');
INSERT INTO ORA_TST_IOT(C1,C2) VALUES (994827, 'H');

so on and so forth till we reach 994830…

COMMIT and then let’s rebuild our table…

Rebuild or Move

ALTER TABLE ORA_IOT_TST REBUILD;

What happens to the table, (IOT) when we’ve issued this command?  It’s moving all the rows back to fill up each block up to the pct free.  For an IOT, we can’t simply rebuild the index, as the index IS THE TABLE.

Now we’ve re-organized our IOT so the blocks are only taking up the space that it would have when it was first inserted into.  So let’s see what happens now that we issue an UPDATE to those rows-

SQL> update ora_tst_iot set c2=dbms_random.string('B',200)
  2  where c1 >=994820
  3  and c1 <=994830;

11 rows updated.

So how vulnerable are IOTs to different storage issues?

Chained Rows

Chained Rows after updating, moving data and then updating to larger data values than the first with 10% free on each block?

Just 11 rows shows the pressure:

SQL> SELECT 'Chained or Migrated Rows = '||value
 FROM v$sysstat
 WHERE name = 'table fetch continued row';  

Chained or Migrated Rows = 73730

Data Unload

Let’s delete and update more rows using DML like the following:

SQL> delete from ora_tst_iot
  2  where c2 like '%300%';

4193 rows deleted.

Insert rows for 300 with varying degree of lengths, delete more, rinse and repeat and update and delete…

So what has this done to our table as we insert, update, delete and then insert again?

SQL> SELECT table_name, iot_type, iot_name FROM USER_TABLES
     WHERE iot_type IS NOT NULL;  

TABLE_NAME               IOT_TYPE           IOT_NAME
------------------------------ ------------ ------------------------------
SYS_IOT_OVER_88595       IOT_OVERFLOW       ORA_TST_IOT
ORA_TST_IOT              IOT

This is where a clustered index and an IOT is very different.  There is a secondary management object involved when there is overflow.  If you look up at my creation, yes, I chose to create an overflow.  Even if I drop the IOT properly, the overflow table will go into the recycle bin, (unless I’ve configured the database without it.)

SQL> select index_name from dba_indexes
  2  where table_name='ORA_TST_IOT';

INDEX_NAME
------------------------------
PK_ORA_IOT

SQL> analyze index pk_ora_iot validate structure;
Index analyzed.

SQL> select blocks, height, br_blks, lf_blks from index_stats;

    BLOCKS     HEIGHT BR_BLKS    LF_BLKS
---------- ---------- ---------- ----------
     32768     3       45        32115

We can see, for the blocks, the rows per leaf blocks aren’t too many-  this is a new table without a lot of DML, but we still see that with the current configuration, there aren’t a lot of rows returned per leaf block.

When we select from the IOT, the index is in full use and we can see that with the proper pct free/pct used, the index is still in pretty good shape:

SQL> select * from table(dbms_xplan.display_awr('fbhfmn88tq99z'));


select c1, c2, createdate from ora_tst_iot
Plan hash value: 3208808379

--------------------------------------------------------------------------------
| Id  | Operation     | Name   | Rows  | Bytes | Cost (%CPU)| Time

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |   |   |   |  8709 (100)|
|   1 |  INDEX FAST FULL SCAN| PK_ORA_IOT |   995K|   206M|  8709   (1)| 00:01:4
5 |

13 rows selected.

SQL> analyze index pk_ora_iot validate structure;
Index analyzed.

SQL> SELECT blocks, height, br_blks, lf_blks FROM index_stats;

    BLOCKS     HEIGHT  BR_BLKS    LF_BLKS
---------- ---------- ---------- ----------
     32768     3       45         32058

SQL> select pct_used from index_stats;

  PCT_USED
----------
 88

I Did a Bad

So now what happens, if like our original test, we shrink down the percentage of what can be used and reorganize, (and please don’t do this in production…or test….or dev….or ever! 🙂)?

SQL> alter table ora_tst_iot move pctfree 90;
Table altered.

SQL> analyze index pk_ora_iot validate structure;
Index analyzed.

SQL> SELECT blocks, height, br_blks, lf_blks FROM index_stats;

    BLOCKS     HEIGHT  BR_BLKS    LF_BLKS
---------- ---------- ---------- ----------
    172928     3      228         165630

Well, that’s a few more leaf blocks, eh?  Insert after enabling trigger again-

SQL> BEGIN
   FOR i in 1..1000000 LOOP
    INSERT INTO ORA_TST_IOT(c2) VALUES(i);
   END LOOP;
 COMMIT;
END;
 /

Now we have our elapsed time for Azure inserts of 1 million records with 100% and 10%.  Let’s compare it to our IOT.  The IOT move command to fill the blocks to 100% was quite fast.  Of course, the reverse, only allowing for 10%, (90% free) took F.O.R.E.V.E.R…, (OK, it sure felt like it…why didn’t I just truncate it?  Oh, yeah, I wanted it to be a real test, not simply an easy test..)

Note: For this test, we’ll rebuild after updating the pctfree each time.

 

10% Fill Factor in SQL Server and 1 million insert: Elapsed time: 23 minutes, 18 seconds

90% PCTFree in Oracle and 1 million insert:  7 min, 12 seconds

100% Fill Factor in SQL Server and 1 million insert: Elapsed Time: 4 minutes, 43 seconds

0% PCTFree in Oracle and 1 million insert: 1 min, 8 seconds

REBUILD of the Oracle IOT to make it 90% free in each block?  Elapsed Time:  8 hrs, 21 minutes, 12 seconds 

…along with four backups of archive logs it generated that filled up the archive dest… 🙂  Now the AWS Trial is to be used to test out the Delphix product, not to test out index performance in a high insert/delete/update scenario, so I’ve been asking for some of these challenges, but it was still a great way to build this out quickly and then compare.

 

Results

In this test, this was the overall results:

  1. Less overhead during transactional processing when it comes to inserts, updates and deletes to the IOT.
  2. Improved performance on any selects that require the data to be sorted in sequential order.
  3. Similar performance to SQL Server clustered indexes on complex queries and less sort temp usage.
  4. Limited use case than SQL Server clustered indexes, as these are quite common and IOTs are less used in the Oracle space.
  5. More maintenance upkeep as we will need to reorganize the IOT if its used with processing that includes a lot of inserts, updates and deletes.
  6. DBCC rebuilds of a clustered index uses less resources and doesn’t impact the transaction log as it does Oracle’s rollback and archive log.  It was easier to build the table with a high pct free storage configuration and then do an insert of the data, then drop the old table than to do an “alter move” command.

Now there’s more to do comparisons on, so I’m going to dig in more on the SQL Server side, but here’s to Oracle Index Organized Tables, (IOTs)!

Posted in Oracle, SQLServer Tagged with: , , ,

April 5th, 2017 by dbakevlar

I just uploaded my slides from Collaborate 2017 to Slideshare, but also, the findings missing from my slide deck I used for Oak Table World on War of the Indices- Oracle and SQL Server.  Feel free to download them here.

As I tested out Oracle vs. SQL Server Index performance, I ran across this great script to check for fragmentation from Franck Pachot.  You’ll need to simply update the script to declare the table and index name or simply edit the script as is, adding those two values correctly before running it.

The outcome when run against my ugly index in the testing of the good, the bad and the ugly, was fun, to say the least:

SQL> @index_frag.sql;

 C1 ->       C1 rows/block bytes/block %free space blocks free
---------- -- ---------- ---------- ----------- ----------- ---------- -----
     30041 ->   130221 35     560  93   2851 oooo
    130256 ->   230433 35     560  93   2851 oooo
    230468 ->   330654 35     560  93   2851 oooo
    330689 ->   430874 35     560  93   2851 oooo
    430909 ->   531073 35     560  93   2851 oooo
    531108 ->   631264 35     560  93   2850 oooo
    631299 ->   731429 35     560  93   2850 oooo
    731464 ->   831547 35     560  93   2850 oooo
    831583 ->   931711 35     560  93   2850 oooo
    931746 -> 1030047 34     560  93   2850 oooo

If you have just one or a couple indexes that you want to perform some deep analysis to view fragmentation, Franck’s script does an awesome job.

Just my share for the day…. 🙂

Posted in Oracle Tagged with: ,

April 5th, 2017 by dbakevlar

This post has a lot of the support code and data for my Oak Table Talk that I’ll be giving at IOUG Collaborate 2017 in Las Vegas on April 5th, 2017.  

One of the Iceland 2017 SQL Saturday sessions got me thinking about indexing and how similar and different it all is in Oracle vs. SQL Server.  There was some really fun, (well, at least what I call fun…) test cases built out and referenced by Paul Randal.  After looking through some of it, I decided it might be interesting to try to replicate it to Oracle, (as close as possible) and compare how the two database platforms deal with index storage and specifically- SQL Server’s Fill Factor vs. Oracle PctIncrease index percentage filled.

B-tree indexing is the cornerstone of physically optimizing searches on data.  No consensus exists on what the “B” stands for, (some think its from Bayer, for one of the main gentlemen who did the research and many more believe it’s for Boeing, for the Research Center the research was done at.)

The choice in how the data is organized, leafs and such are pretty standard, but database platforms have created some unique indexing that enhances queries on RDBMS vs. just having heap tables.

Using Oracle and SQL Server as our choice for a comparison today, there are a few translations I need for readers of this blog:

OracleSQL ServerDescription
Index Organized Table, (IOT)Clustered Indexphysical index storing data in their key values. In SQL Server, there can be only one Clustered index per table.
Pctfree of blockFillFactor of pagePercent of storage that is allowed filled. There are different times when this is used for each platform.
SequenceTOPAbility to populate data with a sequential number
dbms_random.stringReplicateAbility to populate data with string values
blockpageunit of storage
Automatic Workload Repository, (AWR)Dynamic Management Views, (DMV)Performance data collection

Now that we have that out of the way, you can use this trusty, little graph for common terms that require a “translation” from one database platform to the other.

The next thing to remember is that PCTFree and FillFactor aren’t adhered to at all times.  Appending a row to an index is different than updating a row in an index and each platform has it’s own set of criteria to decide if it follows the rule of percentage of a block or page to fill or not.

The Test

The steps of this test:

  1. Create a table with three columns and two indexes-  SQL Server having its trusted clustered index.
  2. Populate data of different sizes to each of the tables.
  3. check the storage of our index “health”
  4. Remove data
  5. Repeat step 2 and also remove data
  6. Check the storage again to see how it has changed-  page splits in SQL Server, leaf block splits in Oracle

Goal is:

  1. Inspect the differences and similarities of indexing in both platforms
  2. The pros and cons of how index data is stored and used in both platforms

Oracle Code for Creation of Objects and Support

  • Table and PK with constraint
  • Alter index statement to decrease pctfree
  • Sequence to populate c1 column
  • Trigger to do this in simple way
  • Rinse, repeat, more “shampoo”, and do it all again… 🙂

Yes, I could have just used RowNum, but I was trying to kill a second bird, (testing task) with this one stone, so an trigger with a sequence it is… 🙂

CREATE TABLE ORA_INDEX_TST
(
C1 NUMBER NOT NULL
,C2 VARCHAR2(255)
,CREATEDATE TIMESTAMP
);

CREATE INDEX PK_INDEXPS ON ORA_INDEX_TST (C1);
ALTER TABLE ORA_INDEX_TST ADD CONSTRAINT OIT_PK PRIMARY KEY(C1) USING INDEX PK_INDEXPS;
CREATE UNIQUE INDEX IDX_INDEXPS ON ORA_INDEX_TST(C2);
ALTER INDEX PK_INDEXPS REBUILD PCTFREE 90 INITRANS 5;
ALTER INDEX IDX_INDEXPS REBUILD PCTFREE 90 INITRANS 5;
CREATE SEQUENCE C1_SEQ START WITH 1;

CREATE OR REPLACE TRIGGER C1_BIR 
BEFORE INSERT ON ORA_INDEX_TST 
FOR EACH ROW
BEGIN
  SELECT C1_SEQ.NEXTVAL
  INTO   :new.C1
  FROM   DUAL;
END;
/

We’ll need to manually insert just enough data to fill up one block, which is 8KB in this database, (and default.)

INSERT INTO ORA_INDEX_TST (C2, CREATEDATE) 
VALUES (dbms_random.string('A', 200), SYSDATE);
INSERT INTO ORA_INDEX_TST (C2, CREATEDATE) 
VALUES (dbms_random.string('B', 200), SYSDATE);
INSERT INTO ORA_INDEX_TST (C2, CREATEDATE) 
VALUES (dbms_random.string('C', 200), SYSDATE);
INSERT INTO ORA_INDEX_TST (C2, CREATEDATE) 
VALUES (dbms_random.string('D', 200), SYSDATE);
INSERT INTO ORA_INDEX_TST (C2, CREATEDATE) 
VALUES (dbms_random.string('E', 200), SYSDATE);
INSERT INTO ORA_INDEX_TST (C2, CREATEDATE) 
VALUES (dbms_random.string('F', 200), SYSDATE);
INSERT INTO ORA_INDEX_TST (C2, CREATEDATE) 
VALUES (dbms_random.string('G', 200), SYSDATE);
COMMIT;

We’ll now verify that our data is inserted into one block:

SQL> ANALYZE INDEX PK_INDEXPS VALIDATE STRUCTURE;

SQL> SELECT LF_BLKS, LF_BLK_LEN, DEL_LF_ROWS,USED_SPACE, PCT_USED FROM INDEX_STATS where NAME='PK_INDEXPS';
   LF_BLKS LF_BLK_LEN DEL_LF_ROWS USED_SPACE   PCT_USED
---------- ---------- ----------- ---------- ----------
  1        7924       0           1491        19

Code for SQL Server Objects and Support.  Since I didn’t have the same secondary project request, this one will appear simpler:

  • Table with PK with constraint
  • alter index and change fill factor
  • Second Table to use for data population
CREATE TABLE SQL_INDEX_TST (c1 INT NOT NULL, c2 CHAR (255), createdate datetime);
CREATE INDEX CL2_INDEX_TST ON SQL_INDEX_TST(C2);
GO

ALTER TABLE SQL_INDEX_TST 
ADD CONSTRAINT PK_CLINDX_TST PRIMARY KEY CLUSTERED (c1);
GO

First, in SQL Server, a page will hold around 8KB of data, so let’s test out our index storage:

INSERT INTO SQL_INDEX_TST(c1,c2) VALUES (1, 'a');
INSERT INTO SQL_INDEX_TST(c1,c2) VALUES (2, 'a'); 
INSERT INTO SQL_INDEX_TST(c1,c2) VALUES (3, 'a'); 
INSERT INTO SQL_INDEX_TST(c1,c2) VALUES (4, 'a');
INSERT INTO SQL_INDEX_TST(c1,c2) VALUES (5, 'a'); 
INSERT INTO SQL_INDEX_TST(c1,c2) VALUES (6, 'a'); 
INSERT INTO SQL_INDEX_TST(c1,c2) VALUES (7, 'a'); 
GO

We now have officially “filled” the first page as much as possible and we should see this if we query the information schema:

SELECT
OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name, i.name as index_name, leaf_allocation_count, nonleaf_allocation_count, fill_factor, type_desc
FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('dbo.SQL_INDEX_TST'),NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id;
 

Data Loads

Oracle:

SQL> Begin
For IDS in 1..1000000
Loop
INSERT INTO ORA_INDEX_TST (C2) 
VALUES (dbms_random.string('X', 200));
Commit;
End loop;
End; 
  /

10% PCT Free- Time Elapsed 2 minutes, 12 seconds

90% PCT Free- Time Elapsed 7 minutes, 3 seconds

I’ll have both the initial test data and the new 10000 rows I’ve added:

SQL> select count(*) from ora_index_tst;

  COUNT(*)
----------
     1000008

Let’s delete some of this data load to create deleted leaf blocks:

SQL> delete from ora_index_tst
2 where c2 like '%200%';

4179 rows deleted.

SQL> commit;
Commit complete.

Now let’s analyze and take a look at the stats again:

SELECT LF_BLKS, LF_BLK_LEN, DEL_LF_ROWS,USED_SPACE, PCT_USED FROM INDEX_STATS where NAME='PK_INDEXPS';

   LF_BLKS LF_BLK_LEN DEL_LF_ROWS USED_SPACE   PCT_USED

---------- ---------- ----------- ---------- ----------

     41227  7924       121 212596009       19

There’s a substantial difference in number of leaf blocks vs. when the pct_used is allowed to fill up:

   LF_BLKS LF_BLK_LEN DEL_LF_ROWS USED_SPACE   PCT_USED

---------- ---------- ----------- ---------- ----------

     2004  7996       531   15985741     100

Oracle wasn’t impacted by PCTFREE that much, but there was some impact. Rebuilds were required to clean up some wait, but it wasn’t a true “requirement”, just a preferences if consistent deletes, updates where data was different sized than original and poor storage choices.  The differences in performance weren’t that significant.

Now that we know we have deleted rows, let’s do the same on the SQL Server side:

SQL Server

declare @id int

select @id = 9 --already inserted 8 rows
while @id >= 0 and @id <= 1000000
begin
   insert into sql_index_tst (c1,c2) values(@id, 'DKUELKJ' + convert(varchar(7), @id))
   select @id = @id + 1
end

Default Fill Factor- Elapsed Time: 4 minutes, 43 seconds

10% Fill Factor- Elapsed time: 23 minutes, 18 seconds

Delete some rows to test similar to Oracle:

DELETE FROM SQL_INDEX_TST WHERE c2 LIKE ‘%200%’;

Now there are a few ways we can look at how the indexes were impacted.  We’ll first check for page splits, which as we’ve discussed, cause extra work to the transaction log and fragmentation in the index:

SELECT
OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,leaf_allocation_count
,nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('dbo.SQL_Index_tst'),NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id;

Next, we’ll look at the physical fragmentation of the index:

SELECT
OBJECT_SCHEMA_NAME(ips.object_id) + '.' + OBJECT_NAME(ips.object_id) as table_name
,ips.avg_fragmentation_in_percent
,ips.fragment_count
,page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.SQL_Index_tst')
GO

There’s significant fragmentation and it also impacted performance as we viewed above.

Index Maintenance

USE AS_test;  

GO

DBCC DBREINDEX ('SQL_INDEX_TST', CL2_INDEX_TST,100);
DBCC DBREINDEX ('SQL_INDEX_TST', PK_CLINDX_TST,100);
GO

We’ve now rebuilt our indexes and moved the fillfactor to 100%.  Queries using each index column in where clause improved over 20%.  Insert and updates increased to perform similarly to Oracle, unless….

Sorts on data for C1 column on a clustered index in SQL Server increased dramatically and out-performed Oracle’s PK. Only IOT tables could compete, but the use case was very small where it was beneficial.

And The Winner Is…

So who won out in my comparison at Oak Table World?  As we always hear from the DBA,

It Depends…

Some of the benefits of clustered indexes in SQL Server are superior to Oracle:

  • Data is physically sorted in the clustered index
  • Optimizer usage specific- clustered index seek
  • Work best with sequential data, identity columns and order dates
  • Option to randomize the writes on the index can deter from hot spots.

There are negatives that leave this debate still open for me:

  • Vulnerable to fragmentation
  • Inserts are added at end and introduce “Last Page Insert Latch Contention” wait event.
  • Subject to hot spots, (see above…)
  • Page Splits- hit performance hard, especially to the transaction log, (requires much more work than standard insert.)
  • Fillfactor can be a hit or miss configuration for some systems.

There was a lot of other tests and queries I used than what is presented here, but this is the main focus of the test.  I need to thank those that have contributed to the deep index knowledge that offered me the research to then want to research on my own.  Shout out to Richard Foote, Mr. Oracle Index and Paul Randal and Jason Strate for the SQL Server expertise!

Posted in Oracle, SQLServer Tagged with: ,

  • Facebook
  • Google+
  • LinkedIn
  • Twitter