In this previous post I talked like there are some customers that do start their virtualization journey virtualizing mission critical databases. They are the exceptions, not the rule but that post raised a lot of interest as many customers are looking at virtualizing mission critical application and databases as their next step in the virtualization journey.
So, I went back to one of these customers and I did an in depth interview with Douglas Babb, Chief IT Systems Architect, System Implementers, Inc, to get more information about his experience. It is a bit long but there is a lot of good stuff in here. If you go back to the key drivers of the virtualization journey, Doug is a good example of an individual that embodies two or the critical virtualization adoption elements: Confidence and Sponsorship as he is both very technical and experienced (confidence) as well as very senior and respected (sponsorship).
Q: Doug, a lot of customers are concerned about virtualizing production databases. What is your experience? Why virtualize databases?
Doug: We look at application candidates holistically from the app logic all the way to the databases, from a performance perspective as well as quality of service requirements.
Also, we make sure we deploy our databases in the virtual environment across the development lifecycle so that we have the same architecture and quality or service across all phases: development, test and production.
Q: When did you start virtualizing databases in your journey?
Doug: The first step was to ensure we had the framework for virtualization. We did not have any Enterprise Storage network, we had no fabric, we consolidated 6 data centers into two. We focused on building the framework, which was network, data fabric, storage and power distribution.
If we had started virtualization without have a stable data center, stable enterprise storage, and fabrics we would have failed.
Q: So, when did you start with the actual virtualization effort?
Doug: In the first phase of the project we moved our databases that were running on proprietary systems such as Unix, Open VMS and Mainframes to the x86 architecture.
Q: What was the technical reason why databases were a good candidate for virtualization?
Doug: Prior to starting this effort, we had been down 8 times in 3 months and the cost of just one of the application was 1 million per hour.
We were shocked by the cost of downtime. Just with the cost of salary alone, we paid for the virtualization effort in the first year.
I found that ERP techniques work well in large virtualization projects, and we took that Oracle Application Methodology (AIM) with short cycles step-wise refinements in order to perfect the process, so we used those type of ERP methodologies for consolidation efforts.
Q: If you had to help a customer virtualize databases based on your experience, what would the recipe be?
Doug: The first thing is to understand the performance of the application before virtualizing, so our concept is to create a jail that is a confined environment where we get quantifiable data as to how the application performs.
The next thing is to determine if the application/DB is a fit.
Q: do you have examples of configurations that you stayed away from?
Doug: We have not stayed away from anything so far other than this one application that does 3.5 trillion I/Os in 2.5 hours and we know that with VMware currently we can do 100,000 I/Os per sec per VM so that one was not a good fit.
Then we profile the application after virtualization. The thing that we are most concerned about is the user experience and so we have used multiple tools to measure it including an open source one called grinder http://grinder.sourceforge.net/
or Mercury test professional
That is the metric that really matters.
We also capture the standard CPU/Network and I/O memory metrics.
When we go into production, we also keep the original application running and if the users say “it did not use to work that way”, we can point them to a working instance prior to virtualization and show them the actual data. Quantifiable data is hard to argue with.
Q: So, you are saying that applications and databases run better in the virtual environment from a technical perspective, tell me from a business perspective, what is the business justification for
Doug: In most cases we are 2% to 8% of the previous cost of hosting.
And even if you look at our virtualization journey, we have an application that we just refreshed and it is going to be 40% less even after saving 90%.
Q:Where does the cost saving come from?
Doug: All of the above: fewer physical servers, most application did not use de-duplication of backups, they did not use de-duplication of storage, we also use information lifecycle management where we only use high performance storage when required.
Q: So, when you move to the virtual environment you also take the chance to upgrade all the related components.
And we also remove things where possible. We consolidate, standardize, automate and repeat. And we don’t try to get it right the first time, we try to get it close, and the we iterate. Sometime we iterate 10-12 times before we go into production.
Q: oh yes, I am a big proponent of Agile and Scrum methodologies.
Doug: We really like design thinking and Agile processes. A lot of people think that they are only applicable to development but they work just as well for operations
We are always trying to use lean methodology http://www.lean.org/WhatsLean/
which came fro Toyota: Plan, do check act.
Q: Did you have to deal with concerns from DBAs?
Q: How did you address them?
Doug: Two nights ago a virtualized application started to double its response time. So the DBA sends out an email saying that something is going on with the virtual environment.
What we found by correlating the performance events from the physical server, from the storage, the operating system from Virtual Center, and then from Oracle itself we were able to determine that they just crossed the boundary and by changing to minor parameters in Oracle the system went back to normal. It had nothing to do with virtualization. It had to do with the fact that the cost-space optimizer crossed the boundary whereby a query just did not fit into memory anymore.
We constantly fight with this attitude that virtualization is evil because of all the mis-information that is out there.
A lot of it is also because of over-subscription, people who are reducing cost sometimes oversubscribe their environment to the point where performance suffers.
So one of the things that we do is we never over-subscribe RAM, the RAM is so cheap we chose not to over-subscribe it. We always buy servers with maximum RAM.
We are bringing our servers in with ½ a terabyte of RAM.
There is so much value brought by the VMware product ecosystem with being able to standardize your images and use Site Recovery manager and other flexible features… why over-subscribe something that is so inexpensive.
Q: Do you see DBAs embracing virtualization?
Doug: No. They understand why we do it but they feel like they are losing something. There is always the fear of loss, whenever something is taken away from you… if you can build a kingdom, if you can be in an isolated area you are more survivable and the economy, the way everything is working I think people feel better to have ownership of their entire environment.
Q: hug their servers…
Doug: Yeah. They want to hug it for sure. It is something especially for the DBAs, and especially for the FUD that has been spread in the industry about virtualizing databases.
Q: do you virtualize both Oracle and SWLServer?
Doug: We do
Q: How did you get over the concern about the Oracle support concern?
Doug: That’s a huge issue. For us, we have built into our ELA with both Microsoft and Oracle that if they want to continue working with us they have to support VMWare. That was the first thing we did. We reached out to our ELA team to ensure that there was verbiage in there, including RAC, that specified that in our environment we want to be able to run on VMware.
Since then, there has been a huge change with Oracle. With the exception of Oracle RAC everything is supported on VMware now. I think that a lot of this changed when they acquired BEA.
Q: What about other ISVs?
Doug: We had an instance with Cognos where they were going to deploy a new version (version 8) and they said they would not support us on VMware. I excused myself from the room, I called their technical account manager who called the team at VMware and within 5 minutes there was an email to the Cognos sales person with the support statement.
Oracle, with the exception RAC we have no issues, with Microsoft we have no issues as far as supporting us on VMware.
Microsoft Exchange is the only Microsoft application that we have not virtualized, but we don’t have control over that environment currently.
Q: so, to recap, we said
1 – Measure everything before and after virtualization. Constantly measure.
2- Keep existing physical environment around for a while
3- Use HA expensively. We are also very interested in FT when it supports multiple CPUs
Thank you Doug!!