Upgrades are dying, don’t die with them

We live in a world that has changed the way it consumes applications. The last few years have seen a rapid rise in the adoption of Software-as-a-Service (SaaS) and Platform-as-a-Service (PaaS). Much of this can be attributed to the broad success of Amazon Web Services (AWS), which is said to have grown revenue from $3.1B to $5B last year (Forbes). More and more people, enterprise customers included, are consuming applications and resources that require little to no maintenance. And any maintenance that does happen, now goes unnoticed by users. This leaves traditional software vendors contending to find a way to adapt their distribution models to make their software easier to consume. Lengthy, painful upgrades are no longer acceptable to users, forcing vendors to create a solution to this problem.

Let’s face it, the impact of this on traditional software companies are starting to be felt. Their services and methods of doing business are now being compared to a newer, more efficient model. One that is not bogged down by the inefficiencies of the traditional model. SaaS has the advantage that the software runs in their datacenters, where they have easy access to it, control the hardware, the architecture, the configurations, and so on.

Open source initiatives that target the enterprise market, like OpenStack, have to look at what others are doing in order to appeal to it’s intended audience. The grueling release cycle of the OpenStack community (major releases every 6 months) can put undue pressure on enterprise IT teams to update, deploy, and maintain environments, often times leaving them unable to keep up from one release to the next. Inevitably, they start falling behind. And in some cases, their attempts to update is slower than the software release cycle, resulting in them falling further behind each release. This is a major hindrance to successful OpenStack adoption.

Solving only one side of the problem

Looking at today’s best practices for upgrading, we can see that the technology hasn’t quite matured yet. And, although DevOps allows companies to deliver code to customers faster, It doesn’t solve the problem of installing the new underlying infrastructure – faster is not enough.. This situation is even more critical when considering your data security practices. The ability to patch quickly and efficiently is key for companies to deploy security updates when critical security issues are spotted.

Adding to this further, is how businesses can shorten the feedback loop with development releases. Releasing an alpha or beta, waiting for people to test it and send relevant feedback is a long process that causes delays for both the customer and the provider. Yet another friction point.

Efforts are currently being made with community projects Tempest and Rally to provide better vision in a cloud’s stability and functionality. These two projects on their own are necessary steps in the right direction, however they currently lack holistic integration and still only offer a vision into a single cloud’s performance. Additionally, they do not yet allow for an OpenStack distribution provider to check if their distribution’s new versions work with specific configurations or hardware. Whatever the solution is, it has to compete with what is currently being offered in the “*aaS” or it will be seen as outdated and risk losing users.

Automation: A way out

Continuous integration and continuous delivery (CI/CD) is all the rage these days and it might offer part of the solution. Automation has to play a key role if companies are to keep up. We need to look into ways of making the process repeatable, reliable, incrementally improving, and customizable. Developers can no longer claim it worked on their laptop, so companies cannot limit themselves to saying it worked (or didn’t work) on their infrastructure. Software providers have to get closer to their customers to share in the pain.

Every OpenStack deployment is a custom job these days. Everyone is not running the same hardware, the same configurations, and so on. This means we have to adapt to those customizations and provide a framework that allows people to test their specific use cases. Once unit testing, integration testing, and functional testing has happened inside the walls of the software providers, it has to go out into the wild and survive real customer use cases. And just as important, feedback has to be received quickly in order for the next iterations to be smaller, which will ease the burden of identifying problems and fixing as needed.

One of the concepts Red Hat is investigating is chaining different CI environments and managing the logs and log analysis from a “central CI”. We’ve been working with customers to validate this concept by testing it first on customer and partner equipment for those who have been able to set aside equipment for us. We want to deploy a new version and verify an update live on premise and include this step into our gating process before merging code. We are not satisfied  unless it can be deployed and proven to work in a real environment. This means that CI/CD isn’t just about us anymore, it has to work on-site or a patch is not merged.

Currently in our testing, we receive status reports from different architectures which allows us to identify if an issue is specific to a certain configuration, hardware, or environment. This also allows us identify a more widespread issue that needs to be fixed in the release. Ideally, we envision a point where once a new version reaches a certain “acceptance threshold,” it is marked as ready for release. It’s then automatically pushed out and updated to a customer’s pre-production environment.

A workflow might look something like this:

Screen Shot 2015-08-12 at 11.44.53 AM

Source (modified): https://en.wikipedia.org/wiki/Continuous_delivery#/media/File:Continuous_Delivery_process_diagram.png

This type of workflow could integrate well into existing tools like Red Hat Satellite. Updates would still be provided as usual, but additional options to test upgrades leveraging the capabilities of the cloud would be made available. This would provide system administrators with an added level of certainty before deploying packages to existing servers, including logs to troubleshoot before pushing to production environments, should anything go wrong.  

Red Hat is committed to delivering a better and smoother upgrade experience for our customers and partners. While there are many questions that remain to be answered, notably around security or proprietary code, there is no doubt in my mind that this is the way forward for software. Automation has to take over the busy work of testing and upgrading to free up critical IT staff members to spend more time delivering features to their customers or users.