The IT management practice to bundle large batches of IT processes together to leverage economies of scale has revolutionized the ability to get substantive amounts of work completed. It is one of those rare practices that actually enables IT professionals to do more with less. Like a school bus for patch releases. But what if the 40 students go to 20 different schools? And what if they all need to be at school at exactly the same time? That’s not unlike an Oracle upgrade when the same database is accessed by applications on Unix, Microsoft, and Linux servers—and all the upgrades need to be done during the Sunday morning maintenance window.

Large batch processing introduces its own problems with larger transaction cost between processes, longer time needed to have to re-run processes, and most importantly, difficulty with triage and problem resolution when an incident occurs. The modern CIO is expected to maintain an appearance of a homogenous and stable environment, but in reality may be managing applications that straddle multiple platforms, share databases, and require different levels of security. Additionally, the enterprise most likely includes both new and legacy systems using multiple versions of the same software. No longer can the individual offerings provided in the operations service catalog be self sufficient and isolated—recent projects have proven the importance of addressing IT process improvements through use of lean and effective methods.

Seeing Beyond the Target

The growing reach and the complexity of an agency’s IT Infrastructure creates a difficult management challenge for CIO Offices across the federal government. The urgency of the agency’s mission may create an unfulfillable requirement to attain rapid response times without sacrificing reliability—all on a reduced budget. This results in heroics and reactive solutions that lack long-term fixes.

In a homogenous system, processing upgrades and patches in large bundles makes sense—the organization can get more done in less time. However, in a complex system (as described above) we need to closely examine what is efficient, and what is “fit for purpose.” For example, Microsoft releases a patch bundle for their servers every month. It might be efficient to install all of the patches at one time. And it might not cause any problems. But in many complex systems, doing this will cause something to break, even if the bundle is installed in record time. Kind of like if all the students arrived on time and safely, but at the wrong school.

In sniper training, Navy SEALS learn a rule that is surprisingly similar to a critical practice in patch management. The rule (also a fundamental of firearm safety) is to always be aware of your target—and of what’s behind your target. Similarly, IT managers and patch management team members must look beyond what is being patched (the target) to see if, in hitting the target, they may accidentally hurt something else. Otherwise, the time and expense of analysis and rework will strip away the benefits a large patch bundle.

For most IT managers, some of the most critical risks tracked are related to having their mission-critical systems available, which makes effective incident management and timely problem resolution processes even more important. The growth of an organization and its responsibilities, as well as the need for new technology, creates constant need for change in IT Services enclave. Many practical IT managers will admit incidents are inevitable because change to the IT infrastructure that supports an ever-changing organization is inevitable. Studies show that the vast majority of incidents in IT environment are caused by poorly implemented IT change.

Size Matters

Given the inevitability of change, organizations can strive to reduce their batch sizes to an optimal point, since smaller batch management offers a more agile environment that is flexible to change and is easier to monitor from all levels of the organization. Additionally, there is less transaction cost associated with re-processing of jobs and troubleshooting broken or failed processes as their size is reduced.

But what is the right size? Well, that depends on a number of factors and no two enterprises are exactly the same. In a well-organized patch management process, the bundle size will be determined by the patch management team. Some of the factors they will consider are:

  • What platforms are effective?
  • What is the timeframe?
  • Can we accept system downtime?
  • What is the mix of old and new technology?
  • Do we have known compatibility issues?

Obviously there are many other factors to consider—the patch management process is related to other system management processes. Along with optimizing batch sizes, organizations also should strive to integrate and optimize these core IT service management processes:

  • Asset management
  • Incident management
  • Change management
  • Problem management
  • Configuration management
  • Release management

By doing this, the current configuration information for an IT asset is readily available to the incident management process to expedite problem resolution. To maintain an IT infrastructure that is able to handle the increasing demand for change and response, IT managers must take advantage of industry frameworks (such as ITIL), and build core processes that integrate well with each other and are able to provide timely and accurate information about their IT resources.

Some of the benefits of this approach will provide include:

  • Superior scalability across the operations department, increased efficiencies through knowledge management;
  • Improved communication;
  • Clearly defined roles for all partners;
  • Communication is increased between all parties and less reliance on individuals but problem management;
  • Other key aspects of ITIL have improved operations such as test management; and
  • Help desk services are able to access knowledge management resources, thus reducing reliance on top-level support.

Taking the time to find the right batch size may seem like extra preparation and a waste of time, but if you can optimize the size of your batch bundles you will identify problems earlier, solve them faster, and not be left waiting for the bus.