Thanksgiving is as good a time as any to acknowledge the fact that some IT departments, when it comes to being able to restore their organizations’ data in the event of an outage, are real turkeys.
OK, maybe that’s too harsh. Let’s just say that, in my experience and that of other consultants at MHA, many IT departments have a lot of room for improvement when it comes to their business-continuity capability.
Obviously, it’s not very helpful to make negative generalities without giving specific insights and tips on how to improve, so in this blog post on the “5 Biggest IT Management Mistakes” we’ll point out the main problems we see and also give you tips on how to address them.
Before we begin, it might be worth reminding you of the stakes. These are high. IT management mistakes create processing inefficiencies and delays in the completion of deliverables. They eat up organizational resources that could be used to execute new projects, and they have negative impacts on business continuity and recoverability. In short, they cost the organization money, restrict its growth, and limit its ability to capitalize on new opportunities.
Furthermore, these management mistakes crop up both at the operational level and the project level—though as we will see, one of these levels is often slighted in favor of the other, to the detriment of the organization’s resilience and recoverability. (Bonus points to any reader who can guess which level, operational or project, tends to get the short end of the resource stick.)
One last thing before we dive in, and I mention it because it’s something I’ve noticed over the years that I have always found interesting and which has a definite bearing on how well an IT department is managed and how resilient it is.
In many if not most IT environments, the most technically capable people tend to move into management. However, the roles of technical person and manager call on two very different skill sets. The skills required to manage a team and a program are administrative and interpersonal, rather than strictly technical. We see a lot of departments that are led by technical wizards who struggle with the managerial aspects of the job. If you think this might describe you, you might consider bringing in someone with strong administrative skills to support you in this area.
This is mostly a people issue. What we see over and over is the decisions about who does what are decided based on criteria that have little or nothing to do with the matter of who would be best at accomplishing a given task. Technical wizards with weak managerial skills are put in the role of manager, or the technically strongest people are put on the project teams, leaving the operations team going begging for talent.
Think carefully about the strengths and weaknesses of your team members. Use rational criteria in assigning people to different roles.
A lot of operational outages happen when changes are implemented without proper testing, validation, and review. At the project level, the processes used for testing are rarely sufficiently comprehensive. The process used for identifying appropriate use cases and testing scenarios is not sufficient to drive actual comprehensive testing and appropriate exception-based testing. It’s typically all positive testing.
Make sure you have enough time. Do small tests along the way to make sure the process is effective. Validate the results of the process. In the case of a backup, for example, show that the backup is actually being monitored and that it actually works. Show that you could actually do a restore. Pick a small data set and restore it. On projects, do a post-project review. Assess how many changes you had to put into the project because you didn’t have the requirements defined upfront.
In today’s business world, the management of the environment—the overall processing—is one of the areas where we don’t do a very good job. We tend to focus on projects because they are fun and new, and the day-to-day operational management of the environment gets pushed to the side. Often the most technically capable people are assigned to work on projects, and the operational side is left with an underpowered staff. But from the tech perspective, it’s the operational side that is the lifeblood of the business. When operations go down, they suddenly obtain a high and unwanted level of visibility–because business functions stop.
Be proactive in thinking about the risks to your current processing environment, in capacity, staffing, or the lack of solid processes. Find problems before they find you, and fix them.
A lot of times IT departments live in the moment, and not in a good way. They let things go and let things go and only take steps when something breaks.
Practice more management and less crisis management. It was true in Ben Franklin’s day and it is true now: an ounce of prevention is worth a pound of cure. This is one of those things that’s simple but not easy, because it involves behavioral change. Consider putting in five minutes every day or 30 minutes every week to conduct an informal risk assessment for your project or area of responsibility. Block out time on your calendar, or put a reminder on your phone so that you think about it as you drive home. Get something in your routine that causes you to think proactively.
This mistake is about communication. Specifically, it is about bad communication. And what is bad about the communication at most IT environment is two things, neither of which is the frequency of communication. Rather, what we see is that communication is not in understandable terms and, especially, that it is not honest. Typically what happens is, people are asked how things are going and they say, It’s going great, it’s going to work, no problem. They sugarcoat the situation, concealing problems, and setting the stage for a very tough reckoning down the road.
True and honest communication takes and demonstrates courage. Bad news does not get better with age. Giving bad news is never easy, but it is always respected. And when you give a heads up about a problem or potential problem, have a proposed solution to offer. That will help everyone move that much faster to the matter of, what can you do to get back on track.
So there you have it, the five biggest IT management mistakes, as judged by myself and the other consultants at MHA, based on our collective decades of experience in the fields of IT and business continuity. We encourage you to use this brief guide to address these common mistakes and help your organization identify cost savings and capitalize on new opportunities.