Thursday, August 30, 2018

Business Continuity vs. Disaster Recovery; What’s the Difference?


Picture courtesy: https://msinfokom.com/

A lot of people use the terms disaster recovery (DR) and business continuity (BC) plans interchangeably, but technically there is a difference. A disaster recovery plan is more reactive while a business continuity plan is more proactive.

Business Continuity Planning (BCP) and Disaster Recovery (DR) are used together so often that people often begin to forget that there is a difference between the two. The idea of this post is to try to define these terms from a practical point of view. 

In day to day lingo, BCP refers to plans about how a business should plan for continuing in case of a disaster. DR refers to how the IT (information technology) should recover in case of a disaster. It may sound a bit weird, but after the first 500 times, you sort of start getting used to it and start preaching it as well. I think these definitions must have started off as a practical joke played by mischievous IT staff on their organization, or maybe it was some consultant trying to wiggle out of a situation where he forgot to make a BCP for IT.

If you think practically, a BCP is a plan that allows a business to plan in advance what it needs to do to ensure that its key products and services continue to be delivered (technicality: at a predefined level) in case of a disaster, while a DR allows a business to plan what needs to be done immediately after a disaster to recover from the event. So, a BCP tells your business the steps to be taken to continue its key product and services, while a DR tells your business the steps to be taken to recover post an incident.

Your impact analysis, your business continuity strategy and business continuity plans are a part of BCP. Your incident response, emergency response, damage assessment, evacuation plans, etc. are all a part of DR. It makes sense to divide your planning into two parts
  1. Planning to continue your business operations and 
  2. Planning to recover from disaster situations

If you use these definitions of BCP and DR, you would probably end up having a practical and effective BCMS for your organisation.

Let us assume a practical scenario in an organization, where the IT team is the one that is really worried about data backup and recovery. No one else gives two hoots. Roughly translated, it means the others will run to IT in case of disaster and say: “What do you mean you cannot recover it. I assumed it was being backed up.”

So, the IT team takes it upon itself to build a business continuity plan. The problem, however, is that they will receive no data from business about esoteric terms like ‘criticality of activities supporting key products and services’. They have to do the next best thing. Ask the business “What will happen if this application crashes and is not recoverable?” This becomes the applications RTO. They ask “What will happen if we lose data for 1 min, 1 hour, 1 day, 1 week, 1 year, etc. progressively (until they see the business users pupils dilate). This becomes their RPO. Armed with this data they create a plan to recover these key applications and services (provided and supported by IT) within the RTO and RPO. Calling this a business continuity plan is not the right thing to do, hence the IT calls it a DR plan. It is then free to make statements like “We have a DR, but not a BCP” to the half-read consultant and get away by showing this document as a BCP to the untrained auditor (believe me there are plenty of those around). The auditor sees some calculation of RTO and RPO and accepts the plan as a BCP, thus obfuscating the definitions further.

The Bottom Line
Business Continuity is the first defense against a disaster threatening the proper function of business. However, Disaster Recovery is a must for any organization who cannot function without its vital business data. Although Disaster Recovery is just a smaller part of the larger Business Continuity umbrella, enterprise organizations would be wise to employ both strategies for full protection. Disaster Recovery techniques are more preventative in nature than continuity tools, which are typically used to maintain smooth business operations.

Practical advice
Plan in two parts as above. A plan to continue business processes (Including support processes like IT) and a plan to respond to and recover from incidents (Emergency Management) and you should be good. If you are in IT and the business is not responding for a BCMS, use the tactic shown above to have a minimalist practical plan."

Tuesday, May 29, 2018

CMDB Data Inaccuracies – Why is it so bad?





Picture courtesy: https://www.shutterstock.com/image-vector/word-cloud-configuration-management-related-items-697025140

In this blog, I am going to examine four specific areas where poor data quality in CMDB can challenge the success and ROI of your ITSM endeavours

What is CMDB?
A Configuration Management Database (CMDB) is the central repository of information describing all of the IT infrastructure components used to deliver services to a business. In other words, CMDB is a database that contains all relevant information about the hardware and software components used in an organization's IT services and the relationships between those components. A CMDB provides an organized view of configuration data and a means of examining that data from any desired perspective.

What is a CI?
A Configuration Item (CI) means any component that is managed to deliver an IT service. CIs include IT Services, hardware, software, buildings, people, and links to documentation such as SLAs. CIs also include databases; applications; operating systems; servers, laptops and other types of computers; routers, firewalls, switches and other types of network components; phones and other communication and handheld devices; network attached storage and other types of storage; point of sale devices, printers and other electronic devices; data centre environmentals (APC) or any components or networked based elements that are monitored by the Software whether physical, virtual (including virtual guests) or in a cloud environment. Networked based elements will be counted as one CI for each IP address; provided; however, that any such elements or components that are included as an attribute, event definition or visual control for a CI, will be counted as a separate CI.

DIRTY DATA

While a CMDB is most often associated with IT Service Management solution and ITIL requirements, the key to creating a solid and valuable CMDB is having it populated with current, complete and accurate information about the IT assets and users across the network. We’ve all heard the saying before: “Garbage in, garbage out.” Poor quality input will always result in poor quality output. This adage is certainly true when it comes to the cleanliness of your CMDB. It’s nearly impossible for an IT organization to make informed decisions, deliver services efficiently, and provide a superior user experience without high quality data flowing through its systems.
It’s no surprise then that IT data accuracy and the risk of suspect data remains a key concern among IT leaders—especially when the stakes are sky-high. From an ITAM/ITSM perspective, poor quality data can have major impacts on costs, efficiency, governance, and security.
Below are the four specific areas where poor data quality can challenge the success and ROI of your ITSM endeavours:



Incident/Problem Management
The existence of poor quality (or outright inaccurate) data in your CMDB will lead to incorrect classification. I am not referring to how everyone classifies a category for their incidents and problems in their ITSM tool. I am referring to the second layer of classification, which outlines what is affected by a particular incident or problem. When your CMDB is filled with outdated data and misinformation, you have limited visibility of the impact, and it’s extremely difficult to link issues together. Conversely, when you have a CMDB with accurate data and links to additional information, it’s easy to understand what the impact of an incident or problem is on the various systems in your organization.
Potential errors include misleading relationships or poor escalation due to Configuration Items (CIs) not being linked to the correct SLA. This presents itself in the form of an inability to prioritize work effort based on actual business impact.

Change Management
Like Incident and Problem Management, your teams will have limited visibility into the impact of your changes without high quality data in your CMDB. For example, perhaps you want to make a change and there’s a possibility of conflict with someone else’s change. With an accurate CMDB, you’ll be able to see the impacts of your change and whether or not you can make both changes together. It will also make the tracking of changes on equipment easier. You may want to know how many changes you’ve made to a specific piece of equipment—for example, if it’s faulty and you want to see how many repairs were made over its lifecycle. It’s crucial in this case to have this information tracked by your CMDB.
Failed Requests for Change (RFCs) – These can be caused by the CMDB containing out of date or missing version information. For example, you may arrange a maintenance outage to update a server’s software version, only to find that it is retired. This wastes time and disrupts service availability.

Asset management
Poor quality data is a huge issue when it comes to IT Asset Management for a number of reasons but especially when it comes to financial impacts, asset refreshes, usage misunderstandings, and equipment loses. A CMDB with incorrect data can’t provide you with the accurate financial impact from your license usage. Additionally, not knowing what assets you have in the field or how they are impacting each other will become a huge issue when it’s time to replace equipment. Misunderstanding how your equipment is used or who’s using what and when has a big impact when it comes to refreshes. Lastly, it’s shockingly easy to lose track of expensive equipment when you’re lacking quality data and a proper discovery tool.

Enterprise
There are significant organizational impacts related to poor data quality as well. Unplanned downtime can occur when you are unintentionally making conflicting changes. And that downtime often results in loss of revenue whether directly or indirectly. In the manufacturing world, for example, one bad change could turn into a nightmare in terms of production. Consequently, unplanned downtime that impacts the business and revenue naturally fosters a bad perception of IT.


A MATTER OF TRUST

The concept of the CMDB is that it should be a single source of truth that is accessed by multiple systems and functions to power effective processes and decision-making across IT and business functions.
However, when CMDB data is bad or dirty, trust in the accuracy and value of the CMDB is quickly eroded, often leading to the failures described above. Unfortunately, once SAM or ITSM provides dirty data to another department or senior management, it is hard for the recipient to trust future data. Even if the data quality is improved, it will still be received with skepticism and an element of being untrustworthy.
The value of the data, the ITSM and SAM function to the business decreases. Inconsistency with CMDB data will result in a user’s unwillingness to use the data, or it will require manual effort and manipulation that will also lead to poor data.
Having unreliable data can end up costing the organization a lot more time and money than necessary. Organizations can up end up throwing money and resources at trying to improve the quality of the data, instead of trying to find the root cause.

Monday, May 28, 2018

Common mistakes while implementing Knowledge Management (and Fixes)

Picture courtesy: https://www.natfas.com/knowledge-management-3/
In this blog, I will introduce knowledge management, highlight the mistakes that are frequently made and how to fix the issues

As per the current ITSM trends a lot of focus on Knowledge Management, Self-Service, and Self-Help. With this comes the uncertainty and complexity of implementing an effective and well-designed knowledge management process and while challenging, this doesn’t mean you can shy away from it.

According to findings from "TheState of Knowledge Management: 2016-17 KMWorld Survey” knowledge management is gaining momentum and encouragement. More than one-third of those surveyed, 38%, said they don’t have any knowledge management structure in place or are sitting in the “exploration stage.”

If you’re in the early stages of planning for your knowledge management system, or maybe you’ve already tried and failed at your knowledge management attempt, this blog is for you.

What is Knowledge Management?
Knowledge management is the systematic management of an organization's knowledge assets for the purpose of creating value and meeting tactical & strategic requirements; it consists of the initiatives, processes, strategies, and systems that sustain and enhance the storage, assessment, sharing, refinement, and creation of knowledge.
Knowledge management (KM) therefore implies a strong tie to organizational goals and strategy, and it involves the management of knowledge that is useful for some purpose and which creates value for the organization.

What are the different Knowledge Types?
According to Knowledge Management Tools, knowledge falls into three categories:
  • Explicit Knowledge: Formal and systematic, and usually in the form of written documents, it’s easy to communicate and store, such as information found in documentation, books, instruction manuals and on the web. It can also be in an audio or visual form such as instructional diagrams or videos.
  • Tacit Knowledge: Typically, the information held in people’s heads. The trick is to either enable them to share this information via tools and processes or to connect these people with those needing the information.
  • Embedded Knowledge: Information stored within policies, procedures, legal documentation and other unstructured data (such as social media). This can require observation, insight and analytics tools to identify this knowledge.


The Benefits of Knowledge Management: 
Developing and executing a knowledge management strategy can bring many benefits to the organisation as a whole:
  • Ability to create a “trusted source” or a “single source of truth” for key information that needs to be shared either internally or with your customer base.
  • Improvement in KPIs such as Average ticket Handle Time, First Contact Resolution and Customer Satisfaction.
  • Lowering the Time to Competency for new employees.
  • Lowering overall training costs.
  • More accurate call logging, and reduced after-call work (ACW).
  • Enabling wider sharing of information across the whole organisation.
  • Operational efficiencies communicating key information more quickly.
  • Enabling emerging technologies such as AI, chatbots, robotic process automation.


Mistakes while implementing Knowledge Management:
Knowledge management can be hard. There are many technology companies out there which claim to have all the answers, but – trust me – there is so much more to the discipline than deploying a platform.
Often, organisations leap into knowledge initiatives without proper planning or strategy. So, what are the common mistakes?
  • Doing things the traditional way: If you want to succeed at knowledge management, you must cater to your users and the style of experience they want—more knowledge at their fingertips, that’s readily available and accessible with just a few keystrokes. Many companies keep using their same knowledge strategy of gathering as much information as they can, but in the end, it’s rarely updated and barely used. This method is not the best if you want to succeed in the long run.
  • Having a thought process of "build it and they will come": Many organizations think knowledge management is simply building a massive repository of knowledge articles, and that this act alone will encourage people to use it. However, nothing could be further from the truth. Creating a knowledge base is not the end. Once developed, it must be regularly updated, easily and readily accessible, and its usage needs to be reinforced company-wide.
  • Ignoring your data: Successful knowledge management implementations use as much data as possible to determine what knowledge is required. Taking data from the ITSM Tool is an excellent source to understand what kinds of enquiries are being handled. In fact, tickets data is a great way of initially setting up a structure to your knowledge plan. Other sources that are vital to keeping knowledge fresh are social media feeds, company-owned and third-party forums, staff ideas, customer input and general web search. Each of these sources will have a different level of trust that you may wish to apply – you don’t want people randomly pasting information into your knowledge systems without some form of diligence in place.
  • Not measuring and monitoring for success: After implementation, the real work comes in tracking and measuring how the knowledge management environment is being used. Some key performance indicators should include data on:

·         Article usage
·         Article satisfaction
·         Navigation times
·         Navigation flows
·         Ease of use

Capture as much data as you can on how the knowledge is used and consumed. A good self-help tool should assist you in learning which processes and navigation function best, what articles users find valuable, how deep users dig to get the information they need, and who is actually using the knowledge base. Furthermore, use this data to justify increases in resources, funding, and additional tools to maintain the success of your company’s knowledge investment.

What Can Be Done to Fix These Issues?
As mentioned above, knowledge management can be tricky, so here are a few ideas about what to consider on the journey.

Engendering a Knowledge Culture:
  • Identify key influencers and enthusiasts and enable them to become evangelists.
  • Celebrate key milestones and other successes (e.g. most used article for the month).
  • Make knowledge contribution part of employee KPIs.
  • Report on knowledge usage and value and make it a key business metric.
  • Ensure that all departments access the benefits of a well-managed knowledge environment.
  • Have knowledge as an agenda item on management and board conversations.
  • Train employees on the process and the system – and make this training mandatory.

Design Thinking
  • Start with a vision for what knowledge represents in your organisation.
  • Define and communicate a clear strategy to enable this vision.
  • Develop personas to describe the types of people contributing to and consuming your knowledge.
  • Plan the taxonomy and required metadata with these personas in mind.
  • Define the types of reporting you will require to understand the value and effectiveness of knowledge.
  • Continually test, optimise and develop your processes, systems, reporting, training, and communications based on what you learn from usage data.

Creating Insights from Data
  • Monitor system usage to create lots of data about what knowledge is popular.
  • Combine this with disposition data from the Delivery teams, web self-help, chat transcripts and social media to allow you to develop an understanding of what knowledge is effective, and what isn’t.
  • Create a measurement system that allows an understanding of the value of knowledge. This should consider the effort to create and manage the content as well as the usage data.
  • Measure knowledge over time to give an indication as to when knowledge assets start to devalue and need to either be changed or archived.

Curation as a Key Part of the Process
  • Do not just keep pouring more and more information into the system without considering whether the content is popular, effective and valuable.
  • Understand when knowledge has reached the end of its usable life and archive it.
  • Realise that data is the key asset required for curation, and structure it well.
  • Understand that knowledge can be sourced from anywhere, but that doesn’t mean it should be. Identify trusted sources and ensure that you understand the efficacy of the information.
  • Regularly review and modify the taxonomy and metadata based on your market, your customers and the ultimate purpose of your knowledge management

Friday, May 25, 2018

Incident Management vs Problem Management




Incident Management vs Problem Management: Is there a Difference?

ITIL has been around since the late 1980s. We are currently on version three (v3). There are a lot of books and courses about ITIL but there’s still real confusion about where Incident management stops and Problem management begins, and the difference between the two. If it was just a terminology issue I wouldn’t be so worried about it, but the reality is – confusion about incident and problem management hurts us all.

Now confusion between two terms and definitions wouldn’t normally be such a big deal, but not being familiar with the differences between these two processes can end up having a huge negative impact on not only your infrastructure, but your business as a whole.
Here are the meanings of each word, according to the definitions used by ITIL, and how these meanings translate into the timeliness of the fix needed:

What is an Incident?
An incident is an event that leads to an unplanned disruption of service. The important part to remember is ‘disruption of service,’ because if an issue does not disrupt service, even if it was unplanned and unexpected, it is not an incident. For example, if a piece of hardware fails after hours when nobody is using the system, it is not an incident, because it did not disrupt service. However, if the same equipment failed during the regular workday, it would be defined as an incident because service was, in fact, disrupted. The IT help desk is often the first ones to be made aware of an incident, as they are usually the first point of contact for users experiencing issues with the system.

What is Incident Management?

Picture Courtesy: http://www.seriosoft.com/

The main goal of incident management is to resolve the disruption as soon as possible in order to restore service operations. The objective of the Incident Management Life cycle is to restore the service as quickly as possible to meet Service Level Agreements. The process is primarily aimed at the user level.
Due to the fact that even minor disruptions in service can have a huge impact on the organization, it is necessary to fix incidents immediately. The process of incident management usually includes recording the details of the incident and resolving it.

What is a Problem?
Also according to ITIL, “a problem is a cause of one or more incidents”. This problem is initially unknown and results from a number of incidents that are related and have common issues. While problems are not classified as incidents, incidents can raise problems, especially if they may or do happen repeatedly. To refer to our above example, the situation of the server that is only used during the day crashing after office hours is a problem because although it isn’t currently causing a disruption in service, it could happen again and become an incident.

What is Problem Management?
Picture Courtesy: http://www.seriosoft.com/

The goal of problem management is to identify the root cause of the incidents and try to prevent them from happening again. It might take multiple incidents before problem management can have enough data to analyse what is going wrong, but if undertaken correctly, it will help the problem become a “known error” and steps can be put in place to correct it.
Sometimes problem management is referred to as a reactive process that begins only after incidents have occurred. In actuality, problem management should be thought of as a proactive process because its end goal is to identify the problem, fix it, and prevent it from ever happening again. So, you could say the main goal of problem management is to identify the problem, troubleshoot it, document the issue as well as the causes of it, and then ultimately resolve it. Problem Management deals with solving the underlying cause of one or more incidents. The emphasis Problem Management to resolve the root cause of errors and to find permanent solutions. This process deals at the enterprise level.

Now, let’s look at an analogy comparing Incident management and Problem management

Incident management is like a fire-fighter at a house fire: it comes in, immediately fixes the problem, and saves the day. Fire-fighters come to the scene and notice the issue, and work fast to put out the fire as quickly as possible without stopping to question how it started. This is a similar situation for incident management. While it is necessary for incident management to provide fast results and repair issues within the infrastructure, it doesn’t help us find out what ultimately went wrong and why there was an issue in the first place. That’s where problem management comes in.

Problem management is like the detective that comes into the picture after the fact. They weren’t there to put out the flames themselves, but they can still investigate what went wrong, figure out how the fire started, and help educate people to take preventive steps so something similar doesn’t happen again. Problem management is a vital piece of the puzzle as it addresses the root cause of the incidents and proactively prevents them from repeating and potentially causing major issues in the future. Without taking time to review incidents and problem solve, they will just continue to happen and potentially increase in seriousness.

Conclusion
Understanding the difference between Incident management and Problem management, and having dedicated managers for each separate scenario, ensures that you are not just putting out fires all day. While immediately fixing problems in the infrastructure with incident management provides temporary relief, it will soon exhaust your resources and employees without finding the root of the problem. Bringing in problem management helps to investigate the cause of the incidents and puts steps in place so it doesn’t continue to occur. By having a specific manager or team for this process, you will be one step closer to decreasing the rates of incidents in your organization and preventing major outages and service disruptions

Wednesday, May 23, 2018

Service integration and management (SIAM) – Suddenly why so serious?



Joker Picture courtesy: https://tyrite.deviantart.com/art/Why-so-serious-92162678

There is one thing that links the sudden limelight which SIAM has received in the past years and “The Dark Knight” movie. It’s the dialog of Joker- “Why so Serious?”
Now, why did I make this statement is simply for two reasons:
  1. Of course - I am fan of Joker
  2. Because SIAM is not something which is new. SIAM is a management approach that has evolved over the last decade and is now rapidly growing in popularity and off late this has created a lot of curiosity among ITSM professionals. Almost, every ITSM professional is now discussing about SIAM

So in a quest to find out details about SIAM, I started my own research, and after about 3 months I was finally able to write this short article to address some common questions on SIAM

A brief history of SIAM:
So, it’s like one of those stories of the Kings – Long long ago, not so long ago, nobody knows how long ago – but wait, I think people know how long ago for SIAM.
The term service integration and management or SIAM, and the concept of SIAM as a management methodology originated in around 2005 from within the UK public sector. In 2010, the UK Government published a new information and communications technology (ICT) strategy, which included moving away from large prime supplier contracts to a more flexible approach using multiple service providers and cloud-based solutions
SIAM interest became global when in 2015 AXELOS published several white papers on SIAM, and in 2016 the SIAM Foundation Architect Group was formed by Scopism – This has been one good source for me. The objective was to bring the experts of the SIAM world together and create a consolidated view of their knowledge and experience.

So what is SIAM?
If you had asked this question somewhere around two years, you would have received ten different answers from ten different people and that’s precisely the most interesting thing about SIAM – it’s evolved in response to business problems, so each organization has its own take on what SIAM is and the best way to apply it.
In a layman’s language the answer to the questions “What is SIAM?”  is in the name: service integration and management, and in particular service integration across multiple providers.
But off late, thanks to a number of organizations with SIAM experience have collaborated to develop the SIAM Foundation Body of Knowledge, so we now have the below definition

“Service integration and management (SIAM) is a management methodology that can be applied in an environment that includes services sourced from a number of service providers.
SIAM has a different level of focus to traditional multi-sourced ecosystems with one customer and multiple suppliers. It provides governance, management, integration, assurance, and coordination to ensure that the customer organization gets maximum value from its service providers.”

OK, so now we have the definition and explanation in a simple language, but what is SIAM?
SIAM addresses the needs to provide a standardised methodology for integrating and managing multiple service providers and their services. It can enhance the delivery of the end to end supply chain, it provides governance, management, integration, assurance, and coordination to maximise the value received from multiple service providers.
It is not just a methodology for the management of services by a single organisation or governing body. It supports cross-functional, cross-process and cross-provider integration, in a complex sourcing environment or ecosystem in which all parties understand their role and responsibilities, are empowered to deliver, and are held accountable for their outcomes. As such it’s an organisational change that includes collaboration, and end-to-end focus into the core of every stakeholder involved.

That’s great news, but what are the benefits?
In plain English, SIAM helps companies who are struggling to manage their suppliers. Introducing the concept of a “service integrator” gives the company, and the customer, a single point of contact, as shown below.


Source: SIAM Foundation Body of Knowledge, copyright Scopism 2017

The SIAM model provides a single logical entity with accountability for the end to end service delivery, known as the service integrator. The customer organisation has a single management relationship with the service integrator, and the service integrator manages the relationships with the multiple service providers supporting the organisation.
As more and more organizations source services from different service providers, SIAM gives them a structure that allows them to add and remove service providers quickly and efficiently, with contracts, agreements, and a culture that drive the right behaviors from all parties
If you’re a service management professional (I have had this experience personally), you’ve probably been in the situation where your network supplier is blaming your database supplier who is blaming the applications team for an incident. In a SIAM model, the service integrator coordinates the response and drives a culture of “fix first, argue later.” An incident is just a small example of course; imagine a group of service providers working in an integrated way to support strategic goals.

That sounds awesome!!! That’s all what SIAM is about, now how about what SIAM is Not?
Like all practices, there are stories around SIAM too. SIAM is a coherent framework and people need to understand it and how it can help. To understand it better, it’s also interesting to know, what SIAM is not?

SIAM is Not –
  1. A replacement of ITIL – There are similarities and overlaps to the basic principles of ITIL and SIAM, but point to be noted is SIAM is not meant to take over where ITIL left off. SIAM is unique in offering a structure, culture, principles, and practices for managing a multi-service provider environment, which then allow the use of your framework-of-choice.
  2. Not a solution to all your issues – issues which people have been waiting for ITIL to resolve, and then thought DevOps would take care of them, and now are looking for the next best thing.
  3. A new course or methodology or something which consultants came up to sell some books or training or their certifications – SIAM has actually been around for more than 10 years and there is already a treasure of tried-and-tested SIAM practices. The need to build an agile ecosystem of multiple service providers and utilise best-of-breed, collaboration, and coordination is not going anywhere.


Final Thoughts – Why should an ITSM professional care about SIAM
SIAM has been building momentum for years, and off late it has become a defined set of management practices that you can read, learn from, use, comment on, and help evolve.
If you’re an IT management professional of any flavour, SIAM is definitely an area you need to be aware of, even if it’s just reading a blog or two. SIAM will complement and build on many other management practices like IT service management (ITSM), and show you how to adapt and augment processes in a multi-supplier environment.