Let’s implement ITOA this year!
The New Year has begun and IT Managers are back at their desk, fresh from a well-deserved vacation. On the top of their minds is preparing a to-do list for their departments and organisation. On top of many of them is implementing ITOA this year!
What is ITOA?
IT Operations Analytics (ITOA) is a science that involves mathematically analysing IT metrics recorded during the running of an enterprise with an aim to uncover insights, which then aid further action.
The IT metrics recorded may be from a variety of sources within the enterprise. The actionable decisions or insights are reached using a variety of mathematical techniques beginning with simple ones like averages and maximum (peak) to more complex ones involving models of correlation and prediction. The action or decision made towards, say, incident management or capacity management is thus based on solid data and reasoning.
Background
Financial institutions have grown into large enterprises deploying an ever increasing IT infrastructure. These varied infrastructures generate tons of IT metrics daily and the outcome of these IT metrics directly impacts business objectives.
Historically, analysis based on these IT metrics has been limited to basic capacity management using spreadsheets where an individual IT metrics average or peak is measured against its stated capacity. If the usage is above a certain threshold, say 80%, it is flagged for capacity upgrade. Or in some cases, the IT metrics have been used post-facto to manually analyse an outage or degradation in performance.
Analytical tools have now evolved to provide us with insights into this enterprise IT data, determine correlations between multiple metrics and provide us with summaries and predictions based on historic data. These tools are called ITOA tools. There are also tools which perform ITOA in real time and feedback the insights into a monitoring system to raise instant alerts.
Need for an ITOA tool: top 2 use cases
The main challenges faced today by IT operations teams today are:
1. Comprehensive capacity management
The capacity of a service is based on the capacity of individual components, with the added complexity of interplay of these components with varying levels of correlation.
As these components report continuous IT metrics,
- How does one establish correlation between two components?
- Using component metrics, how does one correctly measure and predict capacity for the service?
- Can capacity reports be generated at the click of a button?
- How can visualisations like spark lines, bubble charts, line graphs, columns charts etc. help with capacity management and help me to gain insight?
- Can real-time capacity alerts be integrated into existing real-time monitoring instead of analysing the capacity reports at month end?
Capacity management reporting for the enterprise and to the regulator remains the top use case for ITOA; organisations typically spend 40 man hours per month on preparing these reports.
2. Log Analytics for faster incident management
The underlying IT software today is a mix of software solutions sourced from multiple vendors, written in different languages, deployed on varied operating systems and, more importantly, with no standard application logging standards. During a major incident, investigation involves “subject matter experts” manually parsing these application logs and finding problem trails left by the application.
So, some key questions:
- How does one correlate these varied application logs using the known methods of organising by timestamps or keywords like Error code, UserID or TransactionID?
- How does one correlate disparate events and establish causality?
- How does one visualise trends on frequency of event occurrence?
- How is one alerted if particular unknown “unknown” messages are written today in application logs?
- How does one report security compliance using log analysis?
Prerequisites for ITOA implementation
The prerequisites required to begin an ITOA implementation are:
- An enterprise monitoring solution: this is a basic requirement so that IT metrics from all functions and departments (hardware, network, database and application) are measured, monitored and stored in one tool. Lack of an enterprise monitoring tool will incur additional tasks of pooling data from different tool’s databases and then storing it in one database for the ITOA tool to act on.
- A smart ITOA tool: the various stages of ITOA include ingestion of data from multiple sources, analysis of data and presentation of insights in the form of visualisations or alerts. While there are multiple tools available to do each of these stages singularly, it is advisable to select a single tool which can do all of this.
The challenges in setting up ITOA
- Lack of a single enterprise monitoring solution: the legacy of multiple tools for monitoring multiple stacks, which do not talk to each other or do not have capability to forward the data to a big data store
- Availability and granularity of operational data from all stacks of monitoring
- Upgrading knowledge of analytical skills in the IT operations community: enterprises are not going to hire data scientists in IT operations. They will expect ITOA tools to provide the power of analytics to regular IT operations support, which is exactly what ITRS Insights does.
ITRS Insights
ITRS Insights is an all-in-one big data solution, combining real-time analytics with big data storage, all within a simple to use platform. Its ability to ingest and analyse large amounts of data from a wide variety of sources, means you can combine IT and business data from across your business to support your IT Operations Analytics (ITOA) initiatives. In-built anomaly detection and machine learning algorithms mean you can pinpoint and fix problems quicker, reduce downtime and optimise IT spending across your organization.
To arrange a demonstration of ITRS Insights, click here.