Skip to content

GAVS – Global IT Consulting

Menu
  • Platforms & Products
    • Platforms & Products

      GAVS’ products will help change how you organize your IT Operations, bring meaningful and actionable insights to speed up network fixes, provide real data as quantifiable justification to adopt strategies that foster business improvements.

      • ZIF
      • Products
        • zDesk – Remote, Secure Desktop-as-a-Service (VDI+)
        • zIrrus
        • GTOps
        • TruOps
        • Close
    • Products & Platforms
      • Reimagining your Digital Infrastructure with Zero Incident FrameworkTM

        Read more
    Close
  • Services & Technologies
    • Services & Technologies

      GAVS is a global IT services provider with focus on AI-led Managed Services and Digital Transformation. GAVS’ AIOps platform, Zero Incident Framework ™ (ZIF), enables proactive detection and remediation of incidents and increases uptime, helping organizations drive towards a Zero Incident Enterprise™ . GAVS has transformed IT Enterprise delivery through ZIF’s Discover, Monitor, Analyze, Predict, and Remediate modules, to optimize business services continuity.

      • Digital Services
        • Auto Discovery and Dependency Mapping
        • Cloud Enablement
          • Cloud Advisory and Transformation
          • Close
        • Automation
        • Blockchain
        • Close
      • Cyber Security Services
        • Assessment & Advisory
        • Identity & Access Management (IAM)
        • Managed Detection & Response (MDR)
        • Managed Security Services (MSS)
        • Security Automation
        • Risk & Compliance
        • Close
      • Data Privacy Services
      • Consulting & Implementation Services
        • Cloud Advisory and Transformation
        • Data Center Assessment
        • Data Center-as-a-Service (DCaaS)
        • Infrastructure re-engineering
        • Data Center Consolidation & Migration
        • Close
      • Application Services
      • Enterprise Support Services
        • Managed Infrastructure Support
        • Remote Infrastructure Monitoring
        • End User Monitoring
        • Close
      • Microsoft Services
    • Services &Technologies
      • Reinforcement Learning- The Art of Teaching Machines

        Read more
    Close
  • Industries
    • Industries

      GAVS Technologies focuses on serving various industry verticals in their digital transformation through infrastructure solutions, adopting innovation and technologies in different domains. We offer services and solutions aligned with technology trends to enable enterprises to take advantage of futuristic technologies like DevOps, Smart Machines, Cloud, IoT, Predictive Analytics, Managed Infrastructure Services, and Security services.

      • Industries Overview
      • Healthcare
      • Banking & Financial Services
      • Manufacturing
      • Media & Publishing
    Close
  • Inside GAVS
    • Inside GAVS

      GAVS is a global IT services provider with focus on AI-led Managed Services and Digital Transformation. GAVS’ AIOps platform, Zero Incident Framework™ (ZIF), enables proactive detection and remediation of incidents and increases uptime, helping organizations drive towards a Zero Incident Enterprise™ . GAVS has transformed IT Enterprise delivery through ZIF’s Discover, Monitor, Analyze, Predict, and Remediate modules, to optimize business services continuity.

      • About Us
      • Client Speak
      • Alliances & Partnerships
      • Leadership Team
      • Social Responsibility
      • Events
      • Locations
      • Contact Us
      • Press Releases
      • Media Mentions
      • Awards and Recognitions
      • In Memoriam
      • Covid Care
    Close
  • Insights
    • Insights

      We bring you discerning insights on technology trends, innovation and organization culture, thru our collection of articles, blogs and more. Insights reflects our passion in driving advancements as we move forward creating new paradigms in business and work culture. You would find our thoughts on a variety of topics ranging from evolving technologies and ways it affects businesses and lives, transformational leadership, high impact teams, diversity, inclusion and much more.

      • Blogs
      • Articles
      • White Papers
      • Brochures
      • Videos
      • Case Studies
      • enGAge Magazine
    • insights
      • Seven Tips for Leading IT Modernization and Digital Transformation

        Read more

    Close
  • Work With Us
    • Work with us

      What it means to be a GAVSian?

      If you rate high on our SWAT test (Smart, Hardworking, Articulate, Technologically curious), GAVS’ hiring profile, we promise you excitement, inspiration and the freedom to succeed in our flat organization. Being a GAVSian, you would represent our cutting edge in technological advancement while we help you hone yourself into the person you aspire to be. That’s the level of personal interest we invest in you.

      • Career with GAVS
      • Company Culture
      • Diversity @ GAVS
      • Building a respectful workplace
    Close
Back to blogs

Anomaly Detection in AIOps

Jun 02, 2021
  • ai devops automation service tools
  • aiops artificial intelligence for it operations
  • aiops digital transformation solutions
  • best ai auto discovery tools
  • Best AIOps Platforms Software
SHARE

In this blog post

  • What are anomalies?
  • How are they flagged?
  • Why is it important?
  • So, what are thresholds, how are they significant?
  • Are there no disadvantages in the threshold way of identifying alerts?
  • What do we do now? Should anomalies be resolved?
  • Isolation Forest
  • References

Before we get into anomalies, let us understand what is AIOps and what is its role in IT Operations. Artificial Intelligence for IT operations is nothing but monitoring and analyzing larger volumes of data generated by IT Platforms using Artificial Intelligence and Machine Learning. These help enterprises in event correlation and root cause analysis to enable faster resolution. Anomalies or issues are probably inevitable, and this is where we need enough experience and talent to take it to closure.

Let us simplify the significance of anomalies and how they can be identified, flagged, and resolved.

What are anomalies?

Anomalies are instances when performance metrics deviate from normal, expected behavior. There are several ways in which this occur. However, we’ll be focusing on identifying such anomalies using thresholds.

How are they flagged?

With current monitoring systems, anomalies are flagged based on static thresholds. They are constant values that provide the upper limits of a normal behavior. For example, CPU usage is considered anomalous when the value is set to be above 85%. When anomalies are detected, alerts are sent out to the operations team to inspect.

Why is it important?

Monitoring the health of servers are necessary to ensure the efficient allocation of resources. Unexpected spikes or drop in performance such as CPU usage might be the sign of a resource constraint. These problems need to be addressed by the operations team timely, failing to do so may result in applications associated with the servers failing.

So, what are thresholds, how are they significant?

Thresholds are the limits of acceptable performance. Any value that breaches the threshold are indicated in the form of alerts and hence subjected to a cautionary resolution at the earliest. It is to be noted that thresholds are set only at the tool level, hence that way if something is breached, an alert will be generated. These thresholds, if manual, can be adjusted accordingly based on the demand.

There are 2 types of thresholds;

  1. Static monitoring thresholds: These thresholds are fixed values indicating the limits of acceptable performance.
  2. Dynamic monitoring thresholds: These thresholds are dynamic in nature. This is what an intelligent IT monitoring tool does. They learn the normal range for both a high and low threshold, at each point in a day, week, month, and so on. For instance, a dynamic system will know that a high CPU utilization is normal during backup, and the same is abnormal on utilizations occurring on other days.

Are there no disadvantages in the threshold way of identifying alerts?

This is definitely not the case. Like most things in life, it has its fair share of problems. Routing from philosophy back to our article, there are disadvantages in the Static Threshold way of doing things, although the ones with a dynamic threshold are minimal. We should also understand that with the appropriate domain knowledge, there are many ways to overcome these.

Consider this scenario. Imagine a CPU threshold set at 85%. We know anything that breaches this, is anomalies generated in the form of alerts. Now consider the same threshold percentage as normal behavior in a Virtual Machine (VM). This time, the monitoring tool will generate alerts continuously until it reaches a value below the threshold. If this is left unattended, it will be a mess as there might be a lot of false alerts which in turn may cause the team to fail to identify the actual issue. It will be a chain of false positives that occur. This can disrupt the entire IT platform and cause an unnecessary workload for the team. Once an IT platform is down, it leads to downtime and loss for our clients.

As mentioned, there are ways to overcome this with domain knowledge. Every organization have their own trade secrets to prevent it from happening. With the right knowledge, this behaviour can be modified and swiftly resolved.

What do we do now? Should anomalies be resolved?

Of course, anomalies should be resolved at the earliest to prevent the platform from being jeopardized. There are a lot of methods and machine learning techniques to get over this. Before we get into it, we know that there are two major machine learning techniques – Supervised Learning and Unsupervised Learning. There are many articles on the internet one can go through to have an idea of these techniques. Likewise, there are a variety of factors that could be categorized into these. However, in this article, we’ll discuss an unsupervised learning technique – Isolation Forest amongst others.

Isolation Forest

The algorithm isolates observations by randomly selecting a feature and then randomly selecting a split value between the maximum and minimum values of the selected feature.

The way that the algorithm constructs the separation is by first creating isolation trees, or random decision trees. Then, the score is calculated as the path length to isolate the observation. The following example shows how easy it is to separate an anomaly-based observation:

Best AI Auto Discovery Tools

 

In the above image, the blue points denote the anomalous points whereas the brown ones denote the normal points. Anomaly detection allows you to detect abnormal patterns and take appropriate actions. One can use anomaly-detection tools to monitor any data source and identify unusual behaviors quickly. It is a good practice to research methods to determine the best organizational fit. One way of doing this is to ideally check with the clients, understand their requirements, tune algorithms, and hit the sweet spot in developing an everlasting relationship between organizations and clients.

Zero Incident FrameworkTM, as the name suggests, focuses on trending organization towards zero incidents. With knowledge we’ve accumulated over the years, Anomaly Detection is made as robust as possible resulting in exponential outcomes.

References

  • https://www.crunchmetrics.ai/blog/top-5-aiops-use-cases-to-enhance-it-operations/#:~:text=Real%2Dtime%20anomaly%20detection,are%20otherwise%20hard%20to%20identify
  • https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html

Author

Vimalraj Subash

Vimalraj is a seasoned Data Scientist working with vast data sets to break down information, gather relevant points, and solve advanced business problems. He has over 8 years of experience in the Analytics domain, and currently a lead consultant at GAVS.



Best Cyber Security Services Companies
Cybersecurity Trends Driving 2022
Read More
Cybersecurity for BFSI
Cybersecurity Imperatives for BFSI
Read More
Automation for MSPs
Understanding the necessity of Automation for MSPs
Read More
GAVS – Global IT Consulting

Copyright © 2022, GAVS Technologies.

  • Privacy Policy
  • Cookie Policy
  • Terms of use
  • Contact Us
  • Platforms & Products
    • Platforms & Products
    • Products
      • Zero Incident Framework ™
      • Products
      • zDesk – Remote, Secure Desktop-as-a-Service (VDI+)
      • GTOps
      • TruOps
      • zIrrus
  • Services & Technologies
    • Services & Technologies
    • Digital Services
      • Digital Services
      • Auto Discovery and Dependency Mapping
      • Cloud Enablement
        • Cloud Advisory and Transformation
      • Automation
      • Blockchain
    • Data Privacy Services
    • Cyber Security Services
      • Cyber Security Services
      • Risk and Compliance
      • Security Automation
      • Managed Security Services (MSS)
      • Managed Detection and Response (MDR)
      • Identity and Access Management
      • Assessment and Advisory
    • Consulting & Implementation Services
      • Consulting & Implementation Services
      • Cloud Assessment & Advisory
      • Data Center Assessment
      • Data Center-as-a-Service (DCaaS)
      • Infrastructure re-engineering
      • Data Center Consolidation & Migration
    • Application Services
    • Enterprise Support Services
      • Enterprise Support Services
      • Managed Infrastructure Support
      • Remote Infrastructure Monitoring
      • End User Monitoring
    • Microsoft Services
  • Industries
    • Industries Overview
    • Healthcare
    • Banking & Financial Services
    • Manufacturing
    • Media & Publishing
  • Inside GAVS
    • Inside GAVS
    • About Us
    • Industries
    • Client Speak
    • Alliances & Partnerships
    • Leadership Team
    • Social Responsibility
    • Events
    • Find us
    • Reaching us
    • Press Releases
    • Media Mentions
    • Awards and recognitions
    • In Memoriam
    • Covid Care
  • Insights
    • Insights
    • Articles
    • Blogs
    • White Papers
    • Case Studies
    • Brochures
    • Videos
    • enGAge Magazine
  • Work with us
    • Work with us
    • Career with GAVS
    • Company Culture
    • Diversity @ GAVS
    • Building a respectful workplace

Schedule a Demo