“Observability” has become a key trend in Service Reliability Engineering practice. One of the recommendations from Gartner’s latest Market Guide for IT Infrastructure Monitoring Tools released in January 2020 says, “Contextualize data that ITIM tools collect from highly modular IT architectures by using AIOps to manage other sources, such as observability metrics from cloud-native monitoring tools.”
Like so many other terms in software engineering, ‘observability’ is a term borrowed from an older physical discipline: in this case, control systems engineering. Let me use the definition of observability from control theory in Wikipedia: “observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs.”
Observability is gaining attention in the software world because of its effectiveness at enabling engineers to deliver excellent customer experiences with software despite the complexity of the modern digital enterprise.
When we blew up the monolith into many services, we lost the ability to step through our code with a debugger: it now hops the network. Monitoring tools are still coming to grips with this seismic shift.
How is observability different than monitoring?
Monitoring requires you to know what you care about before you know you care about it. Observability allows you to understand your entire system and how it fits together, and then use that information to discover what specifically you should care about when it’s most important.
Monitoring requires you to already know what normal is. Observability allows discovery of different types of ‘normal’ by looking at how the system behaves, over time, in different circumstances.
Monitoring asks the same questions over and over again. Is the CPU usage under 80%? Is memory usage under 75% percent? Or, is the latency under 500ms? This is valuable information, but monitoring is useful for known problems.
Observability, on the other side, is about asking different questions almost all the time. You discover new things.
Observability allows the discovery of different types of ‘normal’ by looking at behavior, over time, in different circumstances.
Metrics do not equal observability.
What Questions Can Observability Answer?
Below are sample questions that can be addressed by an effective observability solution:
Why is x broken?
What services does my service depend on — and what services are dependent on my service?
Why has performance degraded over the past quarter?
What changed? Why?
What logs should we look at right now?
What is system performance like for our most important customers?”
What SLO should we set?
Are we out of SLO?
What did my service look like at time point x?
What was the relationship between my service and x at time point y?
What was the relationship of attributed across the system before we deployed? What’s it like now?
What is most likely contributing to latency right now? What is most likely not?
Are these performance optimizations on the critical path?
About the Author –
Sri is a Serial Entrepreneur with over 30 years’ experience delivering creative, client-centric, value-driven solutions for bootstrapped and venture-backed startups.
This article explores the offering of
the various Java caching technologies that can play critical roles in improving
What is Cache Management?
A cache is a hot or a temporary memory buffer which stores most frequently used data like the live transactions, logical datasets, etc. This intensely improves the performance of an application, as read/write happens in the memory buffer thus reducing retrieval time and load on the primary source. Implementing and maintaining a cache in any Java enterprise application is important.
The client-side cache is used to temporarily store the static data transmitted over the network from the server to avoid unnecessarily calling to the server.
The server-side cache could be a query cache, CDN cache or a proxy cache where the data is stored in the respective servers instead of temporarily storing it on the browser.
Adoption of the right caching technique and tools allows
the programmer to focus on the implementation of business logic; leaving the
backend complexities like cache expiration, mutual exclusion, spooling, cache
consistency to the frameworks and tools.
Caching should be designed specifically for the environment considering a single/multiple JVM and clusters. Given below multiple scenarios where caching can be used to improve performance.
1. In-process Cache – The In-process/local cache is the simplest cache, where the cache-store is effectively an object which is accessed inside the application process. It is much faster than any other cache accessed over a network and is strictly available only to the process that hosted it.
If the application is deployed only in one node, then in-process caching is the right candidate to store frequently accessed data with fast data access.
If the in-process cache is to be deployed in multiple instances of the application, then keeping data in-sync across all instances could be a challenge and cause data inconsistency.
An in-process cache can bring down the performance of any application where the server memory is limited and shared. In such cases, a garbage collector will be invoked often to clean up objects that may lead to performance overhead.
In-Memory Distributed Cache
Distributed caches can be built externally to an application that supports read/write to/from data repositories, keeps frequently accessed data in RAM, and avoid continuous fetching data from the data source. Such caches can be deployed on a cluster of multiple nodes, forming a single logical view.
In-memory distributed cache is suitable for applications running on multiple clusters where performance is key. Data inconsistency and shared memory aren’t matters of concern, as a distributed cache is deployed in the cluster as a single logical state.
As inter-process is required to access caches over a network, latency, failure, and object serialization are some overheads that could degrade performance.
2. In-memory database
In-memory database (IMDB) stores data in the main memory instead of a disk to produce quicker response times. The query is executed directly on the dataset stored in memory, thereby avoiding frequent read/writes to disk which provides better throughput and faster response times. It provides a configurable data persistence mechanism to avoid data loss.
Redis is an open-source in-memory data
structure store used as a database, cache, and message broker. It offers data
replication, different levels of persistence, HA, automatic partitioning that
Replacing the RDBMS with an in-memory
database will improve the performance of an application without changing the
3. In-Memory Data Grid
An in-memory data grid (IMDG) is a
data structure that resides entirely in RAM and is distributed among multiple
Parallel computation of the data in memory
Search, aggregation, and sorting of the data in memory
Transactions management in memory
Cache Use Cases
There are use cases where a specific caching should be adapted to improve the performance of the application.
1. Application Cache
Application cache caches web content
that can be accessed offline. Application owners/developers have the
flexibility to configure what to cache and make it available for offline users.
It has the following advantages:
Quicker retrieval of data
Reduced load on servers
2. Level 1 (L1) Cache
This is the default transactional
cache per session. It can be managed by any Java persistence framework (JPA) or
object-relational mapping (ORM) tool.
The L1 cache stores entities that fall under a specific session and are cleared once a session is closed. If there are multiple transactions inside one session, all entities will be stored from all these transactions.
3. Level 2 (L2) Cache
The L2 cache can be configured to provide custom caches that can hold onto the data for all entities to be cached. It’s configured at the session factory-level and exists as long as the session factory is available.
Sessions in an application.
Applications on the same servers with the same database.
Application clusters running on multiple nodes but pointing to the same database.
4. Proxy / Load balancer cache
Enabling this reduces the load on application servers. When similar content is queried/requested frequently, proxy takes care of serving the content from the cache rather than routing the request back to application servers.
When a dataset is requested for the first time, proxy saves the response from the application server to a disk cache and uses them to respond to subsequent client requests without having to route the request back to the application server. Apache, NGINX, and F5 support proxy cache.
5. Hybrid Cache
A hybrid cache is a combination of JPA/ORM frameworks and open source services. It is used in applications where response time is a key factor.
Caching Design Considerations
1. Data Loading/Updating
Data loading into a cache is an
important design decision to maintain consistency across all cached content.
The following approaches can be considered to load data:
Using default function/configuration provided by JPA and ORM frameworks to load/update data.
Implementing key-value maps using open-source cache APIs.
Programmatically loading entities through automatic or explicit insertion.
External application through synchronous or asynchronous communication.
2. Performance/Memory Size
Resource configuration is an important factor in achieving the performance SLA. Available memory and CPU architecture play a vital role in application performance. Available memory has a direct impact on garbage collection performance. More GC cycles can bring down the performance.
3. Eviction Policy
An eviction policy enables a cache to ensure that the size of the cache doesn’t exceed the maximum limit. The eviction algorithm decides what elements can be removed from the cache depending on the configured eviction policy thereby creating space for the new datasets.
There are various popular eviction
algorithms used in cache solution:
Least Recently Used (LRU)
Least Frequently Used (LFU)
First In, First Out (FIFO)
Concurrency is a common issue in enterprise applications. It creates conflict and leaves the system in an inconsistent state. It can occur when multiple clients try to update the same data object at the same time during cache refresh. A common solution is to use a lock, but this may affect performance. Hence, optimization techniques should be considered.
5. Cache Statistics
Cache statistics are used to identify the health of cache and provide insights about its behavior and performance. Following attributes can be used:
Hit Count: Indicates the number of times the cache lookup has returned a cached value.
Miss Count: Indicates number of times cache lookup has returned a null or newly loaded or uncached value
Load success count: Indicates the number of times the cache lookup has successfully loaded a new value.
Total load time: Indicates time spent (nanoseconds) in loading new values.
Load exception count: Number of exceptions thrown while loading an entry
Eviction count: Number of entries evicted from the cache
Various Caching Solutions
There are various Java caching solutions available — the right choice depends on the use case.
At GAVS, we focus on building a strong foundation of coding practices. We encourage and implement the “Design First, Code Later” principle and “Design Oriented Coding Practices” to bring in design thinking and engineering mindset to build stronger solutions.
We have been training and mentoring our talent on cutting-edge JAVA technologies, building reusable frameworks, templates, and solutions on the major areas like Security, DevOps, Migration, Performance, etc. Our objective is to “Partner with customers to realize business benefits through effective adoption of cutting-edge JAVA technologies thereby enabling customer success”.
About the Author –
Sivaprakash is a solutions architect with strong solutions and design skills. He is a seasoned expert in JAVA, Big Data, DevOps, Cloud, Containers, and Micro Services. He has successfully designed and implemented a stable monitoring platform for ZIF. He has also designed and driven Cloud assessment/migration, enterprise BRMS, and IoT-based solutions for many of our customers. At present, his focus is on building ‘ZIF Business’ a new-generation AIOps platform aligned to business outcomes.
From lightbulbs to cities, IoT is adding a level of digital intelligence to various things around us. Internet of Things or IoT is physical devices connected to the internet, all collecting and sharing data, which can then be used for various purposes. The arrival of super-cheap computers and the ubiquity of wireless networks are behind the widespread adoption of IoT. It is possible to turn any object, from a pill to an airplane, into an IoT-enabled device. It is making devices smarter by letting them ‘sense’ and communicate, without any human involvement.
Let us look at the developments that enabled the commercialization of IoT.
The idea of integrating sensors and
intelligence to basic objects dates to the 1980s and 1990s. But the progress
was slow because the technology was not ready. Chips were too big and bulky and
there was no way for an object to communicate effectively.
Processors had to be cheap and power-frugal enough to be disposed of before it finally becomes cost-effective to connect to billions of devices. The adoption of RFID tags and IPV6 was a necessary step for IoT to scale.
Kevin Ashton penned the phrase ‘Internet of Things’ in 1999. Although it took a decade for this technology to catch up with his vision. According to Ashton “The IoT integrates the interconnectedness of human culture (our things) with our digital information system(internet). That’s the IoT”.
Early suggestions for IoT include ‘Blogjects’ (object that blog and record data about themselves to the internet), Ubiquitous computing (or ‘ubicomp’), invisible computing, and pervasive computing.
How big is IoT?
IDC predicts that there will be 41.6 billion connected IoT devices by 2025. It also suggests industrial and automotive equipment represent the largest opportunity of connected ‘things’.
Gartner predicts that the enterprise and automotive sectors will account for 5.8 billion devices this year.
However, the COVID-19 pandemic has further
enhanced the need for IoT-enabled devices to help the nations tackle the
IoT for the Government
Information about the movement of citizens is urgently required by governments to track the spread of the virus and potentially monitor their quarantine measures. Some IoT operators have solutions that could serve these purposes.
Telia’s Division X has developed Crowd Insights which provides aggregated smartphone data to city and transport authorities of Nordic Countries. It is using the tool which will track the movement of citizens during the quarantine.
Vodafone provides insights on traffic congestion.
Telefonica developed Smart steps, which aggregates data on footfall and movement for the transport, tourism, and retail sectors.
Personal data of people will also help in
tracking clusters of infection by changing the privacy regulations. For
example, in Taiwan, high-risk quarantined patients were being monitored through
their mobile phones to ensure compliance with quarantine rules. In South Korea,
the officials track infected citizens and alert others if they come into
contact with them. The government of Israel went as far as passing an emergency
law to monitor the movement of infected citizens via their phones.
China is already using mass temperature scanning devices in public areas like airports. A team of researchers at UMass Amherst is testing a device that can analyze coughing sounds to identify the presence of flu-like symptoms among crowds.
IoT in Health care
COVID-19 could be the trigger to explore new solutions and be prepared for any such future pandemics, just as the SARS epidemic in 2003 which spurred the governments in South Korea and Taiwan to prepare for today’s problems.
Remote patient monitoring (RPM) and telemedicine could be helpful in managing a future pandemic. For example, patients with chronic diseases who are required to self-isolate to reduce their exposure to COVID-19 but need continuous care would benefit from RPM. Operators like Orange, Telefónica, and Vodafone already have some experience in RPM.
Connected thermometers are being used
in hospitals to collect data while maintaining a social distance. Smart wearables
are also helpful in preventing the spread of the virus and responding to those
who might be at risk by monitoring their vital signs.
Connected thermometers are being used
in hospitals to collect data while maintaining a social distance. Smart wearables
are also helpful in preventing the spread of the virus and responding to those
who might be at risk by monitoring their vital signs.
widely adopted in the US, and the authorities there are relaxing reimbursement
rules and regulations to encourage the extension of specific services. These
include the following.
Medicare, the US healthcare program for senior citizens, has temporarily expanded its telehealth service to enable remote consultations.
The FCC has made changes to the Rural Health Care (RHC) and E-Rate programs to support telemedicine and remote learning. Network operators will be able to provide incentives or free network upgrades that were previously not permitted, for example, for hospitals that are looking to expand their telemedicine programs.
IoT for Consumers
The IoT promises to make our environment smarter, measurable, and interactive.COVID-19 is highly contagious, and it can be transmitted from one to another even by touching the objects used by the affected person. The WHO has instructed us to disinfect and sanitize high touch objects. IoT presents us with an ingenious solution to avoid touching these surfaces altogether. Hands-free and sensor-enabled devices and solutions like smart lightbulbs, door openers, smart sinks, and others help prevent the spread of the virus.
Security aspects of IoT
Security is one of the biggest issues with the IoT. These sensors collect extremely sensitive data like what we say and do in our own homes and where we travel. Many IoT devices lack security patches, which means they are permanently at risk. Hackers are now actively targeting IoT devices such as routers and webcams because of their inherent lack of security makes them easy to compromise and pave the way to giant botnets.
IoT bridges the gap between the digital and the physical world which means hacking into devices can have dangerous real-world consequences. Hacking into sensors and controlling the temperature in power stations might end up in catastrophic decisions and taking control of a driverless car could also end in disaster.
Overall IoT makes the world around us smarter and more responsive by merging the digital and physical universe. IoT companies should look at ways their solutions can be repurposed to help respond to the crisis.
Naveen is a software developer at GAVS. He teaches underprivileged children and is interested in giving back to society in as many ways as he can. He is also interested in dancing, painting, playing keyboard, and is a district-level handball player.
These are unprecedented times. The world hadn’t witnessed
such a disruption in recent history. It is times like these test the strength
and resilience of our community. While we’ve been advised to maintain social
distancing to flatten to curve, we must keep the wheels of the economy rolling.
In my previous article, I covered the ‘People-Centric’ Tech
Trends of the year, i.e., Hyper automation, Multiexperience, Democratization,
Human Augmentation and Transparency and Traceability. All of those hold more
importance now in the light of current events. Per Gartner, Smart Spaces enable
people to interact with people-centric technologies. Hence, the next Tech
Trends in the list are about creating ‘Smart Spaces’ around us.
Smart spaces, in simple words, are interactive physical
environments decked out with technology, that act as a bridge between humans
and the digital world. The most common example of a smart space is a smart
home, also called as a connected home. Other environments that could be a smart
space are offices and communal workspaces; hotels, malls, hospitals, public places
such as libraries and schools, and transportation portals such as airports and
train stations. Listed below are the 5 Smart Spaces Technology Trends which,
per Gartner, have great potential for disruption.
Edge computing is a distributed computing topology in which
information processing and data storage are located closer to the sources,
repositories and consumers of this information. Empowered Edge is about moving
towards a smarter, faster and more flexible edge by using more adaptive
processes, fog/mesh architectures, dynamic network topology and distributed
cloud. This trend will be introduced across a spectrum of endpoint devices
which includes simple embedded devices (e.g., appliances, industrial devices),
input/output devices (e.g., speakers, screens), computing devices (e.g.,
smartphones, PCs) and complex embedded devices (e.g., automobiles, power
generators). Per Gartner predictions, by 2022, more than 50% of
enterprise-generated data will be created and processed outside the data center
or cloud. This trend also includes the next-generation cellular standard after
4G Long Term Evolution (LTE), i.e., 5G. The concept of edge also percolates to
the digital-twin models.
Gartner defines a distributed cloud as “distribution of public cloud services to different locations outside
the cloud providers’ data centers, while the originating public cloud provider
assumes responsibility for the operation, governance, maintenance and updates.” Cloud computing has always been viewed as a centralized service, although,
private and hybrid cloud options compliments this model. Implementing private
cloud is not an easy task and hybrid cloud breaks many important cloud
computing principles such as shifting the responsibility to cloud providers,
exploiting the economics of cloud elasticity and using the top-class services
of large cloud service providers. A distributed cloud provides services in a
location which meets organization’s requirements without compromising on the
features of a public cloud. This trend is still in the early stages of
development and is expected to build in three phases:
Phase 1: Services will be provided from a micro-cloud which
will have a subset of services from its centralized cloud.
Phase 2: An extension to phase 1, where service provider will
team up with a third-party to deliver subset of services from the centralized
Phase 3: Distributed cloud substations will be setup which
could be shared by different organizations. This will improve the economics
associated as the installation cost can be split among the companies.
Autonomous can be defined as being able to control oneself.
Similarly, Autonomous Things are devices which can operate by themselves
without human intervention using AI to automate all their functions. The most
common among these devices are robots, drones, and aircrafts. These devices can
operate across different environments and will interact more naturally with
their surroundings and people. While exploring use cases of this technology,
understanding the different spaces the device will interact to, is very
important like the people, terrain obstacles or other autonomous things.
Another aspect to consider would be the level of autonomy which can be applied.
The different levels are: No automation, Human-assisted automation, Partial
automation, Conditional automation, High automation and Full automation. With
the proliferation of this trend, a shift is expected from stand-alone
intelligent things to collaborative intelligent things in which multiple
devices work together to deliver the final output. The U.S. Defense Advanced
Research Projects Agency (DARPA) is studying the use of drone swarms to defend
or attack military targets.
Most of us have heard about Blockchain technology. It is a
tamper-proof, decentralized, distributed database that stores blocks of records
linked together using cryptography. It holds the power to take industries to
another level by enabling trust, providing transparency, reducing transaction
settlement times and improving cash flow. Blockchain also makes it easy to
trail assets back to its origin, reducing the chances of substituting it with
counterfeit products. Smart contracts are used as part of the blockchain which
can trigger actions on encountering any change in the blockchain; such as
releasing payment when goods are received. New developments are being
introduced in public blockchains but over time these will be integrated with
permissioned blockchains which supports membership, governance and operating
model requirements. Some of the use cases of this trend that Gartner has
identified are: Asset Tracking, Identity Management/Know Your Client (KYC), Internal
Record Keeping, Shared Record Keeping, Smart Cities/the IoT, Trading,
Blockchain-based voting, Cryptocurrency payments and remittance services. Per
the 2019 Gartner CIO Survey, in the next three years 60% of CIOs expect
blockchain deployment in some way.
Per Gartner, over the next five years AI-based decision-making
will be applied across a wide set of use cases which will result in a
tremendous increase of potential attack surfaces. Gartner provides three key
perspectives on how AI impacts security: protecting AI-powered systems, leveraging
AI to enhance security defense and anticipating negative use of AI by
attackers. ML pipelines have different phases and at each of these phases there
are various kinds of risks associated. AI-based security tools can be very
powerful extension to toolkits with use cases such as security monitoring,
malware detection, etc. On the other hand, there are many AI-related attack
techniques which include training data poisoning, adversarial inputs and model
theft and per Gartner predictions, through 2022, 30% of all AI cyberattacks
will leverage these attacking techniques. Every innovation in AI can be
exploited by attackers for finding new vulnerabilities. Few of the AI attacks
that security professionals must explore are phishing, identity theft and
One of the
most important things to note here is that the trends listed above cannot exist
in isolation. IT leaders must analyse what combination of these trends will
drive the most innovation and strategy fitting it into their business models.
Soon we will have smart spaces around us in forms of factories, offices and
cities with increasingly insightful digital services everywhere for an ambient
world’s largest taxi company, owns no vehicles. Facebook, the world’s most
popular media owner, creates no content. Alibaba, the most valuable retailer,
has no inventory. Netflix, the world’s largest movie house, own no cinemas. And
Airbnb, the world’s largest accommodation provider, owns no real estate.
Something interesting is happening.”
– Tom Goodwin, an executive at the French media group Havas.
This new breed
of companies is the fastest growing in history because they own the customer
interface layer. It is the platform where all the value and profit is. “Platform
business” is a more wholesome term for this model for which data is the fuel;
Big Data & AI/ML technologies are the harbinger of new waves of
productivity growth and innovation.
With Big data and AI/ML is making a big difference in the area of public health, let’s see how it is helping us tackle the global emergency of coronavirus formally known as COVID-19.
Chinese technology giant Alibaba has
developed an AI system for detecting the COVID-19 in CT scans of patients’ chests with
96% accuracy against viral pneumonia cases. It only takes 20 seconds for the AI
to decide, whereas humans generally take about 15 minutes to diagnose the
illness as there can be upwards of 300 images to evaluate. The system was trained on images and data
from 5,000 confirmed coronavirus cases and has been tested in hospitals
throughout China. Per a report, at least 100 healthcare facilities are
currently employing Alibaba’s AI to detect COVID-19.
Ping An Insurance (Group) Company of China, Ltd (Ping An) aims to address the issue of lack of radiologists by introducing the COVID-19 smart image-reading system. This image-reading system can read the huge volumes of CT scans in epidemic areas.
Ping An Smart Healthcare uses clinical
data to train the AI model of the COVID-19 smart image-reading system. The AI
analysis engine conducts a comparative analysis of multiple CT scan images of
the same patient and measures the changes in lesions. It helps in tracking the
development of the disease, evaluation of the treatment and in prognosis of
patients. Ultimately it assists doctors to diagnose, triage and evaluate
COVID-19 patients swiftly and effectively.
Ping An Smart Healthcare’s COVID-19 smart
image-reading system also supports AI image-reading remotely by medical
professionals outside the epidemic areas. Since its launch, the smart
image-reading system has provided services to more than 1,500 medical
institutions. More than 5,000 patients have received smart image-reading
services for free.
The more solutions the better. At least
when it comes to helping overwhelmed doctors provide better diagnoses and,
thus, better outcomes.
AI based Temperature monitoring & scanning
In Beijing, China, subway passengers are being screened for
symptoms of coronavirus, but not by health authorities. Instead, artificial
intelligence is in-charge.
Two Chinese AI giants, Megvii and Baidu, have introduced
temperature-scanning. They have implemented scanners to detect body temperature
and send alerts to company workers if a person’s body temperature is high
enough to constitute a fever.
Megvii’s AI system detects body temperatures for up to 15 people per second and up to 16 feet. It monitors as many as 16 checkpoints in a single station. The system integrates body detection, face detection, and dual sensing via infrared cameras and visible light. The system can accurately detect and flag high body temperature even when people are wearing masks, hats, or covering their faces with other items. Megvii’s system also sends alerts to an on-site staff member.
Baidu, one of the largest search-engine companies in China,
screens subway passengers at the Qinghe station with infrared scanners. It also
uses a facial-recognition system, taking photographs of passengers’ faces. If
the Baidu system detects a body temperature of at least 99-degrees Fahrenheit,
it sends an alert to the staff member for another screening. The technology can
scan the temperatures of more than 200 people per minute.
AI based Social Media Monitoring
An international team is using machine
learning to scour through social media posts, news reports, data from official
public health channels, and information supplied by doctors for warning signs of
the virus across geographies. The program is looking for social media posts
that mention specific symptoms, like respiratory problems and fever, from a
geographic area where doctors have reported potential cases. Natural language
processing is used to parse the text posted on social media, for example, to
distinguish between someone discussing the news and someone complaining about
how they feel.
The approach has proven capable of spotting
a coronavirus needle in a haystack of big data. This technique could help
experts learn how the virus behaves. It may be possible to determine the age,
gender, and location of those most at risk quicker than using official medical
Data from hospitals, airports, and other
public locations are being used to predict disease spread and risk. Hospitals
can also use the data to plan for the impact of an outbreak on their
Kalman filter was pioneered by Rudolf Emil
Kalman in 1960, originally designed and developed to solve the navigation
problem in the Apollo Project. Since then, it has been applied to numerous
cases such as guidance, navigation, and control of vehicles, computer vision’s
object tracking, trajectory optimization, time series analysis in signal
processing, econometrics and more.
Kalman filter is a recursive algorithm which uses time-series measurement over time, containing statistical noise and produce estimations of unknown variables.
For the one-day prediction Kalman filter can
be used, while for the long-term forecast a linear model is used where its main
features are Kalman predictors, infected rate relative to population,
time-depended features, and weather history and forecasting.
The one-day Kalman prediction is very accurate
and powerful while a longer period prediction is more challenging but provides
a future trend. Long term prediction does not guarantee full accuracy but
provides a fair estimation following the recent trend. The model should re-run daily
to gain better results.
The Center for
Systems Science and Engineering at Johns Hopkins University has developed an
interactive, web-based dashboard that tracks the status of COVID-19 around the
world. The resource provides a visualization of the location and number of
confirmed COVID-19 cases, deaths and recoveries for all affected countries.
data source for the tool is DXY, a Chinese platform that aggregates local media
and government reports to provide COVID-19 cumulative case totals in near
real-time at the province level in China and country level otherwise.
Additional data comes from Twitter feeds, online news services and direct
communication sent through the dashboard. Johns Hopkins then confirms the case
numbers with regional and local health departments. This kind of Data analytics
platform plays a pivotal role in addressing the coronavirus outbreak.
All data from
the dashboard is also freely available in the following GitHub repository.
One of AI’s core strengths when working on identifying and limiting the effects of virus outbreaks is its incredibly insistent nature. AI systems never tire, can sift through enormous amounts of data, and identify possible correlations and causations that humans can’t.
However, there are
limits to AI’s ability to both identify virus outbreaks and predict how
they will spread. Perhaps the best-known example comes from the neighboring
field of big data analytics. At its launch, Google Flu Trends was heralded
as a great leap forward in relation to identifying and estimating the spread of
the flu—until it underestimated the 2013 flu season by a whopping 140 percent
and was quietly put to rest. Poor data quality was identified as one of the
main reasons Google Flu Trends failed. Unreliable or faulty
data can wreak havoc on the prediction power of AI.
Bargunan is a Big Data Engineer and a programming enthusiast. His passion is to share his knowledge by writing his experiences about them. He believes “Gaining knowledge is the first step to wisdom and sharing it is the first step to humanity.”