Welcome!

Open Source Cloud Authors: Elizabeth White, Yeshim Deniz, Pat Romanski, Liz McMillan, Zakia Bouachraoui

Related Topics: @DevOpsSummit, Microservices Expo, Open Source Cloud, Containers Expo Blog, @CloudExpo

@DevOpsSummit: Blog Post

Observer Effect on Microservices Architecture | @DevOpsSummit #Devops #BigData #Microservices

The Observer Effect simply states that observing something necessarily changes the thing being observed

The Impact of the Observer Effect on Microservices Architecture

Application availability is not just the measure of "being up". Many apps can claim that status. Technically they are running and responding to requests, but at a rate which users would certainly interpret as being down. That's because excessive load times can (and will be) interpreted as "not available." That's why it's important to view ensuring application availability as requiring attention to all its composite parts: scalability, performance, and security.

The paradox begins when we consider how we ensure scale, performance, and security: monitoring and measuring. That is, we observe certain characteristics about the network, compute, and application resources to gain an understanding of the status of the application. That necessarily means we have to interact with those components that need monitoring and measuring and thus we enter the world of physics.

The Observer Effect simply states that observing something necessarily changes the thing being observed. When it's a sentient being, this often takes the form of the Hawthorne Effect, which claims that sentient beings will change their behavior when they know they're being observed. Go ahead, try it out on your kids. If they know they're being watched they're angels. But turn your back on them for a minute and wham! They've destroyed their play room and littered popcorn all over the floor.

Within the realm of IT, the effect is no less active:

wikipedia-logoIn information technology, the observer effect is the potential impact of the act of observing a process output while the process is running. For example: if a process uses a log file to record its progress, the process could slow. Furthermore, the act of viewing the file while the process is running could cause an I/O error in the process, which could, in turn, cause it to stop.

Another example would be observing the performance of a CPU by running both the observed and observing programs on the same CPU, which will lead to inaccurate results because the observer program itself affects the CPU performance (modern, heavily cached and pipelined CPUs are particularly affected by this kind of observation).

-- https://en.wikipedia.org/wiki/Observer_effect_(information_technology)

The act of measuring capacity and performance of a system* - say an app or an individual microservice - alters its state by consuming resources that in turn increase total load which, based on operational axiom #2 says, ultimately degrades both capacity and performance. This is one of the reasons agent-based monitoring has always been a less favorable choice for APM, because the presence of the agent on the system necessarily reduces capacity and performance.

The Observer Effect is going to be particularly impactful on applications composed of microservices because of, well, math. If the act of measuring and monitoring one monolithic application degrades performance by X then the act of measuring and monitoring a microservices-based application is going to degrade performance by many more X. It could be argued that the impact on a microservices-based application is actually not X per service, but some fraction of X given that the point is to distribute services in such a way that not all services are being taxed at the same rate as in a monolithic application. That would  be true if it the microservices were being used as part of a single application, but one of the benefits - and target uses - of microservices is reuse. That implies that multiple apps or APIs are going to make use of each service, thus increasing the need to measure and monitor the capacity and performance of each service.

active versus passive monitoring

This is where architecture and technique matters. Where the design and implementation of the measuring and monitoring for performance and load of microservices becomes an important piece of ensuring availability. While each and every point of control - an API gateway or service discovery system or load balancer or proxy - can measure each microservice for which it performs its assigned tasks, it is likely to unnecessarily increase the impact of the Observer Effect on the microservice. That's because most points of control take an active approach to monitoring and measuring load and performance. That is, they purposefully poll a system so as to enquire regarding the status and responsiveness of the system. They use ICMP pings, they use TCP half opens, and they use HTTP content requests to gather the data they need.

Each of these methods interacts with the system in question and thus fulfills the Observers Effect prediction. The more systems gathering this data, the more interaction occurs, the greater the impact of the Observer Effect.

That means there must be greater attention paid to the way in which microservices are monitoring and measured - including the techniques used to accomplish it.

Passive approaches to measuring and monitoring provide one means of avoiding the Observer Effect. That's because they - as the term implies - passively observe status and measure performance without actively probing systems for this data. This is typically achieved by leveraging intermediate systems like load balancers and proxies through which requests and responses necessarily flow to capture status information as it is passing through.

The measurements are then used by the intermediary, of course, to manage distribution of load but are also exposed via APIs for collection by other systems. Those statistics gathered from an intermediary are likely* to have no impact on performance because they are managed by a system separate from the real-time execution of the intermediary.

It is important to consider the availability of the statistics via APIs to external systems when architecting a solution based on passive monitoring and measurement techniques. If the system performing the monitoring and measuring makes available the data it has collected, it relieves other systems of needing to directly measure each services' status and performance and further reduces the impact of the Observer Effect on the overall system.

This is one of the ways in which the collaborative aspects of DevOps can provide significant value. As ops and net ops work together to establish a more efficient means of measuring and monitoring the availability of systems like microservices they can provide as valuable input to dev those statistics using APIs directly or through integration with other established systems.

At an operational level this effort also establishes a more centralized location from which performance-related data can be retrieved (in real-time) and used to trigger other actions such as auto-scaling (up and down) - a critical capability when moving to microservices architectures in which the number and variability of usage and services requires a more automated approach to operations than their monolithic predecessors.

* This is less applicable to virtual network appliances because they are purposefully designed to separate operational and actionable systems to ensure that management - measuring, monitoring, modifying - the system does not impact the performance of the actionable system. This is carried over from their roots in hardware, where "lights out" management is a requirement.

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
The challenges of aggregating data from consumer-oriented devices, such as wearable technologies and smart thermostats, are fairly well-understood. However, there are a new set of challenges for IoT devices that generate megabytes or gigabytes of data per second. Certainly, the infrastructure will have to change, as those volumes of data will likely overwhelm the available bandwidth for aggregating the data into a central repository. Ochandarena discusses a whole new way to think about your next...
CloudEXPO | DevOpsSUMMIT | DXWorldEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
All in Mobile is a place where we continually maximize their impact by fostering understanding, empathy, insights, creativity and joy. They believe that a truly useful and desirable mobile app doesn't need the brightest idea or the most advanced technology. A great product begins with understanding people. It's easy to think that customers will love your app, but can you justify it? They make sure your final app is something that users truly want and need. The only way to do this is by ...
DXWorldEXPO LLC announced today that Big Data Federation to Exhibit at the 22nd International CloudEXPO, colocated with DevOpsSUMMIT and DXWorldEXPO, November 12-13, 2018 in New York City. Big Data Federation, Inc. develops and applies artificial intelligence to predict financial and economic events that matter. The company uncovers patterns and precise drivers of performance and outcomes with the aid of machine-learning algorithms, big data, and fundamental analysis. Their products are deployed...
Cell networks have the advantage of long-range communications, reaching an estimated 90% of the world. But cell networks such as 2G, 3G and LTE consume lots of power and were designed for connecting people. They are not optimized for low- or battery-powered devices or for IoT applications with infrequently transmitted data. Cell IoT modules that support narrow-band IoT and 4G cell networks will enable cell connectivity, device management, and app enablement for low-power wide-area network IoT. B...
The hierarchical architecture that distributes "compute" within the network specially at the edge can enable new services by harnessing emerging technologies. But Edge-Compute comes at increased cost that needs to be managed and potentially augmented by creative architecture solutions as there will always a catching-up with the capacity demands. Processing power in smartphones has enhanced YoY and there is increasingly spare compute capacity that can be potentially pooled. Uber has successfully ...
SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 22nd International Cloud Expo, which will take place on June 5–7, 2018, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buye...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...