Welcome!

Open Source Cloud Authors: Elizabeth White, Liz McMillan, Pat Romanski, Greg Schulz, Carmen Gonzalez

Related Topics: Open Source Cloud, Java IoT, Industrial IoT, Microsoft Cloud, Machine Learning , Python

Open Source Cloud: Article

Python and gevent

Let’s talk about event loops

The easiest way to make your code run faster is to do less. At some point, though, you don’t want to do less. Maybe you want to do more, without it being any slower. Maybe you want to make what you have fast, without cutting out any of the work. What then? In this enlightened age, the answer is easy — parallelize it! Threads are always a good choice, but without careful consideration, it’s easy to create all manner of strange race conditions or out-of-order data access. So today, let’s talk about a different route: event loops.

Event whats?
If you’re not familiar with evented code, it’s a way to parallelize execution, similar to threading or multiprocessing. Unlike threads, though, evented code is typically cooperative — each execution path must voluntarily give up control. Each of these execution units actually runs in serial, and when finished, returns control to the main loop. The parallelization gain comes from cleverly dividing the work so that when they make a blocking call (i.e. a DB call, HTTP request, or disk access), they give up control, letting the main event loop run other functions and wait for the call to return. This is perfect for cases where you do a lot of I/O and relatively little work in the evented thread itself.

In my case, I’m interested in doing a number of heterogeneous, but related, data lookups in response to a web request. We’re running all of this behind our data access layer, which is an evented Thrift server. I have a number of functions (with a common API) I’m interested in running, and a naive implementation would look like this:

Python-Gevent-1

What does that do?

If we run this on a machine with TraceView installed, we’ll see the following request structure:

Pretty predictable. We called into each function serially, which is exactly what we said we’d do. We can also look at the raw events TraceView collected, and they tell a similar story:

All together now!

This seems like a good baseline, but let’s see what happens when we parallelize it. Let’s use Python’s gevent, which has two major selling points. First, it implements an event loop based on libevent, which means we won’t have to worry about actually implementing the event loop. We can just spawn separate greenlet (i.e., non-native) tasks, and gevent will handle all the scheduling. The other big advantage is that gevent knows about and can monkeypatch existing synchronous libraries to cede control when they block. This means that outside of the actual event spawning, we can leave our existing code untouched, and our external calls will just do the right thing.

That just leaves the questions of how to break up the work into parallel coroutines. It seems natural to give each of these functions their own, and then have our “main thread” wait for them all the finish. We do this by firing off each function in a separate task and collecting them in a list. We then wait for all of those tasks to finish and collect the results. Easy! Here’s what the same function as above looks like, but evented:

Python-Gevent-2

This calling change is all* that’s necessary to parallelize these functions! The next question is, did that help? Let’s look at the same graphs we had before, but now for the evented case:

Definitely different! Instead of running everything sequentially, we can see all seven functions running at the same time. As we’d hoped, this has a major impact on our total response time, as well. It’s 500ms faster — a speedup of 2x!

*Caveats
OK, so it’s not as perfectly simple as this example. There are a few “gotchas” that are worth bearing in mind when you start to use this in a real example.

This first one is that gevent mimics separate threads for each coroutine. This means that if you’re storing global state that’s thread-aware, gevent may discard it. Notably, Pylons/Pyramid uses thread-safe object proxies to store global request state, which means that new coroutines will hide that information from you. In our production version of this code, we explicitly pass that state from caller to callee, then set it in the global “pylons.request” object before running the function. It lets us seemlessly mix evented and non-evented functions, while only thinking about the details of gevent in one place.

The second big gotcha is error handling. Since these aren’t normal function calls, exceptions don’t propagate to the caller. They must be explicitly checked for on the task and re-thrown, if appropriate. This sort of error-case-checking is familiar to any C programmer, but it’s different than the normal Python idiom, so it’s worth thinking about.

Another caveat is that spawning multiple events doesn’t actually get you code-level parallelization. It runs blocking calls in parallel, but you still only get one interpreter thread to run your Python (no magic GIL sidestep here!). If you’re looking to speed up heavy computations or other CPU-intensive work, check out the multiprocessing module. Eventing really shines when the majority of your work is database calls, file access, or other blocking, out-of-process work.

Finally, if you’re looking to trace these kinds of calls with TraceView (like we did here), it’s pretty straightforward. The only thing to remember is to wrap your evented function calls using “oboe.log_method”, and pass “entry_kvs={‘Async’: True}”. The ensures that we calculate the timing information properly for all your parallel work.

But that’s it! You can use this technique to speed up existing projects, or build something an entirely new with gevent at the core. What are you planning on doing with it?

And, of course, if you want to instrument your evented project now, sign up for a 14-day trial of TraceView today!

Related Articles

More Stories By TR Jordan

A veteran of MIT’s Lincoln Labs, TR is a reformed physicist and full-stack hacker – for some limited definition of full stack. After a few years as Software Development Lead with Thermopylae Science and Techology, he left to join Tracelytics as its first engineer. Following Tracelytics merger with AppNeta, TR was tapped to run all of its developer and market evangelism efforts. TR still harbors a not-so-secret love for Matlab-esque graphs and half-baked statistics, as well as elegant and highly-performant code. Read more of his articles at www.appneta.com/blog or visit www.appneta.com.

@ThingsExpo Stories
Amazon started as an online bookseller 20 years ago. Since then, it has evolved into a technology juggernaut that has disrupted multiple markets and industries and touches many aspects of our lives. It is a relentless technology and business model innovator driving disruption throughout numerous ecosystems. Amazon’s AWS revenues alone are approaching $16B a year making it one of the largest IT companies in the world. With dominant offerings in Cloud, IoT, eCommerce, Big Data, AI, Digital Assis...
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs oft...
The 21st International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 21st International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @ThingsExpo Silicon Valley Call for Papers is now open.
As cloud adoption continues to transform business, today's global enterprises are challenged with managing a growing amount of information living outside of the data center. The rapid adoption of IoT and increasingly mobile workforce are exacerbating the problem. Ensuring secure data sharing and efficient backup poses capacity and bandwidth considerations as well as policy and regulatory compliance issues.
SYS-CON Events announced today that Interoute has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Interoute is the owner operator of Europe's largest network and a global cloud services platform, which encompasses over 70,000 km of lit fiber, 15 data centers, 17 virtual data centers and 33 colocation centers, with connections to 195 additional partner data centers. Our full-service Unifie...
In order to meet the rapidly changing demands of today’s customers, companies are continually forced to redefine their business strategies in order to meet these needs, stay relevant and continue to see profitable growth. IoT deployment and development is integral in this transformation, and today businesses are increasingly seeing the value of investing their resources into IoT deployments. These technologies are able increase ROI through projects such as connecting supply chains or enabling sm...
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs ofte...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
SYS-CON Events announced today that DivvyCloud will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. DivvyCloud software enables organizations to achieve their cloud computing goals by simplifying and automating security, compliance and cost optimization of public and private cloud infrastructure. Using DivvyCloud, customers can leverage programmatic Bots to identify and remediate common cloud problems in rea...
SYS-CON Events announced today that Outscale, a global pure play Infrastructure as a Service provider and strategic partner of Dassault Systèmes, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2010, Outscale simplifies infrastructure complexities and boosts the business agility of its customers. Outscale delivers a secure, reliable and industrial strength solution for its customers, which in...
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet th...
SYS-CON Events announced today that A&I Solutions has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 1999, A&I Solutions is a leading information technology (IT) software and services provider focusing on best-in-class enterprise solutions. By partnering with industry leaders in technology, A&I assures customers high performance levels across all IT environments including: mai...
Every successful software product evolves from an idea to an enterprise system. Notably, the same way is passed by the product owner's company. In his session at 20th Cloud Expo, Oleg Lola, CEO of MobiDev, will provide a generalized overview of the evolution of a software product, the product owner, the needs that arise at various stages of this process, and the value brought by a software development partner to the product owner as a response to these needs.
SYS-CON Events announced today that Tappest will exhibit MooseFS at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. MooseFS is a breakthrough concept in the storage industry. It allows you to secure stored data with either duplication or erasure coding using any server. The newest – 4.0 version of the software enables users to maintain the redundancy level with even 50% less hard drive space required. The software func...
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software in the hope of capturing value in IoT. Although IoT is relatively new in the market, it has already gone through many promotional terms such as IoE, IoX, SDX, Edge/Fog, Mist Compute, etc. Ultimately, irrespective of the name, it is about deriving value from independent software assets participating in an ecosystem as one comprehensive solution.
SYS-CON Events announced today that EARP will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. "We are a software house, so we perfectly understand challenges that other software houses face in their projects. We can augment a team, that will work with the same standards and processes as our partners' internal teams. Our teams will deliver the same quality within the required time and budget just as our partn...
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @ThingsExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
In his keynote at @ThingsExpo, Chris Matthieu, Director of IoT Engineering at Citrix and co-founder and CTO of Octoblu, focused on building an IoT platform and company. He provided a behind-the-scenes look at Octoblu’s platform, business, and pivots along the way (including the Citrix acquisition of Octoblu).