Python Sweetness: Operon: Extreme Performance For Ansible

October 28, 2019, 5:30 am

≫ Next: Samuel Sutch: Python Parallel Programming Cookbook: Over 70 recipes to solve challenges in multithreading and distributed system with Python 3, 2nd Edition

≪ Previous: Chris Moffitt: Cleaning Up Currency Data with Pandas

I'm very excited to unveil Operon, a high performance replacement for Ansible® Engine, tailored for large installations and offered by subscription. Operon runs your existing playbooks, modules, plug-ins and third party tools without modification using an upgraded engine, dramatically increasing the practical number of nodes addressable in a single run, and potentially saving hours on every invocation.

Operon can be installed independently or side-by-side with Ansible Engine, enabling it to be gradually introduced to your existing projects or employed on a per-run basis.

Here is the runtime for 416 tasks of common.yml from DebOps 0.7.2 deployed via SSH:

Operon reduces runtime by around 60% compared to Ansible for a single node, but things really heat up for large runs. See how runtime scales using a 24 GiB, 8 core Xeon E5530 deploying to Google Cloud VMs over an 18 ms SSH connection:

Each run executed 416 tasks per node, including loop items. In the 1,024 node run, 490,496 tasks executed in 54 minutes, giving an average throughput of 151 tasks per second. Linear scaling is apparent, with just under 4x time increase moving from 256 to 1,024 nodes.

The 256 node Ansible run was cancelled following a lengthy period with no output, after many re-runs to iteratively reduce forks from 40 to 10, so Ansible would not exceed RAM. A 13 fork run may have succeeded, but further attempts were abandoned having consumed two days worth of compute time.

In the final run, Ansible completed 89% of tasks in 6h 13m prior to cancellation:

256 Nodes, DebOps common.yml

Operon deployed to all nodes in parallel for every run presented. Operon has imperceptible overhead executing 1,024 forks given 8 cores and cleanly scales to at least 6,144 given 24 cores. Had these results been recorded using 16 cores rather than 8, we expect the 1,024 node run would complete in 27 minutes rather than 54 minutes.

Memory usage is highly predictable and significantly decoupled from forks. With 256 forks, Operon uses 4x less RAM than Ansible uses for 10 forks, while consuming at least 15x less controller CPU time to achieve the same outcome.

This graph is crooked as the 64 node Ansible run executed with 40 forks, while the 256 node run executed with 10 forks. Ansible required 1.6 GiB per fork for the 256 node run, placing a severe restraint on achievable parallelism regardless of available RAM.

Operon is the progression of a design approach first debuted in Mitogen for Ansible. It inherits massive low-level efficiency improvements from that work, already depended on by thousands of users:

Beyond software

Performance is a secondary effect of a culture shift towards stronger user friendliness, compatibility and cost internalization. There is a lot to reveal here, but to offer a taste of what's planned, I'm pleased to announce a forwards-compatible playbook syntax guarantee, in addition to restoration of specific Ansible Engine constructs marked deprecated.

.fonk { border-spacing: 15px; border-collapse: separate; } .fonk1 td { padding-right: 15px; padding-bottom: 15px; } .fonk td { width: 50%; }

include:

- include: "i-will-always-work.yml"

"with" loops

- debug: msg={{item}}
  with_items: ["i", "will", "always", "work"]

"squash actions"

- apt:
    name: "{{item}}"
  with_items: ["i", "will",
               "always", "work"]

hyphens in group names

  $ cat hosts
  [i-will-always-work.us.mycorp.com]
  host1

hash merging

  # I will always work
  [defaults]
  hash_behaviour = merge

The Ansible 2.9-compatible syntax Operon ships will always be supported, and future syntax deprecations in Ansible Engine do not apply in Operon. Changes like these harm working configurations without improving capability, and are a major source of error-prone labour during upgrades.

Over time this guarantee will progressively extend to engine semantics and outwards.

How can I get this?

Operon is initially distributed with support from Network Genomics, backed by experience and dedication to service unavailable elsewhere. If your team are gridlocked by deployments or fatigued by years of breaking upgrades, consider requesting an evaluation, and don't hesitate to drop me an e-mail with any questions and concerns.

Software is always better in the open, so a public release will happen when some level of free support can be provided. Subscribe to the operon-announce mailing list to learn about future releases.

Will Operon help Windows performance?

Yes. If you're struggling with performance deploying to Windows, please get in touch.

Will Operon help network device performance?

Yes. Operon features an architectural redesign that extends far beyond the transport layer, and applying to all connection types equally.

Is Operon a fork of Ansible?

No. Operon is an incremental rewrite of the engine, a small component of around 60k code lines, of which around a quarter are replaced. Every Ansible installation includes around 715k lines, of which the vast majority is independently maintained by the wider Ansible community, just as Operon is.

Will Operon help improve Ansible Engine?

Yes. Operon is already promoting improvement within Ansible Engine, and since it remains an upstream, an incentive exists to contribute code upstream where practical.

Is Operon free software?

Yes. Operon is covered by same GPL license that covers Ansible, and you are free to make use of the code to the full extent of that license.

Does Operon break compatibility?

No. Operon does not break compatibility with the standard module collection, plug-in interfaces, or the surrounding Ansible ecosystem, and never plans to. Compatibility is a primary deliverable, including to keep pace with future improvements, and backwards compatibility such as improved playbook syntax stability.

I target only one node, what can Operon do for me?

Operon will help ensure the continued marketability of skills you have heavily invested in. It offers a powerful new flexibility that previously could not exist: your freedom to choose an engine. Whether you use it directly or not, you already benefit from Operon.

David

h3 { color: #7f0000; padding-top: 1em; }

↧

Samuel Sutch: Python Parallel Programming Cookbook: Over 70 recipes to solve challenges in multithreading and distributed system with Python 3, 2nd Edition

October 28, 2019, 6:14 am

≫ Next: ListenData: Loan Amortisation Schedule using R and Python

≪ Previous: Python Sweetness: Operon: Extreme Performance For Ansible

Price: $39.99
(as of Oct 28,2019 13:14:47 UTC – Details)

Giancarlo Zaccone has over fifteen years’ experience of managing research projects in the scientific and industrial domains. He is a software and systems engineer at the European Space Agency (ESTEC), where he mainly deals with the cybersecurity of satellite navigation systems.
Giancarlo holds a master’s degree in physics and an advanced master’s degree in scientific computing.
Giancarlo has already authored the following titles, available from Packt: Python Parallel Programming Cookbook (First Edition), Getting Started with TensorFlow, Deep Learning with TensorFlow (First Edition), and Deep Learning with TensorFlow (Second Edition).

↧

ListenData: Loan Amortisation Schedule using R and Python

October 28, 2019, 2:15 am

≫ Next: Real Python: Python Community Interview With Al Sweigart

≪ Previous: Samuel Sutch: Python Parallel Programming Cookbook: Over 70 recipes to solve challenges in multithreading and distributed system with Python 3, 2nd Edition

In this post, we will explain how you can calculate your monthly loan instalments the way bank calculates using R and Python. In financial world, analysts generally use MS Excel software for calculating principal and interest portion of instalment using PPMT, IPMT functions. As data science is growing and trending these days, it is important to know how you can do the same using popular data science programming languages such as R and Python.

When you take a loan from bank at x% annual interest rate for N number of years. Bank calculates monthly (or quarterly) instalments based on the following factors :

Loan Amount
Annual Interest Rate
Number of payments per year
Number of years for loan to be repaid in instalments

Loan Amortisation Schedule

It refers to table of periodic loan payments explaining the breakup of principal and interest in each instalment/EMI until the loan is repaid at the end of its stipulated term. Monthly instalments are generally same every month throughout term if interest and term is not changed. Sometimes bank restructures loan portfolio and reduce interest rate but increase terms (i.e. number of years you need to pay monthly instalments) so monthly instalment gets changed.

How much principal and interest in each instalment?

We generally pay high interest rate initially and it goes down after that in successive months. It is because it depends on loan balance. Once you pay first monthly instalment, your loan balance goes down from original loan amount (i.e origination loan amount) to (original loan amount - principal amount you paid in first instalment).

Principal part in instalment goes up every month. In other words, Principal amount increases in following months. Since instalment is summation of principal and interest amount, when principal amount goes up, interest goes down to balance out.

Example : You took a personal loan of 50,000 over a period of 6 years at 8.5% per annum paid monthly (12 payments per year)
The table below shows amortisation schedule of first year. Similarly you have for 5 more years as term is 6 years.

↧

Real Python: Python Community Interview With Al Sweigart

October 28, 2019, 7:00 am

≫ Next: Samuel Sutch: Python for Kids: A Playful Introduction to Programming

≪ Previous: ListenData: Loan Amortisation Schedule using R and Python

This week, I’m joined by Al Sweigart, a familiar name in the Python community. Al is an accomplished developer, conference speaker, teacher, and origamist. (Yes, you read that correctly!) But some may know him best as the author of many Python programming books, including the bestselling book Automate the Boring Stuff with Python and our top pick, Invent Your Own Computer Games with Python. So, without any further ado, let’s get into it!

Ricky:Welcome to Real Python, Al. We’re so glad you could join us for this interview. Let’s start the same way we do with all our guests. How’d you get into programming, and when did you start using Python?

Al Sweigart

Al: Thank you! Heh, I absolutely hate telling people how I got into programming because I was one of those kids who started learning BASIC around the third grade or so.

I don’t like telling people that because I feel like it contributes to the idea that in order to become a programmer you had to have started when you were really young. Like, if you weren’t debugging subroutines as a fetus, there’s no chance you’ll make it a career. So I tell people I started when I was a kid.

I also tell them that most of my programs were pretty mediocre for several years. I didn’t have Wikipedia or Google or Stack Overflow, so I kept making variations of a “guess the number” game, or began projects that I didn’t have the technical knowledge to complete. My head start didn’t amount to much. Everything I learned about programming and computers across several years as a kid and teen could be learned today in about a couple dozen weekends.

I think the main benefit of starting young was that I didn’t know programming was supposed to be hard. Today, everyone thinks of AI and machine learning and video games with amazing 3D graphics. I was just messing around in my spare time in a completely unfocused way, but I was totally fine with that.

I didn’t learn that much knowledge-wise, but I did pick up the idea that programming was just a thing you could learn to do like anything else. It didn’t require super smarts or Olympic-level training to do.

My first programming language was BASIC and soon after Qbasic, but I also picked up a little C, Visual Basic, Perl, Java, PHP, and JavaScript. It seems like a lot, but I never really mastered any of them. I just learned enough to complete whatever project I was working on in those languages at the time.

I got into Python around 2005, and sort of stopped learning new languages after that. I keep feeling the urge to explore new ones (Kotlin, Rust, and Dart have been in my sights for a while), but Python is just so easy to use for so many areas that I haven’t had a strong enough pull away from it yet.

Ricky:You know, my first Python programming book was your bookAutomate the Boring Stuff with Python. It’s something I still reference, even today. I’d love to know what inspired you to write a book that isn’t necessarily aimed at people who want to be a professional programmer in the traditional sense?

Al: Haha, I’m always amazed at how popular that book has become! I still reference the book myself: I’ll be working on something and can’t quite remember a function name and I’ll realize, “Wait, I wrote this down before.” I really hate job interviews where they forbid you from consulting books or documentation for the coding interview. When I’m programming, I consult books that I wrote.

I got started writing books around 2008 or 2009. My girlfriend at the time was a nanny for a 10-year-old who wanted to learn to code, but I couldn’t find any tutorials for him online that I liked. Everything was either for software engineers or the same old “let’s calculate Fibonacci numbers” stuff.

I thought back to how I got into coding through those magazines and books that listed the source code for small games. I had only half-understood the explanatory text, but copying the source code and making small modifications really showed me how to put programs together.

So I wrote a tutorial like that which grew in length and eventually became Invent Your Own Computer Games with Python. I put it online, then later sold it as a self-published title. I still had my day job as a software developer, and I never really considered book writing to be a career.

I released the book under a Creative Commons copyright license which let people download and share it for free. That turned out to be critically important because it let people share the book and generated word-of-mouth. Without a Creative Commons license, it would have just been yet another self-published listing on Amazon and I wouldn’t have my current career. Making the book freely available resulted in far more sales.

So I wrote another book and another one, and for my fourth book (which would become Automate the Boring Stuff with Python), I signed a contract with No Starch Press. I’ve really enjoyed working with them: their editors are great, their books are all high-quality, and they were fine with me continuing to release titles under a Creative Commons license.

Around 2012, “everyone should learn to code” was making the rounds in the news again, as it seems to have done every five or ten years for the last few decades. And I thought, “Sure, but why?” Not everyone needs to become a software engineer, and nobody needs to calculate Fibonacci numbers.

I thought about what non-programmers could use coding for. Twenty years ago, if you were chatting with your friends online every day, you were probably a huge nerd. But today, you’re just the average Facebook user. Lots of people use computers for their office jobs or at home.

What kinds of things would they like to automate that won’t require a computer science degree? As it turns out, there’s a lot! I have a friend who joked that if you want to become a millionaire startup founder, just find an industry that still uses fax machines and Excel spreadsheets for everything and write the web app that gets adopted by that entire field.

So I had a list of stuff like “update spreadsheets” and “send notification emails,” and all of that became Automate the Boring Stuff with Python. The software developer job I had at the time had gotten a bit stale, so I left it thinking I’d spend a year finishing Automate and then get another software developer job. I’ve been a full-time writer and online course developer for six years now.

It’s been a lot of luck. I’m lucky that hundreds of people have contributed to a free, open-source language like Python that makes programming so easy. I’m lucky I had savings so I could take a chance with book writing. I’m lucky that others have created the free software culture that led to things like the Creative Commons license.

So when people ask me for book-writing advice, it’s kind of like asking a lottery winner what numbers to pick. I still think there’s a lot of needless barriers in our society that need to be torn down to let people reach their full potential, so that aim has been my guiding star.

Ricky:Anyone who has followed you for any amount of time on Twitter will know you have a passion for teaching Python to people learning to code. You particularly seem to like using video games as the entry point. Have you found that focusing more on the “fun” topics has helped you reach more people, and helped them get into Python? And was it intentional, or were you scratching your own itch?

Al: Just the other day I found a slide from Mahmoud Hashemi’s 2019 PyBay talk that was titled “There are two reasons to start wanting to code” and it had a Venn diagram with “I want to make a video game” and “I want to be free of Excel.” Never have I seen my entire teaching and writing career summed up like that before. Video games are an excellent gateway into programming, especially with a tool like MIT’s Scratch. But it’s kind of hard to use video games sometimes.

I’ll be standing in front of a classroom of 9-year-olds and say, “We’re going to make a game,” and they’re thinking of Minecraft or Breath of the Wild or some other title that had a professional development team with a hundred-million-dollar budget, and instead I say, “How about we make a maze or Simon Says game?” It’s all about managing expectations. There’s still the basic fun of creating something, of telling the computer what to do and seeing it do that.

I’ve been working on creating a collection of small, easy to understand, text-based games in Python, which I’ve put online. The idea is that once you know about programming concepts like loops and variables, you can see these used in actual programs with these games.

I shudder when people tell new programmers to get better by reading the code of open source projects because those projects are often massive and poorly documented. Very few of them have onboarding guides to get new volunteers up to speed with the code base, so it’s hard to wrap your head around that code.

New coders just come away from it feeling intimidated. So my simple games are like training wheels on a bike, and the constraints I follow to enforce simplicity are actually pretty good at making me come up with creative but simple programs.

I’m not a game designer by any means. All the games I’ve made are, at most, a few hundred lines of code, and they’ve always been things that are mechanic-heavy like Tetris or that growing worm game everyone had on their Nokia cell phone. These games don’t require a lot in the way of graphics or level design.

I’m in awe of folks like Toby Fox or Eric Barone who spent years on their own creating Undertale and Stardew Valley, respectively. But even those games stand on the shoulders of giants. Undertale and Stardew Valley were inspired by Earthbound and Harvest Moon, but have the benefit of modern game design insights (and overwhelming talent on the part of Fox and Barone).

Likewise, all of my text-based games are similar to the ones you’d find in David Ahl’s BASIC Computer Games book or BYTE Magazine from the 70s and 80s, but with the benefit of a few decades of game design theory. All of this was intentional on my part (I wouldn’t be creating these small games if not for getting a programming tutorial out of them), but I don’t think I could do it any other way; nothing is as good of a hook to get kids and adults into programming as these fun little video games.

Also, anyone who has followed me for any amount of time on Twitter will also know that I have a passion for political rants. I try to stay on the constructive side for those, at least.

Ricky:You are one of the few programmers (and definitely one of only a handful of Python programmers that I know of) that live codes on Twitch. How has that experience been, and what have you learned through the process? Do we need to be encouraging more Python Twitch streamers?

Al: I’ve been streaming off and on over the last couple of years, sometimes with weeks or months in between streams. It’s an unfortunate truth that you essentially have to do it as a full-time job for about a year before you can grow an audience to a size that can sustain a career, and I don’t have that kind of time. But it’s nice as a hobby.

I can stream myself as I work on my small games or as I record myself creating online courses. It lets me practice being on camera while coding and narrating, which is useful for my online video course work. And it puts me in touch with beginners, so I can understand what parts they have trouble with or what questions they have.

Twitch deletes my streams about a week after I stream them and some folks have asked that I archive them somewhere permanent, but I really don’t think of it as much more than something to have on in the background. I’d much prefer creating polished, 10-minute videos than a four-hour stream of me rambling and trying to look up documentation.

Ricky:Now for our last few questions. What else do you get up to in your spare time? What other hobbies and interests do you have, aside from Python and coding?

Al: Spare time, eh? I’m afraid I don’t understand the question…

A nice thing about my current career is that it certainly keeps me busy. Writing programming tutorials pretty much was my side hobby, and now I get to do it full time, with all the benefits and disadvantages that brings. I have enough projects to last me well into 2021, but at the same time, I want them all done now.

The biggest motivation for finishing my current books and videos is that I’ll be able to start on my next book and video. I jokingly asked a friend if starting a cocaine habit would improve my productivity and she said, “Sure, at first.”

But aside from writing programs and writing books about programs, I like doing origami. It’s something I was interested in as a kid, but I’d always get halfway through the folding diagrams in books before reaching a step I couldn’t figure out, and I’d have to abandon it.

This is another thing from my childhood that the internet has improved tremendously: these days, there’s a million origami folding videos online, so I’m able to put together models that are far more impressive than anything I could have done as a kid.

Another friend pointed out that writing software, writing books, folding origami, and essentially all my hobbies have one thing in common: they’re all cheap! Give me an old laptop and a stack of paper, and I’ll be fairly content.

If you want to see what Al is up to, you can reach out to him on Twitter and say hi. His blog and his plethora of books can be found on his website, Invent With Python. Thank you, Al, for talking to me!

If you have someone you’d like me to reach out to for an interview, leave a comment below and let me know.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

↧

Samuel Sutch: Python for Kids: A Playful Introduction to Programming

October 28, 2019, 8:15 am

≫ Next: Reuven Lerner: Early-bird pricing for Weekly Python Exercise ends tomorrow

≪ Previous: Real Python: Python Community Interview With Al Sweigart

Price: $19.79
(as of Oct 28,2019 15:15:29 UTC – Details)

From the Author: Top 5 Tips & Tricks for Beginning Programmers

1. Never try to understand a long piece of code (or a long program) in one go. Focus on a few statements at a time. If possible, try to take a smaller chunk of the code and run it yourself to see what it does. Experimenting is always good, even if it doesn’t work and you get weird error messages, you’ve learned something!

↧

Reuven Lerner: Early-bird pricing for Weekly Python Exercise ends tomorrow

October 28, 2019, 10:24 am

≫ Next: Samuel Sutch: Hands-On GPU Programming with Python and CUDA: Explore high-performance parallel computing with CUDA

≪ Previous: Samuel Sutch: Python for Kids: A Playful Introduction to Programming

My native language is English: I grew up speaking it at home and school, and it’s my preference when reading, writing, and speaking. I studied in US schools through 12 grade, and then got both a bachelor’s degree and a PhD at American universities. I’ve been writing for years, including 20 years as a columnist at Linux Journal.

Am I fluent in English? Yes, I’d say so. And yet, I’m always reading tidbits about the history of English, how to speak more clearly, and how sharpen the language I use when writing.

Why? Because fluency isn’t a video game, in which you get a flashing sign saying, “Achievement unlocked: You’re fluent!” No matter how fluent you currently are, there are ideas, techniques, and practices that you can still learn. You can always become better.

How do you improve your fluency? The best way, of course, is practice. No matter how fluent you already are, more practice is always good. As the old saying goes, the best way to become a good writer is to write. And the best way to become a better speaker is to speak. And so forth.

What’s true for English, and other languages, is true for programming languages, as well. If you want to be a better Python programmer, then you should be writing Python code, making mistakes, and learning from those mistakes. Better yet, you should be discussing your mistakes (and techniques) with others, so that you can compare ideas and techniques, and learn from your peers.

This is the thinking that has driven my work with Weekly Python Exercise. Each of the six WPE courses is designed to help you become a more fluent programmer.

A new advanced-level cohort is starting on November 5th. But tomorrow (Tuesday, October 29th) is the last day you can sign up for the early-bird price of $80.

Here’s what some people have said about previous cohorts of WPE, when asked what they thought:

I was a total python noob when I started. I just wanted to learn the syntax, how to look at problems and find the solution. You provided both… your teaching is instrumental in drilling some concepts into our brains.
I learned a lot of features of the language and had a fun time doing it. I also got to apply what I learned when programming for work.
I expected to see Python in real world examples. I am not disappointed, because during WPE there were many these examples with wide varieties of programming blueprints.
The exercises are perfect for me because they are right in my “wheelhouse”. I have enough background knowledge that the context of the problems is relevant in my experience, yet I can’t just rattle off the solutions instantly.

If you use Python on a regular basis, but still feel that you can learn more about advanced techniques: Iterators, generators, decorators, comprehensions, inner functions, threading, and useful PyPI packages, then Weekly Python Exercise is for you.

Early-bird pricing ends on Tuesday evening. After that, you can still sign up, but you’ll pay the full price (i.e., $100). Why delay?

Click here to learn more about Weekly Python Exercise, and become a better Python developer.

The post Early-bird pricing for Weekly Python Exercise ends tomorrow appeared first on Reuven Lerner.

↧

Samuel Sutch: Hands-On GPU Programming with Python and CUDA: Explore high-performance parallel computing with CUDA

October 28, 2019, 12:22 pm

≫ Next: tryexceptpass: Practical Log Viewers with Sanic and Elasticsearch - Designing CI/CD Systems

≪ Previous: Reuven Lerner: Early-bird pricing for Weekly Python Exercise ends tomorrow

Price: $44.99
(as of Oct 28,2019 19:22:03 UTC – Details)

Dr. Brian Tuomanen has been working with CUDA and General-Purpose GPU Programming since 2014. He received his Bachelor of Science in Electrical Engineering from the University of Washington in Seattle, and briefly worked as a Software Engineer before switching to Mathematics for Graduate School. He completed his Ph.D. in Mathematics at the University of Missouri in Columbia, where he first encountered GPU programming as a means for studying scientific problems. Dr. Tuomanen has spoken at the US Army Research Lab about General Purpose GPU programming, and has recently lead GPU integration and development at a Maryland based start-up company. He currently lives and works in the Seattle area.

↧

tryexceptpass: Practical Log Viewers with Sanic and Elasticsearch - Designing CI/CD Systems

October 25, 2019, 9:00 pm

≫ Next: Stack Abuse: Asynchronous Tasks Using Flask, Redis and Celery

≪ Previous: Samuel Sutch: Hands-On GPU Programming with Python and CUDA: Explore high-performance parallel computing with CUDA

One of the critical pieces in a build system is the ability to view build and test output. Not only does it track progress as the build transitions through the various phases, it’s also an instrument for debugging.

This chapter in the continuous builds series covers how to build a simple log viewer. You’ll find details on retrieving log entries from Docker containers, serving them through Python, linking from a GitHub pull request, and highlighting the data for easy reading.

↧

Stack Abuse: Asynchronous Tasks Using Flask, Redis and Celery

October 28, 2019, 1:14 pm

≫ Next: Samuel Sutch: Python Programming: A Smart Approach For Absolute Beginners (A Step-by-Step Guide With 8 Days Crash Course)

≪ Previous: tryexceptpass: Practical Log Viewers with Sanic and Elasticsearch - Designing CI/CD Systems

Introduction

As web applications evolve and their usage increases, the use cases also diversify. We are now building and using websites for more complex tasks than ever before. Some of these tasks can be processed and feedback relayed to the users instantly, while others require further processing and relaying of results later. The increased adoption of internet access and internet-capable devices has led to increased end-user traffic.

In a bid to handle increased traffic or increased complexity of functionality, sometimes we may choose to defer the work and have the results relayed at a later time. This way, we do not get to keep the user waiting for an unknown time on our web application, and instead send the results at a later time. We can achieve this by utilizing background tasks to process work when there is low traffic or process work in batches.

One of the solutions we can use to achieve this is Celery. It helps us break down complex pieces of work and have them performed by different machines to ease the load on one machine or reduce the time taken to completion.

In this post, we will explore the usage of Celery to schedule background tasks in a Flask application to offload resource-intensive tasks and prioritize responding to end-users.

What is a Task Queue?

A task queue is a mechanism to distribute small units of work or tasks that can be executed without interfering with the request-response cycle of most web-based applications.

Tasks queues are helpful with delegating work that would otherwise slow down applications while waiting for responses. They can also be used to handle resource-intensive tasks while the main machine or process interacts with the user.

This way, the interaction with the user is consistent, timely and unaffected by the workload.

What is Celery?

Celery is an asynchronous task queue based on distributed message passing to distribute workload across machines or threads. A celery system consists of a client, a broker, and several workers.

These workers are responsible for the execution of the tasks or pieces of work that are placed in the queue and relaying the results. With Celery, you can have both local and remote workers meaning that work can be delegated to different and more capable machines over the internet and results relayed back to the client.

This way, the load on the main machine is alleviated and more resources are available to handle user requests as they come in.

The client in a Celery setup is responsible for issuing jobs to the workers and also communicating with them using a message broker. The broker facilitates the communication between the client and the workers in a Celery installation through a message queue where a message is added to the queue and the broker delivers it to the client.

Examples of such message brokers include Redis and RabbitMQ.

Why use Celery?

There are various reasons why we should Celery for our background tasks. First, it is quite scalable allowing more workers to be added on-demand to cater to increased load or traffic. Celery is also still in active development meaning it is a supported project alongside its concise documentation and active community of users.

Another advantage is that Celery is easy to integrate into multiple web frameworks with most having libraries to facilitate integration.

It also provides the functionality to interact with other web applications through webhooks where there is no library to support the interaction.

Celery can also use a variety of message brokes which offers us flexibility. RabbitMQ is recommended but it can also support Redis and Beanstalk.

Demo Application

We'll build a Flask application that allows users to set reminders that will be delivered to their emails at a set time.

We will also provide the functionality to customize the amount of time before the message or reminder is invoked and the message is sent out to the user.

Setup

Like any other project, our work will take place in a virtual environment which we will create and manage using the Pipenv tool:

$ pipenv install --three
$ pipenv shell

For this project, we will need to install the Flask and Celery packages to start:

$ pipenv install flask celery

This is what our Flask application file structure will look like:

.
├── Pipfile                    # manage our environment
├── Pipfile.lock
├── README.md
├── __init__.py
├── app.py                     # main Flask application implementation
├── config.py                  # to host the configuration
├── requirements.txt           # store our requirements
└── templates
    └── index.html             # the landing page

1 directory, 8 files

For our Celery-based project, we will use Redis as the message broker and we can find the instructions to set it up on their homepage.

Implementation

Let's start by creating the Flask application that will render a form that allows users to enter the details of the message to be sent at a future time.

We will add the following to our app.py file:

from flask import Flask, flash, render_template, request, redirect, url_for

app = Flask(__name__)
app.config.from_object("config")
app.secret_key = app.config['SECRET_KEY']

@app.route('/', methods=['GET', 'POST'])
def index():
    if request.method == 'GET':
        return render_template('index.html')

    elif request.method == 'POST':
        email = request.form['email']
        first_name = request.form['first_name']
        last_name = request.form['last_name']
        message = request.form['message']
        duration = request.form['duration']
        duration_unit = request.form['duration_unit']

        flash(“Message scheduled”)
        return redirect(url_for('index'))


if __name__ == '__main__':
    app.run(debug=True)

This is a really simple app with just a single route to handle a GET and POST request for the form. Once the details are submitted, we can hand over the data to a function that will schedule the job.

In order to declutter our main application file, we will put the configuration variables in a separate config.py file and load the config from the file:

app.config.from_object("config")

Our config.py file will be in the same folder as the app.py file and contains some basic configurations:

SECRET_KEY = 'very_very_secure_and_secret'
# more config

For now, let us implement the landing page as index.html:

{% for message in get_flashed_messages() %}
  <p style="color: red;">{{ message }}</p>
{% endfor %}

<form method="POST">
    First Name: <input id="first_name" name="first_name" type="text">
    Last Name: <input id="last_name" name="last_name" type="text">
    Email: <input id="email" name="email" type="email">
    Message: <textarea id="textarea" name="message"></textarea>
    Duration: <input id="duration" name="duration" placeholder="Enter duration as a number. for example: 3" type="text">

   <select name="duration_unit">
      <option value="" disabled selected>Choose the duration</option>
      <option value="1">Minutes</option>
      <option value="2">Hours</option>
      <option value="3">Days</option>
   </select>       

   <button type="submit" name="action">Submit </button>  
</form>

Styling and formatting has been truncated for brevity, feel free to format/style your HTML as you'd like.

We can now start our application:
1_landing_page

Sending Emails Using Flask-Mail

In order to send emails from our Flask application, we will use the Flask-Mail library which we add to our project as follows:

$ pipenv install flask-mail

With our Flask application and the form in place, we can now integrate Flask-Mail in our app.py:

from flask_mail import Mail, Message

app = Flask(__name__)
app.config.from_object("config")
app.secret_key = app.config['SECRET_KEY']

# set up Flask-Mail Integration
mail = Mail(app)

def send_mail(data):
    """ Function to send emails.
    """
    with app.app_context():
        msg = Message("Ping!",
                    sender="admin.ping",
                    recipients=[data['email']])
        msg.body = data['message']        
        mail.send(msg)

The function send_main(data) will receive the message to be sent and the recipient of the email and then it will be invoked after the specified time has passed to send the email to the user.

We will also need to add the following variables to our config.py in order for Flask-Mail to work:

# Flask-Mail
MAIL_SERVER = 'smtp.googlemail.com'
MAIL_PORT = 587
MAIL_USE_TLS = True
MAIL_USERNAME = 'mail-username'
MAIL_PASSWORD = 'mail-password'

Celery Integration

With our Flask application ready and equipped with email sending functionality, we can now integrate Celery in order to schedule the emails to be sent out at a later date.

Our app.py will be modified again:

# existing imports are maintained
from celery import Celery

# flask app and flask-mail configuration truncated

# set up celery client
client = Celery(app.name, broker=app.config['CELERY_BROKER_URL'])
client.conf.update(app.config)

# add this decorator to our send_mail function
@client.task
def send_mail(data):
	# function remains the same


@app.route('/', methods=['GET', 'POST'])
def index():
    if request.method == 'GET':
        return render_template('index.html')

    elif request.method == 'POST':
        data = {}
        data['email'] = request.form['email']
        data['first_name'] = request.form['first_name']
        data['last_name'] = request.form['last_name']
        data['message'] = request.form['message']
        duration = int(request.form['duration'])
        duration_unit = request.form['duration_unit']

        if duration_unit == 'minutes':
            duration *= 60
        elif duration_unit == 'hours':
            duration *= 3600
        elif duration_unit == 'days':
            duration *= 86400

        send_mail.apply_async(args=[data], countdown=duration)
        flash(f"Email will be sent to {data['email']} in {request.form['duration']} {duration_unit}")
        
        return redirect(url_for('index'))

We import celery and use it to initialize the Celery client in our Flask application by attaching the URL for the messaging broker. In our case, we will be using Redis as the broker, thus we add the following to our config.py:

CELERY_BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/0'

In order to have our send_mail() function executed as a background task, we will add the @client.task decorator so that our Celery client will be aware of it.

After setting up the Celery client, the main function which also handles form input is modified.

First, we pack the input data for the send_mail() function in a dictionary. The, we invoke our mailing function through the Celery Task Calling API using the function apply_async which takes in the arguments required by our function.

An optional countdown paramater is set, defining a delay between running the code and performing the task.

This duration is in seconds, which is the reason why we convert the duration passed by the user into seconds depending on the unit of time they choose.

After the user has submitted the form, we will acknowledge the reception and notify them through a banner message when the message will be sent out.

Bringing Everything Together

In order to run our project, we will need two terminals, one to start our Flask application and the other to start the Celery worker that will send messages in the background.

Start the Flask app in the first terminal:

$ python app.py

On the second terminal, start the virtual environment then start the Celery worker:

# start the virtualenv
$ pipenv shell

$ celery worker -A app.client --loglevel=info

If everything goes well, we will get the following feedback in the terminal running the celery client:

2_celery_kick_off

Now let us navigate to http://localhost:5000 and fill in the details scheduling the email to arrive after 2 minutes of submission.

Above the form, a message will appear indicating the address that will receive the email and the duration after which the email will be sent. I

n our Celery terminal, we will also be able to see the a log entry that signifies that our email has been scheduled:

[2019-10-23 16:27:25,399: INFO/MainProcess] Received task: app.send_mail[d65025c8-a291-40d0-aea2-e816cb40cd78]  ETA:[2019-10-23 13:29:25.170622+00:00]

The ETA section of the entry shows when our send_email() function will be called and the email will be sent.

So far, so good. Our emails are being scheduled and sent out in the specified time, however, one thing is missing. We have no visibility of the tasks before or after they are executed and we have no way of telling whether the email was sent or not.

For this reason, let's implement a monitoring solution for our background tasks so that we can view tasks and also be aware in case something goes wrong and the tasks are not executed as planned.

Monitoring our Celery Cluster Using Flower

Flower is a web-based tool that will provide visibility on our Celery setup and provides the functionality to view task progress, history, details, and statistics including success or failure rates. We can also monitor all the workers in our cluster and the tasks they are currently handling.
Installing Flower is as easy as:

$ pipenv install flower

Earlier on, we specified the details of our Celery client in our app.py file. We'll need to pass that client to Flower in order to monitor it.

To achieve this we need to open up a third terminal window, jump into our virtual environment, and start our monitoring tool:

$ pipenv shell
$ flower -A app.client --port=5555

When starting Flower, we specify the Celery client by passing it through the application (-A) argument, and also specifying the port to be used through the --port argument.

With our monitoring in place, let us schedule another email to be sent on the dashboard, and then navigate to http://localhost:5555 where we are welcomed by the following:

3_flower_landing_page

On this page, we can see the list of workers in our Celery cluster, which is currently just made up of our machine.

To view the email we have just scheduled, click on the Tasks button on the top left side of the dashboard and this will take us to the page where we can see the tasks that have been scheduled:

4_scheduled_tasks

In this section, we can see that we had scheduled two emails and one has been successfully sent out at the scheduled time. The emails were scheduled to be sent out after 1 minute and 5 minutes respectively for testing purposes.

We can also see the time the text was received and when it was executed from this section.

In the monitor section, there are graphs displaying the success and failure rates of the background tasks.

We can schedule messages for as long as we wish, but that also means that our worker has to be online and functional at the time the task is supposed to be executed.

Conclusion

We have successfully set up a Celery cluster and integrated it into our Flask application that allows users to schedule emails to be sent out after a certain time in the future.

The email sending functionality has been delegated to a background task and placed on a queue where it will be picked and executed by a worker in our local Celery cluster.

The source code for this project is, as always, available on Github.

↧

Samuel Sutch: Python Programming: A Smart Approach For Absolute Beginners (A Step-by-Step Guide With 8 Days Crash Course)

October 28, 2019, 6:28 pm

≫ Next: Podcast.__init__: Building Quantum Computing Algorithms In Python

≪ Previous: Stack Abuse: Asynchronous Tasks Using Flask, Redis and Celery

Price: $14.89
(as of Oct 29,2019 01:28:01 UTC – Details)

↧

Podcast.init: Building Quantum Computing Algorithms In Python

October 28, 2019, 7:05 pm

≫ Next: Python Insider: Python 3.5.8 is now available

≪ Previous: Samuel Sutch: Python Programming: A Smart Approach For Absolute Beginners (A Step-by-Step Guide With 8 Days Crash Course)

Quantum computers are the biggest jump forward in processing power that the industry has seen in decades. As part of this revolution it is necessary to change our approach to algorithm design. D-Wave is one of the companies who are pushing the boundaries in quantum processing and they have created a Python SDK for experimenting with quantum algorithms. In this episode Alexander Condello explains what is involved in designing and implementing these algorithms, how the Ocean SDK helps you in that endeavor, and what types of problems are well suited to this approach.

Summary

Announcements

Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
Your host as usual is Tobias Macey and today I’m interviewing Alex Condello about the Ocean SDK from D-Wave for building quantum algorithms in Python

Interview

Introductions
How did you get introduced to Python?
Can you start by giving a high-level overview of quantum computing?
What is the Ocean SDK and how does it fit into the business model for D-Wave?
What are some of the problem types that a quantum processor is uniquely well suited for?
- How does the overall system design for a quantum computer compare to that of the Von Neumann architecture that is common for the machines that we are all familiar with?
What are some of the differences in algorithm design when programming for a quantum processor?
- Is there any specialized background knowledge that is necessary for making effective use of the QPU’s capabilities?
- What are some of the common difficulties that you have seen users struggle with?
- How does the Ocean SDK assist the developer in implementing and understanding the patterns necessary for Quantum algorithms?
What was the motivation for choosing Python as the target language for an SDK to attract developers to experiment with quantum algorithms?
Can you describe how the SDK is implemented and some of the integrations that are necessary for being able to operate on a quantum processor?
- What have you found to be some of the most interesting, challenging, or unexpected aspects of your work on the Ocean software stack?
- How do you handle the abstraction of the execution context to allow for replicating the program behavior on CPU/GPU vs QPU
Is there any potential for quantum computing to impact research in previously intractable computer science research, such as the P vs NP problem?
What are your current scaling limits in terms of providing compute to customers for their problems?
What are some of the most interesting, innovative, or unexpected ways that you have seen developers use the Ocean SDK and quantum processors?
What are you most excited for as you look to the future capabilities of quantum systems?
- What are some of the upcoming challenges that you anticipate for the quantum computing industry?

Keep In Touch

arcondello on GitHub

Picks

Tobias
- QuTip Podcast Interview
Alex
- Cython
  - Podcast Interview

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

↧

Python Insider: Python 3.5.8 is now available

October 28, 2019, 7:51 pm

≫ Next: Samuel Sutch: Elements of Programming Interviews in Python: The Insiders’ Guide

≪ Previous: Podcast.__init__: Building Quantum Computing Algorithms In Python

Python 3.5.8 is now available.

You can download Python 3.5.8 here.

↧

Samuel Sutch: Elements of Programming Interviews in Python: The Insiders’ Guide

October 29, 2019, 2:33 am

≫ Next: Robin Wilson: Five new-ish Python things – Part 1

≪ Previous: Python Insider: Python 3.5.8 is now available

Price: $35.96
(as of Oct 29,2019 09:33:53 UTC – Details)

"A practical, fun approach to computer science fundamentals, as seen through the lens of common programming interview questions."
Jeff Atwood / Co-founder, Stack Overflow and Discourse

"This book prepares the reader for contemporary software interviews, and also provides a window into how algorithmic techniques translate into the workplace. It emphasizes problems that stem from real-world applications and can be coded up in a reasonable time, and is a wonderful complement to a traditional computer science algorithms and data structures course."
Ashish Goel / Professor, Stanford University

"A wonderful resource for anyone preparing for a modern software engineering interview: work through the entire book, and you&aposll find the actual interview a breeze. More generally, for algorithms enthusiasts, EPI offers endless hours of entertainment while simultaneously learning neat coding tricks."
Vineet Gupta / Principal Engineer, Google

↧

Robin Wilson: Five new-ish Python things – Part 1

October 29, 2019, 2:53 am

≫ Next: Stack Abuse: Coroutines in Python

≪ Previous: Samuel Sutch: Elements of Programming Interviews in Python: The Insiders’ Guide

I keep gathering links of interesting Python things I’ve seen around the internet: new packages, good tutorials, and so on – and so I thought I’d start a series where I share them every so often.

Not all of these are new new – some have been around for a while but are new to me– and so they might be new to you too!

Also, there is a distinct ‘PyData’ flavour to these things – they’re all things I’ve come across in my work in data science and geographic processing with Python.

So, on with the list:

removestar

I try really hard to follow the PEP8 style guide for my Python code – but I wasn’t so disciplined in the past, and so I’ve got a lot of old code sitting around which isn’t styled particularly well.

One of the things PEP8 recommends against is using: from blah import *. In my code I used to do a lot of from matplotlib.pyplot import *, and from Py6S import *– but it’s a pain to go through old code and work out what functions are actually used, and replace the import with something like from matplotlib.pyplot import plot, xlabel, title.

removestar is a tool that will do that for you! Just install it with pip install removestar and then it provides a command-line tool to fix your imports for you.

For example, using removestar on the Py6S case study code by running:

removestar ncaveo.py

Gives the following diff as output:

--- original/ncaveo.py
+++ fixed/ncaveo.py
@@ -1,7 +1,7 @@
 # Import Py6S
-from Py6S import *
+from Py6S import Geometry, GroundReflectance, SixS, SixSHelpers
 # Import the Matplotlib plotting environment
-from matplotlib.pyplot import *
+from matplotlib.pyplot import clf, legend, plot, savefig, xlabel, ylabel
 # Import the functions for copying objects
 import copy

To run it on all of the Python files in a module, and do the edits inplace rather than just showing the diffs, you can run it as follows:

removestar -i module_folder/

ipynb-quicklook

file
If you use OS X then you’ll know about the very handy ‘quicklook’ feature that shows you a preview of the selected file in Finder when pressing the spacebar. You can add support for new filetypes to quicklook using quicklook plugins – and I’d already set up a number of useful plugins which will show syntax-highlighted code, preview JSON, CSV and Markdown files nicely, and so on.

I only discovered ipynb-quicklook last week, and it does what you’d expect: it provides previews of Jupyter Notebook files from the Finder. Simply follow the instructions to place the ipynb-quicklook.qlgenerator file in your ~/Library/QuickLook folder, and it ‘Just Works’ – and it’s really quick to render the files too!

Nicolas Rougier’s Matplotlib Cheatsheet

file

This is a great cheatsheet for the matplotlib plotting library from Nicolas Rougier. It’s a great quick reference for all the various matplotlib settings and functions, and reminded me of a number of things matplotlib can do that I’d forgotten about.

Find the high-resolution cheatsheet image here and the repository with all the code used to create it here. Nicolas is also writing a book called Scientific Visualization – Python & Matplotlib which looks great – and it’ll be released open-access once it’s finished (you can donate to see it ‘in progress’).

PyGEOS

If you’re not interested in geographic data processing using Python then this probably won’t interest you…but for those who are interested this looks great. PyGEOS provides native Python bindings to the GEOS library which is used for geometry manipulation by many geospatial tools (such as calculating distances, or finding out whether one geometry contains another). However, by using the underlying C library PyGEOS bypasses the Python interpreter for a lot of the calculations, allowing them to be vectorised efficiently and making it very fast to apply these geometry functions: their preliminary performance tests show speedups ranging from 4x to 136x. The interface is very simple too – for example:

import pygeos
import numpy as np

points = [
    pygeos.Geometry("POINT (1 9)"),
    pygeos.Geometry("POINT (3 5)"),
    pygeos.Geometry("POINT (7 6)")
]
box = pygeos.box(2, 2, 7, 7)
pygeos.contains(box, points)

This project is still in the early days – but definitely one to watch as I think it will have a big impact on the efficiency of Python-based spatial analysis.

napari

file
napari is a fast multi-dimensional image viewer for Python. I found out about it through an extremely comprehensive blog post written by Juan Nunez-Iglesias where he explains the background to the project and what problems it is designed to solve.

One of the key features of napari is that it has a full Python API, allowing you to easily visualise images from within Python – as easily as using imshow() from matplotlib, but with far more features. For example, to view three of the scikit-image sample images just run:

from skimage import data
import napari

with napari.gui_qt():
    viewer = napari.Viewer()
    viewer.add_image(data.astronaut(), name='astronaut')
    viewer.add_image(data.moon(), name='moon')
    viewer.add_image(data.camera(), name='camera')

You can then add some vector points over the image – for example, to use as starting points for a segmentation:

points = np.array([[100, 100], [200, 200], [300, 100]])
viewer.add_points(points, size=30)

That is very useful for me already, and it’s just a tiny taste of what napari has to offer. I’ve only played with it for a short time, but I can already see it being really useful for me next time I’m doing a computer vision project, and I’m already planning to discuss some potential new features to help with satellite imagery work. Definitely something to check out if you’re involved in image processing in any way.

If you liked this, then get me to work for you! I do freelance work in data science, Python development and geospatial analysis – please contact me for more details, or look at my freelance website

↧

Stack Abuse: Coroutines in Python

October 29, 2019, 6:16 am

≫ Next: Real Python: Python Type Checking

≪ Previous: Robin Wilson: Five new-ish Python things – Part 1

Introduction

Every programmer is acquainted with functions - sequences of instructions grouped together as a single unit in order to perform predetermined tasks. They admit a single entry point, are capable of accepting arguments, may or may not have a return value, and can be called at any moment during a program's execution - including by other functions and themselves.

When a program calls a function its current execution context is saved before passing control over to the function and resuming execution. The function then creates a new context - from there on out newly created data exists exclusively during the functions runtime.

As soon as the task is complete, control is transferred back to the caller - the new context is effectively deleted and replaced by the previous one.

Coroutines

Coroutines are a special type of function that deliberately yield control over to the caller, but does not end its context in the process, instead maintaining it in an idle state.

They benefit from the ability to keep their data throughout their lifetime and, unlike functions, can have several entry points for suspending and resuming execution.

Coroutines in Python work in a very similar way to Generators. Both operate over data so let's keep the main differences simple:

Generators produce data
Coroutines consume data

The distinct handling of the keyword yield determines whether we are manipulating one or the other.

Defining a Coroutine

With all the essentials out of the way, let us jump right in and code our first coroutine:

def bare_bones():
    while True:
        value = (yield)

It's clear to see the resemblance to a regular Python function. The while True: block guarantees the continuous execution of the coroutine for as long as it receives values.

The value is collected through the yield statement. We'll come back to this in a few moments...

It's clear to see that this code is practically useless, so we'll round it off with a few print statements:

def bare_bones():
	print("My first Coroutine!")
	while True:
		value = (yield)
		print(value)

Now, what happens when we try to call it like so:

coroutine = bare_bones()

If this were a normal Python function, one would expect it to produce some sort of output by this point. But if you run the code in its current state you will notice that not a single print() gets called.

That is because coroutines require the next() method to be called first:

def bare_bones():
    print("My first Coroutine!")
    while True:
        value = (yield)
        print(value)

coroutine = bare_bones()
next(coroutine)

This starts the execution of the coroutine until it reaches its first breakpoint - value = (yield). Then, it stops, returning the execution over to the main, and idles while awaiting new input:

My first Coroutine!

New input can be sent with send():

coroutine.send("First Value")

Our variable value will then receive the string First Value, print it, and a new iteration of the while True: loop forces the coroutine to once again wait for new values to be delivered. You can do this as many times as you like.

Finally, once you are done with the coroutine and no longer wish to make use of it you can free those resources by calling close(). This raises a GeneratorExit exception that needs to be dealt with:

def bare_bones():
	print("My first Coroutine!")
	try:
		while True:
			value = (yield)
			print(value)
	except GeneratorExit:
		print("Exiting coroutine...")

coroutine = bare_bones()
next(coroutine)
coroutine.send("First Value")
coroutine.send("Second Value")
coroutine.close()

Output:

My first Coroutine!
First Value
Second Value
Exiting coroutine...

Passing Arguments

Much like functions, coroutines are also capable of receiving arguments:

def filter_line(num):
	while True:
		line = (yield)
		if num in line:
			print(line)

cor = filter_line("33")
next(cor)
cor.send("Jessica, age:24")
cor.send("Marco, age:33")
cor.send("Filipe, age:55")

Output:

Marco, age:33

Applying Several Breakpoints

Multipleyield statements can be sequenced together in the same individual coroutine:

def joint_print():
	while True:
		part_1 = (yield)
		part_2 = (yield)
		print("{} {}".format(part_1, part_2))

cor = joint_print()
next(cor)
cor.send("So Far")
cor.send("So Good")

Output:

So Far So Good

The StopIteration Exception

After a coroutine is closed, calling send() again will generate a StopIteration exception:

def test():
	while True:
		value = (yield)
		print(value)
try:
	cor = test()
	next(cor)
	cor.close()
	cor.send("So Good")
except StopIteration:
	print("Done with the basics")

Output:

Done with the basics

Coroutines with Decorators

This is all well and good! But when working in larger projects initiating every single coroutine manually can be such a huge drag!

Worry not, its just the matter of exploiting the power of Decorators so we no longer need to use the next() method:

def coroutine(func):
	def start(*args,**kwargs):
		cr = func(*args,**kwargs)
		next(cr)
		return cr
	return start

@coroutine
def bare_bones():
	while True:
		value = (yield)
		print(value)

cor = bare_bones()
cor.send("Using a decorator!")

Running this piece of code will yield:

Using a decorator!

Building Pipelines

A pipeline is a sequence of processing elements organized so that the output of each element is the input of the next.

Data gets pushed through the pipe until it is eventually consumed. Every pipeline requires at least one source and one sink.

The remaining stages of the pipe can perform several different operations, from filtering to modifying, routing, and reducing data:

pipeline

Coroutines are natural candidates for performing these operations, they can pass data between one another with send() operations and can also serve as the end-point consumer. Let's look at the following example:

def producer(cor):
	n = 1
	while n < 100:
		cor.send(n)
		n = n * 2

@coroutine
def my_filter(num, cor):
	while True:
		n = (yield)
		if n < num:
			cor.send(n)

@coroutine
def printer():
	while True:
		n = (yield)
		print(n)

prnt = printer()
filt = my_filter(50, prnt)
producer(filt)

Output:

So, what we have here is the producer() acting as the source, creating some values that are then filtered before being printed by the sink, in this case, the printer() coroutine.

my_filter(50, prnt) acts as the single intermediary step in the pipeline and receives its own coroutine as an argument.

This chaining perfectly illustrates the strength of coroutines: they are scalable for bigger projects (all that is required is to add more stages to the pipeline) and easily maintainable (changes to one don't force an entire rewrite of the source code).

Similarities to Objects

A sharp-eyed programmer might catch on that coroutines contain a certain conceptual similarity to Python objects. From the required prior definition to instance declaration and management. The obvious question arises of why one would use coroutines over the tried and true paradigm of object-oriented programming.

Well, aside the obvious fact that coroutines require but a single function definition, they also benefit from being significantly faster. Lets examine the following code:

class obj:
	def __init__(self, value):
		self.i = value
	def send(self, num):
		print(self.i + num)

inst = obj(1)
inst.send(5)

def coroutine(value):
	i = value
	while True:
		num = (yield)
		print(i + num)

cor = coroutine(1)
next(cor)
cor.send(5)

Here's how these two hold up against each other, when ran through the timeit module, ten thousand times:

Object	Coroutine
0.791811	0.6343617
0.7997058	0.6383156
0.8579286	0.6365501
0.838439	0.648442
0.9604255	0.7242559

Both perform the same menial task but the second example is quicker. Speed gains advent from the absence of the object's self lookups.

For more system taxing tasks this feature makes for a compelling reason to use coroutines instead of the conventional handler objects.

Caution When Using Coroutines

The send() Method Is Not Thread-Safe

import threading
from time import sleep

def print_number(cor):
	while True:
		cor.send(1)

def coroutine():
	i = 1
	while True:
		num = (yield)
		print(i)
		sleep(3)
		i += num

cor = coroutine()
next(cor)

t = threading.Thread(target=print_number, args=(cor,))
t.start()

while True:
	cor.send(5)

Because send() was not properly synchronized, neither does it have inherent protection against thread related miscalls, the following error was raised: ValueError: generator already executing.

Mixing coroutines with concurrency should be done with extreme caution.

It's Not Possible to Loop Coroutines

def coroutine_1(value):
	while True:
		next_cor = (yield)
		print(value)
		value = value - 1
		if next_cor != None:
			next_cor.send(value)

def coroutine_2(next_cor):
	while True:
		value = (yield)
		print(value)
		value = value - 2
		if next != None:
			next_cor.send(value)

cor1 = coroutine_1(20)
next(cor1)
cor2 = coroutine_2(cor1)
next(cor2)
cor1.send(cor2)

The same ValueError shows its face. From these simple examples we can infer that the send() method builds a sort of call-stack that doesn't return until the target reaches its yield statement.

So, using coroutines is not all sunshine and rainbows, careful thought must be had before application.

Conclusion

Coroutines provide a powerful alternative to the usual data processing mechanisms. Units of code can be easily combined, modified and rewritten, all the while profiting from variable persistence across its life cycle.

In the hands of a crafty programmer, coroutines become meaningful new tools by allowing simpler design and implementation, all the while providing significant performance gains.

Stripping ideas down into straightforward processes saves the programmer's effort and time, all the while avoiding stuffing code with superfluous objects that do nothing more than elementary tasks.

↧

Real Python: Python Type Checking

October 29, 2019, 7:00 am

≫ Next: PyCoder’s Weekly: Issue #392 (Oct. 29, 2019)

≪ Previous: Stack Abuse: Coroutines in Python

In this course, you’ll learn about Python type checking. Traditionally, types have been handled by the Python interpreter in a flexible but implicit way. Recent versions of Python allow you to specify explicit type hints that can be used by different tools to help you develop your code more efficiently.

In this tutorial, you’ll learn about:

Type annotations and type hints
Adding static types to code, both your code and the code of others
Running a static type checker
Enforcing types at runtime

You’ll go on a tour of how type hints work in Python and find out if type checking is something you want to use in your code. If you want to learn more, you can check out the resources that will be linked to throughout this course.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

↧

PyCoder’s Weekly: Issue #392 (Oct. 29, 2019)

October 29, 2019, 12:30 pm

≫ Next: Zero-with-Dot (Oleg Żero): Colaboratory + Drive + Github -> the workflow made simpler

≪ Previous: Real Python: Python Type Checking

#392 – OCTOBER 29, 2019
View in Browser »

Black 19.10b0 Released

I’ve been using Black to automatically format most of my Python code since it came out last year, and it’s been an incredibly helpful tool. Stable release coming soon.More about how to use Black here.
GITHUB.COM/PSF

Python and PyQt: Building a GUI Desktop Calculator

In this step-by-step tutorial, you’ll learn how to create Graphical User Interface (GUI) applications with Python and PyQt. Once you’ve covered the basics, you’ll build a fully-functional desktop calculator that can respond to user events with concrete actions.
REAL PYTHON

Manage and Optimize Your Python Applications With a Free 14-day Trial of Datadog APM

Visualize every layer of your Python stack in minutes. Trace requests as they travel across distributed services, and inspect detailed flame graphs to identify bottlenecks and other performance issues. Navigate seamlessly between Python traces to related logs and metrics in seconds →
DATADOGsponsor

Performance of System V Style Shared Memory Support in Python 3.8

“To evaluate the performance gains from shared memory, I ran the following simple test—create a list of integers and double each integer in the list in parallel by chunking the list and processing each chunk in parallel.” Surprising results!
VENKATESH-PRASAD RANGANATH

When to Switch to Python 3.8

A quick rundown of the problems you might encounter when switching major Python versions.
PYTHONSPEED.COM

Consider `memoryview` and `bytearray` for Zero-Copy Interactions With `bytes`

BRETT SLATKIN

Manipulating TTF Font Files With Python

READEVALPRINT.COM

Python Jobs

Full Stack Developer (Toronto, ON, Canada)

Beanfield Metroconnect

More Python Jobs >>>

Articles & Tutorials

Python Type Checking 101

Write better code with this step-by-step intro to Python type checking. Traditionally, types have been handled by the Python interpreter in a flexible but implicit way. Recent versions of Python allow you to specify explicit type hints that can be used by different tools to help you develop your code more efficiently.
REAL PYTHONvideo

3 Ways to Create a Keras Model With TensorFlow 2.0

Keras and TensorFlow 2.0 provide you with three methods to implement your own neural network architectures:, Sequential API, Functional API, and Model subclassing. Inside of this tutorial you’ll learn how to utilize each of these methods, including how to choose the right API for the job.
ADRIAN ROSEBROCK

Automated Python Code Reviews, Directly From Your Git Workflow

Codacy lets developers spend more time shipping code and less time fixing it. Set custom standards and automatically track quality measures like coverage, duplication, complexity and errors. Integrates with GitHub, GitLab and Bitbucket, and works with 28 different languages. Get started today for free →
CODACYsponsor

Python Community Interview With Al Sweigart

Al Sweigart is an accomplished developer, conference speaker, teacher, and origamist. But some may know him best for his numerous Python programming books, such as Automate the Boring Stuff with Python.
REAL PYTHON

Debugging TensorFlow coverage

“It started with a coverage.py issue: Coverage not working for TensorFlow Model call function. A line in the code is executing, but coverage.py marks it as unexecuted. How could that be?”
NED BATCHELDER

Cleaning Up Currency Data With Pandas

Tips on how to clean up messy currency data in Pandas so that you may convert the data to numeric formats for further analysis.
CHRIS MOFFITT

Python & OpenGL for Scientific Visualization

An open-source book about Python and OpenGL for Scientific Visualization.
NICOLAS P. ROUGIER

Dockerizing Flask With Postgres, Gunicorn, and Nginx

MICHAEL HERMAN

Weird Things You Can, but Probably Shouldn’t Do in Python

WILL CIPRIANO

Measure and Improve Python Code Performance With Blackfire.io

Profile in development, test/staging, and production, with no overhead for end users! Blackfire supports any Python version from 2.7.x and 3.x. Find bottlenecks in wall-time, I/O, CPU, memory, HTTP requests, and SQL queries.
BLACKFIREsponsor

Projects & Code

removestar: Automatically Replace `Import *` With Explicit Imports

ASMEURER.COM

safe-assert: Safe and Composable Assertions

GITHUB.COM/SOBOLEVN• Shared by Nikita Sobolev

pydantic: Data Parsing & Validation Using Python Type Hinting

GITHUB.COM/SAMUELCOLVIN

pythonfuzz: Coverage Guided Fuzz Testing for Python

GITHUB.COM/FUZZITDEV

pygeos: Wraps GEOS Geometry Functions in NumPy ufuncs

GITHUB.COM/PYGEOS

ipynb-quicklook: macOS QuickLook Generator for Jupyter/IPython Notebooks

GITHUB.COM/TUXU

Events

PyCon Sweden 2019

October 31 to November 2, 2019
PYCON.SE

PyCon France 2019

October 31 to November 4, 2019
PYCON.FR

PiterPy 2019

November 1 to November 2, 2019
PITERPY.COM

Django Girls Groningen

November 2 to November 3, 2019
DJANGOGIRLS.ORG

PyCon Canada 2019

November 18 to November 19, 2019 in Toronto
2019.PYCON.CA

Happy Pythoning!
This was PyCoder’s Weekly Issue #392.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

↧

Zero-with-Dot (Oleg Żero): Colaboratory + Drive + Github -> the workflow made simpler

October 29, 2019, 4:00 pm

≫ Next: The No Title® Tech Blog: New project: Nice Telescope Planner

≪ Previous: PyCoder’s Weekly: Issue #392 (Oct. 29, 2019)

Introduction

This post is a continuation of our earlier attempt to make the best of the two worlds, namely Google Colab and Github. In short, we tried to map the usage of these tools in a typical data science workflow. Although we got it to work, the process had its drawbacks:

It relied on relative imports, which made our code unnecessarily cumbersome.
We didn’t quite get the Github part to work. The workspace had to be saved offline.

In this post, we will show you a simpler way organize the workspace without these flaws. All you will need to proceed is a Gmail and Github account. Let’s get to work.

What goes where?

/assets/colab-github-workflow/triangle.png

Figure 1. Three parts of our simple "ecosystem".

Typically, we have four basic categories of files in our workspace:

notebooks (.ipynb) - for interactive development work,
libraries (.py) - for code that we use and reuse,
models - things we try to build,
data - ingredients we build it from.

Since Colab backend is not persistent, we need a permanent storage solution. In addition to that, we also need a version control system so we can keep track of changes. Finally, we would appreciate if we won’t have to think of this machinery any more than necessary.

Colab integrates easily with Google Drive, which makes it a natural choice for storage space. We will use it for storing our data and models. At the same time, Github is better suited for code, thus we will use it for notebooks and libraries. Now, the question arises, how we can interface the two from the position of our notebook, which will make our workflow as painless as possible?

Github

We assume that you already have a Github account and created a repository for your project. Unless your repository is public, you will need to generate a token to interact with it through a command line. Here is a short guide on how to create one.

Google Drive

Next thing is to organize our non-volatile storage space for both models and data. If you have a Gmail account you are halfway there. All you need to do is to create an empty directory in the Drive and that’s it.

Colaboratory - operational notebook

To keep things organized, we define one separate notebook that is to be our operational tool. We will use its cells exclusively for manipulating of our space, letting the other notebooks take care of more interesting things such as exploratory data analysis, feature engineering or training. All notebooks, including this one, will be revisioned, but with command stored in the operational notebook.

The workflow

The workflow is a simple three-step process:

First, after connecting to the Colab runtime, we need to mount Google Drive and update our space using Github.
We work with the notebooks and the rest of the files (our modules, libraries, etc.). In this context, we simply call it editing.
We save our work, by synchronizing our Drive with Github using the operational notebook.

Connecting, mounting and updating

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from google.colab import drive
from os.path import join

ROOT ='/content/drive'# default for the drive
PROJ ='My Drive/...'# path to your project on Drive

GIT_USERNAME ="OlegZero13"# replace with yours
GIT_TOKEN ="XXX"# definitely replace with yours
GIT_REPOSITORY ="yyy"# ...nah


drive.mount(ROOT)# we mount the drive at /content/drive

PROJECT_PATH = join(ROOT, PROJ)!mkdir "{PROJECT_PATH}"I    # in case we haven't created it already   

GIT_PATH ="https://{GIT_TOKEN}@github.com/{GIT_USERNAME}/{GIT_REPOSITORY}.git"!mkdir ./temp
!git clone "{GIT_PATH}"!mv ./temp/*"{PROJECT_PATH}"!rm -rf ./temp
!rsync -aP--exclude=data/ "{PROJECT_PATH}"/*  ./

The above snippet mounts the Google Drive at /content/drive and creates our project’s directory. It then pulls all the files from Github and copies them over to that directory. Finally, it collects everything that belongs to the Drive directory and copies it over to our local runtime.

A nice thing about this solution is that it won’t crash if executed multiple times. Whenever executed, it will only update what is new and that’s it. Also, with rsync we have the option to exclude some of the content, which may take too long to copy (…data?).

Editing, editing, and editing

Development, especially in data science, means trying multiple times before we finally get things right. At this stage, editing to the external files/libraries can be done by:

substituting or changing files on Drive and then transferring them to the local runtime of each notebook using rsync, or
using the so-called IPython magic commands.

Suppose you want quickly change somefile.py, which is one of your library files. You can write the code for that file and tell Colab to save it using %%writefile command. Since the file resides locally, you can use simply the import statement to load its new content again. The only thing is to remember to execute %reload_ext somefile command first, to ensure that Colab knows of the update.

Here is an example:

/assets/colab-github-workflow/imports.png

Figure 2. Importing, editing and importing again. All done through the cells.

Saving, calling it a day

Once you wish to make a backup of all of your work, all you need to do is to copy all the files to the storage and push them to Github.

Copying can be done using !cp -r ./* "{PROJECT_PATH}" executed in a notebook cell, which will update the Drive storage. Then, pushing to Github requires creating a temporary working directory and configuring local git repo just for the time being. Here are the commands to execute:

1
2
3
4
5
6
7
8
9
10
11
12
!mkdir ./temp
!git clone "https://{GIT_TOKEN}@github.com/{GIT_USERNAME}/{GIT_REPOSITORY}.git ./temp
!rsync -aP --exclude=data/ "{PROJECT_PATH}"/* ./temp

%cd ./temp
!git add .
!git commit -m '"{GIT_COMMIT_MESSAGE}"'
!git config --global user.email "{GIT_EMAIL}"
!git config --global user.name "{GIT_NAME}"
!git push origin "{GIT_BRANCH_NAME}"
%cd /content
!rm -rf ./temp

Obviously, you need to define the strings in "{...}" yourself.

/assets/colab-github-workflow/github.png

Figure 3. Successful upload of the content to Github. Calling it a day.

Conclusion

In this post, we have shown how to efficiently use Google Drive and Github together when working with Google Colab. The improved workflow is much simpler than the one presented earlier.

If you would like to share any useful tricks or propose some improvements, please do so in the comments. Your feedback is really helpful.

↧

The No Title® Tech Blog: New project: Nice Telescope Planner

October 29, 2019, 5:30 pm

≫ Next: Python Bytes: #154 Code, frozen in carbon, on display for all

≪ Previous: Zero-with-Dot (Oleg Żero): Colaboratory + Drive + Github -> the workflow made simpler

And now, for something different, I have just dived into Java. I am sharing with you the first (pre-)release of Nice Telescope Planner, a simple cross-platform desktop utility for amateur astronomy hobbyists, written in Java. The aim is to provide an easy to use tool to help planning sky observation sessions, suggesting some of the interesting objects you may be able to watch at naked eye, or using amateur equipment (binoculars or small to medium size telescopes) in a given date/time and place.

↧

Python Bytes: #154 Code, frozen in carbon, on display for all

October 29, 2019, 1:00 am

≫ Next: Talk Python to Me: #236 Scaling data science across Python and R

≪ Previous: The No Title® Tech Blog: New project: Nice Telescope Planner

↧