Quantcast
Channel: Planet Python
Viewing all 22875 articles
Browse latest View live

PSF GSoC students blogs: Ladon, web services-last weeks

$
0
0

Hello everyone!

"""

The last two weeks I was seeing some topics in the tests
an improvement in my initial file on travis. Then I was seeing tools to go testing.
In which I was watching soapUI, the handling of WDSL.exe with visual studio.
I was also seeing ladon and the truth seems to me an interesting tool, simple to use and very descriptive.

But, after struggling to be able to correct it, since in Linux probe of everything after the installation, and it still didn't work, the next day I decided to try it back and it was perfect.
Sure, I  needed to reboot to apply the changes. (In windows I still couldn't)

Only 2 weeks left to finish and I am already feeling a little nostalgic.

There are some very interesting topics I was seeing, some related to Apache.

I could see that in a moment in the .local / share folder on the linux mint platform an .xbel file began to appear.
Containing?
The description of each new file generated on the machine in xml format (xml schema) and the address points to freedesktop.org.
Let's see if I understand, something is generating this file now?
Why does the GTK need this information?Its strange.
For now this is not relevant but I wanted to share it, I will be seeing how to delete it after the end of gsoc with psf.

Note: I was able to run the Apache 2.4.9 server on Linux but with error 403.

I could also find how to fix the unicode error when trying to decode using.
.decode ('unicode_escape'), instead of .decode ('utf-8')
It is not yet implemented.

I will advance as much as I can with the tests, and  the web service testing with ladon.

"""

Regards and Good Week!


PSF GSoC students blogs: Blogpost: 10th week of GSoC (Jul 29 - Aug 04)

$
0
0

During this last week I focused on an implantation of the classical bootstrap, as well as the bootstrap-t technique (see previous post for a detailed description of the latter), to provide a robust estimate of significance for the results of the group-level linear regression analysis framework for neural time-series we've been working on during the last few weeks.

In particular, this week I was able to put together a set of functions in a tutorial that shows how the second-level (i.e., group-level) regression analysis can be extended to estimate the moderating effects of a continuous covariate on subject-level predictors. In other words, how variability in the strength of the effect of a primary predictot can be attributed to inter-subject variability on another, putative secondary variable (the subject’s age, for instance).

On a first step, the linear model is fitted each subject’s data (i.e., first level analysis) and the regression coefficients are extracted for the predictor in question. Then the approach consists in sampling with replacement an n number of second level design matrices, with n being the number of subjects in the original sample. Here, the link between subjects and covariate values is maintained, so for simplicity the subject indices (or IDs) are sampled. Thus, the linear model is fitted on the previously estimated subject-level regression coefficients of a given predictor variable, this time however, with the covariate values on the predicting side of the equation.

Next the second-level coefficients sorted in ascending order and the 95% confidence interval is computed. In the added tutorial (see here), we use 2000 bootstraps, although "as little as" 599 bootstraps have been previously shown to be enough to control for false positives in the inference process (see for instance here).

One challenge is however that no P-values can be computed with this technique. One was to derive a decision on the the statistical significance significance of this effect cane be achieved via the confidence interval of the regression coefficients: a regression coefficient is significant if the confidence interval does not contain zero.

Podcast.__init__: Build Your Own Knowledge Graph With Zincbase

$
0
0
Computers are excellent at following detailed instructions, but they have no capacity for understanding the information that they work with. Knowledge graphs are a way to approximate that capability by building connections between elements of data that allow us to discover new connections among disparate information sources that were previously uknown. In our day-to-day work we encounter many instances of knowledge graphs, but building them has long been a difficult endeavor. In order to make this technology more accessible Tom Grek built Zincbase. In this episode he explains his motivations for starting the project, how he uses it in his daily work, and how you can use it to create your own knowledge engine and begin discovering new insights of your own.

Summary

Computers are excellent at following detailed instructions, but they have no capacity for understanding the information that they work with. Knowledge graphs are a way to approximate that capability by building connections between elements of data that allow us to discover new connections among disparate information sources that were previously uknown. In our day-to-day work we encounter many instances of knowledge graphs, but building them has long been a difficult endeavor. In order to make this technology more accessible Tom Grek built Zincbase. In this episode he explains his motivations for starting the project, how he uses it in his daily work, and how you can use it to create your own knowledge engine and begin discovering new insights of your own.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email hosts@podcastinit.com)
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Tom Grek about knowledge graphs, when they’re useful, and his project Zincbase that makes them easier to build

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what a knowledge graph is and some of the ways that they are used?
    • How did you first get involved in the space of knowledge graphs?
  • You have built the Zincbase project for building and querying knowledge graphs. What was your motivation for creating this project and what are some of the other tools that are available to perform similar tasks?
  • Can you describe how Zincbase is implemented and some of the ways that it has evolved since you first began working on it?
    • What are some of the assumptions that you had at the outset of the project which have been challenged or updated in the process of working on and with it?
  • What are some of the common challenges when building or using knowledge graphs?
  • How has the domain of knowledge graphs changed in recent years as new approaches to entity resolution and data processing have been introduced?
  • Can you talk through a use case and workflow for using Zincbase to design and populate a knowledge graph?
  • What are some of the ways that you are using Zincbase in your own projects?
  • What have you found to be the most challenging/interesting/unexpected lessons that you have learned in the process of building and maintaining Zincbase?
  • What do you have planned for the future of the project?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Roberto Alsina: Old Guy @ The Terminal Ep 3: Puede Fallar!

$
0
0

Episodio 3!

Igual que casi nadie publica los estudios con resultados negativos, nadie hace videos en Youtube acerca de como no le sale hacer algo. Bueno, yo sí.

Este episodio es sobre una de las cosas que más me interesan en el desarrollo de software, especialmente para alguien que está aprendiendo (o sea todo el mundo) y más aún para un principiante: el fracaso.

Véanme fracasar durante unos 20 minutos, mientras trato infructuosamente de hacer una cosa que tenía ganas de hacer!

¡Y no pasa nada! Es imposible tener sindrome de impostor si uno no hace como que sabe.

Mike Driscoll: Using Twitter with Python and Tweepy

$
0
0

Twitter is a popular social network that people use to communicate with each other. Python has several packages that you can use to interact with Twitter. These packages can be useful for creating Twitter bots or for downloading lots of data for offline analysis. One of the more popular Python Twitter packages is called Tweepy. You will learn how to use Tweepy with Twitter in this article.

Tweepy gives you access to Twitter’s API, which exposes the following (plus lots more!):

  • Tweets
  • Retweets
  • Likes
  • Direct messages
  • Followers

This allows you to do a lot with Tweepy. Let’s find out how to get started!


Getting Started

The first thing that you need to do is create a Twitter account and get the credentials you will need to access Twitter. To do that, you will need to apply for a developer account here.

Once that is created, you can get or generate the following:

  • Consumer API Key
  • Consumer API Secret
  • Access token
  • Access secret

If you ever lose these items, you can go back to your developer account and regenerate new ones. You can also revoke the old ones.

Note: By default, the access token you receive is read-only. If you would like to send tweets with Tweepy, then you will need to make sure you set the Application Type to “Read and Write”.


Installing Tweepy

Next you will need to install the Tweepy package. This is accomplished by using pip:

pip install tweepy

Now that you have Tweepy installed, you can start using it!


Using Tweepy

You can use Tweepy to do pretty much anything on Twitter programmatically. For example, you can use Tweepy to get and send tweets. You can use it to access information about a user. You can retweet, follow/unfollow, post, and much more.

Let’s look at an example of getting your user’s home timeline:

import tweepy
 
consumer_key = 'CONSUMER_KEY'
consumer_secret = 'CONSUMER_SECRET'
access_token = 'ACCESS_TOKEN'
access_secret = 'ACCESS_SECRET' 
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret) 
api = tweepy.API(auth) 
tweets = api.home_timeline()for tweet in tweets:
    print('{real_name} (@{name}) said {tweet}\n\n'.format(
        real_name=tweet.author.name, name=tweet.author.screen_name,
        tweet=tweet.text))

The first few lines of code here are where you would put your credentials that you found on your Twitter developer profile page. It is not actually recommended to hard-code these values in your code, but I am doing that here for simplicity. If you were to create code to be shared, you would want to have the user’s of your code export their credentials to their environment and use Python’s os.getenv() or via command line arguments using argparse.

Next you login to Twitter by creating a OAuthHandler() object and setting the access token with the aptly named set_access_token() function. Then you can create an API() instance that will allow you to access Twitter.

In this case, you call home_timeline() which returns the first twenty tweets in your home timeline. These are tweets from your friends or followers or could be random tweets that Twitter has decided to promote in your timeline. Here you print out the the author’s name, Twitter handle and the text of their tweet.

Let’s find out how you get information about yourself using the api object you created earlier:

>>> me = api.me()>>> me.screen_name'driscollis'>>> me.name'Mike Driscoll'>>> me.description('Author of books, blogger @mousevspython and Python enthusiast. Also part of ''the tutorial team @realpython')

You can use the api to get information about yourself. The code above demonstrates getting your screen name, actual name and the description you have set on Twitter. You can get much more then this. For example, you can get your followers, timeline, etc.


Getting Tweets

Getting tweets is also quite easy to do using Tweepy. You can get your own tweets or someone else’s if you know their username.

Let’s start by getting your tweets:

import tweepy
 
consumer_key = 'CONSUMER_KEY'
consumer_secret = 'CONSUMER_SECRET'
access_token = 'ACCESS_TOKEN'
access_secret = 'ACCESS_SECRET' 
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret) 
api = tweepy.API(auth)
my_tweets = api.user_timeline()for t in my_tweets:
    print(t.text)

Here you connect to Twitter as you did in the previous section. Then you call user_timeline() to get a list of object that you then iterate over. In this case, you end up printing out only the text of the tweet.

Let’s try getting a specific user. In this case, we will use a fairly popular programmer, Kelly Vaughn:

>>>user = api.get_user('kvlly')>>>user.screen_name'kvlly'>>>user.name'Kelly Vaughn 🐞'>>>for t in tweets:
        print(t.text)

She tweets a LOT, so I won’t be reproducing her tweet here. However as you can see, it’s quite easy to get a user. All you need to do is pass get_user() a valid Twitter user name and then you’ll have access to anything that is publicly available about that user.


Sending Tweets

Reading tweets is fun, but what about sending them? Tweepy can do this task for you as well.

Let’s find out how:

import tweepy
 
consumer_key = 'CONSUMER_KEY'
consumer_secret = 'CONSUMER_SECRET'
access_token = 'ACCESS_TOKEN'
access_secret = 'ACCESS_SECRET' 
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret) 
api = tweepy.API(auth) 
api.update_status("This is your tweet message")

The main bit of code you should focus on here is the very last line: update_status(). Here you pass in a string, which is the tweet message itself. As long as there are no errors, you should see the tweet in your Twitter timeline.

Now let’s learn how to send a tweet with a photo attached to it:

import tweepy
 
consumer_key = 'CONSUMER_KEY'
consumer_secret = 'CONSUMER_SECRET'
access_token = 'ACCESS_TOKEN'
access_secret = 'ACCESS_SECRET' 
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret) 
api = tweepy.API(auth) 
api.update_with_media('/path/to/an/image.jpg',
                      "This is your tweet message")

In this case, you need to use the update_with_media() method, which takes a path to the image you want to upload and a string, which is the tweet message.


Listing Followers

The last topic that I am going to cover is how to list your followers on Twitter.

Let’s take a look:

import tweepy
 
consumer_key = 'CONSUMER_KEY'
consumer_secret = 'CONSUMER_SECRET'
access_token = 'ACCESS_TOKEN'
access_secret = 'ACCESS_SECRET' 
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret) 
api = tweepy.API(auth) 
followers = api.followers()for follower in followers:
    print(follower.screen_name)

This code will list the latest 20 followers for your account. All you need to do is call followers() to get the list. If you need to get more than just the latest 20, you can use the count parameter to specify how many results you want.


Wrapping Up

There is much more that you can do with Tweepy. For example, you can get likes, send and read direct messages and upload media, among other things. It’s a quite nice and easy to use package. You can also use Tweepy to read and write to Twitter in real-time, which allows you to create a Twitter bots. If you haven’t given Tweepy a try yet, you should definitely give it a go. It’s a lot of fun!


Related Reading

The post Using Twitter with Python and Tweepy appeared first on The Mouse Vs. The Python.

Reuven Lerner: Weekly Python Exercise is a PyCon Africa 2019 bronze sponsor

$
0
0

I’ve attended two Python conferences so far this year: PyCon (in May, in Cleveland, Ohio) and EuroPython (in July, in Basel, Switzerland). Both were fantastic; I was happy to be a sponsor at PyCon in the US, and to give my “practical decorators” talk at both conferences.

While in Basel, I heard about PyCon Africa, a conference for people from all over Africa to come and share their Python knowledge with one another. And while I couldn’t make it (since I’m giving my “Python for non-programmers” course to a company in the US), I was delighted to become a bronze sponsor of the conference, under the Weekly Python Exercise name.

I hope that this year’s PyCon Africa, which starts today, is so over-the-top successful that it’ll happen again next year — and that I’ll be able to join it in person.

Meanwhile, don’t forget that if you want to improve your Python fluency, then Weekly Python Exercise offers a family of 15-week courses, at both beginner and advanced levels, to help you out. And if you live outside of the world’s 30 richest countries, then you’re entitled to a very steep discount on the enrollment fee. A new beginner-level cohort starts in September; find out more at https://WeeklyPythonExercise.com/ !

The post Weekly Python Exercise is a PyCon Africa 2019 bronze sponsor appeared first on Reuven Lerner.

nl-project: Announcing syntreenet 1.0

$
0
0
Syntreenet provides a scalable and performant logic engine for any formal language described by a parsing expression grammar (PEG). You can check it out at Pypi.

ListenData: Python list comprehension with Examples

$
0
0
This tutorial covers how list comprehension works in Python. It includes many examples which would help you to familiarize the concept and you should be able to implement it in your live project at the end of this lesson.
Table of Contents

What is list comprehension?

Python is an object oriented programming language. Almost everything in them is treated consistently as an object. Python also features functional programming which is very similar to mathematical way of approaching problem where you assign inputs in a function and you get the same output with same input value. Given a function f(x) = x2, f(x) will always return the same result with the same x value. The function has no "side-effect" which means an operation has no effect on a variable/object that is outside the intended usage. "Side-effect" refers to leaks in your code which can modify a mutable data structure or variable.

Functional programming is also good for parallel computing as there is no shared data or access to the same variable.

List comprehension is a part of functional programming which provides a crisp way to create lists without writing a for loop.
list comprehension python
In the image above, the for clause iterates through each item of list. if clause filters list and returns only those items where filter condition meets. if clause is optional so you can ignore it if you don't have conditional statement.

[i**3 for i in [1,2,3,4] if i>2] means take item one by one from list [1,2,3,4] iteratively and then check if it is greater than 2. If yes, it takes cube of it. Otherwise ignore the value if it is less than or equal to 2. Later it creates a list of cube of values 3 and 4. Output : [27, 64]

List Comprehension vs. For Loop vs. Lambda + map()

All these three have different programming styles of iterating through each element of list but they serve the same purpose or return the same output. There are some differences between them as shown below.
1. List comprehension is more readable than For Loop and Lambda function.
List Comprehension

[i**2 for i in range(2,10)]
For Loop

sqr = []
for i in range(2,10):
sqr.append(i**2)
sqr
Lambda + Map

list(map(lambda i: i**2, range(2, 10)))

Output
[4, 9, 16, 25, 36, 49, 64, 81]
List comprehension is performing a loop operation and then combines items to a list in just a single line of code. It is more understandable and clearer than for loop and lambda.

range(2,10) returns 2 through 9 (excluding 10).

**2 refers to square (number raised to power of 2). sqr = [] creates empty list. append( ) function stores output of each repetition of sequence (i.e. square value) in for loop.

map( ) applies the lambda function to each item of iterable (list). Wrap it in list( ) to generate list as output

READ MORE »

Stack Abuse: Image Classification with Transfer Learning and PyTorch

$
0
0

Introduction

Transfer learning is a powerful technique for training deep neural networks that allows one to take knowledge learned about one deep learning problem and apply it to a different, yet similar learning problem.

Using transfer learning can dramatically speed up the rate of deployment for an app you are designing, making both the training and implementation of your deep neural network simpler and easier.

In this article we'll go over the theory behind transfer learning and see how to carry out an example of transfer learning on Convolutional Neural Networks (CNNs) in PyTorch.

What is PyTorch?

Pytorch is a library developed for Python, specializing in deep learning and natural language processing. PyTorch takes advantage of the power of Graphical Processing Units (GPUs) to make implementing a deep neural network faster than training a network on a CPU.

PyTorch has seen increasing popularity with deep learning researchers thanks to its speed and flexibility. PyTorch sells itself on three different features:

  • A simple, easy-to-use interface
  • Complete integration with the Python data science stack
  • Flexible / dynamic computational graphs that can be changed during run time (which makes training a neural network significantly easier when you have no idea how much memory will be required for your problem).

PyTorch is compatible with NumPy and it allows NumPy arrays to be transformed into tensors and vice versa.

Defining Necessary Terms

Before we go any further, let's take a moment to define some terms related to Transfer Learning. Getting clear on our definitions will make understanding of the theory behind transfer learning and implementing an instance of transfer learning easier to understand and replicate.

What is Deep Learning?

alt

Deep learning is a subsection of machine learning, and machine learning can be described as simply the act of enabling computers to carry out tasks without being explicitly programmed to do so.

Deep Learning systems utilize neural networks, which are computational frameworks modeled after the human brain.

Neural networks have three different components: An input layer, a hidden layer or middle layer, and an output layer.

The input layer is simply where the data that is being sent into the neural network is processed, while the middle layers/hidden layers are comprised of a structure referred to as a node or neuron.

These nodes are mathematical functions which alter the input information in some way and passes on the altered data to the final layer, or the output layer. Simple neural networks can distinguish simple patterns in the input data by adjusting the assumptions, or weights, about how the data points are related to one another.

A deep neural network gets its name from the fact that it is made out of many regular neural networks joined together. The more neural networks are linked together, the more complex patterns the deep neural network can distinguish and the more uses it has. There are different kinds of neural networks, which each type having its own specialty.

For example, Long Short Term Memory deep neural networks are networks that work very well when handling time sensitive tasks, where the chronological order of data is important, like text or speech data.

What is a Convolutional Neural Network?

This article will be concerned with Convolutional Neural Networks, a type of neural network that excels at manipulating image data.

Convolutional Neural Networks (CNNs) are special types of neural networks, adept at creating representations of visual data. The data in a CNN is represented as a grid which contains values that represent how bright, and what color, every pixel in the image is.

A CNN is broken down into three different components: the convolutional layers, the pooling layers, and the fully connected layers.

alt

The responsibility of the convolutional layer is to create a representation of the image by taking the dot product of two matrices.

The first matrix is a set of learnable parameters, referred to as a kernel. The other matrix is a portion of the image being analyzed, which will have a height, a width, and color channels. The convolutional layers are where the most computation happens in a CNN. The kernel is moved across the entire width and height of the image, eventually producing a representation of the entire image that is two-dimensional, a representation known as an activation map.

Due to the sheer amount of information contained in the CNN's convolutional layers, it can take an extremely long time to train the network. The function of the pooling layers is to reduce the amount of information contained in the CNNs convolutional layers, taking the output from one convolutional layer and scaling it down to make the representation simpler.

The pooling layer accomplishes this by looking at different spots in the network's outputs and "pooling" the nearby values, coming up with a single value that represents all the nearby values. In other words, it takes a summary statistic of the values in a chosen region.

Summarizing the values in a region means that the network can greatly reduce the size and complexity of its representation while still keeping the relevant information that will enable the network to recognize that information and draw meaningful patterns from the image.

There are various functions that can be used to summarize a region's values, such as taking the average of a neighborhood - or Average Pooling. A weighted average of the neighborhood can also be taken, as can the L2 norm of the region. The most common pooling technique is Max Pooling, where the maximum value of the region is taken and used to represent the neighborhood.

The fully connected layer is where all the neurons are linked together, with connections between every preceding and succeeding layer in the network. This is where the information that has been extracted by the convolutional layers and pooled by the pooling layers is analyzed, and where patterns in the data are learned. The computations here are carried out through matrix multiplication combined with a bias effect.

There are also several nonlinearities present in the CNN. When considering that images themselves are non-linear things, the network has to have nonlinear components to be able to interpret the image data. The nonlinear layers are usually inserted into the network directly after the convolutional layers, as this gives the activation map non-linearity.

There are a variety of different nonlinear activation functions that can be used for the purpose of enabling the network to properly interpret the image data. The most popular nonlinear activation function is ReLu, or the Rectified Linear Unit. The ReLu function turns nonlinear inputs into a linear representation by compressing real values to only positive values above 0. To put that another way, the ReLu function takes any value above zero and returns it as is, while if the value is below zero it is returned as zero.

The ReLu function is popular because of its reliability and speed, performing around six times faster than other activation functions. The downside to ReLu is that it can easily get stuck when handling large gradients, never updating the neurons. This problem can be tackled by setting a learning rate for the function.

Two other popular nonlinear functions are the sigmoid function and the Tanh function.

The sigmoid function works by taking real values and squishing them to a range between 0 and 1, although it has problems handling activations that are near the extremes of the gradient, as the values become almost zero.

Meanwhile, the Tanh function operates similarly to the Sigmoid, except that its output is centered near zero and it squishes the values to between -1 and 1.

Training and Testing

There are two different phases to creating and implementing a deep neural network: training and testing.

The training phase is where the network is fed the data and it begins to learn the patterns that the data contains, adjusting the weights of the network, which are assumptions about how the data points are related to each other. To put that another way, the training phase is where the network "learns" about the data is has been fed.

The testing phase is where what the network has learned is evaluated. The network is given a new set of data, one it hasn't seen before, and then the network is asked to apply its guesses about the patterns it has learned to the new data. The accuracy of the model is evaluated and typically the model is tweaked and retrained, then retested, until the architect is satisfied with the model's performance.

In the case of transfer learning, the network that is used has been pretrained. The network's weights have already been adjusted and saved, so there's no reason to train the entire network again from scratch. This means that the network can immediately be used for testing, or just certain layers of the network can be tweaked and then retrained. This greatly speeds up the deployment of the deep neural network.

What is Transfer Learning?

alt

The idea behind transfer learning is taking a model trained on one task and applying to a second, similar task. The fact that a model has already had some or all of the weights for the second task trained means that the model can be implemented much quicker. This allows rapid performance assessment and model tuning, enabling quicker deployment overall. Transfer learning is becoming increasingly popular in the field of deep learning, thanks to the vast amount of computational resources and time needed to train deep learning models, in addition to large, complex datasets.

The primary constraint of transfer learning is that the model features learned during the first task are general, and not specific to the first task. In practice, this means that models trained to recognize certain types of images can be reused to recognize other images, as long as the general features of the images are similar.

Transfer Learning Theory

The utilization of transfer learning has several important concepts. In order to understand the implementation of transfer learning, we need go over what a pre-trained model looks like, and how that model can be fine-tuned for your needs.

There are two ways to choose a model for transfer learning. It is possible to create a model from scratch for your own needs, save the model's parameters and structure, and then reuse the model later.

The second way to implement transfer learning is to simply take an already existing model and reuse it, tuning its parameters and hyperparameters as you do so. In this instance, we will be using a pretrained model and modifying it. After you've decided what approach you want to use, choose a model (if you are using a pretrained model).

There is a large variety of pretrained models that can be used in PyTorch. Some of the pretrained CNNs include:

  • AlexNet
  • CaffeResNet
  • Inception
  • The ResNet series
  • The VGG series

These pretrained models are accessible through PyTorch's API and when instructed, PyTorch will download their specifications to your machine. The specific model we are going to be using is ResNet34, part of the Resnet series.

The Resnet model was developed and trained on an ImageNet dataset as well as the CIFAR-10 dataset. As such it is optimized for visual recognition tasks, and showed a marked improvement over the VGG series, which is why we will be using it.

However, other pretrained models exist, and you may want to experiment with them to see how they compare.

As PyTorch's documentation on transfer learning explains, there are two major ways that transfer learning is used: fine-tuning a CNN or by using the CNN as a fixed feature extractor.

When fine-tuning a CNN, you use the weights the pretrained network has instead of randomly initializing them, and then you train like normal. In contrast, a feature extractor approach means that you'll maintain all the weights of the CNN except for those in the final few layers, which will be initialized randomly and trained as normal.

Fine-tuning a model is important because although the model has been pretrained, it has been trained on a different (though hopefully similar) task. The densely connected weights that the pretrained model comes with will probably be somewhat insufficient for your needs, so you will likely want to retrain the final few layers of the network.

In contrast, because the first few layers of the network are just feature extraction layers, and they will perform similarly on similar images, they can be left as they are. Therefore, if the dataset is small and similar, the only training that needs to be done is the training of the final few layers. The larger and more complex the dataset gets, the more the model will need to be retrained. Remember that transfer learning works best when the dataset you are using is smaller than the original pre-trained model, and similar to the images fed to the pretrained model.

Working with transfer learning models in Pytorch means choosing which layers to freeze and which to unfreeze. Freezing a model means telling PyTorch to preserve the parameters (weights) in the layers you've specified. Unfreezing a model means telling PyTorch you want the layers you've specified to be available for training, to have their weights trainable.

After you've concluded training your chosen layers of the pretrained model, you'll probably want to save the newly trained weights for future use. Even though using a pre-trained models is faster than and training a model from scratch, it still takes time to train, so you'll want to copy the best model weights.

Image Classification with Transfer Learning in PyTorch

We're ready to start implementing transfer learning on a dataset. We'll cover both finetuning the ConvNet and using the net as a fixed feature extractor.

Data Preprocessing

First off, we'll need to decide on a dataset to use. Let's choose something that has a lot of really clear images to train on. The Stanford Cats and Dogs dataset is a very commonly used dataset, chosen for how simple yet illustrative the set is. You can download this right here.

Be sure to divide the dataset into two equally sized sets: "train" and "val".

You can do this anyway that you would like, by manually moving the files or by writing a function to handle it. You may also want to limit the dataset to a smaller size, as it comes with almost 12,000 images in each category, and this will take a long time to train. You may want to cut that number down to around 5000 in each category, with 1000 set aside for validation. However, the number of images you want to use for training is up to you.

Here's one way to prepare the data for use:

import os
import shutil
import re

base_dir = "PetImages/"

# Create training folder
files = os.listdir(base_dir)

# Moves all training cat images to cats folder, training dog images to dogs folder
def train_maker(name):
  train_dir = f"{base_dir}/train/{name}"
  for f in files:
        search_object = re.search(name, f)
        if search_object:
          shutil.move(f'{base_dir}/{name}', train_dir)

train_maker("Cat")
train_maker("Dog")

# Make the validation directories
try:
    os.makedirs("val/Cat")
    os.makedirs("val/Dog")
except OSError:
    print ("Creation of the directory %s failed")
else:
    print ("Successfully created the directory %s ")

# Create validation folder

cat_train = base_dir + "train/Cat/"
cat_val = base_dir + "val/Cat/"
dog_train = base_dir + "train/Dog/"
dog_val = base_dir + "val/Dog/"

cat_files = os.listdir(cat_train)
dog_files = os.listdir(dog_train)

# This will put 1000 images from the two training folders
# into their respective validation folders

for f in cat_files:
    validationCatsSearchObj = re.search("5\d\d\d", f)
    if validationCatsSearchObj:
        shutil.move(f'{cat_train}/{f}', cat_val)

for f in dog_files:
    validationCatsSearchObj = re.search("5\d\d\d", f)
    if validationCatsSearchObj:
        shutil.move(f'{dog_train}/{f}', dog_val)

Loading the Data

After we have selected and prepared the data, we can start off by importing all the necessary libraries. We'll need many of the Torch packages like nn neural network, the optimizers and the DataLoaders. We'll also want matplotlib to visualize some of our training examples.

We need numpy to handle the creation of data arrays, as well as a few other miscellaneous modules:

from __future__ import print_function, division

import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import numpy as np
import time
import os
import copy

To start off with, we need to load in our training data and prepare it for use by our neural network. We're going to be making use of Pytorch's transforms for that purpose. We'll need to make sure the images in the training set and validation set are the same size, so we'll be using transforms.Resize.

We'll also be doing a little data augmentation, trying to improve the performance of our model by forcing it to learn about images at different angles and crops, so we'll randomly crop and rotate the images.

Next, we'll make tensors out of the images, as PyTorch works with tensors. Finally, we'll normalize the images, which helps the network work with values that may be have a wide range of different values.

We then compose all our chosen transforms. Note that the validation transforms don't have any of the flipping or rotating, as they aren't part of our training set, so the network isn't learning about them:

# Make transforms and use data loaders

# We'll use these a lot, so make them variables
mean_nums = [0.485, 0.456, 0.406]
std_nums = [0.229, 0.224, 0.225]

chosen_transforms = {'train': transforms.Compose([
        transforms.RandomResizedCrop(size=256),
        transforms.RandomRotation(degrees=15),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize(mean_nums, std_nums)
]), 'val': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean_nums, std_nums)
]),
}

Now we will set the directory for our data and use PyTorch's ImageFolder function to create datasets:

# Set the directory for the data
data_dir = '/data/'

# Use the image folder function to create datasets
chosen_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
  chosen_transforms[x])
                  for x in ['train', 'val']}

Now that we have chosen the image folders we want, we need to use the DataLoaders to create iterable objects for us to work with. We tell it which datasets we want to use, give it a batch size, and shuffle the data.

# Make iterables with the dataloaders
dataloaders = {x: torch.utils.data.DataLoader(chosen_datasets[x], batch_size=4,
  shuffle=True, num_workers=4)
              for x in ['train', 'val']}

We're going to need to preserve some information about our dataset, specifically the size of the dataset and the names of the classes in our dataset. We also need to specify what kind of device we are working with, a CPU or GPU. The following setup will use GPU if available, otherwise CPU will be used:

dataset_sizes = {x: len(chosen_datasets[x]) for x in ['train', 'val']}
class_names = chosen_datasets['train'].classes

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Now let's try visualizing some of our images with a function. We'll take an input, create a Numpy array from it, and transpose it. Then we'll normalize the input using mean and standard deviation. Finally, we'll clip values to between 0 and 1 so there isn't a massive range in the possible values of the array, and then show the image:

alt

# Visualize some images
def imshow(inp, title=None):
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([mean_nums])
    std = np.array([std_nums])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    if title is not None:
        plt.title(title)
    plt.pause(0.001)  # Pause a bit so that plots are updated

Now let's use that function and actually visualize some of the data. We're going to get the inputs and the name of the classes from the DataLoader and store them for later use. Then we'll make a grid to display the inputs on and display them:

# Grab some of the training data to visualize
inputs, classes = next(iter(dataloaders['train']))

# Now we construct a grid from batch
out = torchvision.utils.make_grid(inputs)

imshow(out, title=[class_names[x] for x in classes])

Setting up a Pretrained Model

Now we have to set up the pretrained model we want to use for transfer learning. In this case, we're going to use the model as is and just reset the final fully connected layer, providing it with our number of features and classes.

When using pretrained models, PyTorch sets the model to be unfrozen (will have its weights adjusted) by default. So we'll be training the whole model:

# Setting up the model
# load in pretrained and reset final fully connected

res_mod = models.resnet34(pretrained=True)

num_ftrs = res_mod.fc.in_features
res_mod.fc = nn.Linear(num_ftrs, 2)

If this still seems somewhat unclear, visualizing the composition of the model may help.

for name, child in res_mod.named_children():
    print(name)

Here's what that returns:

conv1
bn1
relu
maxpool
layer1
layer2
layer3
layer4
avgpool
fc

Notice the final portion is fc, or "Fully-Connected". This is the only layer we are modifying the shape of, giving it our two classes to output.

Essentially, we're going to be changing the outputs of the final fully connected portion to just two classes, and adjusting the weights for all the other layers.

Now we need to send our model to our training device. We also need to choose the loss criterion and optimizer we want to use with the model. CrossEntropyLoss and the SGD optimizer are good choices, though there are many others.

We'll also be choosing a learning rate scheduler, which decreases the learning rate of the optimizer overtime and helps prevent nonconvergence due to large learning rates. You can learn more about learning rate schedulers here if you are curious:

res_mod = res_mod.to(device)
criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(res_mod.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

Now we just need to define the functions that will train the model and visualize the predictions.

Let's start off with the training function. It will take in our chosen model as well as the optimizer, criterion, and scheduler we chose. We'll also specify a default number of training epochs.

Every epoch will have a training and validation phase. To begin with, we set the model's initial best weights to those of the pretrained mode, by using state_dict.

Now, for every epoch in the chosen number of epochs, if we are in the training phase, we will:

  1. Decrement the learning rate
  2. Zero the gradients
  3. Carry out the forward training pass
  4. Calculate the loss
  5. Do backward propagation and update the weights with the optimizer

We'll also be keeping track of the model's accuracy during the training phase, and if we move to the validation phase and the accuracy has improved, we'll save the current weights as the best model weights:

def train_model(model, criterion, optimizer, scheduler, num_epochs=10):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                scheduler.step()
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            current_loss = 0.0
            current_corrects = 0

            # Here's where the training happens
            print('Iterating through data...')

            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # We need to zero the gradients, don't forget it
                optimizer.zero_grad()

                # Time to carry out the forward training poss
                # We only need to log the loss stats if we are in training phase
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # We want variables to hold the loss statistics
                current_loss += loss.item() * inputs.size(0)
                current_corrects += torch.sum(preds == labels.data)

            epoch_loss = current_loss / dataset_sizes[phase]
            epoch_acc = current_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # Make a copy of the model if the accuracy on the validation set has improved
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_since = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_since // 60, time_since % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # Now we'll load in the best model weights and return it
    model.load_state_dict(best_model_wts)
    return model

Our training printouts should look something like this:

Epoch 0/25
----------
Iterating through data...
train Loss: 0.5654 Acc: 0.7090
Iterating through data...
val Loss: 0.2726 Acc: 0.8889

Epoch 1/25
----------
Iterating through data...
train Loss: 0.5975 Acc: 0.7090
Iterating through data...
val Loss: 0.2793 Acc: 0.8889

Epoch 2/25
----------
Iterating through data...
train Loss: 0.5919 Acc: 0.7664
Iterating through data...
val Loss: 0.3992 Acc: 0.8627

Visualization

Now we'll create a function that will let us see the predictions our model has made.

def visualize_model(model, num_images=6):
    was_training = model.training
    model.eval()
    images_handeled = 0
    fig = plt.figure()

    with torch.no_grad():
        for i, (inputs, labels) in enumerate(dataloaders['val']):
            inputs = inputs.to(device)
            labels = labels.to(device)

            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)

            for j in range(inputs.size()[0]):
                images_handeled += 1
                ax = plt.subplot(num_images//2, 2, images_handeled)
                ax.axis('off')
                ax.set_title('predicted: {}'.format(class_names[preds[j]]))
                imshow(inputs.cpu().data[j])

                if images_handeled == num_images:
                    model.train(mode=was_training)
                    return
        model.train(mode=was_training)

Now we can tie everything together. We'll train the model on our images and show the predictions:

alt

base_model = train_model(res_mod, criterion, optimizer_ft, exp_lr_scheduler, num_epochs=3)
visualize_model(base_model)
plt.show()

That training will probably take you a long while if you are using a CPU and not a GPU. It will still take some time even if using a GPU.

Fixed Feature Extractor

It is due to the long training time that many people choose to simply use the pretrained model as a fixed feature extractor, and only train the last layer or so. This significantly speeds up training time. In order to do that, you'll need to replace the model we've built. There will be a link to a GitHub repo for both versions of the ResNet implementation.

Replace the section where the pretrained model is defined with a version that freezes the weights and doesn't carry our gradient calculations or backprop.

It looks quite similar to before, except that we specify that the gradients don't need computation:

# Setting up the model
# Note that the parameters of imported models are set to requires_grad=True by default

res_mod = models.resnet34(pretrained=True)
for param in res_mod.parameters():
    param.requires_grad = False

num_ftrs = res_mod.fc.in_features
res_mod.fc = nn.Linear(num_ftrs, 2)

res_mod = res_mod.to(device)
criterion = nn.CrossEntropyLoss()

# Here's another change: instead of all paramters being optimized
# only the params of the final layers are being optmized

optimizer_ft = optim.SGD(res_mod.fc.parameters(), lr=0.001, momentum=0.9)

exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

What if we wanted to selectively unfreeze layers and have the gradients computed for just a few chosen layers. Is that possible? Yes, it is.

Let's print out the children of the model again to remember what layers/components it has:

for name, child in res_mod.named_children():
    print(name)

Here's the layers:

conv1
bn1
relu
maxpool
layer1
layer2
layer3
layer4
avgpool
fc

Now that we know what the layers are, we can unfreeze ones we want, like just layers 3 and 4:

for name, child in res_mod.named_children():
    if name in ['layer3', 'layer4']:
        print(name + 'has been unfrozen.')
        for param in child.parameters():
            param.requires_grad = True
    else:
        for param in child.parameters():
            param.requires_grad = False

Of course, we'll also need to update the optimizer to reflect the fact that we only want to optimize certain layers.

optimizer_conv = torch.optim.SGD(filter(lambda x: x.requires_grad, res_mod.parameters()), lr=0.001, momentum=0.9)

So now you know that you can tune the entire network, just the last layer, or something in between.

Conclusion

Congratulations, you've now implemented transfer learning in PyTorch. It would be a good idea to compare the implementation of a tuned network with the use of a fixed feature extractor to see how the performance differs. Experimenting with freezing and unfreezing certain layers is also encouraged, as it lets you get a better sense of how you can customize the model to fit your needs.

Here's some other things you can try:

  • Using different pretrained models to see which ones perform better under different circumstances
  • Changing some of the arguments of the model, like adjusting learning rate and momentum
  • Try classification on a dataset with more than two classes

If you're curious to learn more about different transfer learning applications and the theory behind it, there's an excellent breakdown of some of the math behind it as well as use cases
here.

The code for this article can be found in this GitHub repo.

Catalin George Festila: Python 3.7.3 : Using the flask - part 010.

$
0
0
If you read my last tutorial about flask then you understand how to use the structure flask project with views.py and models.py. If you run it and open the browser with http://127.0.0.1:5000/texts/ the result will be this: {"texts":[{"title":"first title","txt_content":"this is first content"},{"title":null,"txt_content":null}]} Let's create a file .env into the base folder named my_flask and add

Real Python: 11 Beginner Tips for Learning Python

$
0
0

We are so excited that you have decided to embark on the journey of learning Python! One of the most common questions we receive from our readers is “What’s the best way to learn Python?”

The first step in learning any programming language is making sure that you understand how to learn. Learning how to learn is arguably the most critical skill involved in computer programming.

Why is knowing how to learn so important? Languages evolve, libraries are created, and tools are upgraded. Knowing how to learn will be essential to keeping up with these changes and becoming a successful programmer.

In this course, you’ll see several learning strategies that will help you jumpstart your journey towards becoming a rockstar Python programmer!


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

TechBeamers Python: Python to Find Difference Between Two Lists

$
0
0

In this tutorial, we’ll discover two Pythonic ways to find the Difference Between Two Lists. One of the methods is using the Python Set. It first converts the lists into sets and then gets the unique part out of that. Other non-set methods compare two lists element by element and collect the unique ones. We can implement these by using nested for loops and with the list comprehension. By the way, if you are not aware of the sets in Python, then follow the below tutorial. It would quickly introduce you to how Python implements the mathematical form of Set.

The post Python to Find Difference Between Two Lists appeared first on Learn Programming and Software Testing.

PyCoder’s Weekly: Issue #380 (Aug. 6, 2019)

$
0
0

#380 – AUGUST 6, 2019
View in Browser »

The PyCoder’s Weekly Logo


What You Need to Know to Manage Users in Django Admin

Learn what you need to know to manage users in Django admin. Out of the box, Django admin doesn’t enforce special restrictions on the user admin. This can lead to dangerous scenarios that might compromise your system.
REAL PYTHON

Why Your Mock Doesn’t Work

“Mocking is a powerful technique for isolating tests from undesired interactions among components. But often people find their mock isn’t taking effect, and it’s not clear why. Hopefully this explanation will clear things up.”
NED BATCHELDER

Try the Python IDE for Professional Developers

alt

PyCharm provides smart code completion, code inspections, on-the-fly error highlighting and quick-fixes, along with automated code refactorings and rich navigation capabilities. Try it now →
JETBRAINSsponsor

Django Security Releases Issued: 2.2.4, 2.1.11 and 1.11.23

Addresses DoS possibilities in django.utils.text.Truncator, strip_tags(), django.utils.encoding.uri_to_iri(), and a potential SQL injection issue in JSONField/HStoreField.
DJANGOPROJECT.COM

First Steps With PySpark and Big Data Processing

Take your first steps with Spark, PySpark, and Big Data processing concepts using intermediate Python concepts.
REAL PYTHON

Improve Your Django Tests With Fakes and Factories

This blog post is an introduction into unit testing in Django. It offers an explanation on what are faker and factories and code examples illustrating them.
MARTIN ANGELOV

Memory Management in Python

This article describes memory management in CPython 3.6 and how it uses a pool allocator called PyMalloc to speed-up memory operations and reduce fragmentation.
ARTEM GOLUBIN

Discussions

Python Jobs

Senior Python Developer (Austin, TX)

InQuest

Backend and DataScience Engineers (London – Relocation & Visa Possible)

Citymapper Ltd

Software Engineering Lead (Houston, TX)

SimpleLegal

Software Engineer (Multiple US Locations)

Invitae

Python Software Engineer (Munich, Germany)

Stylight GmbH

Senior Back-End Web Developer (Vancouver, Canada)

7Geese

Lead Data Scientist (Buffalo, NY)

Utilant LLC

Python Developer (Remote)

418 Media

Sr. Python Engineer (Arlington, VA)

Public Broadcasting Service

Senior Backend Software Engineer (Remote)

Close

More Python Jobs >>>

Articles & Tutorials

Exploring Mathematics With Matplotlib and Python

“Data Visualization can be a great tool for mathematical exploration and experimentation. In this article, I show you an example using Matplotlib and Python.”
ANTONIO CANGIANO

How to Make a Scatter Plot in Python Using Seaborn

Learn how to make scatter plots, adding trend lines, text, rotating the labels, changing color, and markers, among other things.
ERIK MARSJA

All-in-One Visual Testing and Review Platform

alt

Visually test your web app, component library, or static site across browsers and responsive widths to catch UI bugs and ship with complete confidence. Get started for free →
PERCYsponsor

A Python Prompt Into a Running Process: Debugging With Manhole

“Sometimes your Python process will behave strangely, run slowly, or give you the wrong answers. And while hopefully you have logging, the logging isn’t always enough. So how do you debug this process?”
ITAMAR TURNER-TRAURING

11 Beginner Tips for Learning Python

In this course, you’ll see several learning strategies and tips that will help you jumpstart your journey towards becoming a rockstar Python programmer!
REAL PYTHONvideo

Image Classification With Transfer Learning and PyTorch

In this article you’ll go over the theory behind transfer learning and see how to carry out an example of transfer learning on Convolutional Neural Networks (CNNs) in PyTorch.
DAN NELSON

Implementing a Photo Stylizer in Python Using a QuadTree Algorithm

Learn how to write a Python script to create a quadtree based filter for stylizing photos.
RICHARD BARELLA

A Simple Explanation of the Softmax Function

What Softmax is, how it’s used, and how to implement it in Python.
VICTOR ZHOU

Python Basics: A Practical Introduction to Python 3

Make the leap from Beginner to Intermediate in Python with this complete curriculum freshly updated for Python 3.7. Includes exercises, interactive quizzes, and sample projects so you’ll always know what to focus on next. Get the book today and save 27% →
REAL PYTHONbooksponsor

Projects & Code

Events

Python Miami

August 10 to August 11, 2019
PYTHONDEVELOPERSMIAMI.COM

PyBay

August 15 to August 19, 2019
PYBAY.COM

PyCon Korea 2019

August 15 to August 19, 2019
PYCON.KR


Happy Pythoning!
This was PyCoder’s Weekly Issue #380.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Python Bytes: #142 There's a bandit in the Python space

Python Insider: Python 3.8.0b3 is now available for testing

$
0
0
It's time for a new Python preview:
https://www.python.org/downloads/release/python-380b3/ 

This release is the third of four planned beta release previews. Beta release previews are intended to give the wider community the opportunity to test new features and bug fixes and to prepare their projects to support the new feature release. The next pre-release of Python 3.8 will be 3.8.0b4, the last beta release, currently scheduled for 2019-08-26.
 

Call to action

We strongly encourage maintainers of third-party Python projects to test with 3.8 during the beta phase and report issues found to the Python bug tracker as soon as possible. While the release is planned to be feature complete entering the beta phase, it is possible that features may be modified or, in rare cases, deleted up until the start of the release candidate phase (2019-09-30). Our goal is have no ABI changes after beta 3 and no code changes after 3.8.0rc1, the release candidate. To achieve that, it will be extremely important to get as much exposure for 3.8 as possible during the beta phase.
Please keep in mind that this is a preview release and its use is not recommended for production environments. 

Last beta coming

Beta 4 can only be released if all “Release blocker” and “Deferred blocker” issues on bugs.python.org for 3.8.0 are resolved. The core team will prioritize fixing those for the next four weeks.
 

Acknowledgements

Thanks to our binary builders, Ned and Steve, who were very quick today to get the macOS and Windows installers ready. The Windows story in particular got pretty magical, it’s now really fully automatic end-to-end.

Thanks to Victor for vastly improving the reliability of multiprocessing tests since Beta 2.

Thanks to Pablo for keeping the buildbots green.

Reuven Lerner: Enjoyed the movie? Now you can also enjoy the (Jupyter note)book!

$
0
0

About a month ago, I started my “Python standard library video explainer series” on YouTube. My goal is to walk through the Python standard library, one little bit at a time — explaining it to Python developers, and also discovering (for myself) the many gems that exist in there, but which I’ve never had a chance to discover or work with.

The series now has more than 25 videos, with more than 2.5 hours of content. I’m currently still on the “builtins” area of the standard library, but will soon be making my way into non-builtin modules. I have already learned a lot in preparing this series, and expect to learn much more as I march through the standard library, one little bit at a time.

As is always the case when I teach, I use the Jupyter notebook and live-code as I explain things. One viewer/subscriber suggested that I should share these Jupyter notebooks with the public.

And so, as of earlier today, you can get copies of the Jupyter notebooks I used in making my videos from GitHub: https://github.com/reuven/video-explainer-notebooks . I hope that the combination of Jupyter + videos will help people to understand Python better.

Subscribe to my YouTube channel, and you’ll get updates whenever I add to my explainer series!

The post Enjoyed the movie? Now you can also enjoy the (Jupyter note)book! appeared first on Reuven Lerner.

Python Engineering at Microsoft: Python in Visual Studio Code – August 2019 Release

$
0
0

We are pleased to announce that the August 2019 release of the Python Extension for Visual Studio Code is now available. You can download the Python extensionfrom the Marketplace, or install it directly from the extension gallery in Visual Studio Code. If you already have the Python extension installed, you can also get the latest update by restarting Visual Studio Code. You can learn more about  Python support in Visual Studio Code  in the documentation.  

In this release we made improvements that are listed in our changelog, closing a total of 76 issues including Jupyter Notebook cell debugging, introducing an Insiders program, improvements to auto-indentation and to the Python Language Server. 

Jupyter Notebook cell debugging  

A few weeks agowe showed a preview of debugging Jupyter notebooks cells at EuroPython 2019. We’re happy to announce we’re officially shipping this functionality in this release.  

Now you’ll be able to set up breakpoints and click on the “Debug Cell” option that is displayed at the cell definition. This will initiate a debugging session and you’ll be able to step into, step out and step over your code, inspect variables and set up watches, just like you normally would when debugging Python files or applications.   

Insiders program  

This release includes support for an easy opt-in to our Insiders program. You can try out new features and fixes before the release date by getting automatic installs for the latest Insiders builds of the Python extension, in a weekly or daily cadence.   

To opt-in this program, open the command palette (View Command Palette…) and select “Python: Switch to Insiders Weekly Channel”. You can also open the settings page (File Preferences Settings)look for “Python: Insiders Channel and set the channel to “daily” or “weekly”, as you prefer 

Improvements to auto-indentation

This release also includes automatic one level dedent and indentation for a series of statements on enter such as else, elif, except, finally, break, continue, pass and raise. This was another highly requested feature from our users 

Improvements to the Python Language Server

We’ve added new functionality to “go to definition” with the Python Language Serverwhich now takes you to the place in code where a variable (as an example) is actually defined. To match the previous behavior of “go to definition”, we added go to declaration.   

We’ve also made fixes to our package watcher. Before, whenever you added an import statement for a package you didn’t have installed in your environment, installing the package via pip didn’t fix ‘unresolved imports’ errors and a user would be forced to reload their entire VS Code window. Now, you no longer need to do this – the errors will automagically disappear once a new package is installed and analyzed. 

Other Changes and Enhancements 

We have also added small enhancements and fixed issues requested by users that should improve your experience working with Python in Visual Studio Code. Some notable changes include: 

  • Add new ‘goto cell’ code lens on every cell that is run from a file. (#6359) 
  • Fixed a bug in pytest test discovery. (thanks Rainer Dreyer) (#6463) 
  • Improved accessibility of the ‘Python Interactive’ window. (#5884) 
  • We now log processes executed behind the scenes in the extension output panel. (#1131) 
  • Fixed indentation after string literals containing escaped characters. (#4241) 

We also started A/B testing new features. If you see something different that was not announced by the team, you may be part of the experiment! To see if you are part of an experiment, you can check the first lines in the Python extension output channel. If you wish to opt-out from A/B testing, disable telemetry in Visual Studio Code. 

Be sure to download the Python extension for Visual Studio Code now to try out the above improvements. If you run into any problems, please file an issue on the Python VS Code GitHub page. 

The post Python in Visual Studio Code – August 2019 Release appeared first on Python.

Real Python: Inheritance and Composition: A Python Guide

$
0
0

In this article, you’ll explore inheritance and composition in Python. Inheritance and composition are two important concepts in object oriented programming that model the relationship between two classes. They are the building blocks of object oriented design, and they help programmers to write reusable code.

By the end of this article, you’ll know how to:

  • Use inheritance in Python
  • Model class hierarchies using inheritance
  • Use multiple inheritance in Python and understand its drawbacks
  • Use composition to create complex objects
  • Reuse existing code by applying composition
  • Change application behavior at run-time through composition

Free Bonus:Click here to get access to a free Python OOP Cheat Sheet that points you to the best tutorials, videos, and books to learn more about Object-Oriented Programming with Python.

What Are Inheritance and Composition?

Inheritance and composition are two major concepts in object oriented programming that model the relationship between two classes. They drive the design of an application and determine how the application should evolve as new features are added or requirements change.

Both of them enable code reuse, but they do it in different ways.

What’s Inheritance?

Inheritance models what is called an is a relationship. This means that when you have a Derived class that inherits from a Base class, you created a relationship where Derivedis a specialized version of Base.

Inheritance is represented using the Unified Modeling Language or UML in the following way:

Basic inheritance between Base and Derived classes

Classes are represented as boxes with the class name on top. The inheritance relationship is represented by an arrow from the derived class pointing to the base class. The word extends is usually added to the arrow.

Note: In an inheritance relationship:

  • Classes that inherit from another are called derived classes, subclasses, or subtypes.
  • Classes from which other classes are derived are called base classes or super classes.
  • A derived class is said to derive, inherit, or extend a base class.

Let’s say you have a base class Animal and you derive from it to create a Horse class. The inheritance relationship states that a Horseis anAnimal. This means that Horse inherits the interface and implementation of Animal, and Horse objects can be used to replace Animal objects in the application.

This is known as the Liskov substitution principle. The principle states that “in a computer program, if S is a subtype of T, then objects of type T may be replaced with objects of type S without altering any of the desired properties of the program”.

You’ll see in this article why you should always follow the Liskov substitution principle when creating your class hierarchies, and the problems you’ll run into if you don’t.

What’s Composition?

Composition is a concept that models a has a relationship. It enables creating complex types by combining objects of other types. This means that a class Composite can contain an object of another class Component. This relationship means that a Compositehas aComponent.

UML represents composition as follows:

Basic composition between Composite and Component classes

Composition is represented through a line with a diamond at the composite class pointing to the component class. The composite side can express the cardinality of the relationship. The cardinality indicates the number or valid range of Composite instances the Component class will contain.

In the diagram above, the 1 represents that the Composite class contains one object of type Component. Cardinality can be expressed in the following ways:

  • A number indicates the number of Component instances that are contained in the Composite.
  • The * symbol indicates that the Composite class can contain a variable number of Component instances.
  • A range 1..4 indicates that the Composite class can contain a range of Component instances. The range is indicated with the minimum and maximum number of instances, or minimum and many instances like in 1..*.

Note: Classes that contain objects of other classes are usually referred to as composites, where classes that are used to create more complex types are referred to as components.

For example, your Horse class can be composed by another object of type Tail. Composition allows you to express that relationship by saying a Horsehas aTail.

Composition enables you to reuse code by adding objects to other objects, as opposed to inheriting the interface and implementation of other classes. Both Horse and Dog classes can leverage the functionality of Tail through composition without deriving one class from the other.

An Overview of Inheritance in Python

Everything in Python is an object. Modules are objects, class definitions and functions are objects, and of course, objects created from classes are objects too.

Inheritance is a required feature of every object oriented programming language. This means that Python supports inheritance, and as you’ll see later, it’s one of the few languages that supports multiple inheritance.

When you write Python code using classes, you are using inheritance even if you don’t know you’re using it. Let’s take a look at what that means.

The Object Super Class

The easiest way to see inheritance in Python is to jump into the Python interactive shell and write a little bit of code. You’ll start by writing the simplest class possible:

>>>
>>> classMyClass:... pass...

You declared a class MyClass that doesn’t do much, but it will illustrate the most basic inheritance concepts. Now that you have the class declared, you can use the dir() function to list its members:

>>>
>>> c=MyClass()>>> dir(c)['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__','__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__','__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__','__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__','__str__', '__subclasshook__', '__weakref__']

dir() returns a list of all the members in the specified object. You have not declared any members in MyClass, so where is the list coming from? You can find out using the interactive interpreter:

>>>
>>> o=object()>>> dir(o)['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__','__ge__', '__getattribute__', '__gt__', '__hash__', '__init__','__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__','__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__','__subclasshook__']

As you can see, the two lists are nearly identical. There are some additional members in MyClass like __dict__ and __weakref__, but every single member of the object class is also present in MyClass.

This is because every class you create in Python implicitly derives from object. You could be more explicit and write class MyClass(object):, but it’s redundant and unnecessary.

Note: In Python 2, you have to explicitly derive from object for reasons beyond the scope of this article, but you can read about it in the New-style and classic classes section of the Python 2 documentation.

Exceptions Are an Exception

Every class that you create in Python will implicitly derive from object. The exception to this rule are classes used to indicate errors by raising an exception.

You can see the problem using the Python interactive interpreter:

>>>
>>> classMyError:... pass...>>> raiseMyError()Traceback (most recent call last):
  File "<stdin>", line 1, in <module>TypeError: exceptions must derive from BaseException

You created a new class to indicate a type of error. Then you tried to use it to raise an exception. An exception is raised but the output states that the exception is of type TypeError not MyError and that all exceptions must derive from BaseException.

BaseException is a base class provided for all error types. To create a new error type, you must derive your class from BaseException or one of its derived classes. The convention in Python is to derive your custom error types from Exception, which in turn derives from BaseException.

The correct way to define your error type is the following:

>>>
>>> classMyError(Exception):... pass...>>> raiseMyError()Traceback (most recent call last):
  File "<stdin>", line 1, in <module>__main__.MyError

As you can see, when you raise MyError, the output correctly states the type of error raised.

Creating Class Hierarchies

Inheritance is the mechanism you’ll use to create hierarchies of related classes. These related classes will share a common interface that will be defined in the base classes. Derived classes can specialize the interface by providing a particular implementation where applies.

In this section, you’ll start modeling an HR system. The example will demonstrate the use of inheritance and how derived classes can provide a concrete implementation of the base class interface.

The HR system needs to process payroll for the company’s employees, but there are different types of employees depending on how their payroll is calculated.

You start by implementing a PayrollSystem class that processes payroll:

# In hr.pyclassPayrollSystem:defcalculate_payroll(self,employees):print('Calculating Payroll')print('===================')foremployeeinemployees:print(f'Payroll for: {employee.id} - {employee.name}')print(f'- Check amount: {employee.calculate_payroll()}')print('')

The PayrollSystem implements a .calculate_payroll() method that takes a collection of employees and prints their id, name, and check amount using the .calculate_payroll() method exposed on each employee object.

Now, you implement a base class Employee that handles the common interface for every employee type:

# In hr.pyclassEmployee:def__init__(self,id,name):self.id=idself.name=name

Employee is the base class for all employee types. It is constructed with an id and a name. What you are saying is that every Employee must have an id assigned as well as a name.

The HR system requires that every Employee processed must provide a .calculate_payroll() interface that returns the weekly salary for the employee. The implementation of that interface differs depending on the type of Employee.

For example, administrative workers have a fixed salary, so every week they get paid the same amount:

# In hr.pyclassSalaryEmployee(Employee):def__init__(self,id,name,weekly_salary):super().__init__(id,name)self.weekly_salary=weekly_salarydefcalculate_payroll(self):returnself.weekly_salary

You create a derived class SalaryEmployee that inherits Employee. The class is initialized with the id and name required by the base class, and you use super() to initialize the members of the base class. You can read all about super() in Supercharge Your Classes With Python super().

SalaryEmployee also requires a weekly_salary initialization parameter that represents the amount the employee makes per week.

The class provides the required .calculate_payroll() method used by the HR system. The implementation just returns the amount stored in weekly_salary.

The company also employs manufacturing workers that are paid by the hour, so you add an HourlyEmployee to the HR system:

# In hr.pyclassHourlyEmployee(Employee):def__init__(self,id,name,hours_worked,hour_rate):super().__init__(id,name)self.hours_worked=hours_workedself.hour_rate=hour_ratedefcalculate_payroll(self):returnself.hours_worked*self.hour_rate

The HourlyEmployee class is initialized with id and name, like the base class, plus the hours_worked and the hour_rate required to calculate the payroll. The .calculate_payroll() method is implemented by returning the hours worked times the hour rate.

Finally, the company employs sales associates that are paid through a fixed salary plus a commission based on their sales, so you create a CommissionEmployee class:

# In hr.pyclassCommissionEmployee(SalaryEmployee):def__init__(self,id,name,weekly_salary,commission):super().__init__(id,name,weekly_salary)self.commission=commissiondefcalculate_payroll(self):fixed=super().calculate_payroll()returnfixed+self.commission

You derive CommissionEmployee from SalaryEmployee because both classes have a weekly_salary to consider. At the same time, CommissionEmployee is initialized with a commission value that is based on the sales for the employee.

.calculate_payroll() leverages the implementation of the base class to retrieve the fixed salary and adds the commission value.

Since CommissionEmployee derives from SalaryEmployee, you have access to the weekly_salary property directly, and you could’ve implemented .calculate_payroll() using the value of that property.

The problem with accessing the property directly is that if the implementation of SalaryEmployee.calculate_payroll() changes, then you’ll have to also change the implementation of CommissionEmployee.calculate_payroll(). It’s better to rely on the already implemented method in the base class and extend the functionality as needed.

You created your first class hierarchy for the system. The UML diagram of the classes looks like this:

Inheritance example with multiple Employee derived classes

The diagram shows the inheritance hierarchy of the classes. The derived classes implement the IPayrollCalculator interface, which is required by the PayrollSystem. The PayrollSystem.calculate_payroll() implementation requires that the employee objects passed contain an id, name, and calculate_payroll() implementation.

Interfaces are represented similarly to classes with the word interface above the interface name. Interface names are usually prefixed with a capital I.

The application creates its employees and passes them to the payroll system to process payroll:

# In program.pyimporthrsalary_employee=hr.SalaryEmployee(1,'John Smith',1500)hourly_employee=hr.HourlyEmployee(2,'Jane Doe',40,15)commission_employee=hr.CommissionEmployee(3,'Kevin Bacon',1000,250)payroll_system=hr.PayrollSystem()payroll_system.calculate_payroll([salary_employee,hourly_employee,commission_employee])

You can run the program in the command line and see the results:

$ python program.py

Calculating Payroll===================Payroll for: 1 - John Smith- Check amount: 1500Payroll for: 2 - Jane Doe- Check amount: 600Payroll for: 3 - Kevin Bacon- Check amount: 1250

The program creates three employee objects, one for each of the derived classes. Then, it creates the payroll system and passes a list of the employees to its .calculate_payroll() method, which calculates the payroll for each employee and prints the results.

Notice how the Employee base class doesn’t define a .calculate_payroll() method. This means that if you were to create a plain Employee object and pass it to the PayrollSystem, then you’d get an error. You can try it in the Python interactive interpreter:

>>>
>>> importhr>>> employee=hr.Employee(1,'Invalid')>>> payroll_system=hr.PayrollSystem()>>> payroll_system.calculate_payroll([employee])Payroll for: 1 - InvalidTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/hr.py", line 39, in calculate_payrollprint(f'- Check amount: {employee.calculate_payroll()}')AttributeError: 'Employee' object has no attribute 'calculate_payroll'

While you can instantiate an Employee object, the object can’t be used by the PayrollSystem. Why? Because it can’t .calculate_payroll() for an Employee. To meet the requirements of PayrollSystem, you’ll want to convert the Employee class, which is currently a concrete class, to an abstract class. That way, no employee is ever just an Employee, but one that implements .calculate_payroll().

Abstract Base Classes in Python

The Employee class in the example above is what is called an abstract base class. Abstract base classes exist to be inherited, but never instantiated. Python provides the abc module to define abstract base classes.

You can use leading underscores in your class name to communicate that objects of that class should not be created. Underscores provide a friendly way to prevent misuse of your code, but they don’t prevent eager users from creating instances of that class.

The abc module in the Python standard library provides functionality to prevent creating objects from abstract base classes.

You can modify the implementation of the Employee class to ensure that it can’t be instantiated:

# In hr.pyfromabcimportABC,abstractmethodclassEmployee(ABC):def__init__(self,id,name):self.id=idself.name=name@abstractmethoddefcalculate_payroll(self):pass

You derive Employee from ABC, making it an abstract base class. Then, you decorate the .calculate_payroll() method with the @abstractmethoddecorator.

This change has two nice side-effects:

  1. You’re telling users of the module that objects of type Employee can’t be created.
  2. You’re telling other developers working on the hr module that if they derive from Employee, then they must override the .calculate_payroll() abstract method.

You can see that objects of type Employee can’t be created using the interactive interpreter:

>>>
>>> importhr>>> employee=hr.Employee(1,'abstract')Traceback (most recent call last):
  File "<stdin>", line 1, in <module>TypeError: Can't instantiate abstract class Employee with abstract methods calculate_payroll

The output shows that the class cannot be instantiated because it contains an abstract method calculate_payroll(). Derived classes must override the method to allow creating objects of their type.

Implementation Inheritance vs Interface Inheritance

When you derive one class from another, the derived class inherits both:

  1. The base class interface: The derived class inherits all the methods, properties, and attributes of the base class.

  2. The base class implementation: The derived class inherits the code that implements the class interface.

Most of the time, you’ll want to inherit the implementation of a class, but you will want to implement multiple interfaces, so your objects can be used in different situations.

Modern programming languages are designed with this basic concept in mind. They allow you to inherit from a single class, but you can implement multiple interfaces.

In Python, you don’t have to explicitly declare an interface. Any object that implements the desired interface can be used in place of another object. This is known as duck typing. Duck typing is usually explained as “if it behaves like a duck, then it’s a duck.”

To illustrate this, you will now add a DisgruntledEmployee class to the example above which doesn’t derive from Employee:

# In disgruntled.pyclassDisgruntledEmployee:def__init__(self,id,name):self.id=idself.name=namedefcalculate_payroll(self):return1000000

The DisgruntledEmployee class doesn’t derive from Employee, but it exposes the same interface required by the PayrollSystem. The PayrollSystem.calculate_payroll() requires a list of objects that implement the following interface:

  • An id property or attribute that returns the employee’s id
  • A name property or attribute that represents the employee’s name
  • A .calculate_payroll() method that doesn’t take any parameters and returns the payroll amount to process

All these requirements are met by the DisgruntledEmployee class, so the PayrollSystem can still calculate its payroll.

You can modify the program to use the DisgruntledEmployee class:

# In program.pyimporthrimportdisgruntledsalary_employee=hr.SalaryEmployee(1,'John Smith',1500)hourly_employee=hr.HourlyEmployee(2,'Jane Doe',40,15)commission_employee=hr.CommissionEmployee(3,'Kevin Bacon',1000,250)disgruntled_employee=disgruntled.DisgruntledEmployee(20000,'Anonymous')payroll_system=hr.PayrollSystem()payroll_system.calculate_payroll([salary_employee,hourly_employee,commission_employee,disgruntled_employee])

The program creates a DisgruntledEmployee object and adds it to the list processed by the PayrollSystem. You can now run the program and see its output:

$ python program.py

Calculating Payroll===================Payroll for: 1 - John Smith- Check amount: 1500Payroll for: 2 - Jane Doe- Check amount: 600Payroll for: 3 - Kevin Bacon- Check amount: 1250Payroll for: 20000 - Anonymous- Check amount: 1000000

As you can see, the PayrollSystem can still process the new object because it meets the desired interface.

Since you don’t have to derive from a specific class for your objects to be reusable by the program, you may be asking why you should use inheritance instead of just implementing the desired interface. The following rules may help you:

  • Use inheritance to reuse an implementation: Your derived classes should leverage most of their base class implementation. They must also model an is a relationship. A Customer class might also have an id and a name, but a Customer is not an Employee, so you should not use inheritance.

  • Implement an interface to be reused: When you want your class to be reused by a specific part of your application, you implement the required interface in your class, but you don’t need to provide a base class, or inherit from another class.

You can now clean up the example above to move onto the next topic. You can delete the disgruntled.py file and then modify the hr module to its original state:

# In hr.pyclassPayrollSystem:defcalculate_payroll(self,employees):print('Calculating Payroll')print('===================')foremployeeinemployees:print(f'Payroll for: {employee.id} - {employee.name}')print(f'- Check amount: {employee.calculate_payroll()}')print('')classEmployee:def__init__(self,id,name):self.id=idself.name=nameclassSalaryEmployee(Employee):def__init__(self,id,name,weekly_salary):super().__init__(id,name)self.weekly_salary=weekly_salarydefcalculate_payroll(self):returnself.weekly_salaryclassHourlyEmployee(Employee):def__init__(self,id,name,hours_worked,hour_rate):super().__init__(id,name)self.hours_worked=hours_workedself.hour_rate=hour_ratedefcalculate_payroll(self):returnself.hours_worked*self.hour_rateclassCommissionEmployee(SalaryEmployee):def__init__(self,id,name,weekly_salary,commission):super().__init__(id,name,weekly_salary)self.commission=commissiondefcalculate_payroll(self):fixed=super().calculate_payroll()returnfixed+self.commission

You removed the import of the abc module since the Employee class doesn’t need to be abstract. You also removed the abstract calculate_payroll() method from it since it doesn’t provide any implementation.

Basically, you are inheriting the implementation of the id and name attributes of the Employee class in your derived classes. Since .calculate_payroll() is just an interface to the PayrollSystem.calculate_payroll() method, you don’t need to implement it in the Employee base class.

Notice how the CommissionEmployee class derives from SalaryEmployee. This means that CommissionEmployee inherits the implementation and interface of SalaryEmployee. You can see how the CommissionEmployee.calculate_payroll() method leverages the base class implementation because it relies on the result from super().calculate_payroll() to implement its own version.

The Class Explosion Problem

If you are not careful, inheritance can lead you to a huge hierarchical structure of classes that is hard to understand and maintain. This is known as the class explosion problem.

You started building a class hierarchy of Employee types used by the PayrollSystem to calculate payroll. Now, you need to add some functionality to those classes, so they can be used with the new ProductivitySystem.

The ProductivitySystem tracks productivity based on employee roles. There are different employee roles:

  • Managers: They walk around yelling at people telling them what to do. They are salaried employees and make more money.
  • Secretaries: They do all the paper work for managers and ensure that everything gets billed and payed on time. They are also salaried employees but make less money.
  • Sales employees: They make a lot of phone calls to sell products. They have a salary, but they also get commissions for sales.
  • Factory workers: They manufacture the products for the company. They are paid by the hour.

With those requirements, you start to see that Employee and its derived classes might belong somewhere other than the hr module because now they’re also used by the ProductivitySystem.

You create an employees module and move the classes there:

# In employees.pyclassEmployee:def__init__(self,id,name):self.id=idself.name=nameclassSalaryEmployee(Employee):def__init__(self,id,name,weekly_salary):super().__init__(id,name)self.weekly_salary=weekly_salarydefcalculate_payroll(self):returnself.weekly_salaryclassHourlyEmployee(Employee):def__init__(self,id,name,hours_worked,hour_rate):super().__init__(id,name)self.hours_worked=hours_workedself.hour_rate=hour_ratedefcalculate_payroll(self):returnself.hours_worked*self.hour_rateclassCommissionEmployee(SalaryEmployee):def__init__(self,id,name,weekly_salary,commission):super().__init__(id,name,weekly_salary)self.commission=commissiondefcalculate_payroll(self):fixed=super().calculate_payroll()returnfixed+self.commission

The implementation remains the same, but you move the classes to the employee module. Now, you change your program to support the change:

# In program.pyimporthrimportemployeessalary_employee=employees.SalaryEmployee(1,'John Smith',1500)hourly_employee=employees.HourlyEmployee(2,'Jane Doe',40,15)commission_employee=employees.CommissionEmployee(3,'Kevin Bacon',1000,250)payroll_system=hr.PayrollSystem()payroll_system.calculate_payroll([salary_employee,hourly_employee,commission_employee])

You run the program and verify that it still works:

$ python program.py

Calculating Payroll===================Payroll for: 1 - John Smith- Check amount: 1500Payroll for: 2 - Jane Doe- Check amount: 600Payroll for: 3 - Kevin Bacon- Check amount: 1250

With everything in place, you start adding the new classes:

# In employees.pyclassManager(SalaryEmployee):defwork(self,hours):print(f'{self.name} screams and yells for {hours} hours.')classSecretary(SalaryEmployee):defwork(self,hours):print(f'{self.name} expends {hours} hours doing office paperwork.')classSalesPerson(CommissionEmployee):defwork(self,hours):print(f'{self.name} expends {hours} hours on the phone.')classFactoryWorker(HourlyEmployee):defwork(self,hours):print(f'{self.name} manufactures gadgets for {hours} hours.')

First, you add a Manager class that derives from SalaryEmployee. The class exposes a method work() that will be used by the productivity system. The method takes the hours the employee worked.

Then you add Secretary, SalesPerson, and FactoryWorker and then implement the work() interface, so they can be used by the productivity system.

Now, you can add the ProductivitySytem class:

# In productivity.pyclassProductivitySystem:deftrack(self,employees,hours):print('Tracking Employee Productivity')print('==============================')foremployeeinemployees:employee.work(hours)print('')

The class tracks employees in the track() method that takes a list of employees and the number of hours to track. You can now add the productivity system to your program:

# In program.pyimporthrimportemployeesimportproductivitymanager=employees.Manager(1,'Mary Poppins',3000)secretary=employees.Secretary(2,'John Smith',1500)sales_guy=employees.SalesPerson(3,'Kevin Bacon',1000,250)factory_worker=employees.FactoryWorker(2,'Jane Doe',40,15)employees=[manager,secretary,sales_guy,factory_worker,]productivity_system=productivity.ProductivitySystem()productivity_system.track(employees,40)payroll_system=hr.PayrollSystem()payroll_system.calculate_payroll(employees)

The program creates a list of employees of different types. The employee list is sent to the productivity system to track their work for 40 hours. Then the same list of employees is sent to the payroll system to calculate their payroll.

You can run the program to see the output:

$ python program.py

Tracking Employee Productivity==============================Mary Poppins screams and yells for 40 hours.John Smith expends 40 hours doing office paperwork.Kevin Bacon expends 40 hours on the phone.Jane Doe manufactures gadgets for 40 hours.Calculating Payroll===================Payroll for: 1 - Mary Poppins- Check amount: 3000Payroll for: 2 - John Smith- Check amount: 1500Payroll for: 3 - Kevin Bacon- Check amount: 1250Payroll for: 4 - Jane Doe- Check amount: 600

The program shows the employees working for 40 hours through the productivity system. Then it calculates and displays the payroll for each of the employees.

The program works as expected, but you had to add four new classes to support the changes. As new requirements come, your class hierarchy will inevitably grow, leading to the class explosion problem where your hierarchies will become so big that they’ll be hard to understand and maintain.

The following diagram shows the new class hierarchy:

Class design explosion by inheritance

The diagram shows how the class hierarchy is growing. Additional requirements might have an exponential effect in the number of classes with this design.

Inheriting Multiple Classes

Python is one of the few modern programming languages that supports multiple inheritance. Multiple inheritance is the ability to derive a class from multiple base classes at the same time.

Multiple inheritance has a bad reputation to the extent that most modern programming languages don’t support it. Instead, modern programming languages support the concept of interfaces. In those languages, you inherit from a single base class and then implement multiple interfaces, so your class can be re-used in different situations.

This approach puts some constraints in your designs. You can only inherit the implementation of one class by directly deriving from it. You can implement multiple interfaces, but you can’t inherit the implementation of multiple classes.

This constraint is good for software design because it forces you to design your classes with fewer dependencies on each other. You will see later in this article that you can leverage multiple implementations through composition, which makes software more flexible. This section, however, is about multiple inheritance, so let’s take a look at how it works.

It turns out that sometimes temporary secretaries are hired when there is too much paperwork to do. The TemporarySecretary class performs the role of a Secretary in the context of the ProductivitySystem, but for payroll purposes, it is an HourlyEmployee.

You look at your class design. It has grown a little bit, but you can still understand how it works. It seems you have two options:

  1. Derive from Secretary: You can derive from Secretary to inherit the .work() method for the role, and then override the .calculate_payroll() method to implement it as an HourlyEmployee.

  2. Derive from HourlyEmployee: You can derive from HourlyEmployee to inherit the .calculate_payroll() method, and then override the .work() method to implement it as a Secretary.

Then, you remember that Python supports multiple inheritance, so you decide to derive from both Secretary and HourlyEmployee:

# In employees.pyclassTemporarySecretary(Secretary,HourlyEmployee):pass

Python allows you to inherit from two different classes by specifying them between parenthesis in the class declaration.

Now, you modify your program to add the new temporary secretary employee:

importhrimportemployeesimportproductivitymanager=employees.Manager(1,'Mary Poppins',3000)secretary=employees.Secretary(2,'John Smith',1500)sales_guy=employees.SalesPerson(3,'Kevin Bacon',1000,250)factory_worker=employees.FactoryWorker(4,'Jane Doe',40,15)temporary_secretary=employees.TemporarySecretary(5,'Robin Williams',40,9)company_employees=[manager,secretary,sales_guy,factory_worker,temporary_secretary,]productivity_system=productivity.ProductivitySystem()productivity_system.track(company_employees,40)payroll_system=hr.PayrollSystem()payroll_system.calculate_payroll(company_employees)

You run the program to test it:

$ python program.py

Traceback (most recent call last): File ".\program.py", line 9, in <module>  temporary_secretary = employee.TemporarySecretary(5, 'Robin Williams', 40, 9)TypeError: __init__() takes 4 positional arguments but 5 were given

You get a TypeError exception saying that 4 positional arguments where expected, but 5 were given.

This is because you derived TemporarySecretary first from Secretary and then from HourlyEmployee, so the interpreter is trying to use Secretary.__init__() to initialize the object.

Okay, let’s reverse it:

classTemporarySecretary(HourlyEmployee,Secretary):pass

Now, run the program again and see what happens:

$ python program.py

Traceback (most recent call last): File ".\program.py", line 9, in <module>  temporary_secretary = employee.TemporarySecretary(5, 'Robin Williams', 40, 9) File "employee.py", line 16, in __init__  super().__init__(id, name)TypeError: __init__() missing 1 required positional argument: 'weekly_salary'

Now it seems you are missing a weekly_salary parameter, which is necessary to initialize Secretary, but that parameter doesn’t make sense in the context of a TemporarySecretary because it’s an HourlyEmployee.

Maybe implementing TemporarySecretary.__init__() will help:

# In employees.pyclassTemporarySecretary(HourlyEmployee,Secretary):def__init__(self,id,name,hours_worked,hour_rate):super().__init__(id,name,hours_worked,hour_rate)

Try it:

$ python program.py

Traceback (most recent call last): File ".\program.py", line 9, in <module>  temporary_secretary = employee.TemporarySecretary(5, 'Robin Williams', 40, 9) File "employee.py", line 54, in __init__  super().__init__(id, name, hours_worked, hour_rate) File "employee.py", line 16, in __init__  super().__init__(id, name)TypeError: __init__() missing 1 required positional argument: 'weekly_salary'

That didn’t work either. Okay, it’s time for you to dive into Python’s method resolution order (MRO) to see what’s going on.

When a method or attribute of a class is accessed, Python uses the class MRO to find it. The MRO is also used by super() to determine which method or attribute to invoke. You can learn more about super() in Supercharge Your Classes With Python super().

You can evaluate the TemporarySecretary class MRO using the interactive interpreter:

>>>
>>> fromemployeesimportTemporarySecretary>>> TemporarySecretary.__mro__(<class 'employees.TemporarySecretary'>,<class 'employees.HourlyEmployee'>,<class 'employees.Secretary'>,<class 'employees.SalaryEmployee'>,<class 'employees.Employee'>,<class 'object'>)

The MRO shows the order in which Python is going to look for a matching attribute or method. In the example, this is what happens when we create the TemporarySecretary object:

  1. The TemporarySecretary.__init__(self, id, name, hours_worked, hour_rate) method is called.

  2. The super().__init__(id, name, hours_worked, hour_rate) call matches HourlyEmployee.__init__(self, id, name, hour_worked, hour_rate).

  3. HourlyEmployee calls super().__init__(id, name), which the MRO is going to match to Secretary.__init__(), which is inherited from SalaryEmployee.__init__(self, id, name, weekly_salary).

Because the parameters don’t match, a TypeError exception is raised.

You can bypass the MRO by reversing the inheritance order and directly calling HourlyEmployee.__init__() as follows:

classTemporarySecretary(Secretary,HourlyEmployee):def__init__(self,id,name,hours_worked,hour_rate):HourlyEmployee.__init__(self,id,name,hours_worked,hour_rate)

That solves the problem of creating the object, but you will run into a similar problem when trying to calculate payroll. You can run the program to see the problem:

$ python program.py

Tracking Employee Productivity==============================Mary Poppins screams and yells for 40 hours.John Smith expends 40 hours doing office paperwork.Kevin Bacon expends 40 hours on the phone.Jane Doe manufactures gadgets for 40 hours.Robin Williams expends 40 hours doing office paperwork.Calculating Payroll===================Payroll for: 1 - Mary Poppins- Check amount: 3000Payroll for: 2 - John Smith- Check amount: 1500Payroll for: 3 - Kevin Bacon- Check amount: 1250Payroll for: 4 - Jane Doe- Check amount: 600Payroll for: 5 - Robin WilliamsTraceback (most recent call last):  File ".\program.py", line 20, in <module>    payroll_system.calculate_payroll(employees)  File "hr.py", line 7, in calculate_payroll    print(f'- Check amount: {employee.calculate_payroll()}')  File "employee.py", line 12, in calculate_payroll    return self.weekly_salaryAttributeError: 'TemporarySecretary' object has no attribute 'weekly_salary'

The problem now is that because you reversed the inheritance order, the MRO is finding the .calculate_payroll() method of SalariedEmployee before the one in HourlyEmployee. You need to override .calculate_payroll() in TemporarySecretary and invoke the right implementation from it:

classTemporarySecretary(Secretary,HourlyEmployee):def__init__(self,id,name,hours_worked,hour_rate):HourlyEmployee.__init__(self,id,name,hours_worked,hour_rate)defcalculate_payroll(self):returnHourlyEmployee.calculate_payroll(self)

The calculate_payroll() method directly invokes HourlyEmployee.calculate_payroll() to ensure that you get the correct result. You can run the program again to see it working:

$ python program.py

Tracking Employee Productivity==============================Mary Poppins screams and yells for 40 hours.John Smith expends 40 hours doing office paperwork.Kevin Bacon expends 40 hours on the phone.Jane Doe manufactures gadgets for 40 hours.Robin Williams expends 40 hours doing office paperwork.Calculating Payroll===================Payroll for: 1 - Mary Poppins- Check amount: 3000Payroll for: 2 - John Smith- Check amount: 1500Payroll for: 3 - Kevin Bacon- Check amount: 1250Payroll for: 4 - Jane Doe- Check amount: 600Payroll for: 5 - Robin Williams- Check amount: 360

The program now works as expected because you’re forcing the method resolution order by explicitly telling the interpreter which method we want to use.

As you can see, multiple inheritance can be confusing, especially when you run into the diamond problem.

The following diagram shows the diamond problem in your class hierarchy:

Diamond problem caused by multiple inheritance

The diagram shows the diamond problem with the current class design. TemporarySecretary uses multiple inheritance to derive from two classes that ultimately also derive from Employee. This causes two paths to reach the Employee base class, which is something you want to avoid in your designs.

The diamond problem appears when you’re using multiple inheritance and deriving from two classes that have a common base class. This can cause the wrong version of a method to be called.

As you’ve seen, Python provides a way to force the right method to be invoked, and analyzing the MRO can help you understand the problem.

Still, when you run into the diamond problem, it’s better to re-think the design. You will now make some changes to leverage multiple inheritance, avoiding the diamond problem.

The Employee derived classes are used by two different systems:

  1. The productivity system that tracks employee productivity.

  2. The payroll system that calculates the employee payroll.

This means that everything related to productivity should be together in one module and everything related to payroll should be together in another. You can start making changes to the productivity module:

# In productivity.pyclassProductivitySystem:deftrack(self,employees,hours):print('Tracking Employee Productivity')print('==============================')foremployeeinemployees:result=employee.work(hours)print(f'{employee.name}: {result}')print('')classManagerRole:defwork(self,hours):returnf'screams and yells for {hours} hours.'classSecretaryRole:defwork(self,hours):returnf'expends {hours} hours doing office paperwork.'classSalesRole:defwork(self,hours):returnf'expends {hours} hours on the phone.'classFactoryRole:defwork(self,hours):returnf'manufactures gadgets for {hours} hours.'

The productivity module implements the ProductivitySystem class, as well as the related roles it supports. The classes implement the work() interface required by the system, but they don’t derived from Employee.

You can do the same with the hr module:

# In hr.pyclassPayrollSystem:defcalculate_payroll(self,employees):print('Calculating Payroll')print('===================')foremployeeinemployees:print(f'Payroll for: {employee.id} - {employee.name}')print(f'- Check amount: {employee.calculate_payroll()}')print('')classSalaryPolicy:def__init__(self,weekly_salary):self.weekly_salary=weekly_salarydefcalculate_payroll(self):returnself.weekly_salaryclassHourlyPolicy:def__init__(self,hours_worked,hour_rate):self.hours_worked=hours_workedself.hour_rate=hour_ratedefcalculate_payroll(self):returnself.hours_worked*self.hour_rateclassCommissionPolicy(SalaryPolicy):def__init__(self,weekly_salary,commission):super().__init__(weekly_salary)self.commission=commissiondefcalculate_payroll(self):fixed=super().calculate_payroll()returnfixed+self.commission

The hr module implements the PayrollSystem, which calculates payroll for the employees. It also implements the policy classes for payroll. As you can see, the policy classes don’t derive from Employee anymore.

You can now add the necessary classes to the employee module:

# In employees.pyfromhrimport(SalaryPolicy,CommissionPolicy,HourlyPolicy)fromproductivityimport(ManagerRole,SecretaryRole,SalesRole,FactoryRole)classEmployee:def__init__(self,id,name):self.id=idself.name=nameclassManager(Employee,ManagerRole,SalaryPolicy):def__init__(self,id,name,weekly_salary):SalaryPolicy.__init__(self,weekly_salary)super().__init__(id,name)classSecretary(Employee,SecretaryRole,SalaryPolicy):def__init__(self,id,name,weekly_salary):SalaryPolicy.__init__(self,weekly_salary)super().__init__(id,name)classSalesPerson(Employee,SalesRole,CommissionPolicy):def__init__(self,id,name,weekly_salary,commission):CommissionPolicy.__init__(self,weekly_salary,commission)super().__init__(id,name)classFactoryWorker(Employee,FactoryRole,HourlyPolicy):def__init__(self,id,name,hours_worked,hour_rate):HourlyPolicy.__init__(self,hours_worked,hour_rate)super().__init__(id,name)classTemporarySecretary(Employee,SecretaryRole,HourlyPolicy):def__init__(self,id,name,hours_worked,hour_rate):HourlyPolicy.__init__(self,hours_worked,hour_rate)super().__init__(id,name)

The employees module imports policies and roles from the other modules and implements the different Employee types. You are still using multiple inheritance to inherit the implementation of the salary policy classes and the productivity roles, but the implementation of each class only needs to deal with initialization.

Notice that you still need to explicitly initialize the salary policies in the constructors. You probably saw that the initializations of Manager and Secretary are identical. Also, the initializations of FactoryWorker and TemporarySecretary are the same.

You will not want to have this kind of code duplication in more complex designs, so you have to be careful when designing class hierarchies.

Here’s the UML diagram for the new design:

Policy based design using multiple inheritance

The diagram shows the relationships to define the Secretary and TemporarySecretary using multiple inheritance, but avoiding the diamond problem.

You can run the program and see how it works:

$ python program.py

Tracking Employee Productivity==============================Mary Poppins: screams and yells for 40 hours.John Smith: expends 40 hours doing office paperwork.Kevin Bacon: expends 40 hours on the phone.Jane Doe: manufactures gadgets for 40 hours.Robin Williams: expends 40 hours doing office paperwork.Calculating Payroll===================Payroll for: 1 - Mary Poppins- Check amount: 3000Payroll for: 2 - John Smith- Check amount: 1500Payroll for: 3 - Kevin Bacon- Check amount: 1250Payroll for: 4 - Jane Doe- Check amount: 600Payroll for: 5 - Robin Williams- Check amount: 360

You’ve seen how inheritance and multiple inheritance work in Python. You can now explore the topic of composition.

Composition in Python

Composition is an object oriented design concept that models a has a relationship. In composition, a class known as composite contains an object of another class known to as component. In other words, a composite class has a component of another class.

Composition allows composite classes to reuse the implementation of the components it contains. The composite class doesn’t inherit the component class interface, but it can leverage its implementation.

The composition relation between two classes is considered loosely coupled. That means that changes to the component class rarely affect the composite class, and changes to the composite class never affect the component class.

This provides better adaptability to change and allows applications to introduce new requirements without affecting existing code.

When looking at two competing software designs, one based on inheritance and another based on composition, the composition solution usually is the most flexible. You can now look at how composition works.

You’ve already used composition in our examples. If you look at the Employee class, you’ll see that it contains two attributes:

  1. id to identify an employee.
  2. name to contain the name of the employee.

These two attributes are objects that the Employee class has. Therefore, you can say that an Employeehas anid and has a name.

Another attribute for an Employee might be an Address:

# In contacts.pyclassAddress:def__init__(self,street,city,state,zipcode,street2=''):self.street=streetself.street2=street2self.city=cityself.state=stateself.zipcode=zipcodedef__str__(self):lines=[self.street]ifself.street2:lines.append(self.street2)lines.append(f'{self.city}, {self.state}{self.zipcode}')return'\n'.join(lines)

You implemented a basic address class that contains the usual components for an address. You made the street2 attribute optional because not all addresses will have that component.

You implemented __str__() to provide a pretty representation of an Address. You can see this implementation in the interactive interpreter:

>>>
>>> fromcontactsimportAddress>>> address=Address('55 Main St.','Concord','NH','03301')>>> print(address)55 Main St.Concord, NH 03301

When you print() the address variable, the special method __str__() is invoked. Since you overloaded the method to return a string formatted as an address, you get a nice, readable representation. Operator and Function Overloading in Custom Python Classes gives a good overview of the special methods available in classes that can be implemented to customize the behavior of your objects.

You can now add the Address to the Employee class through composition:

# In employees.pyclassEmployee:def__init__(self,id,name):self.id=idself.name=nameself.address=None

You initialize the address attribute to None for now to make it optional, but by doing that, you can now assign an Address to an Employee. Also notice that there is no reference in the employee module to the contacts module.

Composition is a loosely coupled relationship that often doesn’t require the composite class to have knowledge of the component.

The UML diagram representing the relationship between Employee and Address looks like this:

Composition example with Employee containing Address

The diagram shows the basic composition relationship between Employee and Address.

You can now modify the PayrollSystem class to leverage the address attribute in Employee:

# In hr.pyclassPayrollSystem:defcalculate_payroll(self,employees):print('Calculating Payroll')print('===================')foremployeeinemployees:print(f'Payroll for: {employee.id} - {employee.name}')print(f'- Check amount: {employee.calculate_payroll()}')ifemployee.address:print('- Sent to:')print(employee.address)print('')

You check to see if the employee object has an address, and if it does, you print it. You can now modify the program to assign some addresses to the employees:

# In program.pyimporthrimportemployeesimportproductivityimportcontactsmanager=employees.Manager(1,'Mary Poppins',3000)manager.address=contacts.Address('121 Admin Rd','Concord','NH','03301')secretary=employees.Secretary(2,'John Smith',1500)secretary.address=contacts.Address('67 Paperwork Ave.','Manchester','NH','03101')sales_guy=employees.SalesPerson(3,'Kevin Bacon',1000,250)factory_worker=employees.FactoryWorker(4,'Jane Doe',40,15)temporary_secretary=employees.TemporarySecretary(5,'Robin Williams',40,9)employees=[manager,secretary,sales_guy,factory_worker,temporary_secretary,]productivity_system=productivity.ProductivitySystem()productivity_system.track(employees,40)payroll_system=hr.PayrollSystem()payroll_system.calculate_payroll(employees)

You added a couple of addresses to the manager and secretary objects. When you run the program, you will see the addresses printed:

$ python program.py

Tracking Employee Productivity==============================Mary Poppins: screams and yells for {hours} hours.John Smith: expends {hours} hours doing office paperwork.Kevin Bacon: expends {hours} hours on the phone.Jane Doe: manufactures gadgets for {hours} hours.Robin Williams: expends {hours} hours doing office paperwork.Calculating Payroll===================Payroll for: 1 - Mary Poppins- Check amount: 3000- Sent to:121 Admin RdConcord, NH 03301Payroll for: 2 - John Smith- Check amount: 1500- Sent to:67 Paperwork Ave.Manchester, NH 03101Payroll for: 3 - Kevin Bacon- Check amount: 1250Payroll for: 4 - Jane Doe- Check amount: 600Payroll for: 5 - Robin Williams- Check amount: 360

Notice how the payroll output for the manager and secretary objects show the addresses where the checks were sent.

The Employee class leverages the implementation of the Address class without any knowledge of what an Address object is or how it’s represented. This type of design is so flexible that you can change the Address class without any impact to the Employee class.

Flexible Designs With Composition

Composition is more flexible than inheritance because it models a loosely coupled relationship. Changes to a component class have minimal or no effects on the composite class. Designs based on composition are more suitable to change.

You change behavior by providing new components that implement those behaviors instead of adding new classes to your hierarchy.

Take a look at the multiple inheritance example above. Imagine how new payroll policies will affect the design. Try to picture what the class hierarchy will look like if new roles are needed. As you saw before, relying too heavily on inheritance can lead to class explosion.

The biggest problem is not so much the number of classes in your design, but how tightly coupled the relationships between those classes are. Tightly coupled classes affect each other when changes are introduced.

In this section, you are going to use composition to implement a better design that still fits the requirements of the PayrollSystem and the ProductivitySystem.

You can start by implementing the functionality of the ProductivitySystem:

# In productivity.pyclassProductivitySystem:def__init__(self):self._roles={'manager':ManagerRole,'secretary':SecretaryRole,'sales':SalesRole,'factory':FactoryRole,}defget_role(self,role_id):role_type=self._roles.get(role_id)ifnotrole_type:raiseValueError('role_id')returnrole_type()deftrack(self,employees,hours):print('Tracking Employee Productivity')print('==============================')foremployeeinemployees:employee.work(hours)print('')

The ProductivitySystem class defines some roles using a string identifier mapped to a role class that implements the role. It exposes a .get_role() method that, given a role identifier, returns the role type object. If the role is not found, then a ValueError exception is raised.

It also exposes the previous functionality in the .track() method, where given a list of employees it tracks the productivity of those employees.

You can now implement the different role classes:

# In productivity.pyclassManagerRole:defperform_duties(self,hours):returnf'screams and yells for {hours} hours.'classSecretaryRole:defperform_duties(self,hours):returnf'does paperwork for {hours} hours.'classSalesRole:defperform_duties(self,hours):returnf'expends {hours} hours on the phone.'classFactoryRole:defperform_duties(self,hours):returnf'manufactures gadgets for {hours} hours.'

Each of the roles you implemented expose a .perform_duties() that takes the number of hours worked. The methods return a string representing the duties.

The role classes are independent of each other, but they expose the same interface, so they are interchangeable. You’ll see later how they are used in the application.

Now, you can implement the PayrollSystem for the application:

# In hr.pyclassPayrollSystem:def__init__(self):self._employee_policies={1:SalaryPolicy(3000),2:SalaryPolicy(1500),3:CommissionPolicy(1000,100),4:HourlyPolicy(15),5:HourlyPolicy(9)}defget_policy(self,employee_id):policy=self._employee_policies.get(employee_id)ifnotpolicy:returnValueError(employee_id)returnpolicydefcalculate_payroll(self,employees):print('Calculating Payroll')print('===================')foremployeeinemployees:print(f'Payroll for: {employee.id} - {employee.name}')print(f'- Check amount: {employee.calculate_payroll()}')ifemployee.address:print('- Sent to:')print(employee.address)print('')

The PayrollSystem keeps an internal database of payroll policies for each employee. It exposes a .get_policy() that, given an employee id, returns its payroll policy. If a specified id doesn’t exist in the system, then the method raises a ValueError exception.

The implementation of .calculate_payroll() works the same as before. It takes a list of employees, calculates the payroll, and prints the results.

You can now implement the payroll policy classes:

# In hr.pyclassPayrollPolicy:def__init__(self):self.hours_worked=0deftrack_work(self,hours):self.hours_worked+=hoursclassSalaryPolicy(PayrollPolicy):def__init__(self,weekly_salary):super().__init__()self.weekly_salary=weekly_salarydefcalculate_payroll(self):returnself.weekly_salaryclassHourlyPolicy(PayrollPolicy):def__init__(self,hour_rate):super().__init__()self.hour_rate=hour_ratedefcalculate_payroll(self):returnself.hours_worked*self.hour_rateclassCommissionPolicy(SalaryPolicy):def__init__(self,weekly_salary,commission_per_sale):super().__init__(weekly_salary)self.commission_per_sale=commission_per_sale@propertydefcommission(self):sales=self.hours_worked/5returnsales*self.commission_per_saledefcalculate_payroll(self):fixed=super().calculate_payroll()returnfixed+self.commission

You first implement a PayrollPolicy class that serves as a base class for all the payroll policies. This class tracks the hours_worked, which is common to all payroll policies.

The other policy classes derive from PayrollPolicy. We use inheritance here because we want to leverage the implementation of PayrollPolicy. Also, SalaryPolicy, HourlyPolicy, and CommissionPolicyare aPayrollPolicy.

SalaryPolicy is initialized with a weekly_salary value that is then used in .calculate_payroll(). HourlyPolicy is initialized with the hour_rate, and implements .calculate_payroll() by leveraging the base class hours_worked.

The CommissionPolicy class derives from SalaryPolicy because it wants to inherit its implementation. It is initialized with the weekly_salary parameters, but it also requires a commission_per_sale parameter.

The commission_per_sale is used to calculate the .commission, which is implemented as a property so it gets calculated when requested. In the example, we are assuming that a sale happens every 5 hours worked, and the .commission is the number of sales times the commission_per_sale value.

CommissionPolicy implements the .calculate_payroll() method by first leveraging the implementation in SalaryPolicy and then adding the calculated commission.

You can now add an AddressBook class to manage employee addresses:

# In contacts.pyclassAddressBook:def__init__(self):self._employee_addresses={1:Address('121 Admin Rd.','Concord','NH','03301'),2:Address('67 Paperwork Ave','Manchester','NH','03101'),3:Address('15 Rose St','Concord','NH','03301','Apt. B-1'),4:Address('39 Sole St.','Concord','NH','03301'),5:Address('99 Mountain Rd.','Concord','NH','03301'),}defget_employee_address(self,employee_id):address=self._employee_addresses.get(employee_id)ifnotaddress:raiseValueError(employee_id)returnaddress

The AddressBook class keeps an internal database of Address objects for each employee. It exposes a get_employee_address() method that returns the address of the specified employee id. If the employee id doesn’t exist, then it raises a ValueError.

The Address class implementation remains the same as before:

# In contacts.pyclassAddress:def__init__(self,street,city,state,zipcode,street2=''):self.street=streetself.street2=street2self.city=cityself.state=stateself.zipcode=zipcodedef__str__(self):lines=[self.street]ifself.street2:lines.append(self.street2)lines.append(f'{self.city}, {self.state}{self.zipcode}')return'\n'.join(lines)

The class manages the address components and provides a pretty representation of an address.

So far, the new classes have been extended to support more functionality, but there are no significant changes to the previous design. This is going to change with the design of the employees module and its classes.

You can start by implementing an EmployeeDatabase class:

# In employees.pyfromproductivityimportProductivitySystemfromhrimportPayrollSystemfromcontactsimportAddressBookclassEmployeeDatabase:def__init__(self):self._employees=[{'id':1,'name':'Mary Poppins','role':'manager'},{'id':2,'name':'John Smith','role':'secretary'},{'id':3,'name':'Kevin Bacon','role':'sales'},{'id':4,'name':'Jane Doe','role':'factory'},{'id':5,'name':'Robin Williams','role':'secretary'},]self.productivity=ProductivitySystem()self.payroll=PayrollSystem()self.employee_addresses=AddressBook()@propertydefemployees(self):return[self._create_employee(**data)fordatainself._employees]def_create_employee(self,id,name,role):address=self.employee_addresses.get_employee_address(id)employee_role=self.productivity.get_role(role)payroll_policy=self.payroll.get_policy(id)returnEmployee(id,name,address,employee_role,payroll_policy)

The EmployeeDatabase keeps track of all the employees in the company. For each employee, it tracks the id, name, and role. It has an instance of the ProductivitySystem, the PayrollSystem, and the AddressBook. These instances are used to create employees.

It exposes an .employees property that returns the list of employees. The Employee objects are created in an internal method ._create_employee(). Notice that you don’t have different types of Employee classes. You just need to implement a single Employee class:

# In employees.pyclassEmployee:def__init__(self,id,name,address,role,payroll):self.id=idself.name=nameself.address=addressself.role=roleself.payroll=payrolldefwork(self,hours):duties=self.role.perform_duties(hours)print(f'Employee {self.id} - {self.name}:')print(f'- {duties}')print('')self.payroll.track_work(hours)defcalculate_payroll(self):returnself.payroll.calculate_payroll()

The Employee class is initialized with the id, name, and address attributes. It also requires the productivity role for the employee and the payroll policy.

The class exposes a .work() method that takes the hours worked. This method first retrieves the duties from the role. In other words, it delegates to the role object to perform its duties.

In the same way, it delegates to the payroll object to track the work hours. The payroll, as you saw, uses those hours to calculate the payroll if needed.

The following diagram shows the composition design used:

Policy based design using composition

The diagram shows the design of composition based policies. There is a single Employee that is composed of other data objects like Address and depends on the IRole and IPayrollCalculator interfaces to delegate the work. There are multiple implementations of these interfaces.

You can now use this design in your program:

# In program.pyfromhrimportPayrollSystemfromproductivityimportProductivitySystemfromemployeesimportEmployeeDatabaseproductivity_system=ProductivitySystem()payroll_system=PayrollSystem()employee_database=EmployeeDatabase()employees=employee_database.employeesproductivity_system.track(employees,40)payroll_system.calculate_payroll(employees)

You can run the program to see its output:

$ python program.py

Tracking Employee Productivity==============================Employee 1 - Mary Poppins:- screams and yells for 40 hours.Employee 2 - John Smith:- does paperwork for 40 hours.Employee 3 - Kevin Bacon:- expends 40 hours on the phone.Employee 4 - Jane Doe:- manufactures gadgets for 40 hours.Employee 5 - Robin Williams:- does paperwork for 40 hours.Calculating Payroll===================Payroll for: 1 - Mary Poppins- Check amount: 3000- Sent to:121 Admin Rd.Concord, NH 03301Payroll for: 2 - John Smith- Check amount: 1500- Sent to:67 Paperwork AveManchester, NH 03101Payroll for: 3 - Kevin Bacon- Check amount: 1800.0- Sent to:15 Rose StApt. B-1Concord, NH 03301Payroll for: 4 - Jane Doe- Check amount: 600- Sent to:39 Sole St.Concord, NH 03301Payroll for: 5 - Robin Williams- Check amount: 360- Sent to:99 Mountain Rd.Concord, NH 03301

This design is what is called policy-based design, where classes are composed of policies, and they delegate to those policies to do the work.

Policy-based design was introduced in the book Modern C++ Design, and it uses template metaprogramming in C++ to achieve the results.

Python does not support templates, but you can achieve similar results using composition, as you saw in the example above.

This type of design gives you all the flexibility you’ll need as requirements change. Imagine you need to change the way payroll is calculated for an object at run-time.

Customizing Behavior With Composition

If your design relies on inheritance, you need to find a way to change the type of an object to change its behavior. With composition, you just need to change the policy the object uses.

Imagine that our manager all of a sudden becomes a temporary employee that gets paid by the hour. You can modify the object during the execution of the program in the following way:

# In program.pyfromhrimportPayrollSystem,HourlyPolicyfromproductivityimportProductivitySystemfromemployeesimportEmployeeDatabaseproductivity_system=ProductivitySystem()payroll_system=PayrollSystem()employee_database=EmployeeDatabase()employees=employee_database.employeesmanager=employees[0]manager.payroll=HourlyPolicy(55)productivity_system.track(employees,40)payroll_system.calculate_payroll(employees)

The program gets the employee list from the EmployeeDatabase and retrieves the first employee, which is the manager we want. Then it creates a new HourlyPolicy initialized at $55 per hour and assigns it to the manager object.

The new policy is now used by the PayrollSystem modifying the existing behavior. You can run the program again to see the result:

$ python program.py

Tracking Employee Productivity==============================Employee 1 - Mary Poppins:- screams and yells for 40 hours.Employee 2 - John Smith:- does paperwork for 40 hours.Employee 3 - Kevin Bacon:- expends 40 hours on the phone.Employee 4 - Jane Doe:- manufactures gadgets for 40 hours.Employee 5 - Robin Williams:- does paperwork for 40 hours.Calculating Payroll===================Payroll for: 1 - Mary Poppins- Check amount: 2200- Sent to:121 Admin Rd.Concord, NH 03301Payroll for: 2 - John Smith- Check amount: 1500- Sent to:67 Paperwork AveManchester, NH 03101Payroll for: 3 - Kevin Bacon- Check amount: 1800.0- Sent to:15 Rose StApt. B-1Concord, NH 03301Payroll for: 4 - Jane Doe- Check amount: 600- Sent to:39 Sole St.Concord, NH 03301Payroll for: 5 - Robin Williams- Check amount: 360- Sent to:99 Mountain Rd.Concord, NH 03301

The check for Mary Poppins, our manager, is now for $2200 instead of the fixed salary of $3000 that she had per week.

Notice how we added that business rule to the program without changing any of the existing classes. Consider what type of changes would’ve been required with an inheritance design.

You would’ve had to create a new class and change the type of the manager employee. There is no chance you could’ve changed the policy at run-time.

Choosing Between Inheritance and Composition in Python

So far, you’ve seen how inheritance and composition work in Python. You’ve seen that derived classes inherit the interface and implementation of their base classes. You’ve also seen that composition allows you to reuse the implementation of another class.

You’ve implemented two solutions to the same problem. The first solution used multiple inheritance, and the second one used composition.

You’ve also seen that Python’s duck typing allows you to reuse objects with existing parts of a program by implementing the desired interface. In Python, it isn’t necessary to derive from a base class for your classes to be reused.

At this point, you might be asking when to use inheritance vs composition in Python. They both enable code reuse. Inheritance and composition can tackle similar problems in your Python programs.

The general advice is to use the relationship that creates fewer dependencies between two classes. This relation is composition. Still, there will be times where inheritance will make more sense.

The following sections provide some guidelines to help you make the right choice between inheritance and composition in Python.

Inheritance to Model “Is A” Relationship

Inheritance should only be used to model an is a relationship. Liskov’s substitution principle says that an object of type Derived, which inherits from Base, can replace an object of type Base without altering the desirable properties of a program.

Liskov’s substitution principle is the most important guideline to determine if inheritance is the appropriate design solution. Still, the answer might not be straightforward in all situations. Fortunately, there is a simple test you can use to determine if your design follows Liskov’s substitution principle.

Let’s say you have a class A that provides an implementation and interface you want to reuse in another class B. Your initial thought is that you can derive B from A and inherit both the interface and implementation. To be sure this is the right design, you follow theses steps:

  1. Evaluate B is an A: Think about this relationship and justify it. Does it make sense?

  2. Evaluate A is a B: Reverse the relationship and justify it. Does it also make sense?

If you can justify both relationships, then you should never inherit those classes from one another. Let’s look at a more concrete example.

You have a class Rectangle which exposes an .area property. You need a class Square, which also has an .area. It seems that a Square is a special type of Rectangle, so maybe you can derive from it and leverage both the interface and implementation.

Before you jump into the implementation, you use Liskov’s substitution principle to evaluate the relationship.

A Squareis aRectangle because its area is calculated from the product of its height times its length. The constraint is that Square.height and Square.length must be equal.

It makes sense. You can justify the relationship and explain why a Squareis aRectangle. Let’s reverse the relationship to see if it makes sense.

A Rectangleis aSquare because its area is calculated from the product of its height times its length. The difference is that Rectangle.height and Rectangle.width can change independently.

It also makes sense. You can justify the relationship and describe the special constraints for each class. This is a good sign that these two classes should never derive from each other.

You might have seen other examples that derive Square from Rectangle to explain inheritance. You might be skeptical with the little test you just did. Fair enough. Let’s write a program that illustrates the problem with deriving Square from Rectangle.

First, you implement Rectangle. You’re even going to encapsulate the attributes to ensure that all the constraints are met:

# In rectangle_square_demo.pyclassRectangle:def__init__(self,length,height):self._length=lengthself._height=height@propertydefarea(self):returnself._length*self._height

The Rectangle class is initialized with a length and a height, and it provides an .area property that returns the area. The length and height are encapsulated to avoid changing them directly.

Now, you derive Square from Rectangle and override the necessary interface to meet the constraints of a Square:

# In rectangle_square_demo.pyclassSquare(Rectangle):def__init__(self,side_size):super().__init__(side_size,side_size)

The Square class is initialized with a side_size, which is used to initialize both components of the base class. Now, you write a small program to test the behavior:

# In rectangle_square_demo.pyrectangle=Rectangle(2,4)assertrectangle.area==8square=Square(2)assertsquare.area==4print('OK!')

The program creates a Rectangle and a Square and asserts that their .area is calculated correctly. You can run the program and see that everything is OK so far:

$ python rectangle_square_demo.py

OK!

The program executes correctly, so it seems that Square is just a special case of a Rectangle.

Later on, you need to support resizing Rectangle objects, so you make the appropriate changes to the class:

# In rectangle_square_demo.pyclassRectangle:def__init__(self,length,height):self._length=lengthself._height=height@propertydefarea(self):returnself._length*self._heightdefresize(self,new_length,new_height):self._length=new_lengthself._height=new_height

.resize() takes the new_length and new_width for the object. You can add the following code to the program to verify that it works correctly:

# In rectangle_square_demo.pyrectangle.resize(3,5)assertrectangle.area==15print('OK!')

You resize the rectangle object and assert that the new area is correct. You can run the program to verify the behavior:

$ python rectangle_square_demo.py

OK!

The assertion passes, and you see that the program runs correctly.

So, what happens if you resize a square? Modify the program, and try to modify the square object:

# In rectangle_square_demo.pysquare.resize(3,5)print(f'Square area: {square.area}')

You pass the same parameters to square.resize() that you used with rectangle, and print the area. When you run the program you see:

$ python rectangle_square_demo.py

Square area: 15OK!

The program shows that the new area is 15 like the rectangle object. The problem now is that the square object no longer meets the Square class constraint that the length and height must be equal.

How can you fix that problem? You can try several approaches, but all of them will be awkward. You can override .resize() in square and ignore the height parameter, but that will be confusing for people looking at other parts of the program where rectangles are being resized and some of them are not getting the expected areas because they are really squares.

In a small program like this one, it might be easy to spot the causes of the weird behavior, but in a more complex program, the problem will be harder to find.

The reality is that if you’re able to justify an inheritance relationship between two classes both ways, you should not derive one class from another.

In the example, it doesn’t make sense that Square inherits the interface and implementation of .resize() from Rectangle. That doesn’t mean that Square objects can’t be resized. It means that the interface is different because it only needs a side_size parameter.

This difference in interface justifies not deriving Square from Rectangle like the test above advised.

Mixing Features With Mixin Classes

One of the uses of multiple inheritance in Python is to extend a class features through mixins. A mixin is a class that provides methods to other classes but are not considered a base class.

A mixin allows other classes to reuse its interface and implementation without becoming a super class. They implement a unique behavior that can be aggregated to other unrelated classes. They are similar to composition but they create a stronger relationship.

Let’s say you want to convert objects of certain types in your application to a dictionary representation of the object. You could provide a .to_dict() method in every class that you want to support this feature, but the implementation of .to_dict() seems to be very similar.

This could be a good candidate for a mixin. You start by slightly modifying the Employee class from the composition example:

# In employees.pyclassEmployee:def__init__(self,id,name,address,role,payroll):self.id=idself.name=nameself.address=addressself._role=roleself._payroll=payrolldefwork(self,hours):duties=self._role.perform_duties(hours)print(f'Employee {self.id} - {self.name}:')print(f'- {duties}')print('')self._payroll.track_work(hours)defcalculate_payroll(self):returnself._payroll.calculate_payroll()

The change is very small. You just changed the role and payroll attributes to be internal by adding a leading underscore to their name. You will see soon why you are making that change.

Now, you add the AsDictionaryMixin class:

# In representations.pyclassAsDictionaryMixin:defto_dict(self):return{prop:self._represent(value)forprop,valueinself.__dict__.items()ifnotself._is_internal(prop)}def_represent(self,value):ifisinstance(value,object):ifhasattr(value,'to_dict'):returnvalue.to_dict()else:returnstr(value)else:returnvaluedef_is_internal(self,prop):returnprop.startswith('_')

The AsDictionaryMixin class exposes a .to_dict() method that returns the representation of itself as a dictionary. The method is implemented as a dict comprehension that says, “Create a dictionary mapping prop to value for each item in self.__dict__.items() if the prop is not internal.”

Note: This is why we made the role and payroll attributes internal in the Employee class, because we don’t want to represent them in the dictionary.

As you saw at the beginning, creating a class inherits some members from object, and one of those members is __dict__, which is basically a mapping of all the attributes in an object to their value.

You iterate through all the items in __dict__ and filter out the ones that have a name that starts with an underscore using ._is_internal().

._represent() checks the specified value. If the value is anobject, then it looks to see if it also has a .to_dict() member and uses it to represent the object. Otherwise, it returns a string representation. If the value is not an object, then it simply returns the value.

You can modify the Employee class to support this mixin:

# In employees.pyfromrepresentationsimportAsDictionaryMixinclassEmployee(AsDictionaryMixin):def__init__(self,id,name,address,role,payroll):self.id=idself.name=nameself.address=addressself._role=roleself._payroll=payrolldefwork(self,hours):duties=self._role.perform_duties(hours)print(f'Employee {self.id} - {self.name}:')print(f'- {duties}')print('')self._payroll.track_work(hours)defcalculate_payroll(self):returnself._payroll.calculate_payroll()

All you have to do is inherit the AsDictionaryMixin to support the functionality. It will be nice to support the same functionality in the Address class, so the Employee.address attribute is represented in the same way:

# In contacts.pyfromrepresentationsimportAsDictionaryMixinclassAddress(AsDictionaryMixin):def__init__(self,street,city,state,zipcode,street2=''):self.street=streetself.street2=street2self.city=cityself.state=stateself.zipcode=zipcodedef__str__(self):lines=[self.street]ifself.street2:lines.append(self.street2)lines.append(f'{self.city}, {self.state}{self.zipcode}')return'\n'.join(lines)

You apply the mixin to the Address class to support the feature. Now, you can write a small program to test it:

# In program.pyimportjsonfromemployeesimportEmployeeDatabasedefprint_dict(d):print(json.dumps(d,indent=2))foremployeeinEmployeeDatabase().employees:print_dict(employee.to_dict())

The program implements a print_dict() that converts the dictionary to a JSON string using indentation so the output looks better.

Then, it iterates through all the employees, printing the dictionary representation provided by .to_dict(). You can run the program to see its output:

 $ python program.py

 {"id": "1","name": "Mary Poppins","address": {"street": "121 Admin Rd.","street2": "","city": "Concord","state": "NH","zipcode": "03301"  }}{"id": "2","name": "John Smith","address": {"street": "67 Paperwork Ave","street2": "","city": "Manchester","state": "NH","zipcode": "03101"  }}{"id": "3","name": "Kevin Bacon","address": {"street": "15 Rose St","street2": "Apt. B-1","city": "Concord","state": "NH","zipcode": "03301"  }}{"id": "4","name": "Jane Doe","address": {"street": "39 Sole St.","street2": "","city": "Concord","state": "NH","zipcode": "03301"  }}{"id": "5","name": "Robin Williams","address": {"street": "99 Mountain Rd.","street2": "","city": "Concord","state": "NH","zipcode": "03301"  }}

You leveraged the implementation of AsDictionaryMixin in both Employee and Address classes even when they are not related. Because AsDictionaryMixin only provides behavior, it is easy to reuse with other classes without causing problems.

Composition to Model “Has A” Relationship

Composition models a has a relationship. With composition, a class Compositehas an instance of class Component and can leverage its implementation. The Component class can be reused in other classes completely unrelated to the Composite.

In the composition example above, the Employee class has anAddress object. Address implements all the functionality to handle addresses, and it can be reused by other classes.

Other classes like Customer or Vendor can reuse Address without being related to Employee. They can leverage the same implementation ensuring that addresses are handled consistently across the application.

A problem you may run into when using composition is that some of your classes may start growing by using multiple components. Your classes may require multiple parameters in the constructor just to pass in the components they are made of. This can make your classes hard to use.

A way to avoid the problem is by using the Factory Method to construct your objects. You did that with the composition example.

If you look at the implementation of the EmployeeDatabase class, you’ll notice that it uses ._create_employee() to construct an Employee object with the right parameters.

This design will work, but ideally, you should be able to construct an Employee object just by specifying an id, for example employee = Employee(1).

The following changes might improve your design. You can start with the productivity module:

# In productivity.pyclass_ProductivitySystem:def__init__(self):self._roles={'manager':ManagerRole,'secretary':SecretaryRole,'sales':SalesRole,'factory':FactoryRole,}defget_role(self,role_id):role_type=self._roles.get(role_id)ifnotrole_type:raiseValueError('role_id')returnrole_type()deftrack(self,employees,hours):print('Tracking Employee Productivity')print('==============================')foremployeeinemployees:employee.work(hours)print('')# Role classes implementation omitted_productivity_system=_ProductivitySystem()defget_role(role_id):return_productivity_system.get_role(role_id)deftrack(employees,hours):_productivity_system.track(employees,hours)

First, you make the _ProductivitySystem class internal, and then provide a _productivity_system internal variable to the module. You are communicating to other developers that they should not create or use the _ProductivitySystem directly. Instead, you provide two functions, get_role() and track(), as the public interface to the module. This is what other modules should use.

What you are saying is that the _ProductivitySystem is a Singleton, and there should only be one object created from it.

Now, you can do the same with the hr module:

# In hr.pyclass_PayrollSystem:def__init__(self):self._employee_policies={1:SalaryPolicy(3000),2:SalaryPolicy(1500),3:CommissionPolicy(1000,100),4:HourlyPolicy(15),5:HourlyPolicy(9)}defget_policy(self,employee_id):policy=self._employee_policies.get(employee_id)ifnotpolicy:returnValueError(employee_id)returnpolicydefcalculate_payroll(self,employees):print('Calculating Payroll')print('===================')foremployeeinemployees:print(f'Payroll for: {employee.id} - {employee.name}')print(f'- Check amount: {employee.calculate_payroll()}')ifemployee.address:print('- Sent to:')print(employee.address)print('')# Policy classes implementation omitted_payroll_system=_PayrollSystem()defget_policy(employee_id):return_payroll_system.get_policy(employee_id)defcalculate_payroll(employees):_payroll_system.calculate_payroll(employees)

Again, you make the _PayrollSystem internal and provide a public interface to it. The application will use the public interface to get policies and calculate payroll.

You will now do the same with the contacts module:

# In contacts.pyclass_AddressBook:def__init__(self):self._employee_addresses={1:Address('121 Admin Rd.','Concord','NH','03301'),2:Address('67 Paperwork Ave','Manchester','NH','03101'),3:Address('15 Rose St','Concord','NH','03301','Apt. B-1'),4:Address('39 Sole St.','Concord','NH','03301'),5:Address('99 Mountain Rd.','Concord','NH','03301'),}defget_employee_address(self,employee_id):address=self._employee_addresses.get(employee_id)ifnotaddress:raiseValueError(employee_id)returnaddress# Implementation of Address class omitted_address_book=_AddressBook()defget_employee_address(employee_id):return_address_book.get_employee_address(employee_id)

You are basically saying that there should only be one _AddressBook, one _PayrollSystem, and one _ProductivitySystem. Again, this design pattern is called the Singleton design pattern, which comes in handy for classes from which there should only be one, single instance.

Now, you can work on the employees module. You will also make a Singleton out of the _EmployeeDatabase, but you will make some additional changes:

# In employees.pyfromproductivityimportget_rolefromhrimportget_policyfromcontactsimportget_employee_addressfromrepresentationsimportAsDictionaryMixinclass_EmployeeDatabase:def__init__(self):self._employees={1:{'name':'Mary Poppins','role':'manager'},2:{'name':'John Smith','role':'secretary'},3:{'name':'Kevin Bacon','role':'sales'},4:{'name':'Jane Doe','role':'factory'},5:{'name':'Robin Williams','role':'secretary'}}@propertydefemployees(self):return[Employee(id_)forid_insorted(self._employees)]defget_employee_info(self,employee_id):info=self._employees.get(employee_id)ifnotinfo:raiseValueError(employee_id)returninfoclassEmployee(AsDictionaryMixin):def__init__(self,id):self.id=idinfo=employee_database.get_employee_info(self.id)self.name=info.get('name')self.address=get_employee_address(self.id)self._role=get_role(info.get('role'))self._payroll=get_policy(self.id)defwork(self,hours):duties=self._role.perform_duties(hours)print(f'Employee {self.id} - {self.name}:')print(f'- {duties}')print('')self._payroll.track_work(hours)defcalculate_payroll(self):returnself._payroll.calculate_payroll()employee_database=_EmployeeDatabase()

You first import the relevant functions and classes from other modules. The _EmployeeDatabase is made internal, and at the bottom, you create a single instance. This instance is public and part of the interface because you will want to use it in the application.

You changed the _EmployeeDatabase._employees attribute to be a dictionary where the key is the employee id and the value is the employee information. You also exposed a .get_employee_info() method to return the information for the specified employee employee_id.

The _EmployeeDatabase.employees property now sorts the keys to return the employees sorted by their id. You replaced the method that constructed the Employee objects with calls to the Employee initializer directly.

The Employee class now is initialized with the id and uses the public functions exposed in the other modules to initialize its attributes.

You can now change the program to test the changes:

# In program.pyimportjsonfromhrimportcalculate_payrollfromproductivityimporttrackfromemployeesimportemployee_database,Employeedefprint_dict(d):print(json.dumps(d,indent=2))employees=employee_database.employeestrack(employees,40)calculate_payroll(employees)temp_secretary=Employee(5)print('Temporary Secretary:')print_dict(temp_secretary.to_dict())

You import the relevant functions from the hr and productivity modules, as well as the employee_database and Employee class. The program is cleaner because you exposed the required interface and encapsulated how objects are accessed.

Notice that you can now create an Employee object directly just using its id. You can run the program to see its output:

$ python program.py

Tracking Employee Productivity==============================Employee 1 - Mary Poppins:- screams and yells for 40 hours.Employee 2 - John Smith:- does paperwork for 40 hours.Employee 3 - Kevin Bacon:- expends 40 hours on the phone.Employee 4 - Jane Doe:- manufactures gadgets for 40 hours.Employee 5 - Robin Williams:- does paperwork for 40 hours.Calculating Payroll===================Payroll for: 1 - Mary Poppins- Check amount: 3000- Sent to:121 Admin Rd.Concord, NH 03301Payroll for: 2 - John Smith- Check amount: 1500- Sent to:67 Paperwork AveManchester, NH 03101Payroll for: 3 - Kevin Bacon- Check amount: 1800.0- Sent to:15 Rose StApt. B-1Concord, NH 03301Payroll for: 4 - Jane Doe- Check amount: 600- Sent to:39 Sole St.Concord, NH 03301Payroll for: 5 - Robin Williams- Check amount: 360- Sent to:99 Mountain Rd.Concord, NH 03301Temporary Secretary:{"id": "5","name": "Robin Williams","address": {"street": "99 Mountain Rd.","street2": "","city": "Concord","state": "NH","zipcode": "03301"  }}

The program works the same as before, but now you can see that a single Employee object can be created from its id and display its dictionary representation.

Take a closer look at the Employee class:

# In employees.pyclassEmployee(AsDictionaryMixin):def__init__(self,id):self.id=idinfo=employee_database.get_employee_info(self.id)self.name=info.get('name')self.address=get_employee_address(self.id)self._role=get_role(info.get('role'))self._payroll=get_policy(self.id)defwork(self,hours):duties=self._role.perform_duties(hours)print(f'Employee {self.id} - {self.name}:')print(f'- {duties}')print('')self._payroll.track_work(hours)defcalculate_payroll(self):returnself._payroll.calculate_payroll()

The Employee class is a composite that contains multiple objects providing different functionality. It contains an Address that implements all the functionality related to where the employee lives.

Employee also contains a productivity role provided by the productivity module, and a payroll policy provided by the hr module. These two objects provide implementations that are leveraged by the Employee class to track work in the .work() method and to calculate the payroll in the .calculate_payroll() method.

You are using composition in two different ways. The Address class provides additional data to Employee where the role and payroll objects provide additional behavior.

Still, the relationship between Employee and those objects is loosely coupled, which provides some interesting capabilities that you’ll see in the next section.

Composition to Change Run-Time Behavior

Inheritance, as opposed to composition, is a tightly couple relationship. With inheritance, there is only one way to change and customize behavior. Method overriding is the only way to customize the behavior of a base class. This creates rigid designs that are difficult to change.

Composition, on the other hand, provides a loosely coupled relationship that enables flexible designs and can be used to change behavior at run-time.

Imagine you need to support a long-term disability (LTD) policy when calculating payroll. The policy states that an employee on LTD should be paid 60% of their weekly salary assuming 40 hours of work.

With an inheritance design, this can be a very difficult requirement to support. Adding it to the composition example is a lot easier. Let’s start by adding the policy class:

# In hr.pyclassLTDPolicy:def__init__(self):self._base_policy=Nonedeftrack_work(self,hours):self._check_base_policy()returnself._base_policy.track_work(hours)defcalculate_payroll(self):self._check_base_policy()base_salary=self._base_policy.calculate_payroll()returnbase_salary*0.6defapply_to_policy(self,base_policy):self._base_policy=base_policydef_check_base_policy(self):ifnotself._base_policy:raiseRuntimeError('Base policy missing')

Notice that LTDPolicy doesn’t inherit PayrollPolicy, but implements the same interface. This is because the implementation is completely different, so we don’t want to inherit any of the PayrollPolicy implementation.

The LTDPolicy initializes _base_policy to None, and provides an internal ._check_base_policy() method that raises an exception if the ._base_policy has not been applied. Then, it provides a .apply_to_policy() method to assign the _base_policy.

The public interface first checks that the _base_policy has been applied, and then implements the functionality in terms of that base policy. The .track_work() method just delegates to the base policy, and .calculate_payroll() uses it to calculate the base_salary and then return the 60%.

You can now make a small change to the Employee class:

# In employees.pyclassEmployee(AsDictionaryMixin):def__init__(self,id):self.id=idinfo=employee_database.get_employee_info(self.id)self.name=info.get('name')self.address=get_employee_address(self.id)self._role=get_role(info.get('role'))self._payroll=get_policy(self.id)defwork(self,hours):duties=self._role.perform_duties(hours)print(f'Employee {self.id} - {self.name}:')print(f'- {duties}')print('')self._payroll.track_work(hours)defcalculate_payroll(self):returnself._payroll.calculate_payroll()defapply_payroll_policy(self,new_policy):new_policy.apply_to_policy(self._payroll)self._payroll=new_policy

You added an .apply_payroll_policy() method that applies the existing payroll policy to the new policy and then substitutes it. You can now modify the program to apply the policy to an Employee object:

# In program.pyfromhrimportcalculate_payroll,LTDPolicyfromproductivityimporttrackfromemployeesimportemployee_databaseemployees=employee_database.employeessales_employee=employees[2]ltd_policy=LTDPolicy()sales_employee.apply_payroll_policy(ltd_policy)track(employees,40)calculate_payroll(employees)

The program accesses sales_employee, which is located at index 2, creates the LTDPolicy object, and applies the policy to the employee. When .calculate_payroll() is called, the change is reflected. You can run the program to evaluate the output:

$ python program.py

Tracking Employee Productivity==============================Employee 1 - Mary Poppins:- screams and yells for 40 hours.Employee 2 - John Smith:- Does paperwork for 40 hours.Employee 3 - Kevin Bacon:- Expends 40 hours on the phone.Employee 4 - Jane Doe:- Manufactures gadgets for 40 hours.Employee 5 - Robin Williams:- Does paperwork for 40 hours.Calculating Payroll===================Payroll for: 1 - Mary Poppins- Check amount: 3000- Sent to:121 Admin Rd.Concord, NH 03301Payroll for: 2 - John Smith- Check amount: 1500- Sent to:67 Paperwork AveManchester, NH 03101Payroll for: 3 - Kevin Bacon- Check amount: 1080.0- Sent to:15 Rose StApt. B-1Concord, NH 03301Payroll for: 4 - Jane Doe- Check amount: 600- Sent to:39 Sole St.Concord, NH 03301Payroll for: 5 - Robin Williams- Check amount: 360- Sent to:99 Mountain Rd.Concord, NH 03301

The check amount for employee Kevin Bacon, who is the sales employee, is now for $1080 instead of $1800. That’s because the LTDPolicy has been applied to the salary.

As you can see, you were able to support the changes just by adding a new policy and modifying a couple interfaces. This is the kind of flexibility that policy design based on composition gives you.

Choosing Between Inheritance and Composition in Python

Python, as an object oriented programming language, supports both inheritance and composition. You saw that inheritance is best used to model an is a relationship, whereas composition models a has a relationship.

Sometimes, it’s hard to see what the relationship between two classes should be, but you can follow these guidelines:

  • Use inheritance over composition in Python to model a clear is a relationship. First, justify the relationship between the derived class and its base. Then, reverse the relationship and try to justify it. If you can justify the relationship in both directions, then you should not use inheritance between them.

  • Use inheritance over composition in Python to leverage both the interface and implementation of the base class.

  • Use inheritance over composition in Python to provide mixin features to several unrelated classes when there is only one implementation of that feature.

  • Use composition over inheritance in Python to model a has a relationship that leverages the implementation of the component class.

  • Use composition over inheritance in Python to create components that can be reused by multiple classes in your Python applications.

  • Use composition over inheritance in Python to implement groups of behaviors and policies that can be applied interchangeably to other classes to customize their behavior.

  • Use composition over inheritance in Python to enable run-time behavior changes without affecting existing classes.

Conclusion

You explored inheritance and composition in Python. You learned about the type of relationships that inheritance and composition create. You also went through a series of exercises to understand how inheritance and composition are implemented in Python.

In this article, you learned how to:

  • Use inheritance to express an is a relationship between two classes
  • Evaluate if inheritance is the right relationship
  • Use multiple inheritance in Python and evaluate Python’s MRO to troubleshoot multiple inheritance problems
  • Extend classes with mixins and reuse their implementation
  • Use composition to express a has a relationship between two classes
  • Provide flexible designs using composition
  • Reuse existing code through policy design based on composition

Here are some books and articles that further explore object oriented design and can be useful to help you understand the correct use of inheritance and composition in Python or other languages:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Stéphane Wirtel: PyCon Ireland 2019: Call for proposals

$
0
0
PyCon Ireland 2019 Python Ireland is the Irish organisation representing the various chapters of Python users. We organise meet ups and events for software developers, students, academics and anyone who wants to learn the language. One of our aims is to help grow and diversify the Python community in Ireland. PyCon Ireland is the largest annual gathering of the Irish Python community. It takes place over a period of two days.

PyCharm: PyCharm 2019.2.1 Preview

$
0
0

PyCharm 2019.2.1 Preview is now available!

Fixed in this Version

  • The PyCharm debugger got some fixes:
    • The issue causing the debugger to hang when a syntax error was encountered was resolved.
    • When classes were given certain names, the debugger would be unable to inspect their variables, this is no longer a problem.
    • Exceptions thrown when debugging a project with multiprocessing modules will not stop the debugger anymore.
  • PyCharm now recognizes ctypes Arrays so now you won’t get wrong inspection messages when defining or using such arrays.
  • We had a bug that made Tensorflow’s references to be unresolved and that was also fixed.
  • An error displayed when trying to run Jupyter Notebook’ cells using managed and configured Jupyter servers that was not allowing Jupyter cells to run is now resolved.
  • We improved Jupyter’s server response error messages to be more user friendly.
  • And many more fixes, see the release notes for more information.

Getting the New Version

Download the Preview from Confluence.

Viewing all 22875 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>