Yasoob Khalid: Review: The History of Unix Video

November 10, 2018, 4:03 pm

≫ Next: Andre Roberge: User feedback is essential for improvements

≪ Previous: Catalin George Festila: Python Qt5 - QTabWidget example.

Hi everyone! If you have been following my blog for a while, you will know how much I love computers. I recently watched this video about the history of Unix and decided to write a short post about it. It’s a one hour presented by Rob Pike who joined Bell labs and worked on Unix in the 70’s. I got to know about the video via Hacker News (you should definitely follow it if you don’t!).

The video is basically a personal history of Rob Pike and how he came from Canada to the United States and joined Bell Labs, where he worked with Ken Thompson, Dennis Ritchie, Brian Kernighan and the rest of the group on Unix. It is a really great refresher on how the Operating System developed over time and how the revolution came about from punch card programming to actual programming on screens.

Some of the facts which stood out for me which I did or did not know before watching this presentation:

Some of the people who worked on Unix at Bell Labs joined Google later on and worked on Go Lang.
History of the Ed text editor.
Good educational material (books and documentation) are equally important in making a piece of software gain popularity as is good coding.
You don’t necessarily need to be a good software engineer in order to make a big contribution in the world of tech. Good writing skills and the ability to break down and explain the working a complex piece of program to laymen is super important as well.
The story about how the various derivative of Unix (Linux, Solaris, OpenBSD etc.) came about and their licensing.

If you love learning about Computers and Operating Systems you should definitely watch this video. If you know of similar videos please share them with the rest of us in the comments below.

If you have read this far let me give you another suggestion. If you haven’t already, you should watch the AlphaGo documentary which talks about DeepMind, Go and Neural Networks.

Have a good day!

↧

Andre Roberge: User feedback is essential for improvements

November 11, 2018, 1:33 am

≫ Next: Ian Ozsvald: On receiving the Community Leadership Award at the NumFOCUS Summit 2018

≪ Previous: Yasoob Khalid: Review: The History of Unix Video

A while ago, in Reeborg's World, I implemented a way to require Reeborg to follow a specific path in order to complete a task. For a world creator, this meant to include a path as a list of successive grid location, something like

path = [[1,1], [2, 1], [2,2], ...]

With the appropriate information given, the task shown to the user includes a visual view of the path to follow. This is what it looked like:

This works well. However, if Reeborg has to retrace some steps, to accomplish the task, the two arrow heads visually combine and appear to form an X which could be interpreted to mean that a given path segment should not be included.

(In addition to the arrow heads combining to look like an X, the dashes do not overlap and instead combine to form a solid line.) Most users of Reeborg's World are students learning in a formal setting. I surmise that those teachers quickly figured out what the correct information was and never reported it. As I created this visual information, I knew its meaning and was simply blind to the other possibility.

A while ago, a user learning on their own asked me why their program was not working. After a few email exchanges, I finally understood the source of the confusion. I took note of it. I had a quick stab at finding a better way but it didn't work.

A couple of days ago, a second user learning on their own contacted me with the same problem. Clearly, something had to be done ...

This is what it now looks like

In addition to clearing the confusion (or, I hope it does), I actually think it looks much nicer. This improvement would not have been possible if I didn't get some user feedback. This is why I am always thankful when someone contact me to suggest some improvements -- even though I may not always be in a position to implement the required changes quickly.

↧

Ian Ozsvald: On receiving the Community Leadership Award at the NumFOCUS Summit 2018

November 11, 2018, 10:59 am

≫ Next: Mike Driscoll: PyDev of the Week: Frank Vieira

≪ Previous: Andre Roberge: User feedback is essential for improvements

At the end of September I was honoured to receive the Community Leadership Award from NumFOCUS for my work building out the PyData community here in London and at associated events. This was awarded at the NumFOCUS 2018 Summit, I couldn’t attend the New York event and James Powell gave my speech on my behalf (thanks James!).

Congratulations to @kellecruz and @Microsoft 's Shahrokh Mortazavi and @ianozsvald for receiving @NumFOCUS awards last night at the annual event. Thanks to all past board members as well. Many thanks to staff! Become a sustaining member of this vital org: https://t.co/8mOEahZbTS
— Travis Oliphant (@teoliphant) September 24, 2018

I’m humbled to be singled out for the award – things only worked out so well because of the work of all of my colleagues (and alumni) at PyDataLondon and all the other wonderful folk at events like PyDataBerlin, PyDataAmsterdam, EuroPython (which has had a set of PyData sub-tracks) and PyConUK (with similar sub-tracks).

NumFOCUS posted a blog entry on the awards, in addition Kelle Cruz received the Project Sustainability Award and Shahrokh Mortazavi received the Corporate Stewardship Award.

Cecilia Liao and Emlyn Clay and myself started the first PyDataLondon conference in 2014 with lots of help, guidance and nudging from NumFOCUS (notably Leah – thanks!), James and via Continuum (now Anaconda Inc) Travis and Peter. Many thanks to you all for your help – we’re now at 8,000+ members and our monthly events have 200+ attendees thanks to AHL’s hosting.

PyData London is a fantastic achievement, and really helped me get started in a data science career. Many congratulations! https://t.co/ejhdooG8Hd
— PyData Cardiff (@pydatacardiff) September 24, 2018

If you don’t know NumFOCUS – they’re the group who do a lot of the background support for a number of our PyData ecosystem packages (including numpy, Jupyter and Pandas and beyond to R and Julia), back the PyData conference series and help lots of associated events and group. They’re a non profit and an awful lot of work goes on that you never see – if you’d like to provide financial support, you can setup a monthly sponsorship here. If you currently don’t provide any contributions back into our open source ecosystem – setting up a regular monthly payment is the easiest possible thing you could do to help NumFOCUS raise more money which helps more development occur in our ecosystem.

Couldn’t happen to a nicer guy. SO hardworking, well-connected and helpful to new groups. https://t.co/rQbGsEJEut
— Steve Holden (@holdenweb) September 24, 2018

Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight and in his Mor Consulting, sign-up for Data Science tutorials in London. He also founded the image and text annotation API Annotate.io, lives in London and is a consumer of fine coffees.

The post On receiving the Community Leadership Award at the NumFOCUS Summit 2018 appeared first on Entrepreneurial Geekiness.

↧

Mike Driscoll: PyDev of the Week: Frank Vieira

November 11, 2018, 10:05 pm

≫ Next: Erik Marsja: How to use Pandas Sample to Select Rows and Columns

≪ Previous: Ian Ozsvald: On receiving the Community Leadership Award at the NumFOCUS Summit 2018

This week we welcome Frank Vieira as our PyDev of the Week. Frank is the creator of the Vibora package, a “fast, asynchronous and elegant Python web framework.” You can see what else Frank is up to over on his Github profile. Let’s take a few moments to get to know Frank better!

Can you tell us a little about yourself (hobbies, education, etc):

My name is Frank Vieira, I’m 25, a really skilled Dota player (lol) and passionate software developer. On my free time, I like to play some games and work on hobby projects like small games using Unity or open-source projects as Vibora.

Why did you start using Python?

I got a job in a security company that used Python for everything and I almost immediately fell in love with it.

What other programming languages do you know and which is your favorite?

Javascript / Go / C# … My favorite is Python although I’m not a big fan of dynamic typing and Python deployments are a real pain (docker is awesome on this)

What projects are you working on now?

I’m working in a complete refactoring of Vibora and also in a mobile game. Hopefully, I’ll be able to finish both

What is the origin story of the Vibora package?

I was working in some Redis-backed APIs (using Flask/Gunicorn) at my job and after some benchmarks, I saw that Redis was almost sleeping while several machines were with 100% CPU load… After some research, I found Sanic/Japronto that could bring some raw performance on the server but they were still young projects and I was not happy with their direction… So here we are, Vibora is still at an early stage, missing a lot of stuff and far from production ready but I hope I’ll catch up soon

Why should people use it over Flask or Django?

First of all, Vibora is a work-in-progress. Don’t replace your Flask/Django app with it.
The project exploded on Reddit before I got the chance to make it stable… I’m working on it to get a stable release as soon as possible, stay tuned
But to answer your question: Flask/Django are synchronous frameworks, which is not bad but far from optimal when dealing with IO challenges (in my humble opinion). Vibora has also a focus on performance which is not a priority on those frameworks.

Which Python libraries are your favorite (core or 3rd party)?

Requests for sure. Although I do have some critics to it I think it influenced the vast majority of HTTP libraries out there (in many different programming languages) in a good way.

Thanks for doing the interview, Frank!

↧

Erik Marsja: How to use Pandas Sample to Select Rows and Columns

November 12, 2018, 3:01 am

≫ Next: Eli Bendersky: Unification

≪ Previous: Mike Driscoll: PyDev of the Week: Frank Vieira

In this tutorial we will learn how to use Pandas sample to randomly select rows and columns from a Pandas dataframe. There are some reasons for randomly sample our data; for instance, we may have a very large dataset and want to build our models on a smaller sample of the data. Other examples are when carrying out bootstrapping or cross-validation. Here we will learn how to; select rows at random, set a random seed, sample by group, using weights, and conditions, among other useful things.

How to Take a Random Sample of Rows

In this section we are going to learn how to take a random sample of a Pandas dataframe. We are going to use an Excel file that can be downloaded here. First, we start by importing Pandas and we use read_excel to load the Excel file into a dataframe:

import pandas as pd

df = pd.read_excel('MLBPlayerSalaries.xlsx')
df.head()

We use the method shape to see how many rows and columns that we have in our dataframe. This method is very similar to the dim function in R statistical programming language (see here).

df.shape

Read the Pandas Excel Tutorial to learn more about loading Excel files into Pandas dataframes.

Now we know how many rows and columns there are (19543 and 5 rows and columns, respectively) and we will now continue by using Pandas sample. In the example below we are not going to use any parameters. The default behavior, when not using any parameters, is sampling one row:

df.sample()

one randomly selected row using Pandas sample

In the most cases we want to take random samples of more rows than one. Thus, in the next Pandas sample example we are going to take random sample of the size of 200. We are going to use the parameter n to accomplish this:

df.sample(n=200).head(10)

As can be seen in the above image, we also used the head method to print only the 10 first rows of the randomly sampled rows. In most cases, we may want to save the randomly sampled rows. To accomplish this, we ill create a new dataframe:

df200 = df.sample(n=200)
df200.shape
# Output: (200, 5)

In the code above we created a new dataframe, called df200, with 200 randomly selected rows. Again, we used the method shape to see how many rows (and columns) we now have.

Random Sampling Rows using NumPy Choice

It’s of course very easy and convenient to use Pandas sample method to take a random sample of rows. Note, however, that it’s possible to use NumPy and random.choice. In the example below we will get the same result as above by using np.random.choice.

As usual when working with Python modules, we start by importing NumPy. After this is done we will the continue to create an array of indices (rows) and then use Pandas loc method to select the rows based on the random indices:

import numpy as np

rows = np.random.choice(df.index.values, 200)
df200 = df.loc[rows]
df200.head()

Using Pandas sample to randomly select 200 rows

How to Sample Pandas Dataframe using frac

Now that we have used NumPy we will continue this Pandas dataframe sample tutorial by using sample’s frac parameter. This parameter specifies the fraction (percentage) of rows to return in the random sample. This means that setting frac to 1 (frac=1) will return all rows, in random order. That is, if we just want to shuffle the dataframe it can be done using sample and the parameter frac.

df.sample(frac=1).head()

$Pandas Sample using the frac parameter$

As can be seen in the output table above the order of the rows are now random. We can use shape, again, to see that we have the same amount of rows:

df.sample(frac=1).shape
# Output: (19543, 5)

As expected there are as many rows and columns as in the original dataframe.

How to Shuffle Pandas Dataframe using Numpy

Here we will use another method to shuffle the dataframe. In the example code below we will use the Python module NumPy again. We have to use reindex (Pandas) and random.permutation (NumPy). More specifically, we will permute the datframe using the indices:

df_shuffled = df.reindex(np.random.permutation(df.index))

We can use frac to get 200 randomly selected rows also. Before doing this we will, of course, need to calculate how many % 200 is of our total amount of rows. In this case it’s approximately 1% of the data and using the code below will also give us 200 random rows from the dataframe.

df200 = df.sample(frac=.01023)

Note, the frac parameter cannot be used together with n. We will get a ValueError that states that we cannot enter a value for both frac and n.

Pandas Sample with Replacement

We can also, of course, sample with replacement. By default Pandas sample will sample without replacement. In some cases we have to sample with replacement (e.g., with really large datasets). If we want to sample with replacement we should use the replace parameter:

df5 = df.sample(n=5, replace=True)

Sample Dataframe with Seed

If we want to be able to reproduce our random sample of rows we can use the random_state parameter. This is the seed for the random number generator and we need to input an integer:

df200 = df.sample(n=200, random_state=1111)

We can, of course, use both the parameters frac and random_state, or n and random_state, together. In the example below we randomly select 50% of the rows and use the random_state. It is further possible to use replace=True parameter together with frac and random_state to get a reproducible percentage of rows with replacement.

df200 = df.sample(frac=.5, replace=True, random_state=1111)

$frac and n cannot be used together when using Pandas sample$

Pandas Sample with Weights

The sample method also have the parameter weights and this can be used if we want to increase the probability for certain rows to be sampled. We start of the next Pandas sample example by importing NumPy.

import numpy as np

df['Weights'] = np.where(df['Year'] <= 2000, .75, .25)
df['Weights'].unique()

# Output: array([0.75 , 0.25])

In the code above we used NumPy’s where to create a new column ‘Weights’. Up until the year 2000 the weights are .5. This will increase the probability for Pandas sample to select rows up until this year:

df2 = df.sample(frac=.5, random_state=1111, weights='Weights')
df2.shape

# Output: (9772, 6)

Pandas Sample by Group

It’s also possible to sample each group after we have used Pandas groupby method. In the example below we are going to group the dataframe by player and then take 2 samples of data from each player:

grouped = df.groupby('Player')
grouped.apply(lambda x: x.sample(n=2, replace=True)).head()

Pandas sample by group (player)

The code above may need some clarification. In the second line, we used Pandas apply methodand the anonymous Python function lambda. What it will do is run sample on each subset (i.e., for each Player) and take 2 random rows. Note, here we have to use replace=True or else it won’t work.

Pandas Random Sample with Condition

Say that we want to take a random sample of players with a salary under 421000 (or rows when the salary is under this number. Could be certain years for some players. This is quite easy, in the example below we sample 10% of the dataframe based on this condition.

df[df['Salary'] < 421000].sample(frac=.1).head()

Pandas sample random selecting columns

It’s also possible to have more than one condition. We just have to add some code to the above example. Now we are going to sample salaries under 421000 and prior to the year 2000:

df[(df['Salary'] < 421000) & (df['Year'] < 2000)].sample(frac=.1).head()

Using Pandas Sample and Remove

We may want to take a random sample from our dataframe and remove those rows. Maybe we want to create two different dataframes; one with 80% of the rows and one with the remaining 20%. Both of these things can, of course, be done using sample and the drop method. In the code example below we create two new dataframes; one with 80% of the rows and one with the remaining 20%.

df1 = df.sample(frac=0.8, random_state=138)
df2 = df.drop(df1.index)

If we merely want to remove random rows we can use drop and the inplace parameter:

df.drop(df1.index, inplace=True)
df.shape

# Same as: df.drop(df.sample(frac=0.8, random_state=138).index, 
                             inplace=True)
# Output: (3909, 5)

More useful Pandas guides:

Saving the Pandas Sample

Finally, we may also want to save the to work on later. In the example code below we are going to save a Pandas sample to csv. To accomplish this we use the to_csv method. The first parameter is the filename and because we don’t want an index column in the file, we use index_col=False.

import pandas as pd

df = pd.read_excel('MLBPlayerSalaries.xlsx')

df.sample(200, random_state=1111).to_csv('MBPlayerSalaries200Sample.csv', 
                                         index_col=False)

Summary

In this brief Pandas tutorial we have learned how to use the sample method. More specifically, we have learned how to:

take a random sample of a data using the n (a number of rows) and frac (a percentage of rows) parameters,
get reproducible results using a seed (random_state),
sample by group, sample using weights, and sample with conditions
create two samples and deleting random rows
saving the Pandas sample

That was it! Now we should know how to use Pandas sample.

The post How to use Pandas Sample to Select Rows and Columns appeared first on Erik Marsja.

↧

Eli Bendersky: Unification

November 12, 2018, 5:49 am

≫ Next: Real Python: How to Publish an Open-Source Python Package to PyPI

≪ Previous: Erik Marsja: How to use Pandas Sample to Select Rows and Columns

In logic and computer science, unification is a process of automatically solving equations between symbolic terms. Unification has several interesting applications, notably in logic programming and type inference. In this post I want to present the basic unification algorithm with a complete implementation.

Let's start with some terminology. We'll be using terms built from constants, variables and function applications:

A lowercase letter represents a constant (could be any kind of constant, like an integer or a string)
An uppercase letter represents a variable
f(...) is an application of function f to some parameters, which are terms themselves

This representation is borrowed from first-order logic and is also used in the Prolog programming language. Some examples:

V: a single variable term
foo(V, k): function foo applied to variable V and constant k
foo(bar(k), baz(V)): a nested function application

Pattern matching

Unification can be seen as a generalization of pattern matching, so let's start with that first.

We're given a constant term and a pattern term. The pattern term has variables. Pattern matching is the problem of finding a variable assignment that will make the two terms match. For example:

Constant term: f(a, b, bar(t))
Pattern term: f(a, V, X)

Trivially, the assignment V=b and X=bar(t) works here. Another name to call such an assignment is a substitution, which maps variables to their assigned values. In a less trivial case, variables can appear multiple times in a pattern:

Constant term: f(top(a), a, g(top(a)), t)
Pattern term: f(V, a, g(V), t)

Here the right substitution is V=top(a).

Sometimes, no valid substitutions exist. If we change the constant term in the latest example to f(top(b), a, g(top(a)), t), then there is no valid substitution becase V would have to match top(b) and top(a) simultaneously, which is not possible.

Unification

Unification is just like pattern matching, except that both terms can contain variables. So we can no longer say one is the pattern term and the other the constant term. For example:

First term: f(a, V, bar(D))
Second term f(D, k, bar(a))

Given two such terms, finding a variable substitution that will make them equivalent is called unification. In this case the substitution is {D=a, V=k}.

Note that there is an infinite number of possible unifiers for some solvable unification problem. For example, given:

First term: f(X, Y)
Second term: f(Z, g(X))

We have the substitution {X=Z, Y=g(X)} but also something like {X=K, Z=K, Y=g(K)} and {X=j(K), Z=j(K), Y=g(j(K))} and so on. The first substitution is the simplest one, and also the most general. It's called the most general unifier or mgu. Intuitively, the mgu can be turned into any other unifier by performing another substitution. For example {X=Z, Y=g(X)} can be turned into {X=j(K), Z=j(K), Y=g(j(K))} by applying the substitution {Z=j(K)} to it. Note that the reverse doesn't work, as we can't turn the second into the first by using a substitution. So we say that {X=Z, Y=g(X)} is the most general unifier for the two given terms, and it's the mgu we want to find.

An algorithm for unification

Solving unification problems may seem simple, but there are a number of subtle corner cases to be aware of. In his 1991 paper Correcting a Widespread Error in Unification Algorithms, Peter Norvig noted a common error that exists in many books presenting the algorithm, including SICP.

The correct algorithm is based on J.A. Robinson's 1965 paper "A machine-oriented logic based on the resolution principle". More efficient algorithms have been developed over time since it was first published, but our focus here will be on correctness and simplicity rather than performance.

The following implementation is based on Norvig's, and the full code (with tests) is available on Github. This implementation uses Python 3, while Norvig's original is in Common Lisp. There's a slight difference in representations too, as Norvig uses the Lisp-y (f X Y) syntax to denote an application of function f. The two representations are isomorphic, and I'm picking the more classical one which is used in most papers on the subject. In any case, if you're interested in the more Lisp-y version, I have some Clojure code online that ports Norvig's implementation more directly.

We'll start by defining the data structure for terms:

classTerm:passclassApp(Term):def__init__(self,fname,args=()):self.fname=fnameself.args=args# Not shown here: __str__ and __eq__, see full code for the details...classVar(Term):def__init__(self,name):self.name=nameclassConst(Term):def__init__(self,value):self.value=value

An App represents the application of function fname to a sequence of arguments.

defunify(x,y,subst):"""Unifies term x and y with initial subst.    Returns a subst (map of name->term) that unifies x and y, or None if    they can't be unified. Pass subst={} if no subst are initially    known. Note that {} means valid (but empty) subst."""ifsubstisNone:returnNoneelifx==y:returnsubstelifisinstance(x,Var):returnunify_variable(x,y,subst)elifisinstance(y,Var):returnunify_variable(y,x,subst)elifisinstance(x,App)andisinstance(y,App):ifx.fname!=y.fnameorlen(x.args)!=len(y.args):returnNoneelse:foriinrange(len(x.args)):subst=unify(x.args[i],y.args[i],subst)returnsubstelse:returnNone

unify is the main function driving the algorithm. It looks for a substitution, which is a Python dict mapping variable names to terms. When either side is a variable, it calls unify_variable which is shown next. Otherwise, if both sides are function applications, it ensures they apply the same function (otherwise there's no match) and then unifies their arguments one by one, carefully carrying the updated substitution throughout the process.

defunify_variable(v,x,subst):"""Unifies variable v with term x, using subst.    Returns updated subst or None on failure."""assertisinstance(v,Var)ifv.nameinsubst:returnunify(subst[v.name],x,subst)elifisinstance(x,Var)andx.nameinsubst:returnunify(v,subst[x.name],subst)elifoccurs_check(v,x,subst):returnNoneelse:# v is not yet in subst and can't simplify x. Extend subst.return{**subst,v.name:x}

The key idea here is recursive unification. If v is bound in the substutution, we try to unify its definition with x to guarantee consistency throughout the unification process (and vice versa when x is a variable). There's another function being used here - occurs_check; I'm retaining its classical name from early presentations of unification. Its goal is to guarantee that we don't have self-referential variable bindings like X=f(X) that would lead to potentially infinite unifiers.

defoccurs_check(v,term,subst):"""Does the variable v occur anywhere inside term?    Variables in term are looked up in subst and the check is applied    recursively."""assertisinstance(v,Var)ifv==term:returnTrueelifisinstance(term,Var)andterm.nameinsubst:returnoccurs_check(v,subst[term.name],subst)elifisinstance(term,App):returnany(occurs_check(v,arg,subst)forarginterm.args)else:returnFalse

Let's see how this code handles some of the unification examples discussed earlier in the post. Starting with the pattern matching example, where variables are just one one side:

>>> unify(parse_term('f(a, b, bar(t))'), parse_term('f(a, V, X)'), {})
{'V': b, 'X': bar(t)}

Now the examples from the Unification section:

>>> unify(parse_term('f(a, V, bar(D))'), parse_term('f(D, k, bar(a))'), {})
{'D': a, 'V': k}
>>> unify(parse_term('f(X, Y)'), parse_term('f(Z, g(X))'), {})
{'X': Z, 'Y': g(X)}

Finally, let's try one where unification will fail due to two conflicting definitions of variable X.

>>> unify(parse_term('f(X, Y, X)'), parse_term('f(r, g(X), p)'), {})
None

Lastly, it's instructive to trace through the execution of the algorithm for a non-trivial unification to see how it works. Let's unify the terms f(X,h(X),Y,g(Y)) and f(g(Z),W,Z,X):

unify is called, sees the root is an App of function f and loops over the arguments.
- unify(X, g(Z)) invokes unify_variable because X is a variable, and the result is augmenting subst with X=g(Z)
- unify(h(X), W) invokes unify_variable because W is a variable, so the subst grows to {X=g(Z), W=h(X)}
- unify(Y, Z) invokes unify_variable; since neither Y nor Z are in subst yet, the subst grows to {X=g(Z), W=h(X), Y=Z} (note that the binding between two variables is arbitrary; Z=Y would be equivalent)
- unify(g(Y), X) invokes unify_variable; here things get more interesting, because X is already in the subst, so now we call unify on g(Y) and g(Z) (what X is bound to)
  - The functions match for both terms (g), so there's another loop over arguments, this time only for unifying Y and Z
  - unify_variable for Y and Z leads to lookup of Y in the subst and then unify(Z, Z), which returns the unmodified subst; the result is that nothing new is added to the subst, but the unification of g(Y) and g(Z) succeeds, because it agrees with the existing bindings in subst
The final result is {X=g(Z), W=h(X), Y=Z}

Efficiency

The algorithm presented here is not particularly efficient, and when dealing with large unificaiton problems it's wise to consider mode advanced options. It does too much copying around of subst, and also too much work is repeated because we don't try to cache terms that have already been unified.

For a good overview of the efficiency of unification algorithms, I recommend checking out two papers:

"An Efficient Unificaiton algorithm" by Martelli and Montanari
"Unification: A Multidisciplinary survey" by Kevin Knight

↧

Real Python: How to Publish an Open-Source Python Package to PyPI

November 12, 2018, 6:00 am

≫ Next: Catalin George Festila: Python Qt5 - QCalendarWidget example.

≪ Previous: Eli Bendersky: Unification

Python is famous for coming with batteries included. Sophisticated capabilities are available in the standard library. You can find modules for working with sockets, parsing CSV, JSON, and XML files, and working with files and file paths.

However great the packages included with Python are, there are many fantastic projects available outside the standard library. These are most often hosted at the Python Packaging Index (PyPI), historically known as the Cheese Shop. At PyPI, you can find everything from Hello World to advanced deep learning libraries.

In this tutorial, you’ll cover how to upload your own package to PyPI. While getting your project published is easier than it used to be, there are still a few steps involved.

You’ll learn how to:

Prepare your Python package for publication
Think about versioning
Upload your package to PyPI

Throughout this tutorial, we’ll use a simple example project: a reader package that can be used to read Real Python tutorials. The first section introduces this project.

Free Bonus:Click here to get access to a chapter from Python Tricks: The Book that shows you Python's best practices with simple examples you can apply instantly to write more beautiful + Pythonic code.

A Small Python Package

This section will describe a small Python package that we’ll use as an example that can be published to PyPI. If you already have a package that you want to publish, feel free to skim this section and join up again at the beginning of the next section.

The package that we’ll use is called reader and is an application that can be used to download and read Real Python articles. If you want to follow along, you can get the full source code from our GitHub repository.

Note: The source code as shown and explained below is a simplified, but fully functional, version of the Real Python feed reader. Compared to the package published on PyPI and GitHub, this version lacks some error handling and extra options.

First, have a look at the directory structure of reader. The package lives completely inside a directory that is also named reader:

reader/
│
├── reader/
│   ├── config.txt
│   ├── feed.py
│   ├── __init__.py
│   ├── __main__.py
│   └── viewer.py
│
├── tests/
│   ├── test_feed.py
│   └── test_viewer.py
│
├── MANIFEST.in
├── README.md
└── setup.py

The source code of the package is in a reader subdirectory together with a configuration file. There are a few tests in a separate subdirectory. The tests will not be covered here, but you can find them in the GitHub repository. To learn more about testing, see Anthony Shaw’s great tutorial on Getting Started With Testing in Python.

If you’re working with your own package, you may use a different structure or have other files in your package directory. Our Python Application Layouts reference discusses several different options. The instructions in this guide will work independently of the layout you use.

In the rest of this section, you’ll see how the reader package works. In the next section, you’ll get a closer look at the special files, including setup.py, README.md, and MANIFEST.in, that are needed to publish your package.

Using the Real Python Reader

reader is a very basic web feed reader that can download the latest Real Python articles from the Real Python feed.

Here is an example of using the reader to get the list of the latest articles:

$ python -m reader
The latest tutorials from Real Python (https://realpython.com/)  0 How to Publish an Open-Source Python Package to PyPI  1 Python "while" Loops (Indefinite Iteration)  2 Writing Comments in Python (Guide)  3 Setting Up Python for Machine Learning on Windows  4 Python Community Interview With Michael Kennedy  5 Practical Text Classification With Python and Keras  6 Getting Started With Testing in Python  7 Python, Boto3, and AWS S3: Demystified  8 Python's range() Function (Guide)  9 Python Community Interview With Mike Grouchy 10 How to Round Numbers in Python 11 Building and Documenting Python REST APIs With Flask and Connexion – Part 2 12 Splitting, Concatenating, and Joining Strings in Python 13 Image Segmentation Using Color Spaces in OpenCV + Python 14 Python Community Interview With Mahdi Yusuf 15 Absolute vs Relative Imports in Python 16 Top 10 Must-Watch PyCon Talks 17 Logging in Python 18 The Best Python Books 19 Conditional Statements in Python

Notice that each article is numbered. To read one particular article, you use the same command but include the number of the article as well. For instance, to read How to Publish an Open-Source Python Package to PyPI, you add 0 to the command:

$ python -m reader 0# How to Publish an Open-Source Python Package to PyPI

Python is famous for coming with batteries included. Sophisticatedcapabilities are available in the standard library. You can find modulesfor working with sockets, parsing CSV, JSON, and XML files, andworking with files and file paths.However great the packages included with Python are, there are manyfantastic projects available outside the standard library. These aremost often hosted at the Python Packaging Index (PyPI), historicallyknown as the Cheese Shop. At PyPI, you can find everything from HelloWorld to advanced deep learning libraries.[... The full text of the article ...]

This prints the full article to the console using the Markdown text format.

Note:python -m is used to run a library module or package instead of a script. If you run a package, the contents of the file __main__.py will be executed. See Different Ways of Calling a Package for more info.

By changing the article number, you can read any of the available articles.

A Quick Look at the Code

The details of how reader works are not important for the purpose of this tutorial. However, if you are interested in seeing the implementation, you can expand the sections below. The package consists of five files:

config.txt is a configuration file used to specify the URL of the feed of Real Python tutorials. It’s a text file that can be read by the configparser standard library:

# config.txt[feed]url=https://realpython.com/atom.xml

In general, such a config file contains key-value pairs separated into sections. This particular file contains only one section (feed) and one key (url).

Note: A configuration file is probably overkill for this simple package. We include it here for demonstration purposes.

The first source code file we’ll look at is __main__.py. The double underscores indicate that this file has a special meaning in Python. Indeed, when running a package as a script with -m as above, Python executes the contents of the __main__.py file.

In other words, __main__.py acts as the entry point of our program and takes care of the main flow, calling other parts as needed:

# __main__.pyfromconfigparserimportConfigParserfromimportlibimportresources# Python 3.7+importsysfromreaderimportfeedfromreaderimportviewerdefmain():"""Read the Real Python article feed"""# Read URL of the Real Python feed from config filecfg=ConfigParser()cfg.read_string(resources.read_text("reader","config.txt"))url=cfg.get("feed","url")# If an article ID is given, show the articleiflen(sys.argv)>1:article=feed.get_article(url,sys.argv[1])viewer.show(article)# If no ID is given, show a list of all articleselse:site=feed.get_site(url)titles=feed.get_titles(url)viewer.show_list(site,titles)if__name__=="__main__":main()

Notice that main() is called on the last line. If we do not call main(), then our program would not do anything. As you saw earlier, the program can either list all articles or print one specific article. This is handled by the if-else inside main().

To read the URL to the feed from the configuration file, we use configparser and importlib.resources. The latter is used to import non-code (or resource) files from a package without having to worry about the full file path. It is especially helpful when publishing packages to PyPI where resource files might end up inside binary archives.

importlib.resources became a part of the standard library in Python 3.7. If you are using an older version of Python, you can use importlib_resources instead. This is a backport compatible with Python 2.7, and 3.4 and above. importlib_resources can be installed from PyPI:

$ pip install importlib_resources

See Barry Warzaw’s presentation at PyCon 2018 for more information.

The next file is __init__.py. Again, the double underscores in the filename tell us that this is a special file. __init__.py represents the root of your package. It should usually be kept quite simple, but it’s a good place to put package constants, documentation, and so on:

# __init__.py# Version of the realpython-reader package__version__="1.0.0"

The special variable __version__ is a convention in Python for adding version numbers to your package. It was introduced in PEP 396. We’ll talk more about versioning later.

Variables defined in __init__.py become available as variables in the package namespace:

>>>

>>> importreader>>> reader.__version__'1.0.0'

You should define the __version__ variable in your own packages as well.

Looking at __main__.py, you’ll see that two modules, feed and viewer, are imported and used to read from the feed and show the results. These modules do most of the actual work.

First consider feed.py. This file contains functions for reading from a web feed and parsing the result. Luckily there are already great libraries available to do this. feed.py depends on two modules that are already available on PyPI: feedparser and html2text.

feed.py contains several functions. We’ll discuss them one at a time.

To avoid reading from the web feed more than necessary, we first create a function that remembers the feed the first time it’s read:

 1 # feed.py 2  3 importfeedparser 4 importhtml2text 5  6 _CACHED_FEEDS=dict() 7  8 def_feed(url): 9 """Only read a feed once, by caching its contents"""10 ifurlnotin_CACHED_FEEDS:11 _CACHED_FEEDS[url]=feedparser.parse(url)12 return_CACHED_FEEDS[url]

feedparser.parse() reads a feed from the web and returns it in a structure that looks like a dictionary. To avoid downloading the feed more than once, it’s stored in _CACHED_FEEDS and reused for later calls to _feed(). Both _CACHED_FEEDS and _feed() are prefixed by an underscore to indicate that they are support objects not meant to be used directly.

We can get some basic information about the feed by looking in the .feed metadata. The following function picks out the title and link to the web site containing the feed:

14 defget_site(url):15 """Get name and link to web site of the feed"""16 info=_feed(url).feed17 returnf"{info.title} ({info.link})"

In addition to .title and .link, attributes like .subtitle, .updated, and .id are also available.

The articles available in the feed can be found inside the .entries list. Article titles can be found with a list comprehension:

19 defget_titles(url):20 """List titles in feed"""21 articles=_feed(url).entries22 return[a.titleforainarticles]

.entries lists the articles in the feed sorted chronologically, so that the newest article is .entries[0].

In order to get the contents of one article, we use its index in the .entries list as an article ID:

24 defget_article(url,article_id):25 """Get article from feed with the given ID"""26 articles=_feed(url).entries27 article=articles[int(article_id)]28 html=article.content[0].value29 text=html2text.html2text(html)30 returnf"# {article.title}\n\n{text}"

After picking the correct article out of the .entries list, we find the text of the article as HTML on line 28. Next, html2text does a decent job of translating the HTML into much more readable text. As the HTML doesn’t contain the title of the article, the title is added before returning.

The final module is viewer.py. At the moment, it consists of two very simple functions. In practice, we could have used print() directly in __main__.py instead of calling viewer functions. However, having the functionality split off makes it easier to replace it later with something more advanced. Maybe we could add a GUI interface in a later version?

viewer.py contains two functions:

# viewer.pydefshow(article):"""Show one article"""print(article)defshow_list(site,titles):"""Show list of articles"""print(f"The latest tutorials from {site}")forarticle_id,titleinenumerate(titles):print(f"{article_id:>3}{title}")

show() simply prints one article to the console, while show_list() prints a list of titles. The latter also creates article IDs that are used when choosing to read one particular article.

Different Ways of Calling a Package

One challenge when your projects grow in complexity is communicating to the user how to use your project. Since the package consists of four different source code files, how does the user know which file to call to run reader?

The python interpreter program has an -m option that allows you to specify a module name instead of a file name. For instance, if you have a script called hello.py, the following two commands are equivalent:

$ python hello.py
Hi there!$ python -m hello
Hi there!

One advantage of the latter is that it allows you to call modules that are built into Python as well. One example is calling antigravity:

$ python -m antigravity
Created new window in existing browser session.

Another advantage of using -m is that it works for packages as well as modules. As you saw earlier, you can call the reader package with -m:

$ python -m reader
[...]

Since reader is a package, the name only refers to a directory. How does Python decide which code inside that directory to run? It looks for a file named __main__.py. If such a file exists, it is executed. If __main__.py does not exist, then an error message is printed:

$ python -m math
python: No code object available for math

In this example, you see that the math standard library has not defined a __main__.py file.

If you are creating a package that is supposed to be executed, you should include a __main__.py file. Later, you’ll see how you can also create entry points to your package that will behave like regular programs.

Preparing Your Package for Publication

Now you’ve got a package you want to publish, or maybe you copied our package. Which steps are necessary before you can upload the package to PyPI?

Naming Your Package

The first—and possibly the hardest—step is to come up with a good name for your package. All packages on PyPI need to have unique names. With more than 150,000 packages already on PyPI, chances are that your favorite name is already taken.

You might need to brainstorm and do some research to find the perfect name. Use the PyPI search to check if a name is already taken. The name that you come up with will be visible on PyPI.

To make the reader package easier to find on PyPI, we give it a more descriptive name and call it realpython-reader. The same name will be used to install the package using pip:

$ pip install realpython-reader

Even though we use realpython-reader as the PyPI name, the package is still called reader when it’s imported:

>>>

>>> importreader>>> help(reader)>>> fromreaderimportfeed>>> feed.get_titles()['How to Publish an Open-Source Python Package to PyPI', ...]

As you see, you can use different names for your package on PyPI and when importing. However, if you use the same name or very similar names, then it will be easier for your users.

Configuring Your Package

In order for your package to be uploaded to PyPI, you need to provide some basic information about it. This information is typically provided in the form of a setup.py file. There are initiatives that try to simplify this collection of information. At the moment though, setup.py is the only fully supported way of providing information about your package.

The setup.py file should be placed in the top folder of your package. A fairly minimal setup.py for reader looks like this:

importpathlibfromsetuptoolsimportsetup# The directory containing this fileHERE=pathlib.Path(__file__).parent# The text of the README fileREADME=(HERE/"README.md").read_text()# This call to setup() does all the worksetup(name="realpython-reader",version="1.0.0",description="Read the latest Real Python tutorials",long_description=README,long_description_content_type="text/markdown",url="https://github.com/realpython/reader",author="Real Python",author_email="office@realpython.com",license="MIT",classifiers=["License :: OSI Approved :: MIT License","Programming Language :: Python :: 3","Programming Language :: Python :: 3.7",],packages=["reader"],include_package_data=True,install_requires=["feedparser","html2text"],entry_points={"console_scripts":["realpython=reader.__main__:main",]},)

We will only cover some of the options available in setuptools here. The documentation does a good job of going into all the detail.

The parameters that are 100% necessary in the call to setup() are the following:

name: the name of your package as it will appear on PyPI
version: the current version of your package
packages: the packages and subpackages containing your source code

We will talk more about versions later. The packages parameter takes a list of packages. In our example, there is only one package: reader.

You also need to specify any subpackages. In more complicated projects, there might be many packages to list. To simplify this job, setuptools includes find_packages(), which does a good job of discovering all your subpackages. You could have used find_packages() in the reader project as follows:

fromsetuptoolsimportfind_packages,setupsetup(...packages=find_packages(exclude=("tests",)),...)

While only name, version, and packages are required, your package becomes much easier to find on PyPI if you add some more information. Have a look at the realpython-reader page on PyPI and compare the information with setup.py above. All the information comes from setup.py and README.md.

Information about the <code>realpython-reader</code> package at PyPI realpython-reader package at PyPI" />realpython-reader package at PyPI" />realpython-reader package at PyPI"/>

The last two parameters to setup() deserve special mention:

install_requires is used to list any dependencies your package has to third party libraries. The reader depends on feedparser and html2text, so they should be listed here.
entry_points is used to create scripts that call a function within your package. In our example, we create a new script realpython that calls main() within the reader/__main__.py file.

For another example of a typical setup file, see Kenneth Reitz’s setup.py repository on GitHub.

Documenting Your Package

Before releasing your package to the world, you should add some documentation. Depending on your package, the documentation can be as small as a simple README file, or as big as a full web page with tutorials, example galleries, and an API reference.

At a minimum, you should include a README file with your project. A good README should quickly describe your project, as well as tell your users how to install and use your package. Typically, you want to include your README as the long_description argument to setup(). This will display your README on PyPI.

Traditionally, PyPI has used reStructuredText for package documentation. However, since March 2018 Markdown has also been supported.

Outside of PyPI, Markdown is more widely supported than reStructuredText. If you don’t need any of the special features of reStructuredText, you’ll be better off keeping your README in Markdown. Note that you should use the setup() parameter long_description_content_type to tell PyPI which format you are using. Valid values are text/markdown, text/x-rst, and text/plain.

For bigger projects, you might want to offer more documentation than can reasonably fit in a single file. In that case, you can use sites like GitHub or Read the Docs, and link to the documentation using the url parameter. In the setup.py example above, url is used to link to the reader GitHub repository.

Versioning Your Package

Your package needs to have a version, and PyPI will only let you do one upload of a particular version for a package. In other words, if you want to update your package on PyPI, you need to increase the version number first. This is a good thing, as it guarantees reproducibility: two systems with the same version of a package should behave the same.

There are many different schemes that can be used for your version number. For Python projects, PEP 440 gives some recommendations. However, in order to be flexible, that PEP is complicated. For a simple project, stick with a simple versioning scheme.

Semantic versioning is a good default scheme to use. The version number is given as three numerical components, for instance 0.1.2. The components are called MAJOR, MINOR, and PATCH, and there are simple rules about when to increment each component:

Increment the MAJOR version when you make incompatible API changes.
Increment the MINOR version when you add functionality in a backwards-compatible manner.
Increment the PATCH version when you make backwards-compatible bug fixes. (Source)

You may need to specify the version in different files inside your project. In the reader project, we specified the version both in setup.py and in reader/__init__.py. To make sure the version numbers are kept consistent, you can use a tool called Bumpversion.

You can install Bumpversion from PyPI:

$ pip install bumpversion

To increment the MINOR version of reader, you would do something like this:

$ bumpversion --current-version 1.0.0 minor setup.py reader/__init__.py

This would change the version number from 1.0.0 to 1.1.0 in both setup.py and reader/__init__.py. To simplify the command, you can also give most of the information in a configuration file. See the Bumpversion documentation for details.

Adding Files to Your Package

Sometimes, you’ll have files inside your package that are not source code files. Examples include data files, binaries, documentation, and—as we have in this project—configuration files.

To tell setup() to include such files, you use a manifest file. For many projects, you don’t need to worry about the manifest, as setup() creates one that includes all code files as well as README files.

If you need to change the manifest, you create a manifest template which must be named MANIFEST.in. This file specifies rules for what to include and exclude:

include reader/*.txt

This example will include all .txt files in the reader directory, which in effect is the configuration file. See the documentation for a list of available rules.

In addition to creating MANIFEST.in, you also need to tell setup() to copy these non-code files. This is done by setting the include_package_data argument to True:

setup(...include_package_data=True,...)

The include_package_data argument controls whether non-code files are copied when your package is installed.

Publishing to PyPI

Your package is finally ready to meet the world outside your computer! In this section, you’ll see how to actually upload your package to PyPI.

If you don’t already have an account on PyPI, now is the time to create one: register your account on PyPI. While you’re at it, you should also register an account on TestPyPI. TestPyPI is very useful, as you can try all the steps of publishing a package without any consequences if you mess up.

To upload your package to PyPI, you’ll use a tool called Twine. You can install Twine using Pip as usual:

$ pip install twine

Using Twine is quite simple, and you will soon see how to use it to check and publish your package.

Building Your Package

Packages on PyPI are not distributed as plain source code. Instead, they are wrapped into distribution packages. The most common formats for distribution packages are source archives and Python wheels.

A source archive consists of your source code and any supporting files wrapped into one tar file. Similarly, a wheel is essentially a zip archive containing your code. In contrast to the source archive, the wheel includes any extensions ready to use.

To create a source archive and a wheel for your package, you can run the following command:

$ python setup.py sdist bdist_wheel

This will create two files in a newly created dist directory, a source archive and a wheel:

reader/
│
└── dist/
    ├── realpython_reader-1.0.0-py3-none-any.whl
    └── realpython-reader-1.0.0.tar.gz

Note: On Windows, the source archive will be a .zip file by default. You can choose the format of the source archive with the --format command line option.

You might wonder how setup.py knows what to do with the sdist and bdist_wheel arguments. If you look back to how setup.py was implemented, there is no mention of sdist, bdist_wheel, or any other command line arguments.

All the command line arguments are instead implemented in the upstream distutils standard library. You can list all available arguments by adding the --help-commands option:

$ python setup.py --help-commands
Standard commands:  build             build everything needed to install  build_py          "build" pure Python modules (copy to build directory)  build_ext         build C/C++ and Cython extensions (compile/link to build directory)< ... many more commands ...>

For information about one particular command, you can do something like python setup.py sdist --help.

Testing Your Package

First, you should check that the newly built distribution packages contain the files you expect. On Linux and macOS, you should be able to list the contents of the tar source archive as follows:

$ tar tzf realpython-reader-1.0.0.tar.gz
realpython-reader-1.0.0/realpython-reader-1.0.0/setup.cfgrealpython-reader-1.0.0/README.mdrealpython-reader-1.0.0/reader/realpython-reader-1.0.0/reader/feed.pyrealpython-reader-1.0.0/reader/__init__.pyrealpython-reader-1.0.0/reader/viewer.pyrealpython-reader-1.0.0/reader/__main__.pyrealpython-reader-1.0.0/reader/config.txtrealpython-reader-1.0.0/PKG-INFOrealpython-reader-1.0.0/setup.pyrealpython-reader-1.0.0/MANIFEST.inrealpython-reader-1.0.0/realpython_reader.egg-info/realpython-reader-1.0.0/realpython_reader.egg-info/SOURCES.txtrealpython-reader-1.0.0/realpython_reader.egg-info/requires.txtrealpython-reader-1.0.0/realpython_reader.egg-info/dependency_links.txtrealpython-reader-1.0.0/realpython_reader.egg-info/PKG-INFOrealpython-reader-1.0.0/realpython_reader.egg-info/entry_points.txtrealpython-reader-1.0.0/realpython_reader.egg-info/top_level.txt

On Windows, you can use a utility like 7-zip to look inside the corresponding zip file.

You should see all your source code listed, as well as a few new files that have been created containing information you provided in setup.py. In particular, make sure that all subpackages and supporting files are included.

You can also have a look inside the wheel by unzipping it as if it were a zip file. However, if your source archive contains the files you expect, the wheel should be fine as well.

Newer versions of Twine (1.12.0 and above) can also check that your package description will render properly on PyPI. You can run twine check on the files created in dist:

$ twine check dist/*
Checking distribution dist/realpython_reader-1.0.0-py3-none-any.whl: PassedChecking distribution dist/realpython-reader-1.0.0.tar.gz: Passed

While it won’t catch all problems you might run into, it will for instance let you know if you are using the wrong content type.

Uploading Your Package

Now you’re ready to actually upload your package to PyPI. For this, you’ll again use the Twine tool, telling it to upload the distribution packages you have built. First, you should upload to TestPyPI to make sure everything works as expected:

$ twine upload --repository-url https://test.pypi.org/legacy/ dist/*

Twine will ask you for your username and password.

Note: If you’ve followed the tutorial using the reader package as an example, the previous command will probably fail with a message saying you are not allowed to upload to the realpython-reader project.

You can change the name in setup.py to something unique, for example test-your-username. Then build the project again and upload the newly built files to TestPyPI.

If the upload succeeds, you can quickly head over to TestPyPI, scroll down, and look at your project being proudly displayed among the new releases! Click on your package and make sure everything looks okay.

If you have been following along using the reader package, the tutorial ends here! While you can play with TestPyPI as much as you want, you shouldn’t upload dummy packages to PyPI just for testing.

However, if you have your own package to publish, then the moment has finally arrived! With all the preparations taken care of, this final step is short:

$ twine upload dist/*

Provide your username and password when requested. That’s it!

Head over to PyPI and look up your package. You can find it either by searching, by looking at the Your projects page, or by going directly to the URL of your project: pypi.org/project/your-package-name/.

Congratulations! Your package is published on PyPI!

`pip install` Your Package

Take a moment to bask in the blue glow of the PyPI web page and (of course) brag to your friends.

Then open up a terminal again. There is one more great pay off!

With your package uploaded to PyPI, you can install it with pip as well:

$ pip install your-package-name

Replace your-package-name with the name you chose for your package. For instance, to install the reader package, you would do the following:

$ pip install realpython-reader

Seeing your own code installed by pip is a wonderful feeling!

Other Useful Tools

Before wrapping up, there are a few other tools that are useful to know about when creating and publishing Python packages.

Virtual Environments

In this guide, we haven’t talked about virtual environments. Virtual environments are very useful when working with different projects, each with their own differing requirements and dependencies.

See the following guides for more information:

In particular, it’s useful to test your package inside a minimal virtual environment to make sure you’re including all necessary dependencies in your setup.py file.

Cookiecutter

One great way to get started with your project is to use Cookiecutter. It sets up your project by asking you a few questions based on a template. Many different templates are available.

First, make sure you have Cookiecutter installed on your system. You can install it from PyPI:

$ pip install cookiecutter

As an example, we’ll use the pypackage-minimal template. To use a template, give Cookiecutter a link to the template:

$ cookiecutter https://github.com/kragniz/cookiecutter-pypackage-minimal
author_name [Louis Taylor]: Real Pythonauthor_email [louis@kragniz.eu]: office@realpython.compackage_name [cookiecutter_pypackage_minimal]: realpython-readerpackage_version [0.1.0]:package_description [...]: Read Real Python tutorialspackage_url [...]: https://github.com/realpython/readerreadme_pypi_badge [True]:readme_travis_badge [True]: Falsereadme_travis_url [...]:

After you have answered a series of questions, Cookiecutter sets up your project. In this example, the template created the following files and directories:

realpython-reader/
│
├── realpython-reader/
│   └── __init__.py
│
├── tests/
│   ├── __init__.py
│   └── test_sample.py
│
├── README.rst
├── setup.py
└── tox.ini

Cookiecutter’s documentation is extensive and includes a long list of available cookiecutters, as well as tutorials on how to create your own template.

Flit

The history of packaging in Python is quite messy. One common criticism is that using an executable file like setup.py for configuration information is not ideal.

PEP 518 defines an alternative: using a file called pyproject.toml instead. The TOML format is a simple configuration file format:

[…] it is human-usable (unlike JSON), it is flexible enough (unlike configparser), stems from a standard (also unlike configparser), and it is not overly complex (unlike YAML). (Source)

While PEP 518 is already a few years old, the pyproject.toml configuration file is not yet fully supported in the standard tools.

However, there are a few new tools that can publish to PyPI based on pyproject.toml. One such tool is Flit, a great little project that makes it easy to publish simple Python packages. Flit doesn’t support advanced packages like those creating C extensions.

You can pip install flit, and then start using it as follows:

$ flit init
Module name [reader]:Author []: Real PythonAuthor email []: office@realpython.comHome page []: https://github.com/realpython/readerChoose a license (see http://choosealicense.com/ for more info)1. MIT - simple and permissive2. Apache - explicitly grants patent rights3. GPL - ensures that code based on this is shared with the same terms4. Skip - choose a license laterEnter 1-4 [1]:Written pyproject.toml; edit that file to add optional extra info.

The flit init command will create a pyproject.toml file based on the answers you give to a few questions. You might need to edit this file slightly before using it. For the reader project, the pyproject.toml file for Flit ends up looking as follows:

[build-system]requires=["flit"]build-backend="flit.buildapi"[tool.flit.metadata]module="reader"dist-name="realpython-reader"description-file="README.md"author="Real Python"author-email="office@realpython.com"home-page="https://github.com/realpython/reader"classifiers=["License :: OSI Approved :: MIT License","Programming Language :: Python :: 3","Programming Language :: Python :: 3.7",]requires-python=">=3.7"requires=["feedparser", "html2text"][tool.flit.scripts]realpython="reader.__main__:main"

You should recognize most of the items from our original setup.py. One thing to note though is that version and description are missing. This is not a mistake. Flit actually figures these out itself by using __version__ and the docstring defined in the __init__.py file. Flit’s documentation explains everything about the pyproject.toml file.

Flit can build your package and even publish it to PyPI. To build your package, simply do the following:

$ flit build

This creates a source archive and a wheel, exactly like python setup.py sdist bdist_wheel did earlier. To upload your package to PyPI, you can use Twine as earlier. However, you can also use Flit directly:

$ flit publish

The publish command will build your package if necessary, and then upload the files to PyPI, prompting you for your username and password if necessary.

To see Flit in action, have a look at the 2 minute lightning talk from EuroSciPy 2017. The Flit documentation is a great resource for more information. Brett Cannon’s tutorial on packaging up your Python code for PyPI includes a section about Flit.

Poetry

Poetry is another tool that can be used to build and upload your package. It’s quite similar to Flit, especially for the things we’re looking at here.

Before you use Poetry, you need to install it. It’s possible to pip install poetry as well. However, the author recommends that you use a custom installation script to avoid potential dependency conflicts. See the documentation for installation instructions.

With Poetry installed, you start using it with an init command:

$ poetry init

This command will guide you through creating your pyproject.toml config.Package name [code]: realpython-readerVersion [0.1.0]: 1.0.0Description []: Read the latest Real Python tutorials...

This will create a pyproject.toml file based on your answers to questions about your package. Unfortunately, the actual specifications inside the pyproject.toml differ between Flit and Poetry. For Poetry, the pyproject.toml file ends up looking like the following:

[tool.poetry]name="realpython-reader"version="1.0.0"description="Read the latest Real Python tutorials"readme="README.md"homepage="https://github.com/realpython/reader"authors=["Real Python <office@realpython.com>"]license="MIT"packages=[{include = "reader"}]include=["reader/*.txt"][tool.poetry.dependencies]python=">=3.7"feedparser=">=5.2"html2text=">=2018.1"[tool.poetry.scripts]realpython="reader.__main__:main"[build-system]requires=["poetry>=0.12"]build-backend="poetry.masonry.api"

Again, you should recognize all these items from the earlier discussion of setup.py. One thing to note is that Poetry will automatically add classifiers based on the license and the version of Python you specify. Poetry also requires you to be explicit about versions of your dependencies. In fact, dependency management is one of the strong points of Poetry.

Just like Flit, Poetry can build and upload packages to PyPI. The build command creates a source archive and a wheel:

$ poetry build

This will create the two usual files in the dist subdirectory, which you can upload using Twine as earlier. You can also use Poetry to publish to PyPI:

$ poetry publish

This will upload your package to PyPI. In addition to building and publishing, Poetry can help you earlier in the process. Similar to Cookiecutter, Poetry can help you start a new project with the new command. It also supports working with virtual environments. See Poetry’s documentation for all the details.

Apart from the slightly different configuration files, Flit and Poetry work very similarly. Poetry is broader in scope as it also aims to help with dependency management, while Flit has been around a little longer. Andrew Pinkham’s article Python’s New Package Landscape covers both Flit and Poetry. Poetry was one of the topics at the special 100th episode of the Python Bytes podcast.

Conclusion

You now know how to prepare your project and upload it to PyPI, so that it can be installed and used by other people. While there are a few steps you need to go through, seeing your own package on PyPI is a great pay off. Having others find your project useful is even better!

In this tutorial, you’ve seen the steps necessary to publish your own package:

Find a good name for your package
Configure your package using setup.py
Build your package
Upload your package to PyPI

In addition, you’ve also seen a few new tools for publishing packages that use the new pyproject.toml configuration file to simplify the process.

If you still have questions, feel free to reach out in the comments section below. Also, the Python Packaging Authority has a lot of information with more detail than we covered here.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

↧

Catalin George Festila: Python Qt5 - QCalendarWidget example.

November 12, 2018, 3:30 am

≫ Next: PyCon: PyCon 2019 Registration is Open!

≪ Previous: Real Python: How to Publish an Open-Source Python Package to PyPI

This tutorial is about QCalendarWidget.
Use a default application and add this widget.
You can change and get the date from the widget calendar like any widget.
The result of this source code is this result:

This is the source code for QCalendarWidget example:

import sys
from PyQt5 import *
from PyQt5.QtWidgets import QApplication, QCalendarWidget, QWidget, QLabel
from PyQt5.QtCore import *
from PyQt5.QtGui import *

class Example(QWidget):
   def __init__(self):
      super(Example, self).__init__()
      self.initUI()
   def initUI(self):
      my_calendar = QCalendarWidget(self)
      my_calendar.setGridVisible(True)
      my_calendar.move(10, 20)
      my_calendar.clicked[QDate].connect(self.show_date)
      self.my_label = QLabel(self)
      date = my_calendar.selectedDate()
      self.my_label.setText(date.toString())
      self.my_label.move(10, 220)
      self.setGeometry(100,100,320,270)
      self.setWindowTitle('Calendar')
      self.show()
   def show_date(self, date):
      self.my_label.setText(date.toString())

def main():
   app = QApplication(sys.argv)
   ex = Example()
   sys.exit(app.exec_())
if __name__ == '__main__':
   main()

↧

PyCon: PyCon 2019 Registration is Open!

November 12, 2018, 4:12 am

≫ Next: Stack Abuse: Python GUI Development with Tkinter: Part 3

≪ Previous: Catalin George Festila: Python Qt5 - QCalendarWidget example.

It is that time of year! Registration for PyCon 2019 has launched and once again we are selling the first 800 tickets at a discounted rate.

How to register

Once you have created an account on us.pycon.org, you can register via the registration tab on the conference website.

Registration costs

The early bird pricing is $550 for corporate, $350 for individuals, and $100 for students. Once we sell the first 800 tickets, regular prices will go into effect. Regular pricing will be $700 for corporate, $400 for individuals, and $125 for students.

PyCon will take place May 1-9, 2019 in Cleveland, Ohio. The core of the conference May 3-5, 2019 packs in three days worth of our community’s 95 best talks, amazing keynote speakers, and our famed lightning talks to close out each day, but it is much more than that.

It’s having over 3,000 people in one place to learn from and share with. It’s joining a conversation in the hallway with the creators of open source projects. It’s taking yourself from beginner to intermediate, or intermediate to advanced. For some, it’s getting started with Python for the first time.

It’s a whole host of events that brings together Python users from around the world. Following the conference days there are 4 days of sprints you are invited to attend. Free and open to the public, PyCon Sprints offer an opportunity for anyone to collaborate and contribute to a project even if it is their first time.

Tutorials

Tutorials will be presented Wednesday May 1, 2019 and Thursday May 2, 2019. We are accepting proposals for tutorials through November 26, 2018. Find more information and submit a proposal here. Once our program committee has scheduled the selected tutorials, you will be able to add them to your conference registration. Watch for tutorial launch in February 2019. Follow us on Twitter for the announcement.

Education Summit

The Education Summit is held on Thursday May 2, 2019. The Education Summit requires you to be registered due to capacity limits. Please only register if you plan to attend as this is a popular event. If you register and are unable to attend please let us know by emailing pycon-reg@python.org. We want to be sure the room is full and those that are able to attend have the chance.

Evening Dinners

There are two evening events that require additional registration and capacity is limited. The Rock & Roll Hall of Fame Dinner on Friday May 3, 2019 and the Great Lakes Science Center Dinner is on Sunday May 5, 2019. If you register for the dinners, please be sure you are able to attend. These events do sell out and we want all those that want to attend to have the opportunity. If you register to attend and your plans change, please let us know by emailing pycon-reg@python.org.

Cancellation Fees

Registration cancellations must be submitted in writing and received by April 19, 2019 in order to receive a refund minus the $50 cancellation fee ($25 for students). No refunds will be granted for cancellations received after April 19, 2019.

In lieu of cancellation you are able to transfer your registration to another person. For details about transferring your registration, visit the registration page.

Attendees traveling to PyCon internationally are encouraged to review our International Travel Refund Policy. This is especially important for recipients of Financial Aid applicants attending from abroad. PyCon strives to support the Python community in attending, no matter where they are traveling from.

Hotel

PyCon has contracted special rates with nearby hotels. When you complete your registration for PyCon 2019, you will be able to book a hotel reservation through our official housing bureau. This is the only way to get the conference rates. More information can be found on the Venue and Hotels page.

Note: Beware of Housing Pirates! PyCon or our official housing bureau, Conference Technology Enhancements(CTE) will not be calling delegates to sell rooms. If you are contacted by an agency other than CTE offering to make your hotel reservations, we urge you to not use their services. We cannot protect you against them if you do book a reservation.

Looking for a roommate? Check out PyCon’s Room Sharing page.

Childcare

PyCon is proud to announce that we will be once again offering Childcare during the main conference days, May 3-5, 2018. Space is limited, so be sure to sign-up soon.

Financial Aid

Check out the Financial Aid page to learn more about the support we provide for travel, hotel, registration, and childcare to ensure that everyone has an opportunity to attend PyCon.

More Information

Head to https://us.pycon.org/2019/registration for more details!

↧

Stack Abuse: Python GUI Development with Tkinter: Part 3

November 12, 2018, 8:25 am

≫ Next: Codementor: Introducing a simple and intuitive Python API for UCI machine learning repository

≪ Previous: PyCon: PyCon 2019 Registration is Open!

This is the third installment of our multi-part series on developing GUIs in Python using Tkinter. Check out the links below for the other parts to this series:

Python GUI Development with Tkinter
Python GUI Development with Tkinter: Part 2
Python GUI Development with Tkinter: Part 3

Introduction

Tkinter is the de facto standard package for building GUIs in Python. In StackAbuse's first and second part of the Tkinter tutorial, we learned how to use the basic GUI building blocks to create simple interfaces.

In the last part of our tutorial, we'll take a look at a couple of shortcuts that Tkinter offers to let us effortlessly offer complex and very useful features. We'll also learn about Python Mega Widgets - a toolkit, based on Tkinter, that speeds up building complicated interfaces even faster.

File Dialog

Letting a user select a file on their machine is obviously a very common feature of graphical interfaces. The file dialogs are usually pretty complex - they combine at least multiple buttons (like Open, Cancel, or Create New Folder) and a frame that displays the structure of our environment's directories. Based on our previous tutorials, you may assume that using Tkinter it is very difficult to create such a complicated feature. However, actually, it isn't. Take a look at the following example:

import tkinter  
import tkinter.filedialog

root = tkinter.Tk()

def print_path():  
    f = tkinter.filedialog.askopenfilename(
        parent=root, initialdir='C:/Tutorial',
        title='Choose file',
        filetypes=[('png images', '.png'),
                   ('gif images', '.gif')]
        )

    print(f)

b1 = tkinter.Button(root, text='Print path', command=print_path)  
b1.pack(fill='x')

root.mainloop()

Output:

The code above is all you need to display a nice, useful File Dialog. In line 2 we import the contents of the filedialog class. Then, after creating our root window in line 4, we define a new function in line 6 (which is supposed to be executed by the button created in line 17 and packed in line 18).

Let's take a look at the print_path() function definition. In line 7, we execute the askopenfilename function, which takes a couple of arguments. The first argument, of course, is the dialog's parent widget (which in this case is our root window). Then, in the initialdir argument, we provide a location that'll be displayed in our file dialog right after it's opened. title controls the content of the dialog's title bar.

And then we have the filetypes argument, thanks to which we can specify what kind of files will be visible for the user in the file dialog. Narrowing down the file types can make the search for the desired file much faster, as well as letting the user know which types of files are accepted.

The argument to filetypes is a list of 2-element tuples. In each tuple, the first element is a string which is any description we want to set for each of the file types. The second element is where we state or list the file extensions associated with each file type (if there's only one extension, it's a string - otherwise, it's a tuple). As you can see on the output screenshot above, the user can select the displayed file type from the dropdown list in the bottom right corner of the dialog.

The askopenfilename() method returns a string which is the path of the file selected by the user. If the user decides to hit Cancel, an empty string is returned. In line 7 we return the path to variable f, and then, in line 15 (which is only executed after the File Dialog is closed), the path is printed in the console.

Displaying Images Using Tkinter

One more interesting thing that many people might find useful to apply to their GUIs is displaying images. Let's modify the previous example a little bit.

import tkinter  
import tkinter.filedialog

root = tkinter.Tk()

def display_image():  
    f = tkinter.filedialog.askopenfilename(
        parent=root, initialdir='C:/Tutorial',
        title='Choose file',
        filetypes=[('png images', '.png'),
                   ('gif images', '.gif')]
        )

    new_window = tkinter.Toplevel(root)

    image = tkinter.PhotoImage(file=f)
    l1 = tkinter.Label(new_window, image=image)
    l1.image = image
    l1.pack()

b1 = tkinter.Button(root, text='Display image', command=display_image)  
b1.pack(fill='x')

root.mainloop()

Output:

Let's see what changed inside the function executed by our button, now renamed to display_image. We display the File Dialog, we use the same criteria for file selection as before, and again we store the returned path in variable f. However, after getting the file path, we don't print it in the console. What we do is we create a top-level window in line 14. Then, in line 16, we instantiate an object of the PhotoImage class, by making it read the .png file that was selected by the user. The object is then stored in the image variable, which we can pass as an argument for the construction of Label widget in line 17. In line 18, we make sure to keep a reference to the image object in order to keep it from getting cleared by Python's garbage collector. Then, in line 19, we pack our label (this time displaying an image, not a text) inside the new_window.

Color Chooser

Another common feature, especially in software focused on graphics, is allowing the user to select a color from a palette. In this case, Tkinter also offers a nice, ready-to-use solution that should satisfy most of our needs concerning the color choosing feature.

import tkinter  
import tkinter.colorchooser

root = tkinter.Tk()

def color_button():  
    color = tkinter.colorchooser.askcolor(parent=root)
    print(color)
    b1.configure(bg=color[1])

b1 = tkinter.Button(root, text='Select Color', command=color_button)  
b1.pack(fill='x')

root.mainloop()

Output:

In line 2 of the example shown above, we import a class called colorchooser. We use its askcolor() method in line 7. This method, similarly to askopenfilename(), is responsible for opening a nice, complex dialog (a color chooser in this case) and returns the data dependent on the user's choice. In this case, after the user picks a color from the palette and accepts their choice, the object returned to variable color is a tuple containing two elements. The first element is a tuple that stores values for the selected color's red, green and blue channels. The second element of the tuple is the same color specified in hexadecimal format. We can see the contents of the tuples in our console, thanks to the print() in line 8.

After we store the tuple returned by askcolor in variable color, we then use that variable in line 9 to configure the b1 button. As you already know, the bg argument is responsible for controlling the button's background color. We pass the first element of the color tuple to it (the color representation in hexadecimal format). As a result, after pressing the b1 button, the user can change its background color using a nice color chooser.

Message Boxes

Before move on from Tkinter to Python Mega Widgets, it's good to mention one more feature of the Tkinter module that makes programming GUIs a bit quicker. Tkinter offers so-called Message Boxes, which are a set of simple, but widely used standard dialogs. These message boxes can be used to display a quick message, a warning, or when we need our user to make a simple yes/no decision. The following example demonstrates all message boxes offered by Tkinter:

import tkinter  
import tkinter.messagebox

root = tkinter.Tk()

def display_and_print():  
    tkinter.messagebox.showinfo("Info","Just so you know")
    tkinter.messagebox.showwarning("Warning","Better be careful")
    tkinter.messagebox.showerror("Error","Something went wrong")

    okcancel = tkinter.messagebox.askokcancel("What do you think?","Should we go ahead?")
    print(okcancel)

    yesno = tkinter.messagebox.askyesno("What do you think?","Please decide")
    print(yesno)

    retrycancel = tkinter.messagebox.askretrycancel("What do you think?","Should we try again?")
    print(retrycancel)

    answer = tkinter.messagebox.askquestion("What do you think?","What's your answer?")
    print(answer)

b1 = tkinter.Button(root, text='Display dialogs', command=display_and_print)  
b1.pack(fill='x')

top.mainloop()

Output:

This time, our b1 button executes function display_and_print(). The function lets 7 message boxes pop up in sequence - each is displayed after the user interacts with the previous one. dialogs defined in lines 11 - 21 are dialogs that require the user to choose one of two available options - therefore, they return values based on the decisions and store them in their respective variables. In each case, we can pass two arguments while defining the dialogs - first is always the dialog's title, and the second one contains contents of its main message.

So, to start from the top. In line 7, we define a simple showinfo dialog, which is only meant to display a neutral icon, a message, and an OK button that closes it. In lines 8 and 9, we have similar, simple types of message boxes, but their icons indicate that caution from the user is required (showwarning) or that an error has occurred (showerror). Note that in each of the three cases, a different sound is played upon the dialog's appearance.

As I stated before, lines 11 - 21 contain code responsible for displaying dialogs to get the user's decision. askokcancel (line 11) returns True if the user clicks OK and False if they click Cancel. askyesno (line 14) returns True if the user clicks Yes and False if the user clicks No. askretrycancel (line 17) returns True if the user clicks Retry and False if the user clicks Cancel. askquestion is very similar to askyesno, but returns 'yes' if the user clicks Yes and 'no' if the user clicks No.

Keep in mind, that the exact appearance of the File Dialog, Color Chooser, and all Message Boxes depends on the operating system that the code is executed on, as well as on the system language.

Progress Bar

Another useful element of advanced GUIs is a Progress Bar. The following example shows a simple implementation of this feature using Tkinter:

import tkinter  
import time  
from tkinter import ttk

root = tkinter.Tk()

def start():  
    for k in range(1, 11):
        progress_var.set(k)
        print("STEP", k)
        k += 1
        time.sleep(1)
        root.update_idletasks()

b1 = tkinter.Button(root, text="START", command=start)  
b1.pack(side="left")

progress_var = tkinter.IntVar()

pb = ttk.Progressbar(root, orient="horizontal",  
                     length=200, maximum=10,
                     mode="determinate",
                     var=progress_var)
pb.pack(side="left")

pb["value"] = 0

root.mainloop()

Output:

The example above shows the implementation of Progressbar. It is a part of the tkinter.ttk module, which provides access to the Tk themed widget set, introduced in Tk 8.5. This is why we need to additionally import the ttk module in line 3.

Our progress bar's state will be controlled by time - the bar will progress in ten steps, executed in one-second intervals. For that purpose, we import the time module in line 2.

We define our Progressbar in line 20. We define its parent widget (root), we give it a "horizontal" orientation and a length of 200 pixels. Then, we define the maximum value - which is the value of the variable assigned to the progress bar using the var argument (in our case, the progress_var variable), that means the progress bar is filled completely. We set the mode to "determinate", which means that our code will be moving the indicator's length to precisely defined points based on the progress_var's value.

The progress_var integer variable that will control the bar's progress is defined in line 18. In line 26, using a dictionary-like assignment, we set the initial value of the progress bar to 0.

In line 15, we create a Button that's supposed to start the clock controlling our bar's progress by executing the start() function, defined between lines 7 and 13. There, we have a simple for loop, that will iterate through values between 1 and 10. With each iteration, the progress_var value is updated and increased by 1. In order to be able to observe the progress clearly, we wait for one second during each iteration (line 12). We then use the root window's update_idletasks() method in line 13, in order to let the program update the appearance of the progress bar even though we're still executing the for loop (so, we're technically still in a single mainloop() iteration).

Python Mega Widgets

If you use Tkinter extensively in your projects, I think it's a good idea to consider incorporating Python Mega Widgets in your code. Python Mega Widgets is a toolkit based on Tkinter that offers a set of megawidgets: complex, functional and relatively aesthetically pleasing widgets made out of simpler Tkinter widgets. What's great about this package, which you can download here is that the general philosophy of defining and orienting widgets is the same as in case of Tkinter, and you can mix both libraries in your code. Let's finish our tutorial by scratching the surface of this powerful toolkit.

EntryField Widget

One of the most useful widgets of the Pmw package is EntryField. Let's analyze the following example to see what it's capable of:

import tkinter  
import Pmw

root = tkinter.Tk()

def color_entry_label():  
    color = entry_color.get()
    entry_number.configure(label_bg=color)

entry_color = Pmw.EntryField(root, labelpos="w",  
                                 label_text="First name:",
                                 entry_bg="white",
                                 entry_width=15,
                                 validate="alphabetic")

entry_number = Pmw.EntryField(root, labelpos="w",  
                                 label_text="Integer:",
                                 entry_bg="white",
                                 entry_width=15,
                                 validate="integer")

ok_button = tkinter.Button(root, text="OK", command=color_entry_label)

entry_color.pack(anchor="e")  
entry_number.pack(anchor="e")  
ok_button.pack(fill="x")

root.mainloop()

Output:

This time, we have to not only import tkinter, but also our freshly installed Pmw package (line 2). As always, we use the Tk class to initiate our root window.

In lines 10-14 and 16-20 we define two Pmw.EntryField widgets. An EntryField is a functional mix of Tkinter's Label and Entry, with some addition of useful functionalities. The first argument for the widget's initialization is, of course, the parent widget. The label_text, entry_bg and entry_width control some self-explanatory aspects of the widget's appearance. The most interesting argument in our example is probably the validate argument. Here, we can decide what kind of data the user can put inside the field.

In the entry_color field, we expect a string of letters, so we set validate to "alphabetic". In the entry_number widget, we expect an integer, and that's what we set the validate argument value to. This way, if we try to put a number inside the former and a letter inside the latter, the symbols simply won't appear in the widgets and a system sound will be played, informing us that we're trying to do something wrong. Also, if the widget expects a certain type of data and its contents are in conflict with this condition the moment it's initialized, the EntryField will be highlighted red.

As you can see in our example, right after we display our window, the first entry field is white, and the second one is red. This is because an empty string (default content of the entries) falls into the category of "alphabetic" entities, but it's definitely not an integer.

The button defined in line 26 executes the color_entry_label() command defined between lines 6 and 8. The function's goal is to paint the entry_number widget label's background according to the contents of the entry_color widget. In line 7, the get() method is used to extract the contents of the entry_colorEntryField. Then, naturally, the configure() method is used in order to change the appearance of the entry_number widget. Note that in order to change characteristics of Pmw widgets that are composed of several simpler widgets, we have to specify which sub-widget we want to configure (in our case, it's the label - this is why we configure the label_bg and not, let's say, the entryfield_bg).

The EntryField widget might not be visually very impressive, but even this simple example illustrates the potential of mega-widgets - building this kind of self-verifying piece of the interface of higher complexity would require much more code if we tried to achieve the same effect using plain Tkinter. I encourage you to explore other powerful mega-widgets described in the toolkit's documentation.

Conclusion

Tkinter is one of many available GUI libraries for Python, but its great advantage is that it's considered a Python standard and still distributed, by default, with all Python distributions. I hope you enjoyed this little tutorial and now have a good grasp on building interfaces for users that might get scared off by command-line operated software.

↧

Codementor: Introducing a simple and intuitive Python API for UCI machine learning repository

November 12, 2018, 11:03 am

≫ Next: Tryton News: Security Release for issue7766

≪ Previous: Stack Abuse: Python GUI Development with Tkinter: Part 3

Introducing a simple and intuitive API for UCI machine learning portal, where users can easily look up a data set description, search for a particular data set they are interested, and even download datasets categorized by size or machine learning task.

↧

Tryton News: Security Release for issue7766

November 12, 2018, 2:25 pm

≫ Next: Mike Driscoll: Jupyter Notebook 101 Released!

≪ Previous: Codementor: Introducing a simple and intuitive Python API for UCI machine learning repository

@ced wrote:

Synopsis
A vulnerability in trytond, the core package of Tryton, has been found by Cédric Krier.
The issue7766 shows that it is possible for an authenticated user to guess the value of a field for which he has no access right no matter if it is at the model or the field level. The procedure is to make dichotomous search queries on the model using a domain clause on the field equals value until the search returns the id.
Impact
CVSS v3.0 Base Score: 6.5
Attack Vector: Network
Attack Complexity: Low
Privileges Required: Low
User Interaction: None
Scope: Unchanged
Confidentiality: High
Integrity: None
Availability: None
Workaround
There are no known workarounds.
Resolution
All affected users should upgrade trytond to the latest version.
Affected versions per series:
5.0: <=5.0.0
4.8: <=4.8.4
4.6: <=4.6.8
4.4: <=4.4.13
4.2: <=4.2.15
4.0: <=4.0.19
Non affected versions per series:
5.0: >=5.0.1
4.8: >=4.8.5
4.6: >=4.6.9
4.4: >=4.4.14
4.2: >=4.2.16
4.0: >=4.0.20
Reference
issue7766
Concern?
Any security concerns should be reported on the bug-tracker at
https://bugs.tryton.org/ with the type security .
Remarks
The module stock_supply received a fix that break the policy of not XML data changes in series. This is to give read access to “Stock Administrator” group for purchase requests in order to run the supply wizard without error.

Posts: 1

Participants: 1

Read full topic

↧

Mike Driscoll: Jupyter Notebook 101 Released!

November 12, 2018, 10:05 pm

≫ Next: eGenix.com: PyDDF Python Herbst Sprint 2018

≪ Previous: Tryton News: Security Release for issue7766

My latest book, Jupyter Notebook 101 is now officially released.

You can purchase it at the following retailers:

Amazon (Kindle or Paperback)
Leanpub (mobi, epub and PDF) on sale for $9.99 until the end of November
Gumroad (mobi, epub and PDF)

You can also download a sample of the book from Leanpub. Get it for $9.99 on Leanpub for a limited time only!

Jupyter Notebook 101 will teach you all you need to know to create and use Notebooks effectively. You can use Jupyter Notebook to help you learn to code, create presentations, and make beautiful documentation.

The Jupyter Notebook is used by the scientific community to demonstrate research in an easy-to-replicate manner.

You will learn the following in Jupyter Notebook 101:

How to create and edit Notebooks
How to add styling, images, graphs, etc
How to configure Notebooks
How to export your Notebooks to other formats
Notebook extensions
Using Notebooks for presentations
Notebook Widgets
and more!

↧

eGenix.com: PyDDF Python Herbst Sprint 2018

November 13, 2018, 12:00 am

≫ Next: PyBites: Automating PyBites Review Post Using Github API and collections.defaultdict

≪ Previous: Mike Driscoll: Jupyter Notebook 101 Released!

The following text is in German, since we're announcing a Python sprint in Düsseldorf, Germany.

Ankündigung

PyDDF Python Herbst Sprint 2018 in
Düsseldorf

Samstag, 17.11.2018, 10:00-18:00 Uhr
Sonntag, 18.11.2018, 10:00-18:00 Uhr

trivago N.V., Kesselstrasse 5-7, 40221 Düsseldorf

Informationen

Das Python Meeting Düsseldorf (PyDDF) veranstaltet mit freundlicher Unterstützung der trivago N.V. ein Python Sprint Wochenende.

Der Sprint findet am Wochenende 17./18.11.2018 in der trivago Niederlassung im Medienhafen Düsseldorf statt (Achtung: Nicht mehr am Karl-Arnold-Platz).

Sprint Ort in Google Maps

Folgende Themengebiete sind als Anregung bereits angedacht:

Openpyxl - https://pythonhosted.org/openpyxl/

SMS Forwarder - SMS an Email Adresse oder Chat weiterleiten
Python auf einem Raspberry Pi - Cluster
Django for Runners

Natürlich kann jeder Teilnehmer weitere Themen vorschlagen und umsetzen.

Anmeldung und weitere Infos

Alles weitere und die Anmeldung findet Ihr auf der Sprint Seite:

PyDDF Python Herbst Sprint 2018

WICHTIG: Ohne Anmeldung können wir kein Badge für den Gebäudezugang bereitstellen lassen. Eine spontane Anmeldung am Sprint Tag wird daher vermutlich nicht funktionieren. Also bitte unbedingt mit vollen Namen bis spätestens Freitag, 16.11., anmelden.

Teilnehmer sollten sich zudem auf der PyDDF Liste anmelden, da wir uns dort koordinieren:

PyDDF Mailing Liste

Über das Python Meeting Düsseldorf

Das Python Meeting Düsseldorf ist eine regelmäßige Veranstaltung in Düsseldorf, die sich an Python Begeisterte aus der Region wendet.

Einen guten Überblick über die Vorträge bietet unser PyDDF YouTube-Kanal, auf dem wir Videos der Vorträge nach den Meetings veröffentlichen.

Veranstaltet wird das Meeting von der eGenix.com GmbH, Langenfeld, in Zusammenarbeit mit Clark Consulting & Research, Düsseldorf.

Viel Spaß !

Marc-Andre Lemburg, eGenix.com

↧

PyBites: Automating PyBites Review Post Using Github API and collections.defaultdict

November 13, 2018, 1:00 am

≫ Next: pythonwise: direnv

≪ Previous: eGenix.com: PyDDF Python Herbst Sprint 2018

In this post I share a quick script I produced last week to automate a portion of our review post. I used the Github API and the collections.defaultdict.

The goal of this script and post is to show you how to convert open PRs of our challenges repo into markdown for our weekly review post.

Setting the stage

First I am importing the libraries to use and some constants:

fromcollectionsimportdefaultdictimportreimportrequestsGH_API_PULLS_ENDPOINT='https://api.github.com/repos/pybites/challenges/pulls'PR_LINK="https://github.com/pybites/challenges/pull/{id}"CHALLENGE_LINK="http://codechalleng.es/challenges/{id}"EXTRACT_TEMPLATE=re.compile(r'.*learn\?\):\s+\[(.*?)\]Other.*')

We will use the EXTRACT_TEMPLATE regex in a bit. I had to escape the ?, ), [ and ], because they have special meaning in regex land. Here I want to match the literal ones which are part of the PR template.

Parsing the review template

Each PR has a fixed template we use to have developers document their learning and provide us feedback. Here is my last submission for example:

Difficulty level (1-10): [3]
Estimated time spent (hours): [1]
Completed (yes/no): [No]
I stretched my coding skills (if yes what did you learn?): [Nice one to get back into Pandas, blabla ...]
Other feedback (what can we improve?): []

I defined a helper to parse the learning part ("what did you learn") from this template. As it might span multiple lines, I cannot just index a list, hence I used the EXTRACT_TEMPLATE regex to parse the full string.

The nice thing about re.compile is that you can define your regex once (here in a constant) and call regex methods like sub on it. The \1 is the user's learning part I am interested in, which I captured using parenthesis in the regular expression.

Before anything else I make sure we're dealing with a single-line string by taking the \r\ns out (you can probably also use re.M = multi-line matching, but that does not always work for me):

def get_learning(template):
    """Helper to extract learning from PR template"""
    learning = ''.join(template.split('\r\n'))
    return EXTRACT_TEMPLATE.sub(r'\1', learning).strip()

By the way, I am not sure why I got a Windows-like \r but it does give me the opportunity to highlight two things here:

The first iteration of this script I did in a Jupyter notebook which is a great tool to play around with Python and document your progress!
Another great way to inspect a data structure when you are writing a script like this, is to pop a quick import pdb;pdb.set_trace() into your code (since Python 3.7 we can actually use breakpoint()).

Github API and collections.defaultdict

To pull the open PRs from Github I don't need an API key. Secondly notice the nice way you can chain operations in Python and the fact requests has a convenient json method. This is as expressive as it can get no?

open_pulls = requests.get(GH_API_PULLS_ENDPOINT).json()

This is part of the get_open_prs function in which I loop through the pull requests and add each (PR number, learning) tuple into a defaultdict which I return. The nice thing about defaultdict is that it prevents having to write code to look for a key before inserting a value into the dictionary:

def get_open_prs():
    """Parse GH API pulls JSON into a dict of keys = code challenge ids
    and values = lists of (pr_number, learning) tuples"""
    open_pulls = requests.get(GH_API_PULLS_ENDPOINT).json()
    prs = defaultdict(list)

    for pull in open_pulls:
        pr_number = pull['number']

        pcc = pull['head']['ref'].upper()
        learning = get_learning(pull['body'])
        if learning:
            prs[pcc].append((pr_number, learning))

    return prs

I used a dictionary here to sort the code challenge ids (or "PCCs") as we'll see next.

Print markdown compatible with our review post

Lastly I print the resulting prs dictionary sorting on key to show all PRs per challenge in ascending order (I needed the  to visually separate blockquotes well):

def print_review_markdown(prs):
    """Return markdown for review post, e.g.
    https://pybit.es/codechallenge57_review.html ->
    Read Code for Fun and Profit"""
    for pcc, prs in sorted(prs.items()):
        challenge_link = CHALLENGE_LINK.format(id=pcc.strip('PCC'))
        print(f'\n#### [{pcc}]({challenge_link})')

        for i, (pr_number, learning) in enumerate(prs):
            if i > 0:
                print('\n<!-- -->')
            pr_link = PR_LINK.format(id=pr_number)
            print(f'\n> {learning} - [PR]({pr_link})')

And I have my main block to call the two functions:

if __name__ == '__main__':
    prs = get_open_prs()
    print_review_markdown(prs)

Running the script

You can check out the complete script in our blog code repo. Here is when I run it (output changes depending on the current open challenge PRs):

$  python prs.py

#### [PCC01](http://codechalleng.es/challenges/01)> Before this exercise I never came across dictionary comprehensions. A bit confusing at first! - [PR](https://github.com/pybites/challenges/pull/428)<!-- -->

> testing - [PR](https://github.com/pybites/challenges/pull/427)#### [PCC03](http://codechalleng.es/challenges/03)> - Learned about SequenceMatcher. Great thing.- Started to think about how tests actually work, since I did get the results from the website but could not manage to pass the tests 8()- Heard about nltk (looks interesting). - [PR](https://github.com/pybites/challenges/pull/423)#### [PCC16](http://codechalleng.es/challenges/16)> I learn how to make request to remote database (in this project used RIPE DB) and how to parse JSON output from DB - [PR](https://github.com/pybites/challenges/pull/426)

We love automated scripts because the time saved each week easily compounds. It's also a nice way to hone your Python skills so I encourage you to always find opportunities to write these kind of utilities.

Feel free to share use cases in the comments below or on our Slack which you can join via our platform.

Keep Calm and Code in Python!

-- Bob

↧

pythonwise: direnv

November 13, 2018, 2:46 am

≫ Next: gamingdirectional: Pygame loads image and background graphic on game scene

≪ Previous: PyBites: Automating PyBites Review Post Using Github API and collections.defaultdict

I use the command line a lot. Some projects require different settings, say Python virtual environment, GOPATH for installing go packages and more.

I'm using direnv to help with settings per project in the terminal. For every project I have a .envrc file which specifies required settings, this file is automatically loaded once I change directory to the project directory or any of it's sub directories.

You'll need the following in your .zshrc

if whence direnv > /dev/null; then
eval "$(direnv hook zsh)"
fi

Every time you create or change your .envrc, you'll need to run direnv allow to validate it and make sure it's loaded. (If you did some changes and want to check them, run "cd .")

Here are some .envrc examples for various scenarios:

Python + pipenv

source $(pipenv --venv)/bin/activate

Go

GOPATH=$(pwd | sed s"#/src/.*##")

PATH=${GOPATH}/bin:${PATH}

This assumes your project's path that looks like /path/to/project/src/github.com/project

If you're using the new go modules (in 1.11+), you probably don't need this.

Python + virtualenv

source venv/bin/activate

Python + conda

source activate env-name

Replace env-name with the name of your conda environment.

↧

gamingdirectional: Pygame loads image and background graphic on game scene

November 13, 2018, 5:19 am

≫ Next: Stack Abuse: Time Series Analysis with LSTM using Python's Keras Library

≪ Previous: pythonwise: direnv

After a few days of rest, today I have continued my pygame project again and will keep on working on my new pygame projects without stopping anymore starting from today. Today I have created two game sprite classes to render the player and the background on the game scene. The player sprite class is almost the same as the background sprite class, the only reason I have created two game sprite...

Source

↧

Stack Abuse: Time Series Analysis with LSTM using Python's Keras Library

November 13, 2018, 6:28 am

≫ Next: Vasudev Ram: Quick-and-dirty IPC with Python, JSON and pyperclip

≪ Previous: gamingdirectional: Pygame loads image and background graphic on game scene

Introduction

Time series analysis refers to the analysis of change in the trend of the data over a period of time. Time series analysis has a variety of applications. One such application is the prediction of the future value of an item based on its past values. Future stock price prediction is probably the best example of such an application. In this article, we will see how we can perform time series analysis with the help of a recurrent neural network. We will be predicting the future stock prices of the Apple Company (AAPL), based on its stock prices of the past 5 years.

Dataset

The data that we are going to use for this article can be downloaded from Yahoo Finance. For training our algorithm, we will be using the Apple stock prices from 1st January 2013 to 31 December 2017. For the sake of prediction, we will use the Apple stock prices for the month of January 2018. So in order to evaluate the performance of the algorithm, download the actual stock prices for the month of January 2018 as well.

Let's now see how our data looks. Open the Apple stock price training file that contains data for five years. You will see that it contains seven columns: Date, Open, High, Low, Close, Adj Close and Volume. We will be predicting the opening stock price, therefore we are not interested in the rest of the columns.

If you plot the opening stock prices against the date, you will see the following plot:

alt text

You can see that the trend is highly non-linear and it is very difficult to capture the trend using this information. This is where the power of LSTM can be utilized. LSTM (Long Short-Term Memory network) is a type of recurrent neural network capable of remembering the past information and while predicting the future values, it takes this past information into account.

Enough of the preliminaries, let's see how LSTM can be used for time series analysis.

Predicting Future Stock Prices

Stock price prediction is similar to any other machine learning problem where we are given a set of features and we have to predict a corresponding value. We will perform the same steps as we do perform in order to solve any machine learning problem. Follow these steps:

Import Libraries

The first step, as always is to import the required libraries. Execute the following script to do so:

import numpy as np  
import matplotlib.pyplot as plt  
import pandas as pd

Import Dataset

Execute the following script to import the data set. For the sake of this article, the data has been stored in the Datasets folder, inside the "E" drive. You can change the path accordingly.

apple_training_complete = pd.read_csv(r'E:\Datasets\apple_training.csv')

As we said earlier, we are only interested in the opening price of the stock. Therefore, we will filter all the data from our training set and will retain only the values for the Open column. Execute the following script:

apple_training_processed = apple_training_complete.iloc[:, 1:2].values

Data Normalization

As a rule of thumb, whenever you use a neural network, you should normalize or scale your data. We will use MinMaxScaler class from the sklear.preprocessing library to scale our data between 0 and 1. The feature_range parameter is used to specify the range of the scaled data. Execute the following script:

from sklearn.preprocessing import MinMaxScaler  
scaler = MinMaxScaler(feature_range = (0, 1))

apple_training_scaled = scaler.fit_transform(apple_training_processed)

Convert Training Data to Right Shape

As I said earlier, in a time series problems, we have to predict a value at time T, based on the data from days T-N where N can be any number of steps. In this article, we are going to predict the opening stock price of the data based on the opening stock prices for the past 60 days. I have tried and tested different numbers and found that the best results are obtained when past 60 time steps are used. You can try different numbers and see how your algorithm performs.

Our feature set should contain the opening stock price values for the past 60 days while the label or dependent variable should be the stock price at the 61st day. Execute the following script to create feature and label set.

features_set = []  
labels = []  
for i in range(60, 1260):  
    features_set.append(apple_training_scaled[i-60:i, 0])
    labels.append(apple_training_scaled[i, 0])

In the script above we create two lists: feature_set and labels. There are 1260 records in the training data. We execute a loop that starts from 61st record and stores all the previous 60 records to the feature_set list. The 61st record is stored in the labels list.

We need to convert both the feature_set and the labels list to the numpy array before we can use it for training. Execute the following script:

features_set, labels = np.array(features_set), np.array(labels)

In order to train LSTM on our data, we need to convert our data into the shape accepted by the LSTM. We need to convert our data into three-dimensional format. The first dimension is the number of records or rows in the dataset which is 1260 in our case. The second dimension is the number of time steps which is 60 while the last dimension is the number of indicators. Since we are only using one feature, i.e Open, the number of indicators will be one. Execute the following script:

features_set = np.reshape(features_set, (features_set.shape[0], features_set.shape[1], 1))

Training The LSTM

We have preprocessed our data and have converted it into the desired format. now is the time to create our LSTM. The LSTM model that we are going to create will be a sequential model with multiple layers. We will add four LSTM layers to our model followed by a dense layer that predicts the future stock price.

Let's first import the libraries that we are going to need in order to create our model:

from keras.models import Sequential  
from keras.layers import Dense  
from keras.layers import LSTM  
from keras.layers import Dropout

In the script above we imported the Sequential class from keras.models library and Dense, LSTM, and Dropout classes from keras.layers library.

As a first step, we need to instantiate the Sequential class. This will be our model class and we will add LSTM, Dropout and Dense layers to this model. Execute the following script

model = Sequential()

Creating LSTM and Dropout Layers

Let's add LSTM layer to the model that we just created. Execute the following script to do so:

model.add(LSTM(units=50, return_sequences=True, input_shape=(features_set.shape[1], 1)))

To add a layer to the sequential model, the add method is used. Inside the add method, we passed our LSTM layer. The first parameter to the LSTM layer is the number of neurons or nodes that we want in the layer. The second parameter is return_sequences, which is set to true since we will add more layers to the model. The first parameter to the input_shape is the number of time steps while the last parameter is the number of indicators.

Let's now add a dropout layer to our model. Dropout layer is added to avoid over-fitting, which is a phenomenon where a machine learning model performs better on the training data compared to the test data. Execute the following script to add dropout layer.

model.add(Dropout(0.2))

Let's add three more LSTM and dropout layers to our model. Run the following script.

model.add(LSTM(units=50, return_sequences=True))  
model.add(Dropout(0.2))

model.add(LSTM(units=50, return_sequences=True))  
model.add(Dropout(0.2))

model.add(LSTM(units=50))  
model.add(Dropout(0.2))

Creating Dense Layer

To make our model more robust, we add a dense layer at the end of the model. The number of neurons in the dense layer will be set to 1 since we want to predict a single value in the output.

model.add(Dense(units = 1))

Model Compilation

Finally, we need to compile our LSTM before we can train it on the training data. The following script compiles the our model.

model.compile(optimizer = 'adam', loss = 'mean_squared_error')

We call the compile method on the Sequential model object which is "model" in our case. We use the mean squared error as loss function and to reduce the loss or to optimize the algorithm, we use the adam optimizer.

Algorithm Training

Now is the time to train the model that we defined in the previous few steps. To do so, we call the fit method on the model and pass it our training features and labels as shown below:

model.fit(features_set, labels, epochs = 100, batch_size = 32)

Depending upon your hardware, model training can take some time.

Testing our LSTM

We have successfully trained our LSTM, now is the time to test the performance of our algorithm on the test set by predicting the opening stock prices for the month of January 2018. However, as we did with the training data, we need to convert our test data in the right format.

Let's first import our test data. Execute the following script:

apple_testing_complete = pd.read_csv(r'E:\Datasets\apple_testing.csv')  
apple_testing_processed = apple_testing_complete.iloc[:, 1:2].values

In the above script, we import our test data and as we did with the training data, we removed all the columns from the test data except the column that contains opening stock prices.

If the opening stock prices for the month of January 2018 are plotted against the dates, you should see the following graph.

alt text

You can see that the trend is highly non-linear. Overall, the stock prices see small rise at the start of the month followed by a downward trend at the end of the month, with a slight increase and decrease in the stock prices in-between. It is extremely difficult to forecast such a trend. Let's see if the LSTM we trained is actually able to predict such a trend.

Converting Test Data to Right Format

For each day of January 2018, we want our feature set to contain the opening stock prices for the previous 60 days. For the 1st of January, we need the stock prices for the previous 60 days. To do so, we need to concatenate our training data and test data before preprocessing. Execute the following script to do so:

apple_total = pd.concat((apple_training_complete['Open'], apple_testing_complete['Open']), axis=0)

Now let's prepare our test inputs. The input for each day should contain the opening stock prices for the previous 60 days. That means we need opening stock prices for the 20 test days for the month of January 2018 and the 60 stock prices from the last 60 days for the training set. Execute the following script to fetch those 80 values.

test_inputs = apple_total[len(apple_total) - len(apple_testing_complete) - 60:].values

As we did for the training set, we need to scale our test data. Execute the following script:

test_inputs = test_inputs.reshape(-1,1)  
test_inputs = scaler.transform(test_inputs)

We scaled our data, now let's prepare our final test input set that will contain previous 60 stock prices for the month of January. Execute the following script:

test_features = []  
for i in range(60, 80):  
    test_features.append(test_inputs[i-60:i, 0])

Finally, we need to convert our data into the three-dimensional format which can be used as input to the LSTM. Execute the following script:

test_features = np.array(test_features)  
test_features = np.reshape(test_features, (test_features.shape[0], test_features.shape[1], 1))

Making Predictions

Now is the time to see the magic. We preprocessed our test data and now we can use it to make predictions. To do so, we simply need to call the predict method on the model that we trained. Execute the following script:

predictions = model.predict(test_features)

Since we scaled our data, the predictions made by the LSTM are also scaled. We need to reverse the scaled prediction back to their actual values. To do so, we can use the ìnverse_transform method of the scaler object we created during training. Take a look at the following script:

predictions = scaler.inverse_transform(predictions)

Finally, let's see how well did our algorithm predicted the future stock prices. Execute the following script:

plt.figure(figsize=(10,6))  
plt.plot(apple_testing_processed, color='blue', label='Actual Apple Stock Price')  
plt.plot(predictions , color='red', label='Predicted Apple Stock Price')  
plt.title('Apple Stock Price Prediction')  
plt.xlabel('Date')  
plt.ylabel('Apple Stock Price')  
plt.legend()  
plt.show()

The output looks like this:

alt text

In the output, the blue line represents the actual stock prices for the month of January 2018, while the red line represents the predicted stock prices. You can clearly see that our algorithm has been able to capture the overall trend. The predicted prices also see a bullish trend at the beginning followed by a bearish or downwards trend at the end. Amazing, isn't it?

Conclusion

A long short-term memory network (LSTM) is one of the most commonly used neural networks for time series analysis. The ability of LSTM to remember previous information makes it ideal for such tasks. In this article, we saw how we can use LSTM for the Apple stock price prediction. I would suggest that you download stocks of some other organization like Google or Microsoft from Yahoo Finance and see if your algorithm is able to capture the trends.

↧

Vasudev Ram: Quick-and-dirty IPC with Python, JSON and pyperclip

November 13, 2018, 8:46 am

≫ Next: Codementor: Quicksort tutorial: Python implementation with line by line explanation

≪ Previous: Stack Abuse: Time Series Analysis with LSTM using Python's Keras Library

By Vasudev Ram

Blue Gene image attribution

Hi, readers,

Some time ago I had written this post.

pyperclip, a cool Python clipboard module

The pyperclip module allows you to programmatically copy/paste text to/from the system clipboard.

Recently, I realized that pyperclip's copy and paste functionality could be used to create a sort of rudimentary IPC (Inter Process Communication) between two Python programs running on the same machine.

So I whipped up a couple of small programs, a sender and a receiver, as a proof of concept of this idea.

Here is the sender, pyperclip_ipc_sender.py:

'''
pyperclip_ipc_sender.py
Purpose: To send JSON data to the clipboard from  
a Python object.
Author: Vasudev Ram
Copyright 2018 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Training: https://jugad2.blogspot.com/p/training.html
Product store: https://gumroad.com/vasudevram
'''

from __future__ import print_function
import pyperclip as ppc
import json
import pprint

def generate_data():
    d = {"North": 1000, "South": 2000, "East": 3000, "West": 4000}
    return d

def send_data(d):
    ppc.copy(json.dumps(d))

def main():
    print("In pyperclip_ipc_sender.py")
    print("Generating data")
    d = generate_data()
    print("data is:")
    pprint.pprint(d)
    print("Copying data to clipboard as JSON")
    send_data(d)

main()

And here is the receiver, pyperclip_ipc_receiver.py:

'''
pyperclip_ipc_receiver.py
Purpose: To receive JSON data from the clipboard into 
a Python object and print it.
Author: Vasudev Ram
Copyright 2018 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Training: https://jugad2.blogspot.com/p/training.html
Product store: https://gumroad.com/vasudevram
'''

from __future__ import print_function
import pyperclip as ppc
import json
import pprint

def receive_data():
    d = json.loads(ppc.paste())
    return d

def main():
    print("In pyperclip_ipc_receiver.py")
    print("Pasting data from clipboard to Python object")
    data = receive_data()
    print("data is:")
    pprint.pprint(data)

main()

First I ran the sender in one command window:

$ python pyperclip_ipc_sender.py
In pyperclip_ipc_sender.py
data is:
{'East': 3000, 'North': 1000, 'South': 2000, 'West': 4000}
Copying data to clipboard as JSON

Then I ran the receiver in another command window:

$ python pyperclip_ipc_receiver.py
In pyperclip_rpc_receiver.py
Pasting data from clipboard to Python object
data is:
{u'East': 3000, u'North': 1000, u'South': 2000, u'West': 4000}

You can see that the receiver has received the same data that was sent by the sender - via the clipboard.

A few points about this technique:

- If you run the receiver without running the sender, or even before running the sender, the receiver will pick up whatever data was last put into the clipboard, either by some other program, or manually by you. For example, if you selected some text in an editor and then pressed Ctrl-C (to copy the selected text to the clipboard), the receiver would get that text (if it was JSON text - see two points below). However, that is not a bug, but a feature :)

- Obviously, this is not meant for production use, due to potential security issues. It's just a toy application as a proof of concept of this idea.

- Since I convert the Python object data to JSON in the sender before copying it to the clipboard with pyperclip, the receiver also expects the data it pastes from the clipboard into a Python object to be of type JSON. So if you instead copy some non-JSON data to the clipboard and then run the receiver, you will get an error. I tried this, and got:

ValueError: No JSON object could be decoded

To handle this gracefully, you can trap the ValueError (and maybe other kinds of exceptions that Python's json library may raise), with a try-except block around the code that pastes the data from the clipboard. You can then either tell the user to try again, or print/log the error and exit, depending on whether the receiver was an interactive or a non-interactive program.

The image at the top of the post is of an IBM Blue Gene supercomputer.

From the Wikipedia article about it:

[ The project created three generations of supercomputers, Blue Gene/L, Blue Gene/P, and Blue Gene/Q. Blue Gene systems have often led the TOP500[1] and Green500[2] rankings of the most powerful and most power efficient supercomputers, respectively. Blue Gene systems have also consistently scored top positions in the Graph500 list.[3] The project was awarded the 2009 National Medal of Technology and Innovation.[4] ]

- Enjoy.

- Vasudev Ram - Online Python training and consulting

I conduct online courses on Python programming, Unix / Linux commands and shell scripting and SQL programming and database design, with course material and personal coaching sessions.

The course details and testimonials are here.

Contact me for details of course content, terms and schedule.

Getting a new web site or blog, and want to help preserve the environment at the same time? Check out GreenGeeks.com web hosting.

DPD: Digital Publishing for Ebooks and Downloads.

Learning Linux? Hit the ground running with my vi quickstart tutorial. I wrote it at the request of two Windows system administrator friends who were given additional charge of some Unix systems. They later told me that it helped them to quickly start using vi to edit text files on Unix. Of course, vi/vim is one of the most ubiquitous text editors around, and works on most other common operating systems and on some uncommon ones too, so the knowledge of how to use it will carry over to those systems too.

Check out WP Engine, powerful WordPress hosting.

Sell More Digital Products With SendOwl.

Get a fast web site with A2 Hosting.

Creating or want to create online products for sale? Check out ConvertKit, email marketing for online creators.

Own a piece of history: Legendary American Cookware

Teachable: feature-packed course creation platform, with unlimited video, courses and students.

Managed WordPress hosting with Flywheel.

Posts about: Python * DLang * xtopdf

My ActiveState Code recipes

Follow me on:

↧

Codementor: Quicksort tutorial: Python implementation with line by line explanation

November 13, 2018, 10:45 am

≫ Next: Mike C. Fletcher: Lessons from Implementing from Scratch

≪ Previous: Vasudev Ram: Quick-and-dirty IPC with Python, JSON and pyperclip

A reference Quicksort implementation with an intuitive explanation as well as a line-by-line breakdown. This tutorial will get you unstuck from understanding the concept of Quicksort and let you implement your own version.

↧

How to Take a Random Sample of Rows

Random Sampling Rows using NumPy Choice

How to Sample Pandas Dataframe using frac

How to Shuffle Pandas Dataframe using Numpy

Pandas Sample with Replacement

Sample Dataframe with Seed

Pandas Sample with Weights

Pandas Sample by Group

Pandas Random Sample with Condition

Using Pandas Sample and Remove

Saving the Pandas Sample

Summary

Pattern matching

Unification

An algorithm for unification

Efficiency

A Small Python Package

Using the Real Python Reader

A Quick Look at the Code

Different Ways of Calling a Package

Preparing Your Package for Publication

Naming Your Package

Configuring Your Package

Documenting Your Package

Versioning Your Package

Adding Files to Your Package

Publishing to PyPI

Building Your Package

Testing Your Package

Uploading Your Package

pip install Your Package

Other Useful Tools

Virtual Environments

Cookiecutter

Flit

Poetry

Conclusion

How to register

Registration costs

Tutorials

Education Summit

Evening Dinners

Cancellation Fees

Hotel

Childcare

Financial Aid

More Information

Introduction

File Dialog

Displaying Images Using Tkinter

Color Chooser

Message Boxes

Progress Bar

Python Mega Widgets

EntryField Widget

Conclusion

Synopsis

Impact

Workaround

Resolution

Reference

Concern?

Remarks

Ankündigung

Informationen

Anmeldung und weitere Infos

Über das Python Meeting Düsseldorf

Setting the stage

Parsing the review template

Github API and collections.defaultdict

Print markdown compatible with our review post

Running the script

Python + pipenv

Go

Python + virtualenv

Python + conda

Related posts:

Introduction

Dataset

Predicting Future Stock Prices

`pip install` Your Package