Quantcast
Channel: Planet Python
Viewing all 23709 articles
Browse latest View live

Hynek Schlawack: typing.Protocol Across Python Versions

$
0
0

How to seamlessly support typing.Protocol on Python versions older and newer than 3.8. At the same time.


ItsMyCode: [Solved] ImportError: No module named matplotlib.pyplot

$
0
0

The ImportError: No module named matplotlib.pyplot occurs if you have not installed the Matplotlib library in Python and trying to run the script which has matplotlib related code. Another issue might be that you are not importing the matplotlib.pyplot properly in your Python code.

In this tutorial, let’s look at installing the matplotlib module correctly in different operating systems and solve No module named matplotlib.pyplot.  

ImportError: No module named matplotlib.pyplot

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.

Matplotlib is not a built-in module (it doesn’t come with the default python installation) in Python, you need to install it explicitly using the pip installer and then use it.

If you looking at how to install pip or if you are getting an error installing pip checkout pip: command not found to resolve the issue.

Matplotlib releases are available as wheel packages for macOS, Windows and Linux on PyPI. Install it using pip:

Install Matplotlib in OSX/Linux 

The recommended way to install the matplotlib module is using pip or pip3 for Python3 if you have installed pip already.

Using Python 2

$ sudo pip install matplotlib

Using Python 3

$ sudo pip3 install matplotlib

Alternatively, if you have easy_install in your system, you can install matplotlib using the below command.

Using easy install

$ sudo easy_install -U matplotlib

For CentOs

$ yum install python-matplotlib

For Ubuntu

To install matplotlib module on Debian/Ubuntu :

$ sudo apt-get install python3-matplotlib

Install Matplotlib in Windows

In the case of windows, you can use pip or pip3 based on the Python version, you have to install the matplotlib module.

$ pip3 install matplotlib

If you have not added the pip to the environment variable path, you can run the below command in Python 3, which will install the matplotlib module. 

$ py -m pip install matplotlib

Install Matplotlib in Anaconda

Matplotlib is available both via the anaconda main channel and it can be installed using the following command.

$ conda install matplotlib

You can also install it via the conda-forge community channel by running the below command.

$ conda install -c conda-forge matplotlib

In case you have installed it properly but it still throws an error, then you need to check the import statement in your code.

In order to plot the charts properly, you need to import the matplotlib as shown below.

# importing the matplotlib 
import matplotlib.pyplot as plt
import seaborn as sns

# car sales data
total_sales = [3000, 2245, 1235, 5330, 4200]

location = ['Bangalore', 'Delhi', 'Chennai', 'Mumbai', 'Kolkatta']

# Seaborn color palette to plot pie chart
colors = sns.color_palette('pastel')

# create pie chart using matplotlib
plt.pie(total_sales, labels=location, colors=colors)
plt.show()

Malthe Borch: PowerShell Remoting on Windows using Airflow

$
0
0

Apache Airflow is an open-source platform that allows you to programmatically author, schedule and monitor workflows. It comes with out-of-the-box integration to lots of systems, but the adage that the devil's in the details holds true with integration in general and remote execution is no exception – in particular PowerShell Remoting which comes with Windows as part of WinRM (Windows Remote Management).

In this post, I'll share some insights from a recent project on how to use Airflow to orchestrate the execution of Windows jobs without giving up on security.

Traditionally, job scheduling was done using agent software. An agent running locally as a system service would wake up and execute jobs at the scheduled time, reporting results back to a central system.

The configuration of the job schedule is either done by logging into the system itself or using a control channel. For example, the agent might connect to a central system to pull down work orders.

Meanwhile, Airflow has no such agents! Conveniently, WinRM works in push mode. It's a service running on Windows that you connect to using HTTP (or HTTPS). It's basically like connecting to a database and running a stored procedure.

From a security perspective, push mode is fundamentally different because traffic is initiated externally. While we might want to implement a thin agent to overcome this difference, such code is a liability on its own. Luckily, PowerShell Remoting comes with a framework that allows us to substantially limit the attack surface.

The aptly named Just-Enough-Administration (JEA) framework is basically sudo on steroids. It allows us to use PowerShell as an API, constraining the remote management interface to a configurable set of commands and executing as a specific user.

We can avoid running arbitrary code entirely by encapsulating the implementation details in predefined commands. In addition, we also separate the remote user that connects to the WinRM service from the user context that executes commands.

You can use PowerShell Remoting without JEA and/or constrained endpoints. But the intersection of Airflow and Windows is typically a bigger company or organization where security concerns mean that you want both of these.

As an aside, I mentioned stored procedures earlier on. Using JEA to change context to a different user is equivalent of Definer's Rights vs Invoker's Rights. Arguably, in a system-to-system integration, using Definer's Rights is helpful in reducing the attack surface because you can define and encapsulate the required functionality.

The steps required to register a JEA configuration are relatively straight-forward. I won't describe them in detail here but the following bullets should give an overview:

In summary, registering a JEA configuration can be as simple as defining a single role capabilities file and running a command to register the configuration.

Now, enter Airflow!

To get started, you'll need to add the PowerShell Remoting Protocol Provider to your Airflow installation.

Add a connection by providing the hostname of your Windows machine, username and password. If you're using HTTP (rather than HTTPS) then you should set up the connection to require Kerberos authentication such that credentials are not sent in clear text (in addition, WinRM will encrypt the protocol traffic using the Kerberos session key).

To require Kerberos authentication, provide {"auth": "kerberos"} in the connection extras. Most of the extra configuration options from the underlying Python library pypsrp are available as connection extras. For example, a JEA configuration (if using) can be specified using the "configuration_name" key.

You will need to install additional Python packages to use Kerberos. Here's a requirements file with the necessary dependencies:

apache-airflow-providers-microsoft-psrp
gssapi
krb5
pypsrp[kerberos]

Finally, a note on transport security. When WinRM is used with an HTTP listener, Kerberos authentication (acting as trusted 3rd party) supplants the use of SSL/TLS through the transparent encryption scheme employed by the protocol. You can configure WinRM to support only Kerberos (by default, "Negotiate" is also enabled) to ensure that all connections are secured in this way. Note that your IT department might still insist on using HTTPS.

Historically, Windows machines feel worse over time for no particular reason. It's common to restart them once in a while. We can use Airflow to do that!

from airflow.providers.microsoft.psrp.operators.psrp import PSRPOperator

default_args = {
    "psrp_conn_id": <connection id>
}

with DAG(..., default_args=default_args) as dag:
    # "task_id" defaults to the value of "cmdlet" so can omit it here.
    restart_computer = PSRPOperator(cmdlet="Restart-Computer", parameters={"Force": None})

This will restart the computer forcefully (which is not a good idea, but it illustrates the use of parameters). In the example, "Force" is a switch so we pass a value of None– but values can be numbers, strings, lists and even dictionaries.

In the first example, we saw how task_id defaults to the value of cmdlet– that is sometimes useful, but it's not the only way we can cut verbosity.

PowerShell cmdlets (and functions which for our purposes are the same thing) follow the naming convention verb-noun. When we define our own commands, we can for example use the verb "Invoke", e.g. "Invoke-Job1". But invoking stuff is something we do all the time in Airflow and we don't want our task ids to have this meaningless prefix all over the place.

Here's an example of fixing that, making good use of Airflow's templating syntax:

from airflow.providers.microsoft.psrp.operators.psrp import PSRPOperator

default_args = {
    "psrp_conn_id": <connection id>,
    "cmdlet": "Invoke-{{ task.task_id }}",
}

with DAG(..., default_args=default_args) as dag:
    # "cmdlet" here will be provided automatically as "Invoke-Job1".
    job1 = PSRPOperator(task_id="Job1")

Windows can have its verb-noun naming convention and we get to have short task ids.

By default, Airflow serializes operator output using XComs – a simple means of passing state between tasks.

Since XComs must be JSON-serializable, the PSRPOperator automatically converts PowerShell output values to JSON using ConvertTo-Json and then deserializes in Python before Airflow will then reserialize it when saving the XComs result to the database – there's room for optimization there! The point is that most of the time, you don't have to worry about it.

You can for example list a directory using Get-ChildItem and the resulting table will be returned as a list of dicts. Note that PowerShell has some flattening magic which generally does the right thing in terms of return values:

That is, functions don't really return a single value. Instead, there is a stream of output values stemming from each command being executed.

With do_xcom_push set to false, no XComs are saved and the conversion to JSON also does not happen.

PowerShell has a number of other streams besides the output stream. These are logged to Airflow's task log by default. Unlike the default logging setup, the debug is also included unless explicitly turned off logging_level– one justification for this is given in the next section.

In traditional automation, command echoing has been a simple way to figure out what a script is doing. PowerShell is a different beast altogether, but it is possible to expose the commands being executed using Set-PSDebug.

from pypsrp.powershell import Command, CommandParameter

PS_DEBUG = Command(
    cmd="Set-PSDebug",
    args=(CommandParameter(name="Trace", value=1), ),
    is_script=False,
)

default_args = {
    "psrp_conn_id": <connection id>,
    "psrp_session_init": PS_DEBUG,
}

This requires that Set-PSDebug is listed under "VisibleCmdlets" in the role capabilities (like ConvertTo-Json if using XComs).

A tracing line will be sent for each line passed over during execution at logging level debug, but as mentioned above, this will nonetheless get included in the task log by default. Don't enable this and have a loop that iterates hundreds of times. You will quickly fill up the task log with useless messages.

Happy remoting!

IslandT: Move chess piece on the chessboard with python

$
0
0

Hello, it is me again and this is the second article about the chess game project which I have created earlier with python. In this article, I have updated the previous python program which will now be able to relocate the chess piece on the chessboard to a new location after I have clicked on any square on the chessboard. This is the first step to move the chess piece on the chessboard because after this, I will make the piece moves slowly by sliding it along the path to its destination as well as making sure the piece can move to that square, for example, a pawn can only move in the up or down direction and can only move sideways if it consumes another piece. All these will take another level of planning but for now, let us just concentrate on the relocation of the piece.

The pawn will originally be situated on one of the squares just like what you have seen in the previous article but after I have clicked on a new square it will relocate to that new square disregarding whether it can moves there or not!

The entire plan to achieve this is to get the dictionary key of that square and then plug in the key to the chess_dict dictionary to get the coordinates needed to draw the new position of the sprite.

# print the square name which you have clicked on
            for key, value in chess_dict.items():
                if (x * width, y * width) == (value[0],value[1]):

                    print(key)
                    previous_square_list.append(key) #insert the next square
                    if len(previous_square_list) > 1:
                        previous_square_list.remove(previous_square_list[0])

The above snippet will save the new key whenever a user clicked on one of the squares on the chessboard.

This will draw the chess piece on the new location…

#draw the position of the pawn sprite
    if len(previous_square_list) == 0:
        screen.blit(pawn0, (0, 64))  # just testing...
    else:
        screen.blit(pawn0,(chess_dict[previous_square_list[0]])) # this will draw the pawn on new position

At first, when there is no click the piece will appear in the original location, after someone has clicked on it once the chess piece will get drawn to the new location.

Here is the entire code…

import sys, pygame
import math

pygame.init()

size = width, height = 512, 512
white = 255, 178, 102
black = 126, 126, 126
hightlight = 192, 192, 192
title = "IslandT Chess"

width = 64 # width of the square
original_color = ''

#empty chess dictionary

chess_dict = {}

#chess square list
chess_square_list = [
                    "a8", "b8", "c8", "d8", "e8", "f8", "g8", "h8",
                    "a7", "b7", "c7", "d7", "e7", "f7", "g7", "h7",
                    "a6", "b6", "c6", "d6", "e6", "f6", "g6", "h6",
                    "a5", "b5", "c5", "d5", "e5", "f5", "g5", "h5",
                    "a4", "b4", "c4", "d4", "e4", "f4", "g4", "h4",
                    "a3", "b3", "c3", "d3", "e3", "f3", "g3", "h3",
                    "a2", "b2", "c2", "d2", "e2", "f2", "g2", "h2",
                    "a1", "b1", "c1", "d1", "e1", "f1", "g1", "h1"
                    ]

# chess square position
chess_square_position = []

#pawn image
pawn0 = pygame.image.load("pawn.png")

# create a list to map name of column and row
for i in range(0, 8) : # control row
    for j in range(0, 8): # control column
        chess_square_position.append((j * width, i * width))

# create a dictionary to map name of column and row

for n in range(0, len(chess_square_position)):
    chess_dict[chess_square_list[n]] = chess_square_position[n]

screen = pygame.display.set_mode(size)
pygame.display.set_caption(title)

rect_list = list() # this is the list of brown rectangle

#the previously touched square
previous_square_list = []

# used this loop to create a list of brown rectangles
for i in range(0, 8): # control the row
    for j in range(0, 8): # control the column
        if i % 2 == 0: # which means it is an even row
            if j % 2 != 0: # which means it is an odd column
                rect_list.append(pygame.Rect(j * width, i * width, width, width))
        else:
            if j % 2 == 0: # which means it is an even column
                rect_list.append(pygame.Rect(j * width, i * width, width, width))


# create main surface and fill the base color with light brown color
chess_board_surface = pygame.Surface(size)
chess_board_surface.fill(white)

# next draws the dark brown rectangles on the chess board surface
for chess_rect in rect_list:
    pygame.draw.rect(chess_board_surface, black, chess_rect)

while True:
    # displayed the chess surface
    #screen.blit(chess_board_surface, (0, 0))

    # displayed the chess surface
    screen.blit(chess_board_surface, (0, 0))

    #draw the position of the pawn sprite
    if len(previous_square_list) == 0:
        screen.blit(pawn0, (0, 64))  # just testing...
    else:
        screen.blit(pawn0,(chess_dict[previous_square_list[0]])) # this will draw the pawn on new position

    for event in pygame.event.get():

        if event.type == pygame.QUIT: sys.exit()
        elif event.type == pygame.MOUSEBUTTONDOWN:

            pos = event.pos
            x = math.floor(pos[0] / width)
            y = math.floor(pos[1] / width)

            # print the square name which you have clicked on
            for key, value in chess_dict.items():
                if (x * width, y * width) == (value[0],value[1]):

                    print(key)
                    previous_square_list.append(key) #insert the next square
                    if len(previous_square_list) > 1:
                        previous_square_list.remove(previous_square_list[0])

            original_color = chess_board_surface.get_at((x * width, y * width ))
            pygame.draw.rect(chess_board_surface, hightlight, pygame.Rect((x) * width, (y) * width, 64, 64))

        elif event.type == pygame.MOUSEBUTTONUP:
            pos = event.pos
            x = math.floor(pos[0] / width)
            y = math.floor(pos[1] / width)
            pygame.draw.rect(chess_board_surface, original_color, pygame.Rect((x) * width, (y) * width, 64, 64))

    pygame.display.update()

Here is the outcome…

I hope you like it, the next step is to make the piece slide along the board as well as to allow it to move in the direction it is supposed to move to!

PyPy: Natural Language Processing for Icelandic with PyPy: A Case Study

$
0
0

Natural Language Processing for Icelandic with PyPy: A Case Study

Icelandic is one of the smallest languages of the world, with about 370.000 speakers. It is a language in the Germanic family, most similar to Norwegian, Danish and Swedish, but closer to the original Old Norse spoken throughout Scandinavia until about the 14th century CE.

As with other small languages, there are worries that the language may not survive in a digital world, where all kinds of fancy applications are developed first - and perhaps only - for the major languages. Voice assistants, chatbots, spelling and grammar checking utilities, machine translation, etc., are increasingly becoming staples of our personal and professional lives, but if they don’t exist for Icelandic, Icelanders will gravitate towards English or other languages where such tools are readily available.

Iceland is a technology-savvy country, with world-leading adoption rates of the Internet, PCs and smart devices, and a thriving software industry. So the government figured that it would be worthwhile to fund a 5-year plan to build natural language processing (NLP) resources and other infrastructure for the Icelandic language. The project focuses on collecting data and developing open source software for a range of core applications, such as tokenization, vocabulary lookup, n-gram statistics, part-of-speech tagging, named entity recognition, spelling and grammar checking, neural language models and speech processing.


My name is Vilhjálmur Þorsteinsson, and I’m the founder and CEO of a software startup Miðeind in Reykjavík, Iceland, that employs 10 software engineers and linguists and focuses on NLP and AI for the Icelandic language. The company participates in the government’s language technology program, and has contributed significantly to the program’s core tools (e.g., a tokenizer and a parser), spelling and grammar checking modules, and a neural machine translation stack.

When it came to a choice of programming languages and development tools for the government program, the requirements were for a major, well supported, vendor-and-OS-agnostic FOSS platform with a large and diverse community, including in the NLP space. The decision to select Python as a foundational language for the project was a relatively easy one. That said, there was a bit of trepidation around the well known fact that CPython can be slow for inner-core tasks, such as tokenization and parsing, that can see heavy workloads in production.

I first became aware of PyPy in early 2016 when I was developing a crossword game Netskrafl in Python 2.7 for Google App Engine. I had a utility program that compressed a dictionary into a Directed Acyclic Word Graph and was taking 160 seconds  to run on CPython 2.7, so I tried PyPy and to my amazement saw a 4x speedup (down to 38 seconds), with literally no effort besides downloading the PyPy runtime.

This led me to select PyPy as the default Python interpreter for my company’s Python development efforts as well as for our production websites and API servers, a role in which it remains to this day. We have followed PyPy’s upgrades along the way, being just about to migrate our minimally required language version from 3.6 to 3.7.

In NLP, speed and memory requirements can be quite important for software usability. On the other hand, NLP logic and algorithms are often complex and challenging to program, so programmer productivity and code clarity are also critical success factors. A pragmatic approach balances these factors, avoids premature optimization and seeks a careful compromise between maximal run-time efficiency and minimal programming and maintenance effort.

Turning to our use cases, our Icelandic text tokenizer "Tokenizer" is fairly light, runs tight loops and performs a large number of small, repetitive operations. It runs very well on PyPy’s JIT and has not required further optimization.

Our Icelandic parser Greynir (known on PyPI as reynir) is, if I may say so myself, a piece of work. It parses natural language text according to a hand-written context-free grammar, using an Earley-type algorithm as enhanced by Scott and Johnstone. The CFG contains almost 7,000 nonterminals and 6,000 terminals, and the parser handles ambiguity as well as left, right and middle recursion. It returns a packed parse forest for each input sentence, which is then pruned by a scoring heuristic down to a single best result tree.

This parser was originally coded in pure Python and turned out to be unusably slow when run on CPython - but usable on PyPy, where it was 3-4x faster. However, when we started applying it to heavier production workloads, it  became apparent that it needed to be faster still. We then proceeded to convert the innermost Earley parsing loop from Python to tight C++ and to call it from PyPy via CFFI, with callbacks for token-terminal matching functions (“business logic”) that remained on the Python side. This made the parser much faster (on the order of 100x faster than the original on CPython) and quick enough for our production use cases. Even after moving much of the heavy processing to C++ and using CFFI, PyPy still gives a significant speed boost over CPython.

Connecting C++ code with PyPy proved to be quite painless using CFFI, although we had to figure out a few magic incantations in our build module to make it compile smoothly during setup from source on Windows and MacOS in addition to Linux. Of course, we build binary PyPy and CPython wheels for the most common targets so most users don’t have to worry about setup requirements.

With the positive experience from the parser project, we proceeded to take a similar approach for two other core NLP packages: our compressed vocabulary package BinPackage (known on PyPI as islenska) and our trigrams database package Icegrams. These packages both take large text input (3.1 million word forms with inflection data in the vocabulary case; 100 million tokens in the trigrams case) and compress it into packed binary structures. These structures are then memory-mapped at run-time using mmap and queried via Python functions with a lookup time in the microseconds range. The low-level data structure navigation is done in C++, called from Python via CFFI. The ex-ante preparation, packing, bit-fiddling and data structure generation is fast enough with PyPy, so we haven’t seen a need to optimize that part further.

To showcase our tools, we host public (and open source) websites such as greynir.is for our parsing, named entity recognition and query stack and yfirlestur.is for our spell and grammar checking stack. The server code on these sites is all Python running on PyPy using Flask, wrapped in gunicorn and hosted on nginx. The underlying database is PostgreSQL accessed via SQLAlchemy and psycopg2cffi. This setup has served us well for 6 years and counting, being fast, reliable and having helpful and supporting communities.

As can be inferred from the above, we are avid fans of PyPy and commensurately thankful for the great work by the PyPy team over the years. PyPy has enabled us to use Python for a larger part of our toolset than CPython alone would have supported, and its smooth integration with C/C++ through CFFI has helped us attain a better tradeoff between performance and programmer productivity in our projects. We wish for PyPy a great and bright future and also look forward to exciting related developments on the horizon, such as HPy.

Podcast.__init__: Achieve Repeatable Builds Of Your Software On Any Machine With Earthly

$
0
0
It doesn't matter how amazing your application is if you are unable to deliver it to your users. Frustrated with the rampant complexity involved in building and deploying software Vlad A. Ionescu created the Earthly tool to reduce the toil involved in creating repeatable software builds. In this episode he explains the complexities that are inherent to building software projects and how he designed the syntax and structure of Earthly to make it easy to adopt for developers across all language environments. By adopting Earthly you can use the same techniques for building on your laptop and in your CI/CD pipelines.

Summary

It doesn’t matter how amazing your application is if you are unable to deliver it to your users. Frustrated with the rampant complexity involved in building and deploying software Vlad A. Ionescu created the Earthly tool to reduce the toil involved in creating repeatable software builds. In this episode he explains the complexities that are inherent to building software projects and how he designed the syntax and structure of Earthly to make it easy to adopt for developers across all language environments. By adopting Earthly you can use the same techniques for building on your laptop and in your CI/CD pipelines.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Vlad A. Ionescu about Earthly, a syntax and runtime for software builds to reduce friction between development and delivery

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you describe what Earthly is and the story behind it?
  • What are the core principles that engineers should consider when designing their build and delivery process?
  • What are some of the common problems that engineers run into when they are designing their build process?
    • What are some of the challenges that are unique to the Python ecosystem?
  • What is the role of Earthly in the overall software lifecycle?
    • What are the other tools/systems that a team is likely to use alongside Earthly?
    • What are the components that Earthly might replace?
  • How is Earthly implemented?
    • What were the core design requirements when you first began working on it?
    • How have the design and goals of Earthly changed or evolved as you have explored the problem further?
  • What is the workflow for a Python developer to get started with Earthly?
    • How can Earthly help with the challenge of managing Javascript and CSS assets for web application projects?
  • What are some of the challenges (technical, conceptual, or organizational) that an engineer or team might encounter when adopting Earthly?
  • What are some of the features or capabilities of Earthly that are overlooked or misunderstood that you think are worth exploring?
  • What are the most interesting, innovative, or unexpected ways that you have seen Earthly used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Earthly?
  • When is Earthly the wrong choice?
  • What do you have planned for the future of Earthly?

Keep In Touch

Picks

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Python GUIs: Packaging PyQt5 applications into a macOS app with PyInstaller (updated for 2022)

$
0
0

There is not much fun in creating your own desktop applications if you can't share them with other people — whether than means publishing it commercially, sharing it online or just giving it to someone you know. Sharing your apps allows other people to benefit from your hard work!

The good news is there are tools available to help you do just that with your Python applications which work well with apps built using PyQt5. In this tutorial we'll look at the most popular tool for packaging Python applications: PyInstaller.

This tutorial is broken down into a series of steps, using PyInstaller to build first simple, and then more complex PyQt5 applications into distributable macOS app bundles. You can choose to follow it through completely, or skip to the parts that are most relevant to your own project.

We finish off by building a macOS Disk Image, the usual method for distributing applications on macOS.

You always need to compile your app on your target system. So, if you want to create a Mac .app you need to do this on a Mac, for an EXE you need to use Windows.

Example Disk Image for macOSExample Disk Image Installer for macOS

If you're impatient, you can download the Example Disk Image for macOS first.

Requirements

PyInstaller works out of the box with PyQt5 and as of writing, current versions of PyInstaller are compatible with Python 3.6+. Whatever project you're working on, you should be able to package your apps.

You can install PyInstaller using pip.

bash
pip3 install PyInstaller

If you experience problems packaging your apps, your first step should always be to update your PyInstaller and hooks package the latest versions using

bash
pip3 install --upgrade PyInstaller pyinstaller-hooks-contrib

The hooks module contains package-specific packaging instructions for PyInstaller which is updated regularly.

Install in virtual environment (optional)

You can also opt to install PyQt5 and PyInstaller in a virtual environment (or your applications virtual environment) to keep your environment clean.

bash
python3 -m venv packenv

Once created, activate the virtual environment by running from the command line —

bash
call packenv\scripts\activate.bat

Finally, install the required libraries. For PyQt5 you would use —

python
pip3 install PyQt5 PyInstaller

Getting Started

It's a good idea to start packaging your application from the very beginning so you can confirm that packaging is still working as you develop it. This is particularly important if you add additional dependencies. If you only think about packaging at the end, it can be difficult to debug exactly where the problems are.

For this example we're going to start with a simple skeleton app, which doesn't do anything interesting. Once we've got the basic packaging process working, we'll extend the application to include icons and data files. We'll confirm the build as we go along.

To start with, create a new folder for your application and then add the following skeleton app in a file named app.py. You can also download the source code and associated files

python
from PyQt5 import QtWidgets

import sys

class MainWindow(QtWidgets.QMainWindow):

    def __init__(self):
        super().__init__()

        self.setWindowTitle("Hello World")
        l = QtWidgets.QLabel("My simple app.")
        l.setMargin(10)
        self.setCentralWidget(l)
        self.show()

if __name__ == '__main__':
    app = QtWidgets.QApplication(sys.argv)
    w = MainWindow()
    app.exec()

This is a basic bare-bones application which creates a custom QMainWindow and adds a simple widget QLabel to it. You can run this app as follows.

bash
python app.py

This should produce the following window (on macOS).

Simple skeleton app in PyQt5Simple skeleton app in PyQt5

Building a basic app

Now we have our simple application skeleton in place, we can run our first build test to make sure everything is working.

Open your terminal (command prompt) and navigate to the folder containing your project. You can now run the following command to run the PyInstaller build.

python
pyinstaller --windowed app.py

The --windowed flag is neccessary to tell PyInstaller to build a macOS .app bundle.

You'll see a number of messages output, giving debug information about what PyInstaller is doing. These are useful for debugging issues in your build, but can otherwise be ignored. The output that I get for running the command on my system is shown below.

bash
martin@MacBook-Pro pyqt5 % pyinstaller --windowed app.py
74 INFO: PyInstaller: 4.8
74 INFO: Python: 3.9.9
83 INFO: Platform: macOS-10.15.7-x86_64-i386-64bit
84 INFO: wrote /Users/martin/app/pyqt5/app.spec
87 INFO: UPX is not available.
88 INFO: Extending PYTHONPATH with paths
['/Users/martin/app/pyqt5']
447 INFO: checking Analysis
451 INFO: Building because inputs changed
452 INFO: Initializing module dependency graph...
455 INFO: Caching module graph hooks...
463 INFO: Analyzing base_library.zip ...
3914 INFO: Processing pre-find module path hook distutils from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks/pre_find_module_path/hook-distutils.py'.
3917 INFO: distutils: retargeting to non-venv dir '/usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9'
6928 INFO: Caching module dependency graph...
7083 INFO: running Analysis Analysis-00.toc
7091 INFO: Analyzing /Users/martin/app/pyqt5/app.py
7138 INFO: Processing module hooks...
7139 INFO: Loading module hook 'hook-PyQt6.QtWidgets.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7336 INFO: Loading module hook 'hook-xml.etree.cElementTree.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7337 INFO: Loading module hook 'hook-lib2to3.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7360 INFO: Loading module hook 'hook-PyQt6.QtGui.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7397 INFO: Loading module hook 'hook-PyQt6.QtCore.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7422 INFO: Loading module hook 'hook-encodings.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7510 INFO: Loading module hook 'hook-distutils.util.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7513 INFO: Loading module hook 'hook-pickle.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7515 INFO: Loading module hook 'hook-heapq.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7517 INFO: Loading module hook 'hook-difflib.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7519 INFO: Loading module hook 'hook-PyQt6.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7564 INFO: Loading module hook 'hook-multiprocessing.util.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7565 INFO: Loading module hook 'hook-sysconfig.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7574 INFO: Loading module hook 'hook-xml.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7677 INFO: Loading module hook 'hook-distutils.py' from '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks'...
7694 INFO: Looking for ctypes DLLs
7712 INFO: Analyzing run-time hooks ...
7715 INFO: Including run-time hook '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks/rthooks/pyi_rth_subprocess.py'
7719 INFO: Including run-time hook '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks/rthooks/pyi_rth_pkgutil.py'
7722 INFO: Including run-time hook '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks/rthooks/pyi_rth_multiprocessing.py'
7726 INFO: Including run-time hook '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks/rthooks/pyi_rth_inspect.py'
7727 INFO: Including run-time hook '/usr/local/lib/python3.9/site-packages/PyInstaller/hooks/rthooks/pyi_rth_pyqt6.py'
7736 INFO: Looking for dynamic libraries
7977 INFO: Looking for eggs
7977 INFO: Using Python library /usr/local/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/Python
7987 INFO: Warnings written to /Users/martin/app/pyqt5/build/app/warn-app.txt
8019 INFO: Graph cross-reference written to /Users/martin/app/pyqt5/build/app/xref-app.html
8032 INFO: checking PYZ
8035 INFO: Building because toc changed
8035 INFO: Building PYZ (ZlibArchive) /Users/martin/app/pyqt5/build/app/PYZ-00.pyz
8390 INFO: Building PYZ (ZlibArchive) /Users/martin/app/pyqt5/build/app/PYZ-00.pyz completed successfully.
8397 INFO: EXE target arch: x86_64
8397 INFO: Code signing identity: None
8398 INFO: checking PKG
8398 INFO: Building because /Users/martin/app/pyqt5/build/app/PYZ-00.pyz changed
8398 INFO: Building PKG (CArchive) app.pkg
8415 INFO: Building PKG (CArchive) app.pkg completed successfully.
8417 INFO: Bootloader /usr/local/lib/python3.9/site-packages/PyInstaller/bootloader/Darwin-64bit/runw
8417 INFO: checking EXE
8418 INFO: Building because console changed
8418 INFO: Building EXE from EXE-00.toc
8418 INFO: Copying bootloader EXE to /Users/martin/app/pyqt5/build/app/app
8421 INFO: Converting EXE to target arch (x86_64)
8449 INFO: Removing signature(s) from EXE
8484 INFO: Appending PKG archive to EXE
8486 INFO: Fixing EXE headers for code signing
8496 INFO: Rewriting the executable's macOS SDK version (11.1.0) to match the SDK version of the Python library (10.15.6) in order to avoid inconsistent behavior and potential UI issues in the frozen application.
8499 INFO: Re-signing the EXE
8547 INFO: Building EXE from EXE-00.toc completed successfully.
8549 INFO: checking COLLECT
WARNING: The output directory "/Users/martin/app/pyqt5/dist/app" and ALL ITS CONTENTS will be REMOVED! Continue? (y/N)y
On your own risk, you can use the option `--noconfirm` to get rid of this question.
10820 INFO: Removing dir /Users/martin/app/pyqt5/dist/app
10847 INFO: Building COLLECT COLLECT-00.toc
12460 INFO: Building COLLECT COLLECT-00.toc completed successfully.
12469 INFO: checking BUNDLE
12469 INFO: Building BUNDLE because BUNDLE-00.toc is non existent
12469 INFO: Building BUNDLE BUNDLE-00.toc
13848 INFO: Moving BUNDLE data files to Resource directory
13901 INFO: Signing the BUNDLE...
16049 INFO: Building BUNDLE BUNDLE-00.toc completed successfully.

If you look in your folder you'll notice you now have two new folders dist and build.

build & dist folders created by PyInstallerbuild & dist folders created by PyInstaller

Below is a truncated listing of the folder content, showing the build and dist folders.

bash
.
&boxvr&boxh&boxh app.py
&boxvr&boxh&boxh app.spec
&boxvr&boxh&boxh build
&boxv   &boxur&boxh&boxh app
&boxv       &boxvr&boxh&boxh Analysis-00.toc
&boxv       &boxvr&boxh&boxh COLLECT-00.toc
&boxv       &boxvr&boxh&boxh EXE-00.toc
&boxv       &boxvr&boxh&boxh PKG-00.pkg
&boxv       &boxvr&boxh&boxh PKG-00.toc
&boxv       &boxvr&boxh&boxh PYZ-00.pyz
&boxv       &boxvr&boxh&boxh PYZ-00.toc
&boxv       &boxvr&boxh&boxh app
&boxv       &boxvr&boxh&boxh app.pkg
&boxv       &boxvr&boxh&boxh base_library.zip
&boxv       &boxvr&boxh&boxh warn-app.txt
&boxv       &boxur&boxh&boxh xref-app.html
&boxur&boxh&boxh dist
    &boxvr&boxh&boxh app
    &boxv   &boxvr&boxh&boxh libcrypto.1.1.dylib
    &boxv   &boxvr&boxh&boxh PyQt5
    &boxv   ...
    &boxv   &boxvr&boxh&boxh app
    &boxv   &boxur&boxh&boxh Qt5Core
    &boxur&boxh&boxh app.app

The build folder is used by PyInstaller to collect and prepare the files for bundling, it contains the results of analysis and some additional logs. For the most part, you can ignore the contents of this folder, unless you're trying to debug issues.

The dist (for "distribution") folder contains the files to be distributed. This includes your application, bundled as an executable file, together with any associated libraries (for example PyQt5) and binary .so files.

Since we provided the --windowed flag above, PyInstaller has actually created two builds for us. The folder app is a simple folder containing everything you need to be able to run your app. PyInstaller also creates an app bundle app.app which is what you will usually distribute to users.

The app folder is a useful debugging tool, since you can easily see the libraries and other packaged data files.

You can try running your app yourself now, either by double-clicking on the app bundle, or by running the executable file, named app.exe from the dist folder. In either case, after a short delay you'll see the familiar window of your application pop up as shown below.

Simple app, running after being packagedSimple app, running after being packaged

In the same folder as your Python file, alongside the build and dist folders PyInstaller will have also created a .spec file. In the next section we'll take a look at this file, what it is and what it does.

The Spec file

The .spec file contains the build configuration and instructions that PyInstaller uses to package up your application. Every PyInstaller project has a .spec file, which is generated based on the command line options you pass when running pyinstaller.

When we ran pyinstaller with our script, we didn't pass in anything other than the name of our Python application file and the --windowed flag. This means our spec file currently contains only the default configuration. If you open it, you'll see something similar to what we have below.

python
# -*- mode: python ; coding: utf-8 -*-


block_cipher = None


a = Analysis(['app.py'],
             pathex=[],
             binaries=[],
             datas=[],
             hiddenimports=[],
             hookspath=[],
             hooksconfig={},
             runtime_hooks=[],
             excludes=[],
             win_no_prefer_redirects=False,
             win_private_assemblies=False,
             cipher=block_cipher,
             noarchive=False)
pyz = PYZ(a.pure, a.zipped_data,
             cipher=block_cipher)

exe = EXE(pyz,
          a.scripts,
          [],
          exclude_binaries=True,
          name='app',
          debug=False,
          bootloader_ignore_signals=False,
          strip=False,
          upx=True,
          console=False,
          disable_windowed_traceback=False,
          target_arch=None,
          codesign_identity=None,
          entitlements_file=None )
coll = COLLECT(exe,
               a.binaries,
               a.zipfiles,
               a.datas,
               strip=False,
               upx=True,
               upx_exclude=[],
               name='app')
app = BUNDLE(coll,
             name='app.app',
             icon=None,
             bundle_identifier=None)

The first thing to notice is that this is a Python file, meaning you can edit it and use Python code to calculate values for the settings. This is mostly useful for complex builds, for example when you are targeting different platforms and want to conditionally define additional libraries or dependencies to bundle.

Because we used the --windowed command line flag, the EXE(console=) attribute is set to False. If this is True a console window will be shown when your app is launched -- not what you usually want for a GUI application.

Once a .spec file has been generated, you can pass this to pyinstaller instead of your script to repeat the previous build process. Run this now to rebuild your executable.

bash
pyinstaller app.spec

The resulting build will be identical to the build used to generate the .spec file (assuming you have made no changes). For many PyInstaller configuration changes you have the option of passing command-line arguments, or modifying your existing .spec file. Which you choose is up to you.

Tweaking the build

So far we've created a simple first build of a very basic application. Now we'll look at a few of the most useful options that PyInstaller provides to tweak our build. Then we'll go on to look at building more complex applications.

Naming your app

One of the simplest changes you can make is to provide a proper "name" for your application. By default the app takes the name of your source file (minus the extension), for example main or app. This isn't usually what you want.

You can provide a nicer name for PyInstaller to use for the app (and dist folder) by editing the .spec file to add a name= under the EXE, COLLECT and BUNDLE blocks.

python
exe = EXE(pyz,
          a.scripts,
          [],
          exclude_binaries=True,
          name='Hello World',
          debug=False,
          bootloader_ignore_signals=False,
          strip=False,
          upx=True,
          console=False
         )
coll = COLLECT(exe,
               a.binaries,
               a.zipfiles,
               a.datas,
               strip=False,
               upx=True,
               upx_exclude=[],
               name='Hello World')
app = BUNDLE(coll,
             name='Hello World.app',
             icon=None,
             bundle_identifier=None)

The name under EXE is the name of the executable file, the name under BUNDLE is the name of the app bundle.

Alternatively, you can re-run the pyinstaller command and pass the -n or --name configuration flag along with your app.py script.

bash
pyinstaller -n "Hello World" --windowed app.py
# or
pyinstaller --name "Hello World" --windowed app.py

The resulting app file will be given the name Hello World.app and the unpacked build placed in the folder dist\Hello World\.

Application with custom name Application with custom name "Hello World"

The name of the .spec file is taken from the name passed in on the command line, so this will also create a new spec file for you, called Hello World.spec in your root folder.

Make sure you delete the old app.spec file to avoid getting confused editing the wrong one.

Application icon

By default PyInstaller app bundles come with the following icon in place.

Default PyInstaller application icon, on app bundleDefault PyInstaller application icon, on app bundle

You will probably want to customize this to make your application more recognisable. This can be done easily by passing the --icon command line argument, or editing the icon= parameter of the BUNDLE section of your .spec file. For macOS app bundles you need to provide an .icns file.

python
app = BUNDLE(coll,
             name='Hello World.app',
             icon='Hello World.icns',
             bundle_identifier=None)

To create macOS icons from images you can use the image2icon tool.

If you now re-run the build (by using the command line arguments, or running with your modified .spec file) you'll see the specified icon file is now set on your application bundle.

Custom application icon (a hand) on the app bundleCustom application icon on the app bundle

On macOS application icons are taken from the application bundle. If you repackage your app and run the bundle you will see your app icon on the dock!

Custom application icon in the dockCustom application icon on the dock

Data files and Resources

So far our application consists of just a single Python file, with no dependencies. Most real-world applications a bit more complex, and typically ship with associated data files such as icons or UI design files. In this section we'll look at how we can accomplish this with PyInstaller, starting with a single file and then bundling complete folders of resources.

First let's update our app with some more buttons and add icons to each.

python
from PyQt5.QtWidgets import QMainWindow, QApplication, QLabel, QVBoxLayout, QPushButton, QWidget
from PyQt5.QtGui import QIcon

import sys

class MainWindow(QMainWindow):

    def __init__(self):
        super().__init__()

        self.setWindowTitle("Hello World")
        layout = QVBoxLayout()
        label = QLabel("My simple app.")
        label.setMargin(10)
        layout.addWidget(label)

        button1 = QPushButton("Hide")
        button1.setIcon(QIcon("icons/hand.png"))
        button1.pressed.connect(self.lower)
        layout.addWidget(button1)

        button2 = QPushButton("Close")
        button2.setIcon(QIcon("icons/lightning.png"))
        button2.pressed.connect(self.close)
        layout.addWidget(button2)

        container = QWidget()
        container.setLayout(layout)

        self.setCentralWidget(container)

        self.show()

if __name__ == '__main__':
    app = QApplication(sys.argv)
    w = MainWindow()
    app.exec_()

In the folder with this script, add a folder icons which contains two icons in PNG format, hand.png and lightning.png. You can create these yourself, or get them from the source code download for this tutorial.

Run the script now and you will see a window showing two buttons with icons.

Window with two iconsWindow with two buttons with icons.

Even if you don't see the icons, keep reading!

Dealing with relative paths

There is a gotcha here, which might not be immediately apparent. To demonstrate it, open up a shell and change to the folder where our script is located. Run it with

bash
python3 app.py

If the icons are in the correct location, you should see them. Now change to the parent folder, and try and run your script again (change <folder> to the name of the folder your script is in).

bash
cd ..
python3 <folder>/app.py

Window with two icons missingWindow with two buttons with icons missing.

The icons don't appear. What's happening?

We're using relative paths to refer to our data files. These paths are relative to the current working directory -- not the folder your script is in. So if you run the script from elsewhere it won't be able to find the files.

One common reason for icons not to show up, is running examples in an IDE which uses the project root as the current working directory.

This is a minor issue before the app is packaged, but once it's installed it will be started with it's current working directory as the root / folder -- your app won't be able to find anything. We need to fix this before we go any further, which we can do by making our paths relative to our application folder.

In the updated code below, we define a new variable basedir, using os.path.dirname to get the containing folder of __file__ which holds the full path of the current Python file. We then use this to build the relative paths for icons using os.path.join().

Since our app.py file is in the root of our folder, all other paths are relative to that.

python
from PyQt5.QtWidgets import QMainWindow, QApplication, QLabel, QVBoxLayout, QPushButton, QWidget
from PyQt5.QtGui import QIcon

import sys, os

basedir = os.path.dirname(__file__)

class MainWindow(QMainWindow):

    def __init__(self):
        super().__init__()

        self.setWindowTitle("Hello World")
        layout = QVBoxLayout()
        label = QLabel("My simple app.")
        label.setMargin(10)
        layout.addWidget(label)

        button1 = QPushButton("Hide")
        button1.setIcon(QIcon(os.path.join(basedir, "icons", "hand.png")))
        button1.pressed.connect(self.lower)
        layout.addWidget(button1)

        button2 = QPushButton("Close")
        button2.setIcon(QIcon(os.path.join(basedir, "icons", "lightning.png")))
        button2.pressed.connect(self.close)
        layout.addWidget(button2)

        container = QWidget()
        container.setLayout(layout)

        self.setCentralWidget(container)

        self.show()

if __name__ == '__main__':
    app = QApplication(sys.argv)
    w = MainWindow()
    app.exec_()

Try and run your app again from the parent folder -- you'll find that the icons now appear as expected on the buttons, no matter where you launch the app from.

Packaging the icons

So now we have our application showing icons, and they work wherever the application is launched from. Package the application again with pyinstaller "Hello World.spec" and then try and run it again from the dist folder as before. You'll notice the icons are missing again.

Window with two icons missingWindow with two buttons with icons missing.

The problem now is that the icons haven't been copied to the dist/Hello World folder -- take a look in it. Our script expects the icons to be a specific location relative to it, and if they are not, then nothing will be shown.

This same principle applies to any other data files you package with your application, including Qt Designer UI files, settings files or source data. You need to ensure that relative path structures are replicated after packaging.

Bundling data files with PyInstaller

For the application to continue working after packaging, the files it depends on need to be in the same relative locations.

To get data files into the dist folder we can instruct PyInstaller to copy them over. PyInstaller accepts a list of individual paths to copy, together with a folder path relative to the dist/<app name> folder where it should to copy them to. As with other options, this can be specified by command line arguments or in the .spec file.

Files specified on the command line are added using --add-data, passing the source file and destination folder separated by a colon :.

The path separator is platform-specific: Linux or Mac use :, on Windows use ;

bash
pyinstaller --windowed --name="Hello World" --icon="Hello World.icns" --add-data="icons/hand.png:icons" --add-data="icons/lightning.png:icons" app.py

Here we've specified the destination location as icons. The path is relative to the root of our application's folder in dist -- so dist/Hello World with our current app. The path icons means a folder named icons under this location, so dist/Hello World/icons. Putting our icons right where our application expects to find them!

You can also specify data files via the datas list in the Analysis section of the spec file, shown below.

python
a = Analysis(['app.py'],
             pathex=[],
             binaries=[],
             datas=[('icons/hand.png', 'icons'), ('icons/lightning.png', 'icons')],
             hiddenimports=[],
             hookspath=[],
             runtime_hooks=[],
             excludes=[],
             win_no_prefer_redirects=False,
             win_private_assemblies=False,
             cipher=block_cipher,
             noarchive=False)

Then rebuild from the .spec file with

bash
pyinstaller "Hello World.spec"

In both cases we are telling PyInstaller to copy the specified files to the location ./icons/ in the output folder, meaning dist/Hello World/icons. If you run the build, you should see your .png files are now in the in dist output folder, under a folder named icons.

The icon file copied to the dist folderThe icon file copied to the dist folder

If you run your app from dist you should now see the icon icons in your window as expected!

Window with two iconsWindow with two buttons with icons, finally!

Bundling data folders

Usually you will have more than one data file you want to include with your packaged file. The latest PyInstaller versions let you bundle folders just like you would files, keeping the sub-folder structure.

Let's update our configuration to bundle our icons folder in one go, so it will continue to work even if we add more icons in future.

To copy the icons folder across to our build application, we just need to add the folder to our .spec file Analysis block. As for the single file, we add it as a tuple with the source path (from our project folder) and the destination folder under the resulting folder in dist.

python
# ...
a = Analysis(['app.py'],
             pathex=[],
             binaries=[],
             datas=[('icons', 'icons')],   # tuple is (source_folder, destination_folder)
             hiddenimports=[],
             hookspath=[],
             hooksconfig={},
             runtime_hooks=[],
             excludes=[],
             win_no_prefer_redirects=False,
             win_private_assemblies=False,
             cipher=block_cipher,
             noarchive=False)
# ...

If you run the build using this spec file you'll see the icons folder copied across to the dist\Hello World folder. If you run the application from the folder, the icons will display as expected -- the relative paths remain correct in the new location.

Alternatively, you can bundle your data files using Qt's QResource architecture. See our tutorial for more information.

Building the App bundle into a Disk Image

So far we've used PyInstaller to bundle the application into macOS app, along with the associated data files. The output of this bundling process is a folder and an macOS app bundle, named Hello World.app.

If you try and distribute this app bundle, you'll notice a problem: the app bundle is actually just a special folder. While macOS displays it as an application, if you try and share it, you'll actually be sharing hundreds of individual files. To distribute the app properly, we need some way to package it into a single file.

The easiest way to do this is to use a .zip file. You can zip the folder and give this to someone else to unzip on their own computer, giving them a complete app bundle they can copy to their Applications folder.

However, if you've install macOS applications before you'll know this isn't the usual way to do it. Usually you get a Disk Image.dmg file, which when opened shows the application bundle, and a link to your Applications folder. To install the app, you just drag it across to the target.

To make our app look as professional as possible, we should copy this expected behaviour. Next we'll look at how to take our app bundle and package it into a macOS Disk Image.

Making sure the build is ready.

If you've followed the tutorial so far, you'll already have your app ready in the /dist folder. If not, or yours isn't working you can also download the source code files for this tutorial which includes a sample .spec file. As above, you can run the same build using the provided Hello World.spec file.

bash
pyinstaller "Hello World.spec"

This packages everything up as an app bundle in the dist/ folder, with a custom icon. Run the app bundle to ensure everything is bundled correctly, and you should see the same window as before with the icons visible.

Two iconsWindow with two icons, and a button.

Creating an Disk Image

Now we've successfully bundled our application, we'll next look at how we can take our app bundle and use it to create a macOS Disk Image for distribution.

To create our Disk Image we'll be using the create-dmg tool. This is a command-line tool which provides a simple way to build disk images automatically. If you are using Homebrew, you can install create-dmg with the following command.

bash
brew install create-dmg

...otherwise, see the Github repository for instructions.

The create-dmg tool takes a lot of options, but below are the most useful.

bash
create-dmg --help
create-dmg 1.0.9

Creates a fancy DMG file.

Usage:  create-dmg [options] <output_name.dmg> <source_folder>

All contents of <source_folder> will be copied into the disk image.

Options:
  --volname <name>
      set volume name (displayed in the Finder sidebar and window title)
  --volicon <icon.icns>
      set volume icon
  --background <pic.png>
      set folder background image (provide png, gif, or jpg)
  --window-pos <x> <y>
      set position the folder window
  --window-size <width> <height>
      set size of the folder window
  --text-size <text_size>
      set window text size (10-16)
  --icon-size <icon_size>
      set window icons size (up to 128)
  --icon file_name <x> <y>
      set position of the file's icon
  --hide-extension <file_name>
      hide the extension of file
  --app-drop-link <x> <y>
      make a drop link to Applications, at location x,y
  --no-internet-enable
      disable automatic mount & copy
  --add-file <target_name> <file>|<folder> <x> <y>
      add additional file or folder (can be used multiple times)
  -h, --help
        display this help screen

The most important thing to notice is that the command requires a <source folder> and all contents of that folder will be copied to the Disk Image. So to build the image, we first need to put our app bundle in a folder by itself.

Rather than do this manually each time you want to build a Disk Image I recommend creating a shell script. This ensures the build is reproducible, and makes it easier to configure.

Below is a working script to create a Disk Image from our app. It creates a temporary folder dist/dmg where we'll put the things we want to go in the Disk Image -- in our case, this is just the app bundle, but you can add other files if you like. Then we make sure the folder is empty (in case it still contains files from a previous run). We copy our app bundle into the folder, and finally check to see if there is already a .dmg file in dist and if so, remove it too. Then we're ready to run the create-dmg tool.

bash
#!/bin/sh
# Create a folder (named dmg) to prepare our DMG in (if it doesn't already exist).
mkdir -p dist/dmg
# Empty the dmg folder.
rm -r dist/dmg/*
# Copy the app bundle to the dmg folder.
cp -r "dist/Hello World.app" dist/dmg
# If the DMG already exists, delete it.
test -f "dist/Hello World.dmg"&& rm "dist/Hello World.dmg"
create-dmg \
  --volname "Hello World" \
  --volicon "Hello World.icns" \
  --window-pos 200 120 \
  --window-size 600 300 \
  --icon-size 100 \
  --icon "Hello World.app" 175 120 \
  --hide-extension "Hello World.app" \
  --app-drop-link 425 120 \
  "dist/Hello World.dmg" \
  "dist/dmg/"

The options we pass to create-dmg set the dimensions of the Disk Image window when it is opened, and positions of the icons in it.

Save this shell script in the root of your project, named e.g. builddmg.sh. To make it possible to run, you need to set the execute bit with.

bash
chmod +x builddmg.sh

With that, you can now build a Disk Image for your Hello World app with the command.

bash
./builddmg.sh

This will take a few seconds to run, producing quite a bit of output.

bash
 No such file or directory
Creating disk image...
...............................................................
created: /Users/martin/app/dist/rw.Hello World.dmg
Mounting disk image...
Mount directory: /Volumes/Hello World
Device name:     /dev/disk2
Making link to Applications dir...
/Volumes/Hello World
Copying volume icon file 'Hello World.icns'...
Running AppleScript to make Finder stuff pretty: /usr/bin/osascript "/var/folders/yf/1qvxtg4d0vz6h2y4czd69tf40000gn/T/createdmg.tmp.XXXXXXXXXX.RvPoqdr0""Hello World"
waited 1 seconds for .DS_STORE to be created.
Done running the AppleScript...
Fixing permissions...
Done fixing permissions
Blessing started
Blessing finished
Deleting .fseventsd
Unmounting disk image...
hdiutil: couldn't unmount "disk2" - Resource busy
Wait a moment...
Unmounting disk image...
"disk2" ejected.
Compressing disk image...
Preparing imaging engine…
Reading Protective Master Boot Record (MBR : 0)…
   (CRC32 $38FC6E30: Protective Master Boot Record (MBR : 0))
Reading GPT Header (Primary GPT Header : 1)…
   (CRC32 $59C36109: GPT Header (Primary GPT Header : 1))
Reading GPT Partition Data (Primary GPT Table : 2)…
   (CRC32 $528491DC: GPT Partition Data (Primary GPT Table : 2))
Reading  (Apple_Free : 3)…
   (CRC32 $00000000:  (Apple_Free : 3))
Reading disk image (Apple_HFS : 4)…
...............................................................................
   (CRC32 $FCDC1017: disk image (Apple_HFS : 4))
Reading  (Apple_Free : 5)…
...............................................................................
   (CRC32 $00000000:  (Apple_Free : 5))
Reading GPT Partition Data (Backup GPT Table : 6)…
...............................................................................
   (CRC32 $528491DC: GPT Partition Data (Backup GPT Table : 6))
Reading GPT Header (Backup GPT Header : 7)…
...............................................................................
   (CRC32 $56306308: GPT Header (Backup GPT Header : 7))
Adding resources…
...............................................................................
Elapsed Time:  3.443s
File size: 23178950 bytes, Checksum: CRC32 $141F3DDC
Sectors processed: 184400, 131460 compressed
Speed: 18.6Mbytes/sec
Savings: 75.4%
created: /Users/martin/app/dist/Hello World.dmg
hdiutil does not support internet-enable. Note it was removed in macOS 10.15.
Disk image done

While it's building, the Disk Image will pop up. Don't get too excited yet, it's still building. Wait for the script to complete, and you will find the finished .dmg file in the dist/ folder.

The Disk Image in the dist folderThe Disk Image created in the dist folder

Running the installer

Double-click the Disk Image to open it, and you'll see the usual macOS install view. Click and drag your app across the the Applications folder to install it.

The Disk Image containing your fileThe Disk Image contains the app bundle and a shortcut to the applications folder

If you open the Showcase view (press F4) you will see your app installed. If you have a lot of apps, you can search for it by typing "Hello"

The app is installed!The app installed on macOS

Repeating the build

Now you have everything set up, you can create a new app bundle & Disk Image of your application any time, by running the two commands from the command line.

bash
pyinstaller "Hello World.spec"
./builddmg.sh

It's that simple!

Wrapping up

In this tutorial we've covered how to build your PyQt5 applications into a macOS app bundle using PyInstaller, including adding data files along with your code. Then we walked through the process of creating a Disk Image to distribute your app to others. Following these steps you should be able to package up your own applications and make them available to other people.

For a complete view of all PyInstaller bundling options take a look at the PyInstaller usage documentation.

For more, see the complete PyQt5 tutorial.

Mike Driscoll: PyDev of the Week: Batuhan Taskaya

$
0
0

This week we welcome Batuhan Taskaya (@isidentical) as our PyDev of the Week! Batuhan is a core developer of the Python language. Batuhan is also a maintainer of multiple Python packages including parso and Black.

You can see what else Batuhan is up to by checking out his website or GitHub profile.

Let's take a few moments to get to know Batuhan better!

Can you tell us a little about yourself (hobbies, education, etc):

Hey there! My name is Batuhan, and I'm a software engineer who loves to work on developer tools to improve the overall productivity of the Python ecosystem.

I pretty much fill all my free time with open source maintenance and other programming related activities. If I am not programming at that time, I am probably reading a paper about PLT or watching some sci-fi show. I am a huge fan of the Stargate franchise.

Why did you start using Python?

I was always intrigued by computers but didn't do anything related to programming until I started using GNU/Linux on my personal computer (namely Ubuntu 12.04). Back then, I was searching for something to pass the time and found Python.

Initially, I was mind-blown by the responsiveness of the REPL. I typed `2 + 2`, it replied `4` back to me. Such a joy! For someone with literally zero programming experience, it was a very friendly environment. Later, I started following some tutorials, writing more code and repeating that process until I got a good grasp of the Python language and programming in general.

What other programming languages do you know and which is your favourite?

After being exposed to the level of elegancy and the simplicity in Python, I set the bar too high for adopting a new language. C is a great example where the language (in its own terms) is very straightforward, and currently, it is the only language I actively use apart from Python. I also think it goes really well when paired with Python, which might not be surprised considering the CPython itself and the extension modules are written in C.

If we let the mainstream languages go, I love building one-off compilers for weird/esoteric languages.

What projects are you working on now?

Most of my work revolves around CPython, which is the reference implementation of the Python language. In terms of the core, I specialize in the parser and the compiler. But outside of it, I maintain the ast module, and a few others.

One of the recent changes I've collaborated (with Pablo Galindo Salgado an Ammar Askar) on CPython was the new fancy tracebacks which I hope will really increase the productivity of the Python developers:

Traceback (most recent call last):
  File "query.py", line 37, in <module>
    magic_arithmetic('foo')
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "query.py", line 18, in magic_arithmetic
    return add_counts(x) / 25
           ^^^^^^^^^^^^^
  File "query.py", line 24, in add_counts
    return 25 + query_user(user1) + query_user(user2)
                ^^^^^^^^^^^^^^^^^
  File "query.py", line 32, in query_user
    return 1 + query_count(db, response['a']['b']['c']['user'], retry=True)
                               ~~~~~~~~~~~~~~~~~~^^^^^
TypeError: 'NoneType' object is not subscriptable

 

Alongside that, I help maintain projects like

and I am a core member of the fsspec.

Which Python libraries are your favorite (core or 3rd party)?

It might be a bit obvious, but I love the ast module. Apart from that, I enjoy using dataclasses and pathlib.

I generally avoid using dependencies since nearly %99 of the time, I can simply use the stdlib. But there is one exception, rich. For the last three months, nearly every script I've written uses it. It is such a beauty (both in terms of the UI and the API). I also really love pytest and pre-commit.

Not as a library, though one of my favorite projects from the python ecosystem is PyPy. It brings an entirely new python runtime, which depending on your work can be 1000X faster (or just 4X in general).

Is there anything else you’d like to say?

I've recently started a GitHub Sponsors Page, and if any of my work directly touches you (or your company) please consider sponsoring me!

Thanks for the interview Mike, and I hope people reading the article enjoyed it as much as I enjoyed answering these questions!

Thanks for doing the interview, Batuhan!

The post PyDev of the Week: Batuhan Taskaya appeared first on Mouse Vs Python.


Matt Layman: Episode 16 - Setting Your Sites

$
0
0
On this episode, we look at how to manage settings on your Django site. What are the common techniques to make this easier to handle? Let’s find out! Listen at djangoriffs.com or with the player below. Last Episode On the last episode, we dug into sessions and how Django uses that data storage technique for visitors to your site. How Is Django Configured? To run properly, Django needs to be configured.

Real Python: Python News: What's New From January 2022?

$
0
0

In January 2022, the code formatter Black saw its first non-beta release and published a new stability policy. IPython, the powerful interactive Python shell, marked the release of version 8.0, its first major version release in three years. Additionally, PEP 665, aimed at making reproducible installs easier by specifying a format for lock files, was rejected. Last but not least, a fifteen-year-old memory leak bug in Python was fixed.

Let’s dive into the biggest Python news stories from the past month!

Free Bonus:Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.

Black No Longer Beta

The developers of Black, an opinionated code formatter, are now confident enough to call the latest release stable. This announcement brings Black out of beta for the first time:

Screenshot of tweet announcing stable release of BlackImage source

Code formatting can be the source of a surprising amount of conflict among developers. This is why code formatters, or linters, help enforce style conventions to maintain consistency across a whole codebase. Linters suggest changes, while code formatters rewrite your code:

Demo of Black formatter executing

This makes your codebase more consistent, helps catch errors early, and makes code easier to scan.

YAPF is an example of a formatter. It comes with the PEP 8 style guide as a default, but it’s not strongly opinionated, giving you a lot of control over its configuration.

Black goes further: it comes with a PEP 8 compliant style, but on the whole, it’s not configurable. The idea behind disallowing configuration is that you free up your brain to focus on the actual code by relinquishing control over style. Many believe this restriction gives them much more freedom to be creative coders. But of course, not everyone likes to give up this control!

One crucial feature of opinionated formatters like Black is that they make your diffs much more informative. If you’ve ever committed a cleanup or formatting commit to your version control system, you may have inadvertently polluted your diff.

Read the full article at https://realpython.com/python-news-january-2022/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

death and gravity: Dealing with YAML with arbitrary tags in Python

$
0
0

... in which we use PyYAML to safely read and write YAML with any tags, in a way that's as straightforward as interacting with built-in types.

If you're in a hurry, you can find the code at the end.

Contents

Why is this useful?#

People mostly use YAML as a friendlier alternative to JSON1, but it can do way more.

Among others, it can represent user-defined and native data structures.

Say you need to read (or write) an AWS Cloud Formation template:

EC2Instance:Type:AWS::EC2::InstanceProperties:ImageId:!FindInMap[AWSRegionArch2AMI,!Ref'AWS::Region',!FindInMap[AWSInstanceType2Arch,!RefInstanceType,Arch],]InstanceType:!RefInstanceType
>>> yaml.safe_load(text)Traceback (most recent call last):...yaml.constructor.ConstructorError: could not determine a constructor for the tag '!FindInMap'  in "<unicode string>", line 4, column 14:        ImageId: !FindInMap [                 ^

... or, you need to safely read untrusted YAML that represents Python objects:

!!python/object/new:module.Class{ attribute:value}
>>> yaml.safe_load(text)Traceback (most recent call last):...yaml.constructor.ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:python/object/new:module.Class'  in "<unicode string>", line 1, column 1:    !!python/object/new:module.Class ...    ^

Warning

Historically, yaml.load(thing) was unsafe for untrusted data, because it allowed running arbitrary code. Consider using safe_load() instead.

Details.

For example, you could do this:

>>> yaml.load("!!python/object/new:os.system [echo WOOSH. YOU HAVE been compromised]")WOOSH. YOU HAVE been compromised0

There were a bunch of CVEs about it.

To address the issue, load() requires an explicit Loader since PyYAML 6. Also, version 5 added two new functions and corresponding loaders:

  • full_load() resolves all tags except those known to be unsafe (note that this was broken before 5.4, and thus vulnerable)
  • unsafe_load() resolves all tags, even those known to be unsafe (the old load() behavior)

safe_load() resolves only basic tags, remaining the safest.


Can't I just get the data, without it being turned into objects?

You can! The YAML spec says:

In a given processing environment, there need not be an available native type corresponding to a given tag. If a node’s tag is unavailable, a YAML processor will not be able to construct a native data structure for it. In this case, a complete representation may still be composed and an application may wish to use this representation directly.

And PyYAML obliges:

>>> text="""\... one: !myscalar string... two: !mysequence [1, 2]... """>>> yaml.compose(text)MappingNode(    tag='tag:yaml.org,2002:map',    value=[        (            ScalarNode(tag='tag:yaml.org,2002:str', value='one'),            ScalarNode(tag='!myscalar', value='string'),        ),        (            ScalarNode(tag='tag:yaml.org,2002:str', value='two'),            SequenceNode(                tag='!mysequence',                value=[                    ScalarNode(tag='tag:yaml.org,2002:int', value='1'),                    ScalarNode(tag='tag:yaml.org,2002:int', value='2'),                ],            ),        ),    ],)>>> print(yaml.serialize(_))one: !myscalar 'string'two: !mysequence [1, 2]

... the spec didn't say the representation has to be concise. ¯\_(ツ)_/¯

Here's how YAML processing works, to give you an idea what we're looking at:

YAML Processing Overview

YAML Processing Overview Diagram

The output of compose() above is the representation (node graph).

From that, safe_load() does its best to construct objects, but it can't do anything for tags it doesn't know about.


There must be a better way!

Thankfully, the spec also says:

That said, tag resolution is specific to the application. YAML processors should therefore provide a mechanism allowing the application to override and expand these default tag resolution rules.

We'll use this mechanism to convert tagged nodes to almost-native types, while preserving the tags.

A note on PyYAML extensibility#

PyYAML is a bit unusual.

For each processing direction, you have a corresponding Loader/Dumper class.

For each processing step, you can add callbacks, stored in class-level registries.

The callbacks are method-like – they receive the Loader/Dumper as the first argument:

Dice=namedtuple('Dice','a b')defdice_representer(dumper,data):returndumper.represent_scalar(u'!dice',u'%sd%s'%data)yaml.Dumper.add_representer(Dice,dice_representer)

You may notice the add_...() methods modify the class in-place, for everyone, which isn't necessarily great; imagine getting a Dice from safe_load(), when you were expecting only built-in types.

We can avoid this by subclassing, since the registry is copied from the parent. Note that because of how copying is implemented, registries from two direct parents are not merged – you only get the registry of the first parent in the MRO.


So, we'll start by subclassing SafeLoader/Dumper:

45678
classLoader(yaml.SafeLoader):passclassDumper(yaml.SafeDumper):pass

Preserving tags#

Constructing unknown objects#

For now, we can use named tuples for objects with unknown tags, since they are naturally tag/value pairs:

121314
classTagged(typing.NamedTuple):tag:strvalue:object

Tag or no tag, all YAML nodes are either a scalar, a sequence, or a mapping. For unknown tags, we delegate construction to the loader's default constructors, and wrap the resulting value:

171819202122232425262728
defconstruct_undefined(self,node):ifisinstance(node,yaml.nodes.ScalarNode):value=self.construct_scalar(node)elifisinstance(node,yaml.nodes.SequenceNode):value=self.construct_sequence(node)elifisinstance(node,yaml.nodes.MappingNode):value=self.construct_mapping(node)else:assertFalse,f"unexpected node: {node!r}"returnTagged(node.tag,value)Loader.add_constructor(None,construct_undefined)

Constructors are registered by tag, with None meaning "unknown".

Things look much better already:

>>> yaml.load(text,Loader=Loader){    'one': Tagged(tag='!myscalar', value='string'),    'two': Tagged(tag='!mysequence', value=[1, 2]),}

A better wrapper#

That's nice, but every time we use any value, we have to check if it's tagged, and then go through value if is:

>>> one=_['one']>>> one.tag'!myscalar'>>> one.value.upper()'STRING'

We could subclass the Python types corresponding to core YAML tags (str, list, and so on), and add a tag attribute to each. We could subclass most of them, anyway – neither bool nor NoneType can be subclassed.

Or, we could wrap tagged objects in a class with the same interface, that delegates method calls and attribute access to the wrapee, with a tag attribute on top.

Tip

This is known as the decorator pattern design pattern (not to be confused with Python decorators).

Doing this naively entails writing one wrapper per type, with one wrapper method per method and one property per attribute. That's even worse than subclassing!

There must be a better way!


Of course, this is Python, so there is.

We can use an object proxy instead (also known as "dynamic wrapper"). While they're not perfect in general, the one wrapt provides is damn near perfect enough2:

1213141516171819202122
classTagged(wrapt.ObjectProxy):# tell wrapt to set the attribute on the proxy, not the wrapped objecttag=Nonedef__init__(self,tag,wrapped):super().__init__(wrapped)self.tag=tagdef__repr__(self):returnf"{type(self).__name__}({self.tag!r}, {self.__wrapped__!r})"
>>> yaml.load(text,Loader=Loader){    'one': Tagged('!myscalar', 'string'),    'two': Tagged('!mysequence', [1, 2]),}

The proxy behaves identically to the proxied object:

>>> one=_['one']>>> one.tag'!myscalar'>>> one.upper()'STRING'>>> one[:3]'str'

...up to and including fancy things like isinstance():

>>> isinstance(one,str)True>>> isinstance(one,Tagged)True

And now you don't have to care about tags if you don't want to.

Representing tagged objects#

The trip back is exactly the same, but much shorter:

39404142434445
defrepresent_tagged(self,data):assertisinstance(data,Tagged),datanode=self.represent_data(data.__wrapped__)node.tag=data.tagreturnnodeDumper.add_representer(Tagged,represent_tagged)

Representers are registered by type.

>>> print(yaml.dump(Tagged('!hello','world'),Dumper=Dumper))!hello 'world'

Let's mark the occasion with some tests.

Since we still have stuff to do, we parametrize the tests from the start.

 7 8 91011121314151617181920
BASIC_TEXT="""\one: !myscalar stringtwo: !mymapping  three: !mysequence [1, 2]"""BASIC_DATA={'one':Tagged('!myscalar','string'),'two':Tagged('!mymapping',{'three':Tagged('!mysequence',[1,2])}),}DATA=[(BASIC_TEXT,BASIC_DATA),]

Loading works:

232425
@pytest.mark.parametrize('text, data',DATA)deftest_load(text,data):assertyaml.load(text,Loader=Loader)==data

And dumping works:

28293031
@pytest.mark.parametrize('text',[t[0]fortinDATA])deftest_roundtrip(text):data=yaml.load(text,Loader=Loader)assertdata==yaml.load(yaml.dump(data,Dumper=Dumper),Loader=Loader)

... but only for known types:

343536
deftest_dump_error():withpytest.raises(yaml.representer.RepresenterError):yaml.dump(object(),Dumper=Dumper)

Unhashable keys#

Let's try an example from the PyYAML documentation:

>>> text="""\... ? !!python/tuple [0,0]... : The Hero... ? !!python/tuple [1,0]... : Treasure... ? !!python/tuple [1,1]... : The Dragon... """

This is supposed to result in something like:

>>> yaml.unsafe_load(text){(0, 0): 'The Hero', (1, 0): 'Treasure', (1, 1): 'The Dragon'}

Instead, we get:

>>> yaml.load(text,Loader=Loader)Traceback (most recent call last):...TypeError: unhashable type: 'list'

That's because the keys are tagged lists, and neither type is hashable:

>>> yaml.load("!!python/tuple [0,0]",Loader=Loader)Tagged('tag:yaml.org,2002:python/tuple', [0, 0])

This limitation comes from how Python dicts are implemented,3 not from YAML; quoting from the spec again:

The content of a mapping node is an unordered set of key/value node pairs, with the restriction that each of the keys is unique. YAML places no further restrictions on the nodes. In particular, keys may be arbitrary nodes, the same node may be used as the value of several key/value pairs and a mapping could even contain itself as a key or a value.

Constructing pairs#

What now?

Same strategy as before: wrap the things we can't handle.

Specifically, whenever we have a mapping with unhashable keys, we return a list of pairs instead. To tell it apart from plain lists, we use a subclass:

484950
classPairs(list):def__repr__(self):returnf"{type(self).__name__}({super().__repr__()})"

Again, we let the loader do most of the work:

535455565758596061
defconstruct_mapping(self,node):value=self.construct_pairs(node)try:returndict(value)exceptTypeError:returnPairs(value)Loader.construct_mapping=construct_mappingLoader.add_constructor('tag:yaml.org,2002:map',Loader.construct_mapping)

We set construct_mapping so that any other Loader constructor wanting to make a mapping gets to use it (like our own construct_undefined() above). Don't be fooled by the assignment, it's a method like any other.4 But we're changing the class from outside anyway, it's best to stay consistent.

Note that overriding construct_mapping() is not enough: we have to register the constructor explictly, otherwise SafeDumper's construct_mapping() will be used (since that's what was in the registry before).

Note

In case you're wondering, this feature is orthogonal from handling unknown tags; we could have used different classes for them. However, as mentioned before, the constructor registry breaks multiple inheritance, so we couldn't use the two features together.

Anyway, it works:

>>> yaml.load(text,Loader=Loader)Pairs(    [        (Tagged('tag:yaml.org,2002:python/tuple', [0, 0]), 'The Hero'),        (Tagged('tag:yaml.org,2002:python/tuple', [1, 0]), 'Treasure'),        (Tagged('tag:yaml.org,2002:python/tuple', [1, 1]), 'The Dragon'),    ])

Representing pairs#

Like before, the trip back is short and uneventful:

646566676869
defrepresent_pairs(self,data):assertisinstance(data,Pairs),datanode=self.represent_dict(data)returnnodeDumper.add_representer(Pairs,represent_pairs)
>>> print(yaml.dump(Pairs([([],'one')]),Dumper=Dumper))[]: one

Let's test this more thoroughly.

Because the tests are parametrized, we just need to add more data:

18192021222324252627282930313233
UNHASHABLE_TEXT="""\[0,0]: one!key {0: 1}: {[]: !value three}"""UNHASHABLE_DATA=Pairs([([0,0],'one'),(Tagged('!key',{0:1}),Pairs([([],Tagged('!value','three'))])),])DATA=[(BASIC_TEXT,BASIC_DATA),(UNHASHABLE_TEXT,UNHASHABLE_DATA),]

Conclusion#

YAML is extensible by design. I hope that besides what it says on the tin, this article shed some light on how to customize PyYAML for your own purposes, and that you've learned at least one new Python thing.

You can get the code here, and the tests here.

Learned something new today? Share this with others, it really helps!

Want more? Get updates via email or Atom feed.

Bonus: hashable wrapper#

You may be asking, why not make the wrapper hashable?

Most unhashable (data) objects are that for a reason: because they're mutable.

We have two options:

  • Make the wrapper hash change with the content. This this will break dictionaries in strange and unexpected ways (and other things too) – the language requires mutable objects to be unhashable.

  • Make the wrapper hash not change with the content, and wrappers equal only to themselves – that's what user-defined classes do by default anyway.

    This works, but it's not very useful, because equal values don't compare equal anymore (data != load(dump(data))). Also, it means you can only get things from a dict if you already have the object used as key:

    >>> data={Hashable([1]):'one'}>>> data[Hashable([1])]Traceback (most recent call last):...KeyError: Hashable([1])>>> key=list(data)[0]>>> data[key]'one'

    I'd file this under "strange and unexpected" too.

    (You can find the code for the example above here.)

Bonus: broken YAML#

We can venture even farther, into arguably broken YAML. Let's look at some examples.

First, there are undefined tag prefixes:

>>> yaml.load("!m!xyz x",Loader=Loader)Traceback (most recent call last):...yaml.parser.ParserError: while parsing a nodefound undefined tag handle '!m!'  in "<unicode string>", line 1, column 1:    !m!xyz x    ^

A valid version:

>>> yaml.load("""\... %TAG !m! !my-... ---... !m!xyz x... """,Loader=Loader)Tagged('!my-xyz', 'x')

Second, there are undefined aliases:

>>> yaml.load("two: *anchor",Loader=Loader)Traceback (most recent call last):...yaml.composer.ComposerError: found undefined alias 'anchor'  in "<unicode string>", line 1, column 6:    two: *anchor         ^

A valid version:

>>> yaml.load("""\... one: &anchor [1]... two: *anchor... """,Loader=Loader){'one': [1], 'two': [1]}

It's likely possible to handle these in a way similar to how we handled undefined tags, but we'd have to go deeper – the exceptions hint to which processing step to look at.

Since I haven't actually encountered them in real life, we'll "save them for later" :)

  1. Of which YAML is actually a superset. [return]

  2. Timothy 20:9. [return]

  3. Using a hash table. For nice explanation of how it all works, complete with a pure-Python implementation, check out Raymond Hettinger's talk Modern Python Dictionaries: A confluence of a dozen great ideas (code). [return]

  4. Almost. The zero argument form of super() won't work for methods defined outside of a class definition, but we're not using it here. [return]

Python Morsels: Making the len function work on your Python objects

$
0
0

In Python, you can make the built-in len function work on your objects.

The len function only works on objects that have a length

The built-in len function works on some objects, but not on others. Only things that have a length work with the len function.

Lists, sets, dictionaries, strings, and most data structures in Python have a length:

>>> numbers=[2,1,3,4,7,11,18]>>> len(numbers)7

But numbers don't:

>>> n=10>>> len(n)Traceback (most recent call last):
  File "<stdin>", line 1, in <module>TypeError: object of type 'int' has no len()

When making a class in Python, you can control whether instances of that class have a length.

Python's built-in len function calls the __len__ method (pronounced "dunder len") on the object you give it.

So if that object has a __len__ method, it has a length:

>>> numbers=[2,1,3,4,7,11,18]>>> numbers.__len__()7

If it doesn't have a __len__ method, the len function raises a TypeError instead:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>TypeError: object of type 'int' has no len()

How to make instances of your class have a length?

Python's random module has a function (choice) which can randomly select an item from a given sequence.

>>> importrandom>>> colors=['red','blue','green','purple']>>> random.choice(colors)'purple'

This choice function only works on objects that can be indexed and have a length.

Here we have a class named ForgivingIndexer:

classForgivingIndexer:def__init__(self,sequence):self.sequence=sequencedef__getitem__(self,index):returnself.sequence[int(index)]

This class has a __init__ method and a __getitem__ method. That __getitem__ method allows instances of this class to be indexed using square brackets ([]).

But this isn't quite enough to allow our ForgivingIndexer objects to work with the random.choice function. If we pass a ForgivingIndexer object to the random.choice function, we'll get an error:

>>> importrandom>>> fruits=ForgivingIndexer(['apple','lime','pear','watermelon'])>>> random.choice(fruits)Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.10/random.py", line 378, in choicereturnseq[self._randbelow(len(seq))]TypeError: object of type 'ForgivingIndexer' has no len()

Python gives us an error because ForgivingIndexer objects don't have a length, which the random.choice function requires. These objects don't work with the built-in len function:

>>> fruits=ForgivingIndexer(['apple','lime','pear','watermelon'])>>> len(fruits)Traceback (most recent call last):
  File "<stdin>", line 1, in <module>TypeError: object of type 'ForgivingIndexer' has no len()

In order to support the built-in len function, we can add a __len__ method to this class:

def__len__(self):returnlen(self.sequence)

Now instances of this class have a length:

>>> importrandom>>> fruits=ForgivingIndexer(['apple','lime','pear','watermelon'])>>> len(fruits)4

And they also work with random.choice:

>>> random.choice(fruits)'apple'

Summary

You can make your objects work with the built-in len function by adding a __len__ method to them. You'll pretty much only ever add a __len__ method if you're making a custom data structure, like a sequence or a mapping.

ItsMyCode: TypeError: method() takes 1 positional argument but 2 were given

$
0
0

If you define a method inside a class, you should add self as the first argument. If you forget the self argument, then Python will raise TypeError: method() takes 1 positional argument but 2 were given

In this tutorial, we will look at what method() takes 1 positional argument but 2 were given error means and how to resolve this error with examples.

TypeError: method() takes 1 positional argument but 2 were given

In Python, we need to pass “self” as the first argument for all the methods which is defined in a class. It is similar to this in JavaScript.

We know that class is a blueprint for the objects, and we can use the blueprints to create multiple instances of objects.

The self is used to represent the instance(object) of the class. Using this keyword, we can access the attributes and methods of the class in Python.

Let us take a simple example to reproduce this error.

If you look at the below example, we have an Employee class, and we have a simple method that takes the name as a parameter and prints the Employee ID as output.

# Employee Class
class Employee:
    # Get Employee method without self parameter
    def GetEmployeeID(name):
        print(f"The Employee ID of {name} ", 1234)

# instance of the employee
empObj = Employee()
empObj.GetEmployeeID("Chandler Bing")

Output

Traceback (most recent call last):
  File "c:\Personal\IJS\Code\main.py", line 10, in <module>
    empObj.GetEmployeeID("Chandler Bing")
TypeError: Employee.GetEmployeeID() takes 1 positional argument but 2 were given

When we run the code, we get a TypeError: method() takes 1 positional argument but 2 were given

How to fix TypeError: method() takes 1 positional argument but 2 were given

In our above code, we have not passed the self argument to the method defined in the Employee class, which leads to TypeError.

As shown below, we can fix the issue by passing the “self” as a parameter explicitly to the GetEmployeeID() method.

# Employee Class
class Employee:
    # Get Employee method with self parameter
    def GetEmployeeID(self,name):
        print(f"The Employee ID of {name} ", 1234)

# instance of the employee
empObj = Employee()
empObj.GetEmployeeID("Chandler Bing")

Output

The Employee ID of Chandler Bing  1234

In Python, when we call the method with some arguments, the corresponding class function is called by placing the methods object before the first argument.

Example object.method(args) will become Class.method(obj,args).

The calling process is automatic, but it should be defined explicitly on the receiving side. 

This is one of the main reasons the first parameter of a function in a class must be the object itself. 

It is not mandatory to use “self” as an argument; instead, we can pass anything over here.

 The “self” is neither a built-in keyword nor has special meaning in Python. It is just a better naming convention that developers use and improves the readability of the code.

Conclusion

The TypeError: method() takes 1 positional argument but 2 were given occurs if we do not pass the “self” as an argument to all the methods defined inside the class.

The self is used to represent the instance(object) of the class. Using this keyword, we can access the attributes and methods of the class in Python.

The issue is resolved by passing the “self” as a parameter to all the methods defined in a class.

Stack Abuse: Convert Numpy Array to Tensor and Tensor to Numpy Array with PyTorch

$
0
0

Tensors are multi-dimensional objects, and the essential data representation block of Deep Learning frameworks such as Tensorflow and PyTorch.

A scalar has one dimension, a vector has two, and tensors have three or more. In practice, we oftentimes refer to scalars and vectors as tensors as well for convenience.

Note: A tensor can also be any n-dimensional array, just like a Numpy array can. Many frameworks have support for working with Numpy arrays, and many of them are built on top of Numpy so the integration is both natural and efficient.

However, a torch.Tensor has more built-in capabilities than Numpy arrays do, and these capabilities are geared towards Deep Learning applications (such as GPU acceleration), so it makes sense to prefer torch.Tensor instances over regular Numpy arrays when working with PyTorch. Additionally, torch.Tensors have a very Numpy-like API, making it intuitive for most with prior experience!

In this guide, learn how to convert between a Numpy Array and PyTorch Tensors.

Convert Numpy Array to PyTorch Tensor

To convert a Numpy array to a PyTorch tensor - we have two distinct approaches we could take: using the from_numpy() function, or by simply supplying the Numpy array to the torch.Tensor() constructor or by using the tensor() function:

import torch
import numpy as np

np_array = np.array([5, 7, 1, 2, 4, 4])

# Convert Numpy array to torch.Tensor
tensor_a = torch.from_numpy(np_array)
tensor_b = torch.Tensor(np_array)
tensor_c = torch.tensor(np_array)

So, what's the difference? The from_numpy() and tensor() functions are dtype-aware! Since we've created a Numpy array of integers, the dtype of the underlying elements will naturally be int32:

print(np_array.dtype)
# dtype('int32')

If we were to print out our two tensors:

print(f'tensor_a: {tensor_a}\ntensor_b: {tensor_b}\ntensor_c: {tensor_c}')

tensor_a and tensor_c retain the data type used within the np_array, cast into PyTorch's variant (torch.int32), while tensor_b automatically assigns the values to floats:

tensor_a: tensor([5, 7, 1, 2, 4, 4], dtype=torch.int32)
tensor_b: tensor([5., 7., 1., 2., 4., 4.])
tensor_c: tensor([5, 7, 1, 2, 4, 4], dtype=torch.int32)

This can also be observed through checking their dtype fields:

print(tensor_a.dtype) # torch.int32print(tensor_b.dtype) # torch.float32print(tensor_c.dtype) # torch.int32

Numpy Array to PyTorch Tensor with dtype

These approaches also differ in whether you can explicitly set the desired dtype when creating the tensor. from_numpy() and Tensor() don't accept a dtype argument, while tensor() does:

# Retains Numpy dtype
tensor_a = torch.from_numpy(np_array)
# Creates tensor with float32 dtype
tensor_b = torch.Tensor(np_array)
# Retains Numpy dtype OR creates tensor with specified dtype
tensor_c = torch.tensor(np_array, dtype=torch.int32)

print(tensor_a.dtype) # torch.int32print(tensor_b.dtype) # torch.float32print(tensor_c.dtype) # torch.int32

Naturally, you can cast any of them very easily, using the exact same syntax, allowing you to set the dtypeafter the creation as well, so the acceptance of a dtype argument isn't a limitation, but more of a convenience:

tensor_a = tensor_a.float()
tensor_b = tensor_b.float()
tensor_c = tensor_c.float()

print(tensor_a.dtype) # torch.float32print(tensor_b.dtype) # torch.float32print(tensor_c.dtype) # torch.float32

Convert PyTorch Tensor to Numpy Array

Converting a PyTorch Tensor to a Numpy array is straightforward, since tensors are ultimately built on top of Numpy arrays, and all we have to do is "expose" the underlying data structure.

Since PyTorch can optimize the calculations performed on data based on your hardware, there are a couple of caveats though:

tensor = torch.tensor([1, 2, 3, 4, 5])

np_a = tensor.numpy()
np_b = tensor.detach().numpy()
np_c = tensor.detach().cpu().numpy()

So, why use detach() and cpu() before exposing the underlying data structure with numpy(), and when should you detach and transfer to a CPU?

CPU PyTorch Tensor -> CPU Numpy Array

If your tensor is on the CPU, where the new Numpy array will also be - it's fine to just expose the data structure:

np_a = tensor.numpy()
# array([1, 2, 3, 4, 5], dtype=int64)

This works very well, and you've got yourself a clean Numpy array.

CPU PyTorch Tensor with Gradients -> CPU Numpy Array

However, if your tensor requires you to calculate gradients for it as well (i.e. the requires_grad argument is set to True), this approach won't work anymore. You'll have to detach the underlying array from the tensor, and through detaching, you'll be pruning away the gradients:

tensor = torch.tensor([1, 2, 3, 4, 5], dtype=torch.float32, requires_grad=True)

np_a = tensor.numpy()
# RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.
np_b = tensor.detach().numpy()
# array([1., 2., 3., 4., 5.], dtype=float32)

GPU PyTorch Tensor -> CPU Numpy Array

Finally - if you've created your tensor on the GPU, it's worth remembering that regular Numpy arrays don't support GPU acceleration. They reside on the CPU! You'll have to transfer the tensor to a CPU, and then detach/expose the data structure.

Note: This can either be done via the to('cpu') or cpu() functions - they're functionally equivalent.

This has to be done explicitly, because if it were done automatically - the conversion between CPU and CUDA tensors to arrays would be different under the hood, which could lead to unexpected bugs down the line.

PyTorch is fairly explicit, so this sort of automatic conversion was purposefully avoided:

# Create tensor on the GPU
tensor = torch.tensor([1, 2, 3, 4, 5], dtype=torch.float32, requires_grad=True).cuda()

np_b = tensor.detach().numpy()
# TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
np_c = tensor.detach().cpu().numpy()
# array([1., 2., 3., 4., 5.], dtype=float32)

Note: It's highly advised to call detach()beforecpu(), to prune away the gradients before transferring to the CPU. The gradients won't matter anyway after the detach() call - so copying them at any point is totally redundant and inefficient. It's better to "cut the dead weight" as soon as possible.

Generally speaking - this approach is the safest, as no matter which sort of tensor you're working - it won't fail. If you've got a CPU tensor, and you try sending it to the CPU - nothing happens. If you've got a tensor without gradients, and try detaching it - nothing happens. On the other end of the stick - exceptions are thrown.

Conclusion

In this guide - we've taken a look at what PyTorch tensors are, before diving into how to convert a Numpy array into a PyTorch tensor. Finally, we've explored how PyTorch tensors can expose the underlying Numpy array, and in which cases you'd have to perform additional transfers and pruning.

Glyph Lefkowitz: A Better Pygame Mainloop

$
0
0

I’ve written about this before, but in that context I was writing mainly about frame-rate independence, and only gave a brief mention of vertical sync; the title also mentioned Twisted, and upon re-reading it I realized that many folks who might get a lot of use out of its technique would not have bothered to read it, just because I made it sound like an aside in the context of an animation technique in a game that already wanted to use Twisted for some reason, rather than a comprehensive best practice. Now that Pygame 2.0 is out, though, and the vsync=1 flag is more reliably available to everyone, I thought it would be worth revisiting.


Per the many tutorials out there, including the official one, most Pygame mainloops look like this:

12345678
pygame.display.set_mode((320, 240))while1:
    foreventinpygame.event.get():
        handleEvent(event)fordrawableinmyDrawables:
        drawable.draw()pygame.display.flip()

Obviously that works okay, or folks wouldn’t do it, but it can give an impression of a certain lack of polish for most beginner Pygame games.

The thing that’s always bothered me personally about this idiom is: where does the networking go? After spending many years trying to popularize event loops in Python, I’m sad to see people implementing loops over and over again that have no way to get networking, or threads, or timers scheduled in a standard way so that libraries could be written without the application needing to manually call them every frame.

But, who cares how I feel about it? Lots of games don’t have networking1. There are more general problems with it. Specifically, it is likely to:

  1. waste power, and
  2. look bad.

Wasting Power

Why should anyone care about power when they’re making a video game? Aren’t games supposed to just gobble up CPUs and GPUs for breakfast, burning up as much power as they need for the most gamer experience possible?

Chances are, if you’re making a game that you expect anyone that you don’t personally know to play, they’re going to be playing it on a laptop2. Pygame might have a reputation for being “slow”, but for a simple 2D game with only a few sprites, Python can easily render several thousand frames per second. Even the fastest display in the world can only refresh at 360Hz3. That’s less than one thousand frames per second. The average laptop display is going to be more like 60Hz, or — if you’re lucky — maybe 120. By rendering thousands of frames that the user never even sees, you warm up their CPU uncomfortably4, and you waste 10x (or more) of their battery doing useless work.

At some point your game might have enough stuff going on that it will run the CPU at full tilt, and if it does, that’s probably fine; at least then you’ll be using up that heat and battery life in order to make their computer do something useful. But even if it is, it’s probably not doing that all of the time, and battery is definitely a use-over-time sort of problem.

Looking Bad

If you’re rendering directly to the screen without regard for vsync, your players are going to experience Screen Tearing, where the screen is in the middle of updating while you’re in the middle of drawing to it. This looks especially bad if your game is panning over a background, which is a very likely scenario for the usual genre of 2D Pygame game.

How to fix it?

Pygame lets you turn on VSync, and in Pygame 2, you can do this simply by passing the pygame.SCALED flag and the vsync=1 argument to set_mode().

Now your game will have silky smooth animations and scrolling5! Solved!

But... if the fix is so simple, why doesn’t everybody — including, notably, the official documentation — recommend doing this?

The solution creates another problem: pygame.display.flip may now block until the next display refresh, which may be many milliseconds.

Even worse: note the word “may”. Unfortunately, behavior of vsync is quite inconsistent between platforms and drivers, so for a properly cross-platform game it may be necessary to allow the user to select a frame rate and wait on an asyncio.sleep than running flip in a thread. Using the techniques from the answers to this stack overflow answer you can establish a reasonable heuristic for the refresh rate of the relevant display, but if adding those libraries and writing that code is too complex, “60” is probably a good enough value to start with, even if the user’s monitor can go a little faster. This might save a little power even in the case where you can rely on flip to tell you when the monitor is actually ready again; if your game can only reliably render 60FPS anyway because there’s too much Python game logic going on to consistently go faster, it’s better to achieve a consistent but lower framerate than to be faster but inconsistent.

The potential for blocking needs to be dealt with though, and it has several knock-on effects.

For one thing, it makes my “where do you put the networking” problem even worse: most networking frameworks expect to be able to send more than one packet every 16 milliseconds.

More pressingly for most Pygame users, however, it creates a minor performance headache. You now spend a bunch of time blocked in the now-blocking flip call, wasting precious milliseconds that you could be using to do stuff unrelated to drawing, like handling user input, updating animations, running AI, and so on.

The problem is that your Pygame mainloop has 3 jobs:

  1. drawing
  2. game logic (AI and so on)
  3. input handling

What you want to do to ensure the smoothest possible frame rate is to draw everything as fast as you possibly can at the beginning of the frame and then call flip immediately to be sure that the graphics have been delivered to the screen and they don’t have to wait until the next screen-refresh. However, this is at odds with the need to get as much done as possible before you call flip and possibly block for 1/60th of a second.

So either you put off calling flip, potentially risking a dropped frame if your AI is a little slow, or you call flip too eagerly and waste a bunch of time waiting around for the display to refresh. This is especially true of things like animations, which you can’t update before drawing, because you have to draw this frame before you worry about the next one, but waiting until afterflip wastes valuable time; by the time you are starting your next frame draw, you possibly have other code which now needs to run, and you’re racing to get it done before that next flip call.

Now, if your Python game logic is actually saturating your CPU — which is not hard to do — you’ll drop frames no matter what. But there are a lot of marginal cases where you’ve mostly got enough CPU to do what you need to without dropping frames, and it can be a lot of overhead to constantly check the clock to see if you have enough frame budget left to do one more work item before the frame deadline - or, for that matter, to maintain a workable heuristic for exactly when that frame deadline will be.

The technique to avoid these problems is deceptively simple, and in fact it was covered with the deferToThread trick presented in my earlier post. But again, we’re not here to talk about Twisted. So let’s do this the no-additional-dependencies, stdlib-only way, with asyncio:

 1 2 3 4 5 6 7 8 91011121314151617181920212223242526272829303132333435363738394041
importasyncioimporttimefrommathimportinffrompygame.displayimportset_mode,flipfrompygame.constantsimportSCALEDfrompygame.eventsimportgetevent_handler=...drawables=[...]asyncdefpygame_loop(framerate_limit=inf):loop=asyncio.get_event_loop()screen_surface=set_mode(size=(480,255),flags=SCALED,vsync=1)next_frame_target=0.0limit_frame_duration=(1.0/framerate_limit)whileTrue:iflimit_frame_duration:# framerate limiterthis_frame=time.time()delay=next_frame_target-this_frameifdelay>0:awaitasyncio.sleep(delay)next_frame_target=this_frame+limit_frame_durationfordrawableindrawables:drawable.draw(screen_surface)events_to_handle=list(get())events_handled=loop.create_task(handle_events(events_to_handle))awaitloop.run_in_executor(None,flip)# don’t want to accidentally start drawing again until events are doneawaitevents_handledasyncdefhandle_events(events_to_handle):# note that this must be an async def even if it doesn’t awaitforeventinevents_to_handle:event_handler.handle_event(event)asyncio.run(pygame_loop(120))

Go Forth and Loop Better

At some point I will probably release my own wrapper library6 which does something similar to this, but I really wanted to present this as a technique rather than as some packaged-up code to use, since do-it-yourself mainloops, and keeping dependencies to a minimum, are such staples of Pygame community culture.

As you can see, this technique is only a few lines longer than the standard recipe for a Pygame main loop, but you now have access to a ton of additional functionality:

  • You can manage your framerate independence in both animations and game logic by just setting some timers and letting the frames update at the appropriate times; stop worrying about doing math on the clock by yourself!
  • Do you want to add networked multiplayer? No problem! Networking all happens inside the event loop, make whatever network requests you want, and never worry about blocking the game’s drawing on a network request!
  • Now your players’ laptops run cool while playing, and the graphics don’t have ugly tearing artifacts any more!

I really hope that this sees broader adoption so that the description “indie game made in Python” will no longer imply “runs hot and tears a lot when the screen is panning”. I’m also definitely curious to hear from readers, so please let me know if you end up using this technique to good effect!7


  1. And, honestly, a few fewer could stand to have it, given how much unnecessary always-online stuff there is in single-player experiences these days. But I digress. That’s why I’m in a footnote, this is a good place for digressing. 

  2. “Worldwide sales of laptops have eclipsed desktops for more than a decade. In 2019, desktop sales totaled 88.4 million units compared to 166 million laptops. That gap is expected to grow to 79 million versus 171 million by 2023.” 

  3. At least, Nvidia says that “the world’s fastest esports displays” are both 360Hz and also support G-Sync, and who am I to disagree? 

  4. They’re playing on a laptop, remember? So they’re literally uncomfortable. 

  5. Assuming you’ve made everything frame-rate independent, as mentioned in the aforementioned post

  6. because of course I will 

  7. And also, like, if there are horrible bugs in this code, so I can update it. It is super brief and abstract to show how general it is, but that also means it’s not really possible to test it as-is; my full-working-code examples are much longer and it’s definitely possible something got lost in translation. 


ItsMyCode: AttributeError: Can only use .str accessor with string values

$
0
0

The AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas occurs if you try to replace the values of string column, but in reality, it is of a different type.

In this tutorial, we will look at what is AttributeError: Can only use .str accessor with string values and how to fix this error with examples.

AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

Let us take a simple example to reproduce this error. In the below example, we have Pandas DataFrame, which indicates the standing of each cricket team.

# import pandas library
import pandas as pd

# create pandas DataFrame
df = pd.DataFrame({'team': ['India', 'South Africa', 'New Zealand', 'England'],
                   'points': [12.0, 8.0, 3.0, 5],
                   'runrate': [0.5, 1.4, 2, -0.6],
                   'wins': [5, 4, 2, 2]})
print(df['points'])
df['points'] = df['points'].str.replace('.', '')
print(df['points'])

Output

0    12.0
1     8.0
2     3.0
3     5.0
Name: points, dtype: float64    
raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!. Did you mean: 'std'?

When we run the above code, we get AttributeError Can only use .str accessor with string values!.

The points column is in the float datatype, and using the str.replace() can be applied only on the string columns. 

How to fix Can only use .str accessor with string values error?

We can fix the error by casting the DataFrame column “points” from float to string before replacing the values in the column.

Let us fix our code and run it once again.

# import pandas library
import pandas as pd

# create pandas DataFrame
df = pd.DataFrame({'team': ['India', 'South Africa', 'New Zealand', 'England'],
                   'points': [12.0, 8.0, 3.0, 5],
                   'runrate': [0.5, 1.4, 2, -0.6],
                   'wins': [5, 4, 2, 2]})
print(df['points'])
df['points'] = df['points'].astype(str).str.replace('.', '')
print(df['points'])

Output

0    12.0
1     8.0
2     3.0
3     5.0
Name: points, dtype: float64

0    120
1     80
2     30
3     50
Name: points, dtype: object

Notice that the error is gone, and the points column is converted from float to object, and also, the decimal has been replaced with an empty string.

Conclusion

The AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas occurs if you try to replace the values of string column, but in reality, it is of a different type.

We can fix the issue by casting the column to a string before replacing the values in the column.

TestDriven.io: Working with Static and Media Files in Django

$
0
0
This article looks at how to work with static and media files in a Django project, locally and in production.

Real Python: Defining Python Functions With Optional Arguments

$
0
0

Defining your own functions is an essential skill for writing clean and effective code. In this tutorial, you’ll explore the techniques you have available for defining Python functions that take optional arguments. When you master Python optional arguments, you’ll be able to define functions that are more powerful and more flexible.

In this course, you’ll learn how to:

  • Distinguish between parameters and arguments
  • Define functions with optional arguments and default parameter values
  • Define functions using args and kwargs
  • Deal with error messages about optional arguments

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Lucas Cimon: Useful short Python decorator to convert generators into lists

$
0
0

Python generators are awesome. Why ?

  • their syntax is simple an concise
  • they lazily generate values and hence are very memory efficient
  • bonus point: since Python 3 you can chain them with yield from

Their drawback ? They can be iterated only once, and they hide the iterable length.

I took an …


Permalink

Python for Beginners: Count Digits Of An Integer in Python

$
0
0

In python, integer data type is used to represent positive and negative integers. In this article, we will discuss a program to count digits of an integer in python. 

How to Count Digits of an Integer in Python?

To count the digits of a number, we will use an approach that divides the number by 10. When we divide an integer by 10, the resultant number gets reduced by one digit. 

For instance, if we divide 1234 by 10, the result will be 123. Here, 1234 has 4 digits whereas 123 has only three digits. Similarly, when we divide 123 by 10, it will get reduced to a number with only 2 digits and so on. Finally the number will become 0. 

You can observe that we can divide 1234 by 10 only 4 times before it becomes 0. In other words, if there are n digits in an integer, we can divide the integer by 10 only n times till it becomes 0.

Program to Count Digits of an Integer in Python

As discussed above, we will use the following approach to count digits of a number in python.

  • First we will declare a value count and initialize it to 0.
  • Then, we will use a while loop to divide the given number by 10 repeatedly.
  • Inside the while loop, we will increment count by one each time we divide the number by 10. 
  • Once the number becomes 0, we will exit from the while loop.
  • After executing the while loop, we will have the count of the digits of the integer in the count variable.

We can implement the above approach to count the number of digits of a number in python as follows.

number = 12345
print("The given number is:", number)
count = 0
while number > 0:
    number = number // 10
    count = count + 1
print("The number of digits is:", count)

Output:

The given number is: 12345
The number of digits is: 5

Conclusion

In this article, we have discussed an approach to count digits of an integer in python. To know more about numbers in python,you can read this article on decimal numbers in python. You might also like this article on complex numbers in python.

The post Count Digits Of An Integer in Python appeared first on PythonForBeginners.com.

Viewing all 23709 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>