Quantcast
Channel: Planet Python
Viewing all 23173 articles
Browse latest View live

Python for Beginners: Use the Pandas fillna Method to Fill NaN Values

$
0
0

Handling NaN values while analyzing data is an important task. The pandas module in python provides us with the fillna() method to fill NaN values. In this article, we will discuss how to use the pandas fillna method to fill NaN values in Python.

The filna() Method

You can fill NaN values in a pandas dataframe using the fillna() method. It has the following syntax.

DataFrame.fillna(value=None, *, method=None, axis=None, inplace=False, limit=None, downcast=None)

Here,

  • The value parameter takes the value that replaces the NaN values. You can also pass a python dictionary or a series to the value parameter. Here, the dictionary should contain the column names of the dataframe as its keys and the value that needs to be filled in the columns as the associated values. Similarly, the pandas series should contain the column names of the dataframe as the index and the replacement values as the associated value for each index.
  • The method parameter is used to fill NaN values in the dataframe if no input is given to the value parameter. If the value parameter is not None, the method parameter is set to None. Otherwise, we can assign the literal “ffill”, “bfill”, “backfill”, or“pad” to specify what values we want to fill in place of the NaN values.
  • The axis parameter is used to specify the axis along which to fill missing values. If you want to fill only specific rows or columns using the pandas fillna method, you can use the axis parameter. To fill NaN values in rows, the axis parameter is set to 1 or “columns”. To fill values by to columns, the axis parameter is set to “index” or 0. 
  • By default, the pandas fillna method doesn’t modify the original dataframe. It returns a new dataframe after execution, to modify the original dataframe on which the fillna() method is invoked, you can set the inplace parameter to True.
  • If the method parameter is specified, the limit parameter specifies the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than limit number of consecutive NaNs, it will only be partially filled. If the method parameter is not specified, the limit parameter takes the maximum number of entries along the entire axis where NaNs will be filled. It must be greater than 0 if not None.
  • The downcast parameter takes a dictionary as a map to decide what data types should be downcasted and the destination data type if there is a need to change the data types of the values.

Use Pandas Fillna to Fill Nan Values in the Entire Dataframe

To fill NaN values in a pandas dataframe using the fillna method, you pass the replacement value of the NaN value to the fillna() method as shown in the following example.

import pandas as pd
import numpy as np
x=pd.read_csv("grade2.csv")
print("The original dataframe is:")
print(x)
x=x.fillna(0)
print("The modified dataframe is:")
print(x)

Output:

The original dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         NaN    NaN   NaN
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0         NaN   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     NaN   NaN         NaN    NaN   NaN
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0   NaN
9     NaN   NaN         NaN    NaN   NaN
10    3.0  15.0      Lokesh   88.0     A
The modified dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0           0    0.0     0
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0           0   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     0.0   0.0           0    0.0     0
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0     0
9     0.0   0.0           0    0.0     0
10    3.0  15.0      Lokesh   88.0     A

In the above example, we have passed the value 0 to the fillna() method. Hence, all the NaN values in the input data frame are replaced by 0.

This approach isn’t very practical as different columns have different data types. So, we can choose to fill different values in different columns to replace the null values.

Fill Different Values in Each Column in Pandas

Instead of filling all the NaN values with the same value, you can also replace the NaN value in each column with a specific value. For this, we need to pass a dictionary containing column names as its keys and the values to be filled in the columns as the associated values to the fillna() method. You can observe this in the following example.

import pandas as pd
import numpy as np
x=pd.read_csv("grade2.csv")
print("The original dataframe is:")
print(x)
x=x.fillna({"Class":1,"Roll":100,"Name":"PFB","Marks":0,"Grade":"F"})
print("The modified dataframe is:")
print(x)

Output:

The original dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         NaN    NaN   NaN
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0         NaN   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     NaN   NaN         NaN    NaN   NaN
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0   NaN
9     NaN   NaN         NaN    NaN   NaN
10    3.0  15.0      Lokesh   88.0     A
The modified dataframe is:
    Class   Roll        Name  Marks Grade
0     2.0   27.0       Harsh   55.0     C
1     2.0   23.0       Clara   78.0     B
2     3.0   33.0         PFB    0.0     F
3     3.0   34.0         Amy   88.0     A
4     3.0   15.0         PFB   78.0     B
5     3.0   27.0      Aditya   55.0     C
6     1.0  100.0         PFB    0.0     F
7     3.0   23.0  Radheshyam   78.0     B
8     3.0   11.0       Bobby   50.0     F
9     1.0  100.0         PFB    0.0     F
10    3.0   15.0      Lokesh   88.0     A

In the above example, we have passed the dictionary {"Class" :1, "Roll": 100, "Name": "PFB", "Marks" : 0, "Grade": "F" } to the fillna() method as input. Due to this, the NaN values in the "Class" column are replaced by 1, the NaN values in the "Roll" column are replaced by 100, the NaN values in the "Name" column are replaced by "PFB", and so on. Thus, When we pass the column names of the dataframe as key and a python literal as associated value to the key, the NaN values are replaced in each column of the dataframe according to the input dictionary.

Instead of giving all the column names as keys in the input dictionary, you can also choose to ignore some. In this case, the NaN values in the columns that are not present in the input dictionary are not considered for replacement. You can observe this in the following example.

import pandas as pd
import numpy as np
x=pd.read_csv("grade2.csv")
print("The original dataframe is:")
print(x)
x=x.fillna({"Class":1,"Roll":100,"Name":"PFB","Marks":0})
print("The modified dataframe is:")
print(x)

Output:

The original dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         NaN    NaN   NaN
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0         NaN   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     NaN   NaN         NaN    NaN   NaN
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0   NaN
9     NaN   NaN         NaN    NaN   NaN
10    3.0  15.0      Lokesh   88.0     A
The modified dataframe is:
    Class   Roll        Name  Marks Grade
0     2.0   27.0       Harsh   55.0     C
1     2.0   23.0       Clara   78.0     B
2     3.0   33.0         PFB    0.0   NaN
3     3.0   34.0         Amy   88.0     A
4     3.0   15.0         PFB   78.0     B
5     3.0   27.0      Aditya   55.0     C
6     1.0  100.0         PFB    0.0   NaN
7     3.0   23.0  Radheshyam   78.0     B
8     3.0   11.0       Bobby   50.0   NaN
9     1.0  100.0         PFB    0.0   NaN
10    3.0   15.0      Lokesh   88.0     A

In this example, we haven’t passed the "Grade" column in the input dictionary to the fillna() method. Hence, the NaN values in the "Grade" column are not replaced by any other value.

Only Fill the First N Null Values in Each Column

Instead of filling all NaN values in each column, you can also limit the number of NaN values to be filled in each column. For this, you can pass the maximum number of values to be filled as input argument to the limit parameter in the fillna() method as shown below.

import pandas as pd
import numpy as np
x=pd.read_csv("grade2.csv")
print("The original dataframe is:")
print(x)
x=x.fillna(0, limit=3)
print("The modified dataframe is:")
print(x)

Output:

The original dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         NaN    NaN   NaN
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0         NaN   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     NaN   NaN         NaN    NaN   NaN
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0   NaN
9     NaN   NaN         NaN    NaN   NaN
10    3.0  15.0      Lokesh   88.0     A
The modified dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0           0    0.0     0
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0           0   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     0.0   0.0           0    0.0     0
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0     0
9     0.0   0.0         NaN    0.0   NaN
10    3.0  15.0      Lokesh   88.0     A

In the above example, we have set the limit parameter to 3. Due to this, only the first three NaN values from each column are replaced by 0.

Only Fill the First N Null Values in Each Row

To fill only the first N null value in each row of the dataframe, you can pass the maximum number of values to be filled as an input argument to the limit parameter in the fillna() method. Additionally, you need to specify that you want to fill the rows by setting the axis parameter to 1. You can observe this in the following example.

import pandas as pd
import numpy as np
x=pd.read_csv("grade2.csv")
print("The original dataframe is:")
print(x)
x=x.fillna(0, limit=2,axis=1)
print("The modified dataframe is:")
print(x)

Output:

The original dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         NaN    NaN   NaN
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0         NaN   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     NaN   NaN         NaN    NaN   NaN
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0   NaN
9     NaN   NaN         NaN    NaN   NaN
10    3.0  15.0      Lokesh   88.0     A
The modified dataframe is:
   Class  Roll        Name Marks Grade
0    2.0  27.0       Harsh  55.0     C
1    2.0  23.0       Clara  78.0     B
2    3.0  33.0         0.0   0.0   NaN
3    3.0  34.0         Amy  88.0     A
4    3.0  15.0           0  78.0     B
5    3.0  27.0      Aditya  55.0     C
6    0.0   0.0         NaN   NaN   NaN
7    3.0  23.0  Radheshyam  78.0     B
8    3.0  11.0       Bobby  50.0     0
9    0.0   0.0         NaN   NaN   NaN
10   3.0  15.0      Lokesh  88.0     A

In the above example, we have set the limit parameter to 2 and the axis parameter to 1. Hence, only two NaN values from each row are replaced by 0 when the fillna() method is executed.

Pandas Fillna With the Last Valid Observation

Instead of specifying a new value, you can also fill NaN values using the existing values. For instance, you can fill the Null values using the last valid observation by setting the method parameter to “ffill” as shown below.

import pandas as pd
import numpy as np
x=pd.read_csv("grade2.csv")
print("The original dataframe is:")
print(x)
x=x.fillna(method="ffill")
print("The modified dataframe is:")
print(x)

Output:

The original dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         NaN    NaN   NaN
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0         NaN   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     NaN   NaN         NaN    NaN   NaN
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0   NaN
9     NaN   NaN         NaN    NaN   NaN
10    3.0  15.0      Lokesh   88.0     A
The modified dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0       Clara   78.0     B
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0         Amy   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     3.0  27.0      Aditya   55.0     C
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0     B
9     3.0  11.0       Bobby   50.0     B
10    3.0  15.0      Lokesh   88.0     A

In this example, we have set the method parameter to "ffill". Hence, whenever a NaN value is encountered, the fillna() method fills the particular cell with the non-null value in the preceding cell in the same column.

Pandas Fillna With the Next Valid Observation

You can fill the Null values using the next valid observation by setting the method parameter to “bfill” as shown below.

import pandas as pd
import numpy as np
x=pd.read_csv("grade2.csv")
print("The original dataframe is:")
print(x)
x=x.fillna(method="bfill")
print("The modified dataframe is:")
print(x)

Output:

The original dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         NaN    NaN   NaN
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0         NaN   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     NaN   NaN         NaN    NaN   NaN
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0   NaN
9     NaN   NaN         NaN    NaN   NaN
10    3.0  15.0      Lokesh   88.0     A
The modified dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         Amy   88.0     A
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0      Aditya   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     3.0  23.0  Radheshyam   78.0     B
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0     A
9     3.0  15.0      Lokesh   88.0     A
10    3.0  15.0      Lokesh   88.0     A

In this example, we have set the method parameter to "bfill". Hence, whenever a NaN value is encountered, the fillna() method fills the particular cell with the non-null value in the next cell in the same column.

Pandas Fillna Inplace

By default, the fillna() method returns a new dataframe after execution. To modify the existing dataframe instead of creating a new one, you can set the inplace parameter to True in the fillna() method as shown below.

import pandas as pd
import numpy as np
x=pd.read_csv("grade2.csv")
print("The original dataframe is:")
print(x)
x.fillna(method="bfill",inplace=True)
print("The modified dataframe is:")
print(x)

Output:

The original dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         NaN    NaN   NaN
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0         NaN   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     NaN   NaN         NaN    NaN   NaN
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0   NaN
9     NaN   NaN         NaN    NaN   NaN
10    3.0  15.0      Lokesh   88.0     A
The modified dataframe is:
    Class  Roll        Name  Marks Grade
0     2.0  27.0       Harsh   55.0     C
1     2.0  23.0       Clara   78.0     B
2     3.0  33.0         Amy   88.0     A
3     3.0  34.0         Amy   88.0     A
4     3.0  15.0      Aditya   78.0     B
5     3.0  27.0      Aditya   55.0     C
6     3.0  23.0  Radheshyam   78.0     B
7     3.0  23.0  Radheshyam   78.0     B
8     3.0  11.0       Bobby   50.0     A
9     3.0  15.0      Lokesh   88.0     A
10    3.0  15.0      Lokesh   88.0     A

In this example, we have set the inplace parameter to True in the fillna() method. Hence, the input dataframe is modified.

Conclusion

In this article, we have discussed how to use the pandas fillna method to fill nan values in Python.

To learn more about python programming, you can read this article on how to sort a pandas dataframe. You might also like this article on how to drop columns from a pandas dataframe.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!

The post Use the Pandas fillna Method to Fill NaN Values appeared first on PythonForBeginners.com.


CodersLegacy: Setup Virtual Environment for Pyinstaller with Venv

$
0
0

In this Python tutorial, we will discuss how to optimize your Pyinstaller EXE’s using Virtual Environments. We will be using the “venv” library to create the Virtual environment for Pyinstaller, which is actually already included in every Python installation by default (because its part of the standard library).

We will walk you through the entire process, starting from “what” virtual environments are, “why” we need them and “how” to create one.


Understanding Pyinstaller

Let me start by telling you how Pyinstaller works. We all know that Pyinstaller creates a standalone EXE which bundles all the dependencies, allowing it to run on any system.

What alot of people do not know however, is “HOW” Pyinstaller does this.

Let me explain.

What Pyinstaller does is “freezes” your Python environment into what we call a “frozen” application. In non-technical terms, this means to take bundle everything in your Python environment, like your Python installation, libraries that you have installed, and other dependencies (e.g DLL’s or Data files) you may be using, into a single application.

Kind of like taking a “snap-shot” of your program in it’s running state (with all the dependencies active) and saving it.

With this understanding, we can now safely explain the benefits of Virtual Environments.


What are Virtual Environments in Python?

Imagine for a moment that you have a 100 libraries installed for Python on your device. You might think you do not have many, but the truth is that when you download a big library (e.g Matplotlib) it downloads several other libraries along with it (as dependencies).

To check the currently installed libraries on our systems (excluding the ones included by default), run the following command.

pip list

It will give you something like that following output.

altgraph                  0.17.3
astroid                   2.12.9
async-generator           1.10
attrs                     22.2.0
auto-py-to-exe            2.24.1
Automat                   22.10.0
autopep8                  1.7.0
Babel                     2.10.3
beautifulsoup4            4.11.1
bottle                    0.12.23
bottle-websocket          0.2.9
bs4                       0.0.1
cad-to-shapely            0.3.1
cairocffi                 1.3.0
CairoSVG                  2.5.2
certifi                   2022.12.7
cffi                      1.15.1
chardet                   4.0.0
colorama                  0.4.5
constantly                15.1.0
cryptography              38.0.4
cssselect                 1.2.0
cssselect2                0.6.0
cx-Freeze                 6.13.1
cycler                    0.11.0
Cython                    0.29.32
defusedxml                0.7.1
dill                      0.3.5.1
Eel                       0.14.0
et-xmlfile                1.1.0
exceptiongroup            1.1.0
ezdxf                     0.18
filelock                  3.8.0
fonttools                 4.34.4
future                    0.18.2
geomdl                    5.3.1
gevent                    22.10.2
gevent-websocket          0.10.1
greenlet                  2.0.1
h11                       0.14.0
hyperlink                 21.0.0
idna                      2.10
incremental               22.10.0
isort                     5.10.1
itemadapter               0.7.0
itemloaders               1.0.6
Jinja2                    3.0.1
jmespath                  1.0.1
kiwisolver                1.4.4
lazy-object-proxy         1.7.1
lief                      0.12.3
lxml                      4.9.1
MarkupSafe                2.1.1
matplotlib                3.5.3
mccabe                    0.7.0
more-itertools            8.14.0
MouseInfo                 0.1.3
mpmath                    1.2.1
Nuitka                    1.2.4
numexpr                   2.8.3
numpy                     1.23.1
openpyxl                  3.0.10
ordered-set               4.1.0
outcome                   1.2.0
packaging                 21.3
pandas                    1.4.3
pandastable               0.13.0
parsel                    1.7.0
pefile                    2022.5.30
Pillow                    8.4.0
pip                       22.2.1
platformdirs              2.5.2
Protego                   0.2.1
pyasn1                    0.4.8
pyasn1-modules            0.2.8
PyAutoGUI                 0.9.53
pycodestyle               2.9.1
pycparser                 2.21
PyDispatcher              2.0.6
pygal                     3.0.0
pygame                    2.1.2
pygame-menu               4.2.8
PyGetWindow               0.0.9
pyinstaller               5.6.2
pyinstaller-hooks-contrib 2022.13
pylint                    2.15.2
PyMsgBox                  1.0.9
PyMuPDF                   1.20.2
pyOpenSSL                 22.1.0
pyparsing                 3.0.9
pyperclip                 1.8.2
PyQt5                     5.15.7
PyQt5-Qt5                 5.15.2
PyQt5-sip                 12.11.0
PyQt6                     6.4.0
PyQt6-Qt6                 6.4.1
PyQt6-sip                 13.4.0
PyRect                    0.2.0
PyScreeze                 0.1.28
PySocks                   1.7.1
pytest-check              1.0.6
python-dateutil           2.8.2
pytweening                1.0.4
pytz                      2022.1
pywin32-ctypes            0.2.0
queuelib                  1.6.2
requests                  2.25.1
requests-file             1.5.1
rhino-shapley-interop     0.0.4
rhino3dm                  7.15.0
scipy                     1.9.0
Scrapy                    2.7.1
sectionproperties         2.0.3
selenium                  4.7.2
service-identity          21.1.0
setuptools                63.2.0
Shapely                   1.8.2
six                       1.16.0
sniffio                   1.3.0
sortedcontainers          2.4.0
soupsieve                 2.3.2.post1
sympy                     1.10.1
tabulate                  0.8.10
tinycss2                  1.1.1
tkcalendar                1.6.1
tkdesigner                1.0.6
tkinter-tooltip           2.1.0
tldextract                3.4.0
toml                      0.10.2
tomli                     2.0.1
tomlkit                   0.11.4
triangle                  20220202
trio                      0.22.0
trio-websocket            0.9.2
Twisted                   22.10.0
twisted-iocpsupport       1.0.2
typing_extensions         4.3.0
urllib3                   1.26.13
w3lib                     2.1.1
webencodings              0.5.1
whichcraft                0.6.1
wrapt                     1.14.1
wsproto                   1.2.0
xlrd                      2.0.1
zope.event                4.5.0
zope.interface            5.5.2

That was quite a long list right? I don’t even recognize half of those libraries (they were installed as dependencies). And this is on a relatively new Python installation (3-4 months old). Running this on my old device might have given me double the above amount.

Now you might have already put two-and-two together and realized the problem here.

When we normally use Pyinstaller to bundle our applications, it ends up including ALL of the libraries that we have installed. Regardless of whether they are actually needed, or not.

Now obviously this is a big problem, especially if you have several large libraries lying around which are not actually being used.


The Solution?

Virtual Environments!

Now, what we could do is setup a new Python installation on your PC and only install the required packages (which you know are being used). But this is an extra hassle, and can cause issues with your current Python installation if you are not careful.

Instead, we use Virtual environments which basically create a “fresh copy” of your current Python version, without any of the installed libraries. You can create as many virtual environments as you want!

We can then compile our Pyinstaller EXE’s inside these virtual environments (just like how we normally do). This time the EXE will only include the bare minimum number of libraries.

It is actually recommended to have a virtual environment for each major project you have. This is to ensure that there is no library conflict, and to ensure version control (the version of the libraries you are using).


Version Control in Virtual Environments with Venv

For example, a common issue that can happen is when you install a new library “A” (unrelated to your application) and it requires a dependency “B”, which is also required by library “C”.

Library “A” requires the dependency to be at atleast version 1.3 (random version number i picked), whereas library “C” only works with the dependency “B” up-to version 1.2 (1.3 onwards not supported).

Hence, we now have a conflict issue. There are many other scenarios like this under which problems can occur. This is just one of them.

Virtual environments help isolate projects and dependencies into separate environments, minimizing the risk of conflict.


Creating a Pyinstaller Virtual Environment with Venv

Now for the actual implementation part of the tutorial. The first thing we will do is setup our Virtual environments.

For users with Python added to PATH, run the following command.

python -m venv tutorial

In the above command, “tutorial” is the name of the virtual environment. This is completely your choice what you choose to name it. Also pay attention to which folder you are running this command in. The virtual environment will be created there.

For users who do not have Python added to PATH, you need to find the path to your Python installation. You can typically find it in a location like this:

C:\Users\CodersLegacy\AppData\Local\Programs\Python\Python310

To create a virtual environment you need to run the following command (swapping out “python” for the location of your python.exe file)

C:\Users\CodersLegacy\AppData\Local\Programs\Python\Python310\python.exe -m venv tutorial

Activating the Virtual Environment

We aren’t done yet though. The Virtual environment needs to be activated first! If you completed the previous step, we should have a folder structure something like this.

-- virtuals_envs_folder
    -- tutorial

virtual_envs_folder is simply a parent folder where we ran the previous commands for creating the virtual environment.

We will now add a new Python file to the virtual_envs_folder (not the tutorial folder). So now our file structure is something like this.

-- virtuals_envs_folder
    -- tutorial
    -- file.py

file.py is where all the code will go, which we want want to convert to a pyinstaller exe. Any supporting files, folder or libraries you have created can also be added here.

Now we need to run another command which will activate the virtual environment. The command can vary slightly depending on what terminal/console/OS you are using.

Command Prompt: (Windows)

C:\Users\CodersLegacy\virtual_envs_folder> tutorial\Scripts\activate.bat

Windows PowerShell:

C:\Users\CodersLegacy\virtual_envs_folder> tutorial\Scripts\Activate.ps1

Linux:

C:\Users\CodersLegacy\virtual_envs_folder> tutorial\bin\activate

Congratulations, now your Virtual environment is now activated and ready to run! Our command-line will now be pointing to the Python installation inside our Virtual environment instead of the main Python installation. (This effect will end once you close the command line/terminal)

Your virtual environment folder (tutorial) should look something like this:

-- tutorial
   -- Include
   -- Lib
   -- Scripts
-- file.py

The two important files here are “Lib” and “Scripts”. “Lib” is where all of our installed libraries will go. “Scripts” is where our Python.exe file is.

Your command prompt/terminal should also look something like this:

(tutorial) C:\Users\CodersLegacy\virtual_envs_folder>

Notice the “(tutorial)” which is now included right in the start. If this has appeared, your Venv Virtual Environment is ready to use with Python and Pyinstaller.



Setting up our Application in the Virtual Environment

Now we will begin installing the required libraries we need. Here is some sample code we will be using in our file.py file.

import tkinter as tk
from tkinter.filedialog import askopenfilename, asksaveasfile
import numpy as np
from matplotlib.figure import Figure
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg
from matplotlib.path import Path
from matplotlib.patches import PathPatch
from matplotlib.collections import PatchCollection
from pandastable import Table
import pandas as pd

   
class Window():
    def __init__(self, master):
        self.main = tk.Frame(master, background="white")
        
        self.rightframe = tk.Frame(self.main, background="white")
        self.rightframe.pack(side=tk.LEFT)
        self.leftframe = tk.Frame(self.main, background="white")
        self.leftframe.pack(side=tk.LEFT)

        self.rightframeheader = tk.Frame(self.rightframe, background="white")
        self.button1 = tk.Button(self.rightframeheader, text='Import CSV',  command=self.import_csv, width=10)
        self.button1.pack(pady = (0, 5), padx = (10, 0), side = tk.LEFT)  

        self.button2 = tk.Button(self.rightframeheader, text='Clear',  command=self.clear, width=10)
        self.button2.pack(padx = (10, 0), pady = (0, 5), side = tk.LEFT)  

        self.button3 = tk.Button(self.rightframeheader, text='Generate Plot',  command=self.generatePlot, width=10)
        self.button3.pack(pady = (0, 5), padx = (10, 0), side = tk.LEFT)  
        self.rightframeheader.pack()

        self.tableframe = tk.Frame(self.rightframe, highlightbackground="blue", highlightthickness=5)
        self.table = Table(self.tableframe, dataframe=pd.DataFrame(), width=300, height=400)
        self.table.show()
        self.tableframe.pack()

        self.canvas = tk.Frame(self.leftframe)
        self.fig = Figure()
        self.ax = self.fig.add_subplot(111)
        self.graph = FigureCanvasTkAgg(self.fig, self.canvas)
        self.graph.draw()
        self.graph.get_tk_widget().pack()
        self.canvas.pack(padx=(20, 0))

        self.main.pack()


    def import_csv(self):
        types = [("CSV files","*.csv"),("Excel files","*.xlsx"),("Text files","*.txt"),("All files","*.*") ]
        csv_file_path = askopenfilename(initialdir = ".", title = "Open File", filetypes=types)
        tempdf = pd.DataFrame()

        try:
            tempdf = pd.read_csv(csv_file_path)
        except:
            tempdf = pd.read_excel(csv_file_path)
            
        self.table.model.df = tempdf
        self.table.model.df.columns = self.table.model.df.columns.str.lower()
        self.table.redraw()  
        self.generatePlot()

    def clear(self):
        self.table.model.df = pd.DataFrame()
        self.table.redraw()
        self.ax.clear()

    def generatePlot(self):
        self.ax.clear()
        if not(self.table.model.df.empty): 
            df = self.table.model.df.copy()
            self.ax.plot(pd.to_numeric(df["x"]), pd.to_numeric(df["y"]), color ='tab:blue', picker=True, pickradius=5)   
        self.graph.draw_idle()     

root = tk.Tk()
window = Window(root)
root.mainloop()

Now you have two options. You can either go and install each library individually using “pip”, or you can create a requirements.txt file. We will go with the latter option, because its a recommended approach for helping to maintain the correct versions.

First create a new requirements.txt file in the parent folder of your virtual environment. File structure should look like this:

-- virtuals_envs_folder
    -- tutorial
    -- file.py
    -- requirements.txt

Now we will open a new command prompt (unrelated to our Virtual environment), and use it to check the versions of each our required libraries.

We can check the version of an installed library using pip show <library-name>. Running pip show matplotlib, gives us the following output.

Name: matplotlib
Version: 3.5.3

We will now add this information to our requirements.txt file. For the sample code we provided above, our file will look like this:

matplotlib==3.5.3
sympy==1.10.1
pandastable==0.13.0
pandas==1.4.3
numpy==1.23.1
scipy==1.9.0
pyinstaller==5.6.2
auto-py-to-exe==2.24.1

Now we will run the following command to have them all installed in one go.

pip install -r requirements.txt

Using Pyinstaller in our Venv Virtual Environment

And now, we are FINALLY done with all the setup!

But don’t forget why we are doing all of this!. As a test, try running pip show again, and see how many libraries get printed out this time. It should be alot less than what we had before.

All we need to do now is run the pyinstaller command like we would do normally.

pyinstaller --noconsole --onefile file.py

This will generate our Pyinstaller EXE in our Venv Virtual environment (or in the parent folder)! Now observe the size difference and let us know down in the comments section how much of an improvement you got!

There should also be a slight speed bonus, due to the smaller size and lower number of modules to load.


If you are looking to further reduce the size of your Python module, the UPX packer is a great way to easily bring down the size of your EXE substantially. Check out our tutorial on it!


This marks the end of the “SetupVirtual Environment for Pyinstaller with Venv” Tutorial. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the tutorial content can be asked in the comments section below.

The post Setup Virtual Environment for Pyinstaller with Venv appeared first on CodersLegacy.

Wyatt Baldwin: PDM vs Poetry

$
0
0
A few years back, I started using poetry to manage dependencies for all my projects. I ran into some minor issues early on but haven’t had any problems recently and prefer it to any of the other dependency management / packaging solutions I’ve used. Recently, I’ve started hearing about pdm and how it’s the bee’s knees. Earlier today, I did a search for “pdm vs poetry” and didn’t find much, so I thought I’d write something myself.

Programiz: Python List

$
0
0
In this tutorial, we will learn about Python lists (creating lists, changing list items, removing items and other list operations) with the help of examples.

Programiz: Python Program to Count the Number of Digits Present In a Number

$
0
0
In this example, you will learn to count the number of digits present in a number.

Programiz: Python Program to Check If Two Strings are Anagram

$
0
0
In this example, you will learn to check if two strings are anagram.

Programiz: Python Program to Compute all the Permutation of the String

$
0
0
In this example, you will learn to compute all the permutation of the string.

Programiz: Python Program to Count the Number of Occurrence of a Character in String

$
0
0
In this example, you will learn to count the number of occurrences of a character in a string.

Programiz: Python Program to Convert Bytes to a String

$
0
0
In this example, you will learn to convert bytes to a string.

Programiz: Python Program to Compute the Power of a Number

$
0
0
In this example, you will learn to compute the power of a number.

Programiz: Python Program to Capitalize the First Character of a String

$
0
0
In this example, you will learn to capitalize the first character of a string.

Programiz: Python Program to Create a Countdown Timer

$
0
0
In this example, you will learn to create a countdown timer.

Programiz: Python Program to Remove Duplicate Element From a List

$
0
0
In this example, you will learn to remove duplicate elements from a list.

Mike Driscoll: PyDev of the Week: Joe Kaufeld

$
0
0

This week we welcome Joe Kaufeld as our PyDev of the Week! Joe has been a convention organizer for more than a decade and cofounded TranscribersOfReddit. Joe is active in the Python community and has been working with the language for many years.

Let's spend some time getting to know Joe better!

Can you tell us a little about yourself (hobbies, education, etc)?

General intro time! I’m a self-taught developer based in Indianapolis, Indiana. When I’m not thinking about code, you can usually find me covered in sawdust or under a pile of cats.

I have a BS in Information Systems and Operations Management from Ball State University, which prepared me for an illustrious career as a consultant. After graduation, I became a consultant and quickly realized that I hated it! I transitioned to an IT helpdesk role and worked my way up to my current job: senior developer. Along the way, I launched a few side projects including an international nonprofit and a very active reference library for 3D printing plastics. (More on both of those later!)

I live in a relatively sleepy corner of the city with my wonderful partner and three cats.

Why did you start using Python?

During my undergrad, I stumbled across a neglected room in the College of Business that hadn’t been touched since 2003. It had been dedicated to a student project to create a Beowulf cluster utilizing discarded university equipment. After I tracked down the professor who was in charge of the room, I asked if I could use it and he basically threw the door key at me and said “have fun!”. I replaced all of the equipment with new-ish (circa 2012) hardware which was being removed from other departments on campus, threw out all the old hardware, and built a 115 node cluster on CentOS 5.

Given that I had complete freedom over how to actually make software run on the monstrosity, I spent a lot of time evaluating different options and languages. Eventually, I landed on Python 2.7.0 and Parallel Python, a HTTP-based multiprocessing library. I used that to develop a system called ClusterBuster, a surprisingly fast Windows NTLM password cracker.

The absolute ease of development was a gamechanger to me; my previous experience had been QBASIC and FORTRAN, and compared to these, Python was a breath of fresh air that I didn’t know I needed.

What other programming languages do you know and which is your favorite?

My first was QBASIC: when I was 11, my dad sat me down in front of a Windows 95 machine, opened the familiar blue screen of QBASIC, hit F1 to bring up the help menu, and left me to my own devices with an Epson dot-matrix printer. I used this knowledge to convince (bribe) my math teacher into letting me trade my QBASIC program which did the day’s math homework for the worksheets. Since I could explain to the computer how to solve it, then I clearly understood what the homework was supposed to cover. That excuse probably wouldn’t fly today!

I picked up some FORTRAN in high school because I was bored, but honestly never enjoyed it. The Myspace era introduced me to HTML and CSS, and a proper loathing for JS followed shortly after. I still do a fair amount of JS, but my bitter complaining during and after hasn’t changed. Once I was introduced to Python, everything really clicked into place - I found a language that just made sense to me and how my brain works.

Since then, I’ve toyed with other languages – C++, Rust, Ruby, Crystal, and a brief fling with Java come to mind – but my programming love remains Python.

What projects are you working on now?

My day job is as a Python developer building APIs with Django, but around that, I run FilamentColors.xyz, a color matching tool for 3D printer filaments. That’s something that’s more or less under constant development, as I’m always wanting to show off the collection in the best light and make it as usable as I can. We have about 880 filaments cataloged and published and another roughly 800 currently in progress, as I had to pause to work on a refresh of the UI.

Besides FilamentColors, I also serve as the president and cofounder of the Grafeas Group, a 501c3 that focuses on increasing accessibility on the internet. (You may have seen us featured on WIRED!) With ~5800 volunteers across the globe, we convert images, video, and audio into the one format everyone can access: text. The nonprofit oversees a community on Reddit called TranscribersOfReddit, a “job board” of sorts that anyone can join. We provide templates and guidance designed in close collaboration with r/Blind, the primary community on Reddit for blind and visually-impaired folks. You may have seen our work sprinkled across Reddit in the form of a comment that starts with “Image Transcription:” and ends with “I’m a human volunteer content transcriber and you could be too!” with a complete description of the linked content written out in between. To date, we’ve done a little over 267,500 transcriptions all across Reddit and plan on doing many more!

Which Python libraries are your favorite (core or 3rd party)?

I have literally made “knowing a lot about Django” into a career, so I am immensely thankful to the core Django team and all the contributors who have helped make it what it is today. Besides that, I really enjoy working with (in absolutely no particular order):

There are so many amazing packages and libraries out there; out of everything I work with regularly, Django is definitely the clear favorite (though it’s technically a framework and not a library), so in the spirit of answering the question as it’s worded, I really have to call out Poetry and Black; when starting a new project, the first commands I run are:

poetry init
poetry add black --group dev

 

How did TranscribersOfReddit come about?

Back in 2014, I had a really crappy phone. It was tiny, had approximately zero reasonable features, and (most importantly) couldn’t zoom in on images very well. I was (and continue to be) an avid reader of r/DnDGreentext, a community on Reddit that is all about roleplaying groups doing silly things, often in game-breaking ways. Occasionally there are images posted that are massive screenshots stitched together from game stories from different places around the web, and whenever I found one of those, I’d save it on mobile and type it up on my computer on my lunch break. I used it as typing practice at the time and I figured, “hey, maybe there’s someone else like me who can’t read these for some reason or another. I’ll just put the work out there so if someone needs it, it’s there.”

Time passed and I eventually started getting pings from other readers essentially asking when I would have a transcription up for reading, since it turns out that even manipulating these giant screenshots on desktop doesn’t even work that well sometimes.

More time passed and then something strange happened: someone started racing me to transcribe these screenshots whenever they popped up. I thought, “well that’s weird, I don’t know why someone else would want to do this too,” and so I just kind of ignored them for a while. We eventually chatted and found out that he was a bored college student while I was a bored help desk tech. After talking about some of the things we’d learned, we decided to set up a subreddit and some basic scripting (Python and PRAW to the rescue!) to create a kind of job board so that we knew what the other person was working on and didn’t duplicate work.

As I built the initial bot, I kept thinking, “I wonder if there are other people who would do this too?”. We decided that we would announce the opening of our little system, and we were so excited that we didn’t check the date: we opened officially on April 1st, 2017.

That didn’t stop people from joining; at the end of the first day, we had 30 people signed up and one very serious message from a lead moderator of r/Blind that essentially boiled down to “do you understand what a lifechanging tool this could be?” We immediately started working closely together to ensure we could mesh these two visions and the modern TranscribersOfReddit was born!

What are the top three things you've learned creating TranscribersOfReddit?

I’ve made a lot of mistakes over my career (as we all have) and I am lucky enough that I’ve made some of my more serious gaffes on TranscribersOfReddit (ToR) and not at my day job! These are the situations and things that immediately come to mind:

  • Design projects to match the requirements AND the team
    • When we launched ToR, we really didn’t plan on really anyone to join us. As such, I took a lot of shortcuts while writing the bots that power the subreddit. As it quickly became evident that we needed to be able to handle a lot more traffic, I rebuilt things using a more microservice-y design because I simply didn’t know how much traffic we would need to handle and that’s what I was familiar with using at the Fortune 500 company where I was a developer.Turns out that microservices are basically a death sentence for an all-volunteer team, and development progress essentially stagnated until we “de-microserviced” our systems into a core Django monolith with the bots acting as services around it. We went from maybe one deployment a month (because it required several people, careful timing, and shoddy documentation) to a fully automated process that can be deployed multiple times per day without thinking twice by anyone on the team.
  • You cannot grow a community in a bubble
    • If it was just still the two of us, I know for a fact that ToR would not have grown the way that it has. With our wonderful industry contacts and the amazing team that help keep everything running – depending on the time of the year, between 17 and 23 people – not an idea comes up that doesn’t immediately get hit with “Well, what about X or Y?” This is pretty much guaranteed to be something the rest of us haven’t considered, and being able to have these discussions means that the feature or idea will be stronger and more useful. If we had never gotten that message from r/Blind, we would absolutely not be where we are today.
  • It takes a village
    • I am guilty (as many of us are) of trying to shoulder every aspect of a project myself, and the one thing that has been beaten over my head time after time with ToR is that I simply can’t do it alone. There’s just too much. The Grafeas board and our wonderful team of ToR mods are indispensable. Without the team, there is no ToR. Without the volunteers, there is no ToR. Without the readers, there is no ToR. It takes so many people and visions to make this all work and I am so immensely thankful to every single one of them.

Is there anything else you’d like to say?

I have so much to thank the wider Python community for, because the things I build are built on the shoulders of giants. I love the ecosystem of Python and how varied it is – and how open everything is – and it’s just an absolute joy. To Mike: thanks so much for offering me the opportunity to be a part of this series!

Thanks for doing the interview, Joe!

The post PyDev of the Week: Joe Kaufeld appeared first on Mouse Vs Python.

Python for Beginners: Create a Dictionary From a String in Python

$
0
0

Strings and dictionaries are the two most used data structures in Python. We use strings for text analysis in Python. On the other hand, a dictionary is used to store key-value pairs. In this article, we will discuss how to create a dictionary from a string in Python.

Create a Dictionary From a String Using for Loop

To create a dictionary from a string in Python, we can use a python for loop. In this approach, we will create a dictionary that contains the count of all the characters in the original string. To create a dictionary from the given string, we will use the following steps.

  • First, we will create an empty python dictionary using the dict() function.
  • Next, we will use a for loop to iterate through the characters of the string.
  • While iteration, we will first check if the character is already present in the dictionary or not. If yes, we will increment the value associated with the character in the dictionary. Otherwise, we will assign the character as a key and 1 as the associated value to the dictionary.

After executing the for loop, we will get the dictionary containing the characters of the string as keys and their frequency as values. You can observe this in the following example.

myStr="Python For Beginners"
print("The input string is:",myStr)
myDict=dict()
for character in myStr:
    if character in myDict:
        myDict[character]+=1
    else:
        myDict[character]=1
print("The dictionary created from characters of the string is:")
print(myDict)

Output:

The input string is: Python For Beginners
The dictionary created from characters of the string is:
{'P': 1, 'y': 1, 't': 1, 'h': 1, 'o': 2, 'n': 3, ' ': 2, 'F': 1, 'r': 2, 'B': 1, 'e': 2, 'g': 1, 'i': 1, 's': 1}

Dictionary From a String Using the Counter Method

To create a dictionary from a string in python, we can also use the Counter() method defined in the collections module. The Counter() method takes a string as an input argument and returns a Counter object. The Counter object contains all the characters of the string as keys and their frequencies as the associated values. You can observe this in the following example.

from collections import Counter
myStr="Python For Beginners"
print("The input string is:",myStr)
myDict=dict(Counter(myStr))
print("The dictionary created from characters of the string is:")
print(myDict)

Output:

The input string is: Python For Beginners
The dictionary created from characters of the string is:
{'P': 1, 'y': 1, 't': 1, 'h': 1, 'o': 2, 'n': 3, ' ': 2, 'F': 1, 'r': 2, 'B': 1, 'e': 2, 'g': 1, 'i': 1, 's': 1}

Dictionary From a String Using Dict.fromkeys() Method

We can also create a dictionary from a string in python using the fromkeys() method. The fromkeys() method takes a string as its first input argument and a default value as its second argument. After execution, it returns a dictionary containing all the characters of the string as keys and the input value as the associated values for each key. If you want to provide a default associated value to all the keys of the dictionary, you can use the fromkeys() method to create a dictionary from a string as shown below.

from collections import Counter
myStr="Python For Beginners"
print("The input string is:",myStr)
myDict=dict.fromkeys(myStr,0)
print("The dictionary created from characters of the string is:")
print(myDict)

Output:

The input string is: Python For Beginners
The dictionary created from characters of the string is:
{'P': 0, 'y': 0, 't': 0, 'h': 0, 'o': 0, 'n': 0, ' ': 0, 'F': 0, 'r': 0, 'B': 0, 'e': 0, 'g': 0, 'i': 0, 's': 0}

If you don’t want to pass any default value to the fromkeys() method, you can choose to omit it. In this case, all the keys of the dictionary created from the string will have None as their associated values. You can observe this in the following example.

from collections import Counter
myStr="Python For Beginners"
print("The input string is:",myStr)
myDict=dict.fromkeys(myStr)
print("The dictionary created from characters of the string is:")
print(myDict)

Output:

The input string is: Python For Beginners
The dictionary created from characters of the string is:
{'P': None, 'y': None, 't': None, 'h': None, 'o': None, 'n': None, ' ': None, 'F': None, 'r': None, 'B': None, 'e': None, 'g': None, 'i': None, 's': None}

Using Ordereddict.fromkeys() Method

Instead of a dictionary, you can also obtain an ordered dictionary from a string in Python. If you want to provide a default associated value to all the keys of the dictionary, you can use the fromkeys() method to create an ordered dictionary from a string. The fromkeys() method takes a string as its first input argument and a default value as its second argument. After execution, it returns an ordered dictionary containing all the characters of the string as keys. You can observe this in the following example.

from collections import OrderedDict
myStr="Python For Beginners"
print("The input string is:",myStr)
myDict=OrderedDict.fromkeys(myStr,0)
print("The dictionary created from characters of the string is:")
print(myDict)

Output:

The input string is: Python For Beginners
The dictionary created from characters of the string is:
OrderedDict([('P', 0), ('y', 0), ('t', 0), ('h', 0), ('o', 0), ('n', 0), (' ', 0), ('F', 0), ('r', 0), ('B', 0), ('e', 0), ('g', 0), ('i', 0), ('s', 0)])

If you don’t want to pass any default value to the fromkeys() method, you can choose to omit it. In this case, all the keys of the ordered dictionary created from the string will have None as their associated values. You can observe this in the following example.

from collections import OrderedDict
myStr="Python For Beginners"
print("The input string is:",myStr)
myDict=OrderedDict.fromkeys(myStr)
print("The dictionary created from characters of the string is:")
print(myDict)

Output:

The input string is: Python For Beginners
The dictionary created from characters of the string is:
OrderedDict([('P', None), ('y', None), ('t', None), ('h', None), ('o', None), ('n', None), (' ', None), ('F', None), ('r', None), ('B', None), ('e', None), ('g', None), ('i', None), ('s', None)])

Conclusion

In this article, we have discussed different ways to create a dictionary from a string in Python. To learn more about python programming, you can read this article on string manipulation. You might also like this article on python simplehttpserver.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!

The post Create a Dictionary From a String in Python appeared first on PythonForBeginners.com.


PyCoder’s Weekly: Issue #557 (Dec. 27, 2022)

$
0
0

#557 – DECEMBER 27, 2022
View in Browser »

The PyCoder’s Weekly Logo

It is that time of year again, everybody is making lists. Hopefully you weren’t on the “naughty” one. 2022 has seen a lot of change in tech, from the release of Python 3.11 to the sudden serge in Mastodon use, the world of Python has been busy.

This week’s newsletter is a collection of the most popular articles and projects linked in 2022. Maybe you missed one, maybe you’ve got more time to read something in depth.

Here’s to you, dear reader. Thanks for continuing to be with us at PyCoder’s Weekly. I’m sure 2023 will be just as wild. Speaking of 2023, if you come across something cool next year, an article or a project you think deserves some notice, send it to us and it might end up in a future issue.

Happy Pythoning!

— The PyCoder’s Weekly Team
    Christopher Trudeau, Curator
    Dan Bader, Editor


Python 3.11: Cool New Features for You to Try

Python 3.11 is out! In this article, you’ll explore what Python 3.11 brings to the table. You’ll learn how Python 3.11 is the fastest and most user-friendly version of CPython yet, and learn about improvements to the typing system and to the asynchronous features of Python.
REAL PYTHON

Python f-Strings Are More Powerful Than You Might Think

Learn about the lesser-known features of Python’s f-strings, including date formatting, variable debugging, nested f-strings, and conditional formatting.
MARTIN HEINZ• Shared by Martin Heinz

TelemetryHub by Scout APM, A One-Step Solution for Open-Telemetry

alt

Imagine a world where you could see all of your logs, metrics, and tracing data in one place. We’ve built TelemetryHub to be the simplest observability tool on the market. We supply peace of mind by providing an intuitive user experience that allows you to easily monitor your application →
SCOUT APMsponsor

Say Goodbye to These Obsolete Python Libraries

It’s time to say goodbye to os.path, random, pytz, namedtuple and many more obsolete Python libraries. Start using the latest and greatest ones instead.
MARTIN HEINZ• Shared by Martin Heinz

Python list vs tuple Comparison

Learn how list and tuple are similar and how they are different, including storage and speed differences and how to choose between them.
CHETAN AMBI

How to Write User-Friendly CLIs in Python

How to write user-friendly Command Line Interface applications and an overview of several of the popular CLI libraries: argparse, Click, Typer, Docopt, and Fire.
XIAOXU GAO

Where Exactly Does Python 3.11 Get Its Speedup?

This deep dive into Python 3.11’s speed-up walks you through nine different optimizations that contribute to the 25% performance improvement in CPython.
BESHR KAYALI

Taipy | The First Open-Source Python Application Builder

alt

Taipy is a new low-code Python package that allows you to create complete Data Science applications, including both graphical visualization, using intuitive low-code programming (Taipy GUI) as well as managing data and execution flow through pipelines and “what-if” scenario management Taipy Core →
TAIPYsponsor

Python’s “Functions” Are Sometimes Classes

Ever use list() or enumerate()? Think of them as functions? They’re not, they’re classes. Sometimes we call classes functions in Python. Why? And what’s a “callable”?
TREY HUNNER

Processing Large JSON Files Without Running Out of Memory

Loading complete JSON files into Python can use too much memory, leading to slowness or crashes. The solution: process JSON data one chunk at a time.
ITAMAR TURNER-TRAURING

Dunder Methods in Python: The Ugliest Awesome Sauce

Double-underscore methods, also known as “dunder methods” or “magic methods” are an ugly way of bringing beauty to your code. Learn about constructors, __repr__, __str__, operator overloading, and getting your classes working with Python functions like len().
JOHN LOCKWOOD

A First Look at PyScript: Python in the Web Browser

In this tutorial, you’ll learn about PyScript, a new framework that allows for running Python in the web browser with few or no code modifications and excellent performance. You’ll leverage browser APIs and JavaScript libraries to build rich, highly interactive web applications with Python.
REAL PYTHON

Build an Alexa Equivalent in Python

It’s not as difficult as you think to build an AI program that listens to speech and answers questions. You can make the magic happen in an afternoon by leveraging a few Python packages and APIs.
ANDREW HERSHY

Connect, Integrate & Automate Your Data - From Python or Any Other Application

At CData, we simplify connectivity between the application and data sources that power business, making it easier to unlock the value of data.
CDATA SOFTWAREsponsor

10 Patterns for Writing Cleaner Python

Cleaner code is more focused, easier to read, easier to debug, and generally easier to maintain. This guide covers ten different patterns Python programmers should apply in their code.
ALEX OMEYER

Just Enough Cython to Be Useful

Cython is a superset of of Python designed to give C-like performance. Ever wanted to learn the basics? This article shows you how to get started.
PETER BAUMGARTNER

Projects & Code

Events


Happy Pythoning!
This was PyCoder’s Weekly Issue #557.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

The Python Coding Blog: Using type hints when defining a Python function [Intermediate Python Functions Series #6]

$
0
0

You’ve already covered a lot of ground in this Intermediate Python Functions series. In this article, you’ll read about a relatively new addition in Python called type hinting or type annotation. Unlike all the other topics you learnt about in the previous articles, this one will not change the behaviour of the function you define. So why bother? Let’s find out.

Overview Of The Intermediate Python Functions Series

Here’s an overview of the seven articles in this series:

  1. Introduction to the series: Do you know all your functions terminology well?
  2. Choosing whether to use positional or keyword arguments when calling a function
  3. Using optional arguments by including default values when defining a function
  4. Using any number of optional positional and keyword arguments: args and kwargs
  5. Using positional-only arguments and keyword-only arguments: the “rogue” forward slash / or asterisk * in function signatures
  6. [This article] Type hinting in functions
  7. Best practices when defining and using functions

Type Hints in Python Functions

Let’s see what type hints are with the following example:

def greet_person(person: str, number: int):
    for greeting in range(number):
        print(f"Hello {person}! How are you doing today?")

# 1.
greet_person("Sam", 4)

# 2.
greet_person(2, 4)

You define the function greet_person() which has two parameters:

  • person
  • number

In the function definition, the parameters also have type hints. The type hint follows immediately after the parameter name and a colon. The function’s signature shows str as the data type annotation for person and int as the annotation for number.

However, these are just hints or annotations. They do not force the parameters to take only those data types as inputs. You can confirm this by running the code above. Both function calls run without errors even though the second call has an int as its first argument when its type hint indicates that it’s meant to be a str:

Hello Sam! How are you doing today?
Hello Sam! How are you doing today?
Hello Sam! How are you doing today?
Hello Sam! How are you doing today?
Hello 2! How are you doing today?
Hello 2! How are you doing today?
Hello 2! How are you doing today?
Hello 2! How are you doing today?

So, if the code still works, what do type hints do?

Tools Which Make Use Of Type Hints in Python Functions

Let’s look at the code above as seen in the IDE I’m using. I’m using PyCharm, but you’ll also get similar behaviour in other IDEs.

You can see that one of the arguments is highlighted in yellow in the second function call. The first argument, the integer 2, has a warning. When you hover over the argument, a warning pops up: “Expected type ‘str’, got ‘int’ instead”.

Even though the code still runs and doesn’t give an error message, the IDE warns you before you run your code to inform you that the argument you used doesn’t match the expected date type. The expected data type is the one used in the type hint.

There are other tools which check type hints and provide warnings, too. Therefore, even though type hints do not change the function’s behaviour, they can minimise errors and bugs. The user is less likely to misuse the function if they get warnings when using the wrong data types.

Type Hints For Return Values in Python Functions

Let’s look at another variation of the function:

def greet_people(people: list) -> list:
    return [f"Hello {person}! How are you doing today?" for person in people]

result = greet_people(["James", "Matthew", "Claire"])

for item in result:
    print(item.upper())

The parameter people has a type annotation showing it should be passed a list. There’s also the -> symbol followed by list before the colon at the end of the function signature. You’ll see what this is soon.

Let’s first look at the output from this code:

HELLO JAMES! HOW ARE YOU DOING TODAY?
HELLO MATTHEW! HOW ARE YOU DOING TODAY?
HELLO CLAIRE! HOW ARE YOU DOING TODAY?

The annotation -> list at the end of the signature shows that the function’s return value should be a list. This type hint lets anyone reading the function definition know that this function returns a list.

More complex type hints

Let’s go a bit further to see the benefit of this type of annotation. Here’s another version. There’s an error in the for loop:

def greet_people(people: list) -> list:
    return [f"Hello {person}! How are you doing today?" for person in people]

result = greet_people(["James", "Matthew", "Claire"])

for item in result:
    print(item.append(5))

This code raises the following error:

Traceback (most recent call last):
  File "...", line 7, in <module>
    print(item.append(5))
          ^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'append'

The variable result is a list which contains strings. Therefore, the variable item in the for loop will contain a string. You cannot use append() on these strings since append() is not a str method. The type annotation you have now doesn’t help in this situation. It indicates that the function should return a list, which it does.

But is it possible to get a warning of this issue before you run the code using type hints? Can we find out that this is not the right kind of list?

Let’s improve the return value’s type annotation:

def greet_people(people: list) -> list[str]:
    return [f"Hello {person}! How are you doing today?" for person in people]

result = greet_people(["James", "Matthew", "Claire"])

for item in result:
    print(item.append(5))

Note that the return value’s type annotation is now list[str]. This indicates that the function returns a list of strings, not just any list.

Let’s see what this code looks like in PyCharm:

The IDE highlights the append() method on the last line. Type hints indicate that the data the function returns is a list of strings. Therefore the IDE “knows” that item should be a str in the final for loop since result is a list of strings. The IDE warns you that append() is not a string method.

Should You Start Using Type Hints When Defining Python Functions?

Opinions are split in the Python community on how and when you should use type hinting. Python is a dynamic language–this means that you don’t have to declare the data type of variables as they are dynamically assigned when the program runs. Type hinting does not make Python a static language.

You may hear some say that you should always use type hints. In some programming environments, such as in teams writing production code, type hints have nearly become standard. They make working in large teams easier and minimise bugs. In such programming teams, type hints are almost always used.

However, there are situations when you don’t need them and the code you write is simpler without them. There are still many programming applications in which code which doesn’t have type hints is perfectly fine.

So don’t feel pressured to use them all the time!

Next Article:<Link will be posted here when the next article in the series is posted>

Further Reading


Get the latest blog updates

No spam promise. You’ll get an email when a new blog post is published


The post Using type hints when defining a Python function [Intermediate Python Functions Series #6] appeared first on The Python Coding Book.

Kay Hayen: Nuitka Release 1.3

$
0
0

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release contains a large amount of performance work, that should specifically be useful on Windows, but also generally. A bit of scalability work has been applied, and as usual many bug fixes and small improvements, many of which have been in hotfixes.

Bug Fixes

  • macOS: Framework build of PySide6 were not properly supporting the use of WebEngine. This requires including frameworks and resources in new ways, and actually some duplication of files, making the bundle big, but this seems to be unavoidable to keep the signature intact.

  • Standalone: Added workaround for dotenv. Do not insist on compiled package directories that may not be there in case of no data files. Fixed in 1.2.1 already.

  • Python3.8+: Fix, the ctypes.CDLL node attributes the winmode argument to Python2, which is wrong, it was actually added with 3.8. Fixed in 1.2.1 already.

  • Windows: Attempt to detect corrupt object file in MSVC linking. These might be produced by cl.exe crashes or clcache bugs. When these are reported by the linker, it now suggests to use the --clean-cache=ccache which will remove it, otherwise there would be no way to cure it. Added in 1.2.1 already.

  • Standalone: Added data files for folium package. Added in 1.2.1 already.

  • Standalone: Added data files for branca package. Added in 1.2.1 already.

  • Fix, some forms try that had exiting finally branches were tracing values only assigned in the try block incorrectly. Fixed in 1.2.2 already.

  • Alpine: Fix, Also include libstdc++ for Alpine to not use the system one which is required by its other binaries, much like we already do for Anaconda. Fixed in 1.2.2 already.

  • Standalone: Added support for latest pytorch. One of our workarounds no longer applies. Fixed in 1.2.2 already.

  • Standalone: Added support for webcam on Windows with opencv-python. Fixed in 1.2.3 already.

  • Standalone: Added support for pytorch_lightning, it was not finding metadata for rich package. Fixed in 1.2.4 already.

    For the release we found that pytorch_lightning may not find rich installed. Need to guard version() checks in our package configuration.

  • Standalone: Added data files for dash package. Fixed in 1.2.4 already.

  • Windows: Retry replace clcache entry after a delay, this works around Virus scanners giving access denied while they are checking the file. Naturally you ought to disable those for your build space, but new users often don’t have this. Fixed in 1.2.4 already.

  • Standalone: Added support for scipy 1.9.2 changes. Fixed in 1.2.4 already.

  • Catch corrupt object file outputs from gcc as well and suggest to clean cache as well. This has been observed to happen at least on Windows and should help resolve the ccache situation there.

  • Windows: In case clcache fails to acquire the global lock, simply ignore that. This happens sporadically and barely is a real locking issue, since that would require two compilations at the same time and for that it largely works.

  • Compatibility: Classes should have the f_locals set to the actual mapping used in their frame. This makes Nuitka usable with the multidispatch package which tries to find methods there while the class is building.

  • Anaconda: Fix, newer Anaconda versions have TCL and Tk in new places, breaking the tk-inter automatic detection. This was fixed in 1.2.6 already.

  • Windows 7: Fix, onefile was not working anymore, a new API usage was not done in a compatible fashion. Fixed in 1.2.6 already.

  • Standalone: Added data files for lark package. Fixed in 1.2.6 already.

  • Fix, pkgutil.iter_modules without arguments was given wrong compiled package names. Fixed in 1.2.6 already.

  • Standalone: Added support for newer clr DLLs changes. Fixed in 1.2.7 already.

  • Standalone: Added workarounds for tensorflow.compat namespace not being available. Fixed in 1.2.7 already.

  • Standalone: Added support for tkextrafont. Fixed in 1.2.7 already.

  • Python3: Fix, locals dict test code testing if a variable was present in a mapping could leak references. Fixed in 1.2.7 already.

  • Standalone: Added support for timm package. Fixed in 1.2.7 already.

  • Plugins: Add tls to list of sensible plugins. This enables at least pyqt6 plugin to do networking with SSL encryption.

  • Standalone: Added implicit dependencies of sklearn.cluster.

  • FreeBSD: Fix, fcopyfile is no longer available on newest OS version, and include files for sendfile have changed.

  • MSYS2: Add back support for MSYS Posix variant. Now onefile works there too.

  • Fix, when picking up data files from command line and plugins, different exclusions were applied. This has been unified to get better coverage for avoiding to include DLLs and the like as data files. DLLs are not data files and must be dealt with differently after all.

New Features

  • UI: Added new option for cache disabling --disable-cache that accepts all and cache names like ccache, bytecode and on Windows, dll-dependencies with selective values.

    Note

    The clcache is implied in ccache for simplicity.

  • UI: With the same values as --disable-cache Nuitka may now also be called with --clean-cache in a compilation or without a filename argument, and then it will erase those caches current data before making a compilation.

  • macOS: Added --macos-app-mode option for application bundles that should run in the background (background) or are only a UI element (ui-element).

  • Plugins: In the Nuitka package configuration files, the when allows now to check if a plugin is active. This allowed us to limit console warnings to only packages whose plugin was activated.

  • Plugins: Can now mark a plugin as a GUI toolkit responsible with the consequence that other toolkit detector plugins are all disabled, so when using tk-inter no longer will you be asked about PySide6 plugin, as that is not what you are using apparently.

  • Plugins: Generalized the GUI toolkit detection to include tk-inter as well, so it will now point out that wx and the Qt bindings should be removed for best results, if they are included in the compilation.

  • Plugins: Added ability to provide data files for macOS Resources folder of application bundles.

  • macOS: Fix, Qt WebEngine was not working for framework using Python builds, like the ones from PyPI. This adds support for both PySide2 and PySide6 to distribute those as well.

  • MSYS2: When asking a CPython installation to compress from the POSIX Python, it crashed on the main filename being not the same.

  • Scons: Fix, need to preserve environment attached modes when switching to winlibs gcc on Windows. This was observed with MSYS2, but might have effects in other cases too.

Optimization

  • Python3.10+: When creating dictionaries, lists, and tuples, we use the newly exposed dictionary free list. This can speedup code that repeatedly allocates and releases dictionaries by a lot.

  • Python3.6+: Added fast path to dictionary copy. Compact dictionaries have their keys and values copied directly. This is inspired by a Python3.10 change, but it is applicable to older Python as well, and so we did.

  • Python3.9+: Faster compiled object creation, esp. on Python platforms that use a DLLs for libpython, which is a given on Windows. This makes up for core changes that went unnoticed so far and should regain relative speedups to standard Python.

  • Python3.10+: Faster float operations, we use the newly exposed float free list. This can speed up all kinds of float operations that are not doable in-place by a lot.

  • Python3.8+: On Windows, faster object tracking is now available, this previously had to go through a DLL call, that is now removed in this way as it was for non-Windows only so far.

  • Python3.7+: On non-Windows, faster object tracking is now used, this was regressed when adding support for this version, becoming equally bad as all of Windows at the time. However, we now managed to restore it.

  • Optimization: Faster deep copy of mutable tuples and list constants, these were already faster, but e.g. went up from 137% gain factor to 201% on Python3.10 as a result. We now use guided a deep copy, which then has the information, what types it is going to copy, removing the need to check through a dictionary.

  • Optimization: Also have own allocator function for fixed size objects. This accelerates allocation of compiled cells, dictionaries, some iterators, and lists objects.

  • More efficient code for object initialization, avoiding one DLL calls to set up our compiled objects.

  • Have our own PyObject_Size variant, that will be slightly faster and avoids DLL usage for len and size hints, e.g. in container creations.

  • Avoid using non-optimal malloc related macros and functions of Python, and instead of the fasted form generally. This avoids Python DLL calls that on Windows can be particularly slow.

  • Scalability: Generated child mixins are now used for the generated package metadata hard import nodes calls, and for all instances of single child tuple containers. These are more efficient for creation and traversal of the tree, directly improving the Python compile time.

  • Scalability: Slightly more efficient compile time constant property detections. For frozenset there was not need to check for hashable values, and some branches could be replaced with e.g. defining our own EllipsisType for use in short paths.

  • Windows: When using MSVC and LTO, the linking stage was done with only one thread, we now use the proper options to use all cores. This is controlled by --jobs much like C compilation already is. For large programs this will give big savings in overall execution time. Added in 1.2.7 already.

  • Anti-Bloat: Remove the use of pytest for dash package compilation.

  • Anti-Bloat: Remove the use of IPython for dotenv, pyvista, python_utils, and trimesh package compilation.

  • Anti-Bloat: Remove IPython usage in rdkit improving compile time for standalone by a lot. Fixed in 1.2.7 already.

  • Anti-Bloat: Avoid keras testing framework when using that package.

Organisational

  • Plugins: The numpy plugin functionality was moved to Nuitka package configuration, and as a result, the plugin is now deprecated and devoid of functionality. On non-Windows, this removes unused duplications of the numpy.core DLLs.

  • User Manual: Added information about macOS entitlements and Windows console. These features are supported very well by Nuitka, but needed documentation.

  • UI: Remove alternative options from --help output. These are there often only for historic reasons, e.g. when an option was renamed. They should not bother users reading them.

  • Plugins: Expose the mnemonics option to plugin warnings function, and use it for pyside2 and pyqt5 plugins.

  • Quality: Detect trailing/leading spaces in Nuitka package configuration description values during their automatic check.

  • UI: Detect the CPython official flavor on Windows by comparing to registry paths and consider real prefixes, when being used in virtualenv more often, e.g. when checking for CPython on Apple.

  • UI: Enhanced --version output to include the C compiler selection. It is doing that respecting your other options, e.g. --clang, etc. so it will be helpful in debugging setup issues.

    UI: Some error messages were using %r rather than '%s' to output file paths, but that escaped backslashes on Windows, making them look worse, so we changed away from this.

  • UI: Document more clearly what --output-dir actually controls.

  • macOS: Added options hint that the Foundation module requires bundle mode to be usable.

  • UI: Allow using both --follow-imports and --nofollow-imports on command line rather than erroring out. Simply use what was given last, this allows overriding what was given in project options tests should the need arise.

  • Reports: Include plugin reasons for pre and post load modules provided. This solves a TODO and makes it easier to debug plugins.

  • UI: Handle --include-package-data before compilation, removing the ability to use pattern. This makes it easier to recognize mistakes without a long compilation and plugins can know them this way too.

  • GitHub: Migration workflows to using newer actions for Python and checkout. Also use newer Ubuntu LTS for Linux test runner.

  • UI: Catch user error of running Nuitka with the pythonw binary on Windows.

  • UI: Make it clear that MSYS2 defaults to --mingw64 mode. It had been like this, but the --help output didn’t say so.

  • GitHub: Updated contribution guidelines for better readability.

  • GitHub: Use organisation URLs everywhere, some were still pointing to the personal rather than the organisation one. While these are redirected, it is not good to have them like this.

  • Mastodon: Added link to https://fosstodon.org/@kayhayen to the PyPI package and User Manual.

Cleanups

  • Nodes for hard import calls of package meta data now have their base classes fully automatically created, replacing what was previously manual code. This aims at making them more consistent and easier to add.

  • When adding the new Scons file for C compiler version output, more values that are needed for both onefile and backend compilation were moved to centralized code, simplifying these somewhat again.

  • Remove unused main_module tag. It cannot happen that a module name matches, and still thinks of itself as __main__ during compilation, so that idea was unnecessary.

  • Generate the dictionary copy variants from template code rather than having manual duplications. For dict.copy(), for deep copy (needed e.g. when there are escaping mutable keyword argument constant values in say a function call), and for **kw value preparation in the called function (checking argument types), we have had diverged copies, that are now unified in a single Jinja2 template for optimization.

  • Plugins: Also allow providing generators for providing extra DLLs much like we already do for data files.

  • Naming of basic tests now makes sure to use a Test suffix, so in Visual Code selector they are more distinct from Nuitka code modules.

  • Rather than populating empty dictionaries, helper code now uses factory functions to create them, passing keys and values, and allowing values to be optional, removing noisy if branches at call side.

  • Removed remaining PyDev annotations, we don’t need those anymore for a long time already.

  • Cleanup, avoid lists objects for ctypes defined functions and their arglist, actually tuples are sufficient and naturally better to use.

  • Spelling cleanups were resumed, as an ongoing action.

Tests

  • Added construct test that demonstrates the mutable constant argument passing for lists to see the performance gains in this area too.

  • Made construct runner --diff output usable for interactive usage.

  • Repaired Nuitka Speedcenter, but it’s not yet too usable for general consumption. More work will be needed there, esp. to make comparisons more accessible for the general public.

Summary

The major achievement of this release was the removal of the long lived numpy plug-in, replacing it with package based configuration, that is even more optimal and works perfectly across all platforms on both important package installation flavors.

This release has a lot of consolidation efforts, but also as a result of 3.11 investigations, addresses a lot of issues, that have crept in over time with Python3 releases since 3.7, each time, something had not been noticed. There will more need for investigation of relative performance losses, but this should address the most crucial ones, and also takes advantage of optimization that had become with 3.10 already.

There is also some initial results from cleanups with the composite node tree structure, and how it is managed. Generated “child(ren) having” mixins, allow for faster traversal of the node tree.

Some technical things also improved in Scons. Using multiple cores in LTO with MSVC with help this a lot, although for big compilations --lto=no probably has to be recommended still.

More anti-bloat work on more packages rounds up the work.

For macOS specifically, the WebEngine support is cruical to some users, and the new --macos-app-mode with more GUI friendly default resolves long standing problems in this area.

And for MSYS2 and FreeBSD, support has been re-activated, so now 4 OSes work extremely well (others too likely), and on those, most Python flavors work well.

The performance and scalability improvements are going to be crucial. It’s a pity that 3.11 is not yet supported, but we will be getting there.

"Paolo Amoroso's Journal": Troubleshooting a Suite8080 assembler bug

$
0
0

The iLoad feature of the Z80-MBC2 homebrew Z80 computer allows uploading binary code that runs on the bare metal.

I thought it would be fun to try iLoad with some Intel 8080 code generated by the asm80 assembler of Suite8080, the suite of 8080 Assembly cross-development tools I'm writing in Python. But the code crashed the Z80-MBC2, uncovering a major asm80 bug.

It all started when, to practice the toolchain and process, I started with a Z80 Assembly demo that comes with the Z80-MBC2, which prints a message to the console and blinks one of the devices's LEDs. I assembled the program with the zasm Z80 and 8080 assembler, uploaded it via iLoad, and successfully ran it.

Next, I ported the blinking LED demo from Z80 to 8080 code and assembled it with asm80. But when I ran the demo on the Z80-MBC2, it crashed the device.

The baffling crash left me stuck for threee months, as I had no tools for debugging on the bare metal and there were only a few vague clues.

I carefully studied the less than a hundred lines of code and they looked fine. To isolate the issue I cut the code in half, leaving the part that prints a message to the console, and transforming the blinking demo into this bare.asm hello world for the bare metal:

OPCODE_PORT     equ     01h
EXEC_WPORT      equ     00h
TX_OPCODE       equ     01h
EOS             equ     00h
CR              equ     0dh
LF              equ     0ah


                org     0h

                jmp     start

                ds      16
stack:


start:          lxi     sp, start
                lxi     h, message
                call    puts

                hlt


message:        db      CR, LF, 'Greetings from the bare metal', CR, LF, EOS


puts:           push    psw
                push    h
puts_loop:      mov     a, m
                cpi     EOS
                jz      puts_end
                call    putc
                inx     h
                jmp     puts_loop
puts_end:       pop     h
                pop     psw
                ret


putc:           push    psw
                mvi     a, TX_OPCODE
                out     OPCODE_PORT
                pop     psw
                out     EXEC_WPORT
                ret

                end

The constants at the beginning define the addresses of the output ports, the opcode for sending a character over the serial line, and a couple of control characters. Next, the program sets up the stack and iterates over the output string to print every character.

The simplified demo program still crashed the Z80-MBC2, forcing me back to the drawing board.

Then I had an epiphany. What if the binary code asm80 generates is different from zasm's?

I fired up the dis80 disassembler of Suite8080 and compared the output of the assemblers. Sure enough, the difference jumped at me: the destination addresses of all the branches after the label message are off by 5 bytes.

The instructions branch to addresses 5 bytes lower, so the call to puts executes random string data that chrashes the device. The last correct address asm80 outputs is that of the label message. The address of the next one, puts, is wrong and leads to the crash.

Indeed, the same demo code assembled with zasm ran fine on the Z80-MBC2 and printed the expected message. This confirmed my hunch.

What now?

The next step is to find the bug in the Python source of asm80, which I'm developing with Replit. Although Replit provides a debugger, I won't use it. The tool is not well documented and I'm not sure how it works. In addition the Replit debugger is best suited to code started from the run button. This is inconvenient for command line programs like the Python scripts of Suite8080.

Therefore, I'll take the opportunity to use Python's native debugger pdb, which I always wanted to try in a real project. I played with pdb a bit and it looks easy to use, with all the commands and options handy.

Let's see if pdb can help me pinpoint the bug in the Python code.

#Suite8080#z80mbc2#Assembly#Python

Discuss...Email | Reply @amoroso@fosstodon.org

Python for Beginners: Position of a Character in a String in Python

$
0
0

Searching characters in given strings is one of the most common tasks while text analysis in Python. This article discusses how to find the position of a character in a string in python.

Find the Position of a Character in a String Using for Loop

To find the position of a character in a string, we will first find the length of the string using the len() function. The len() function takes a string as its input argument and returns the length of the string. We will store the length of the string in a variable strLen

After getting the length of the string, we will create a range object containing values from 0 to strLen-1 using the range() function. The range() function takes the variable strLen as its input argument and returns a range object. 

Next, we will use a python for loop and the range object to iterate through the string characters. While iteration, we will check if the current character is the character whose position we are looking for. For this, we will use the indexing operator. If the current character is the character we are looking for, we will print the position of the character and exit the loop using the break statement in python. Otherwise, we will move to the next character.

After execution of the for loop, if the character is present in the input string, we will get the position of the character as shown below.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
range_obj=range(strLen)
character="F"
for index in range_obj:
    if myStr[index]==character:
        print("The character {} is at index {}".format(character,index))
        break

Output:

The input string is: Python For Beginners
The character F is at index 7

Find All the Occurrences of a Character in a String Using for Loop

To find all the occurrences of the character in the string, we will remove the break statement from the for loop. Due to this, the for loop will check all the characters and print their positions if the character is the one we are looking for. You can observe this in the following example.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
range_obj=range(strLen)
character="n"
for index in range_obj:
    if myStr[index]==character:
        print("The character {} is at index {}".format(character,index))

Output:

The input string is: Python For Beginners
The character n is at index 5
The character n is at index 15
The character n is at index 16

Find the Rightmost Index of the Character in a String Using for Loop

To find the rightmost position of the character, we need to modify our approach. For this, we will create a variable named position to store the position of the character in the string instead of printing the position. While iterating through the characters of the string, we will keep updating the position variable whenever we find the desired character.

After execution of the for loop, we will get the index of the rightmost occurrence of the character in the string. You can observe this in the following example.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
range_obj=range(strLen)
character="n"
position=-1
for index in range_obj:
    if myStr[index]==character:
        position=index
print("The character {} is at rightmost index {}".format(character,position))

Output:

The input string is: Python For Beginners
The character n is at rightmost index 16

The above approach gives the position of the character starting from the left side. If you want to get the position of the rightmost occurrence of the character in the string, you can use the following approach.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
range_obj=range(strLen)
character="n"
for index in range_obj:
    if myStr[strLen-index-1]==character:
        print("The character {} is at position {} from right.".format(character,index+1))
        break

Output:

The input string is: Python For Beginners
The character n is at position 4 from right.

In this example, we have used the string indexing operator to access elements from the right side of the string. Hence, we can get the rightmost position of any given character with minimum executions of the for loop.

Find the Position of a Character in a String Using While Loop

Instead of the for loop, you can use a while loop to find the position of a character in the string. For this, we will use the following steps.

  • First, we will define a variable named position and initialize it to -1. 
  • Then, we will find the length of the string using the len() function. 
  • Now, we will use a while to iterate through the characters of the string. We will define the exit condition if the position variable becomes greater than or equal to the length of the string.
  • Inside the while loop, we will first increment the position variable. Next, we will check if the character in the current position is the character we are looking for or not. If yes, we will print the position of the character and exit the while loop using the break statement. Otherwise, we will move to next character.

After execution of the while loop, we will get the index of the first occurrence of the character in the string. You can observe this in the following code.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
character="n"
position=-1
while position<strLen-1:
    if myStr[position+1]==character:
        print("The character {} is at position {}.".format(character,position+1))
        break
    position+=1

Output:

The input string is: Python For Beginners
The character n is at position 5.

Find All the Occurrences of a Character in a String Using While Loop

If you want to find all the occurrences of the character in the string, you can remove the break statement from the while loop. After this, the program will print all the positions of the character in the given string as shown below.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
character="n"
position=-1
while position<strLen-1:
    if myStr[position+1]==character:
        print("The character {} is at position {}.".format(character,position+1))
    position+=1

Output:

The input string is: Python For Beginners
The character n is at position 5.
The character n is at position 15.
The character n is at position 16.

Find the Rightmost Index of the Character in a String Using While Loop

To find the rightmost position of the character using the while loop in Python, we will create a variable named rposition to store the position of the character in the string instead of printing the position. While iterating through the characters of the string using the while loop, we will keep updating the rposition variable whenever we find the desired character.

After execution of the while loop, we will get the position of the rightmost occurrence of the character in the string. You can observe this in the following example.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
character="n"
position=-1
rposition=0
while position<strLen-1:
    if myStr[position+1]==character:
        rposition=position+1
    position+=1
print("The character {} is at rightmost position {}.".format(character,rposition))

Output:

The input string is: Python For Beginners
The character n is at rightmost position 16.

The above approach gives the position of the character starting from the left side. If you want to get the position of the rightmost occurrence of the character in the string, you can use the following approach.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
character="n"
position=0
while position<strLen-1:
    if myStr[strLen-position-1]==character:
        print("The character {} is at position {} from right.".format(character,position+1))
        break
    position+=1

Output:

The input string is: Python For Beginners
The character n is at position 4 from right.

Find the Position of a Character in a String Using the find() Method

To find the position of a character in a string, you can also use the find() method. The find() method, when invoked on a string, takes a character as its input argument. After execution, it returns the position of the first occurrence of the character in the string. You can observe this in the following example.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
character="n"
position=myStr.find(character)
print("The character {} is at position {}.".format(character,position))

Output:

The input string is: Python For Beginners
The character n is at position 5.

Suggested Reading: Create a chat application in Python

Find the Index of a Character in a String Using the index() Method

The index() method is used to find the index of a character in a string. The index() method, when invoked on a string, takes a character as its input argument. After execution, it returns the position of the first occurrence of the character in the string. You can observe this in the following example.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
character="n"
position=myStr.index(character)
print("The character {} is at index {}.".format(character,position))

Output:

The input string is: Python For Beginners
The character n is at index 5.

Find the Rightmost Index of a String in Python Using the rfind() Method

To find the rightmost position of a character in a python string, you can also use the rfind() method. The rfind() method works in a similar manner to the find() method except that it returns the rightmost position of the input character in the string. You can observe this in the following example.

myStr="Python For Beginners"
print("The input string is:",myStr)
strLen=len(myStr)
character="n"
position=myStr.rfind(character)
print("The character {} is at rightmost position {} .".format(character,position))

Output:

The input string is: Python For Beginners
The character n is at rightmost position 16 .

Conclusion

In this article, we have discussed different ways to find the position of a character in a string in Python. To read more about this topic, you can read this article on how to find all the occurrences of a substring in a string.

You might also like this article on python simplehttpserver.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!

The post Position of a Character in a String in Python appeared first on PythonForBeginners.com.

Viewing all 23173 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>