Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 24353

Data School: How to use Python's f-strings with pandas

$
0
0
How to use Python's f-strings with pandas

Python introduced f-strings back in version 3.6 (six years ago!), but I&aposve only recently realized how useful they can be.

In this post, I&aposll start by showing you some simple examples of how f-strings are used, and then I&aposll walk you through a more complex example using pandas.

Here&aposs what I&aposll cover:

Substituting objects:

name = &aposKevin&apos
age = 42
print(f&aposMy name is {name}. I am {age} years old.&apos)
My name is Kevin. I am 42 years old.

To make an f-string, you simply put an f in front of a string. By putting the name and age objects inside of curly braces, those objects are automatically substituted into the string.

Calling methods and functions:

role = &aposDaddy&apos
print(f&aposSometimes my 6-year-old yells: {role.upper()}!!!&apos)
Sometimes my 6-year-old yells: DADDY!!!

Strings have an upper() method, and so I was able to call that method on the role string from within the f-string.

Evaluating expressions:

days_completed = 37
print(f&aposThis portion of the year remains: {(365 - days_completed) / 365}&apos)
This portion of the year remains: 0.8986301369863013

You can evaluate an expression (a math expression, in this case) within an f-string.

Formatting numbers:

print(f&aposThis percentage of the year remains: {(365 - days_completed) / 365:.1%}&apos)
This percentage of the year remains: 89.9%

This looks much nicer, right? The : begins the format specification, and the .1% means "format as a percentage with 1 digit after the decimal point."

Real-world example using pandas:

Recently, I was analyzing the survey data submitted by 500+ Data School community members. I asked each person about their level of experience with 11 different data science topics, plus their level of interest in improving those skills this year.

Thus I had 22 columns of data, with names like:

  • python_experience
  • python_interest
  • pandas_experience
  • pandas_interest
  • ...

Each “experience” column was coded from 0 (None) to 3 (Advanced), and each “interest” column was coded from 0 (Not interested) to 2 (Definitely interested).

Among other things, I wanted to know the mean level of interest in each topic, as well as the mean level of interest in each topic by experience level.

Here&aposs what I did to answer those questions:

cats = [&apospython&apos, &apospandas&apos]  # this actually had 11 categories
for cat in cats:
    mean_interest = df[f&apos{cat}_interest&apos].mean()
    print(f&aposMean interest for {cat.upper()} is {mean_interest:.2f}&apos)
    print(df.groupby(f&apos{cat}_experience&apos)[f&apos{cat}_interest&apos].mean(), &apos\n&apos)
Mean interest for PYTHON is 1.77
python_experience
0    1.590909
1    1.857143
2    1.781759
3    1.630769
Name: python_interest, dtype: float64 

Mean interest for PANDAS is 1.67
pandas_experience
0.0    1.500000
1.0    1.825806
2.0    1.709924
3.0    1.262295
Name: pandas_interest, dtype: float64 

Notice how I used f-strings:

  • Because of the naming convention, I could access the DataFrame columns using df[f&apos{cat}_interest&apos] and df[f&apos{cat}_experience&apos].
  • I capitalized the category using f&apos{cat.upper()}&apos to help it stand out.
  • I formatted the mean interest to 2 decimal places using f&apos{mean_interest:.2f}&apos.

Further reading:

P.S. This blog post originated as one of my weekly data science tips. Sign up below to receive data science tips every Tuesday! 👇


Viewing all articles
Browse latest Browse all 24353

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>