Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 24355

Stories in My Pocket: Pathlib: my new favorite module

$
0
0

Though `pathlib` was introduced in python 3.4 to some praise, I didn't "get" it. Like many things in python, I needed some time to come around and tinker with it before I realized the power within. I can't remember when `pathlib` started "clicking" for me, but I'm sure it was an accidental rediscovering of it via the Dash documentation application. [dsh]{More on Dash in an upcoming post}

I've found it incredibly helpful in a number of ways, and I hope to pass along some helpful tips if you haven't started using it yet.

You don't have to think as hard

Pretty much every time I needed to configure media or static folders for Django or read some config files, I would have a mental block, frequently asking myself, "What are the `os.path` commands for getting the absolute path of a folder?"

I had to look it up every time, and it just wouldn't stick, but this is usually how I would do it:

local_project_dir = os.path.abspath(os.path.dirname(__file__))

Now, `pathlib` allows me to find a folder in a similar way that I would through the command line:

local_project_path = Path(__file__, '..').resolve()

I also appreciate the flexibility `pathlib` gives you to do the same thing in a couple of ways, to give you freedom to find what works better for you.

All these return a path to the same place:


Path(__file__ , '..').resolve()
(Path(__file__) / '..').resolve()
Path(__file__).parent.absolute()

I find I don't have to think as hard to process this syntax, as compared to the `os.path` syntax above.

Built-in conveniences

I also love how `pathlib` bundles actions into a `Path`. In particular, one doesn't need to create a indented block of code to read its contents. I'm thankful to not have to think about whether I want to iterate over a file's lines, or just read it in.

with open('readfile.txt') as f:
    data = f.read()

Now, you can read all the data in one line [read_text]{documentation}:

data = Path('readfile.txt').read_text()

Don't get me wrong, I think the `open` block is still very much needed, especially when you want to have fine control over when a file is open, or if you need to iterate over a large file. But when you just need to get the decoded contents,[bytes]{Or if you need the bytes, you can do that too.} it sure is nice to have a quick method to do so.

A real-world example

Today, I found out that some of the images in one of my projects were not aligning with the content. I realized I could create a contact sheet of images to help us find the problem images. However, the images were named after the id of the content they were relating to, and there's no easy way for a human to connect those ids to the product.

But we have an API that takes an id and returns the details of the product, including its name. So I was confident I could quickly whip up a little python script to convert the names of each photo to its product name.

from pathlib import Path
import requests


IMG_DIR = Path('C:/Users/chmay/Desktop/pdfcovers')


def id_to_name(id: str):
    url = f'https://stage.projectloc.net/api/v1/products/{id}'
    headers = {
        'accept': 'application/json'
    }
    r = requests.get(url, headers=headers)
    r.raise_for_status()
    return r.json()['name']


def main(path:Path):
    for item in path.glob('*.jpg'):
        id = Path(item).stem
        name = id_to_name(id)
        item.rename(Path(f'{IMG_DIR}/{name}.jpg'))


if __name__ == '__main__':
    main(IMG_DIR)

For those learning python, I'll explain what's going on.

The top imports the dependencies, in this case the `Path` object from `pathlib`, and the `requests` module that I installed into my project's virtual environment.[venv]{Reach out to me if this is hard to understand. Virtual environments are hard to understand when starting out... and many times after that.}

Next, I'm setting a variable (`IMG_DIR`) that points to the folder the images are in. This script uses this variable as a default setting that I could overwrite it if I wanted to use it again somewhere else.

Following that is the function called `id_to_name`. It takes in the id of the product, uses `requests` to get the information, and returns the product's name.

Next up is `main`, where `pathlib` shines.

  • I set up a loop to iterate over all the `jpg` files in the folder[glob]{glob docs}
  • Then get the id from the file name[stem]{stem docs}.
  • Pass that id in to the `id_to_name` function
  • And then rename the photo[rename]{rename docs}.

Finally the last two lines of code are what kicks off the process, but only when I run this file from the command line like this:

python rename_images.py
It's your turn!

I have shown this module to a number of people at work, and like me, none of them realized the greatness that lies within. So I encourage you to open up the documentation,[again]{Or if you're on a Mac purchase Dash.} a REPL or your favorite IDE, and mess around with it.

Thanks!

Thanks go to Trey Hunner, who sent out an email in praise of `pathlib` the other day, and it was a reminder that I hadn’t finished this entry.


Viewing all articles
Browse latest Browse all 24355

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>