Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22882

Real Python: How to Get a List of All Files in a Directory With Python

$
0
0

Getting a list of all the files and folders in a directory is a natural first step for many file-related operations in Python. When looking into it, though, you may be surprised to find various ways to go about it.

When you’re faced with many ways of doing something, it can be a good indication that there’s no one-size-fits-all solution to your problems. Most likely, every solution will have its own advantages and trade-offs. This is the case when it comes to getting a list of the contents of a directory in Python.

In this tutorial, you’ll be focusing on the most general-purpose techniques in the pathlib module to list items in a directory, but you’ll also learn a bit about some alternative tools.

Source Code:Click here to download the free source code, directories, and bonus materials that showcase different ways to list files and folders in a directory with Python.

Before pathlib came out in Python 3.4, if you wanted to work with file paths, then you’d use the os module. While this was very efficient in terms of performance, you had to handle all the paths as strings.

Handling paths as strings may seem okay at first, but once you start bringing multiple operating systems into the mix, things get more tricky. You also end up with a bunch of code related to string manipulation, which can get very abstracted from what a file path is. Things can get cryptic pretty quickly.

Note: Check out the downloadable materials for some tests that you can run on your machine. The tests will compare the time it takes to return a list of all the items in a directory using methods from the pathlib module, the os module, and even the future Python 3.12 version of pathlib. That new version includes the well-known walk() function, which you won’t cover in this tutorial.

That’s not to say that working with paths as strings isn’t feasible—after all, developers managed fine without pathlib for many years! The pathlib module just takes care of a lot of the tricky stuff and lets you focus on the main logic of your code.

It all begins with creating a Path object, which will be different depending on your operating system (OS). On Windows, you’ll get a WindowsPath object, while Linux and macOS will return PosixPath:

>>>
>>> importpathlib>>> desktop=pathlib.Path("C:/Users/RealPython/Desktop")>>> desktopWindowsPath("C:/Users/RealPython/Desktop")
>>>
>>> importpathlib>>> desktop=pathlib.Path("/home/RealPython/Desktop")>>> desktopPosixPath('/home/RealPython/Desktop')

With these OS-aware objects, you can take advantage of the many methods and properties available, such as ones to get a list of files and folders.

Note: If you’re interested in learning more about pathlib and its features, then check out Python 3’s pathlib Module: Taming the File System and the pathlib documentation.

Now, it’s time to dive into listing folder contents. Be aware that there are several ways to do this, and picking the right one will depend on your specific use case.

Getting a List of All Files and Folders in a Directory in Python

Before getting started on listing, you’ll want a set of files that matches what you’ll encounter in this tutorial. In the supplementary materials, you’ll find a folder called Desktop. If you plan to follow along, download this folder and navigate to the parent folder and start your Python REPL there:

Source Code:Click here to download the free source code, directories, and bonus materials that showcase different ways to list files and folders in a directory with Python.

You could also use your own desktop too. Just start the Python REPL in the parent directory of your desktop, and the examples should work, but you’ll have your own files in the output instead.

Note: You’ll mainly see WindowsPath objects as outputs in this tutorial. If you’re following along on Linux or macOS, then you’ll see PosixPath instead. That’ll be the only difference. The code you write is the same on all platforms.

If you only need to list the contents of a given directory, and you don’t need to get the contents of each subdirectory too, then you can use the Path object’s .iterdir() method. If your aim is to move through directories and subdirectories recursively, then you can jump ahead to the section on recursive listing.

The .iterdir() method, when called on a Path object, returns a generator that yields Path objects representing child items. If you wrap the generator in a list() constructor, then you can see your list of files and folders:

>>>
>>> importpathlib>>> desktop=pathlib.Path("Desktop")>>> # .iterdir() produces a generator>>> desktop.iterdir()<generator object Path.iterdir at 0x000001A8A5110740>>>> # Which you can wrap in a list() constructor to materialize>>> list(desktop.iterdir())[WindowsPath('Desktop/Notes'), WindowsPath('Desktop/realpython'), WindowsPath('Desktop/scripts'), WindowsPath('Desktop/todo.txt')]

Passing the generator produced by .iterdir() to the list() constructor provides you with a list of Path objects representing all the items in the Desktop directory.

As with all generators, you can also use a for loop to iterate over each item that the generator yields. This gives you the chance to explore some of the properties of each object:

>>>
>>> desktop=pathlib.Path("Desktop")>>> foritemindesktop.iterdir():... print(f"{item} - {'dir'ifitem.is_dir()else'file'}")...Desktop\Notes - dirDesktop\realpython - dirDesktop\scripts - dirDesktop\todo.txt - file

Read the full article at https://realpython.com/get-all-files-in-directory-python/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]


Viewing all articles
Browse latest Browse all 22882

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>