Hi there - today we're launching a new course on Merging Dataframes with pandas by Dhavide Aruliah.
As a Data Scientist, you'll often find that the data you need is not in a single file. It may be spread across a number of text files, spreadsheets, or databases. You want to be able to import the data of interest as a collection of DataFrames and figure out how to combine them to answer your central questions. This course is all about the act of combining, or merging, DataFrames, an essential part of any working Data Scientist's toolbox. You'll hone your pandas skills by learning how to organize, reshape, and aggregate multiple data sets to answer your specific questions.
Merging Dataframes with pandas features interactive exercises that combine high-quality video, in-browser coding, and gamification for an engaging learning experience that will make you a master at Data Science with Python!
What you'll learn:
In chapter 1, you'll learn about different techniques you can use to import multiple files into DataFrames. Having imported your data into individual DataFrames, you'll then learn how to share information between DataFrames using their Indexes. Understanding how Indexes work is essential information that you'll need for merging DataFrames later in the course. Start first chapter for free!
Having learned how to import multiple DataFrames and share information using Indexes, in chapter 2 you'll learn how to perform database-style operations to combine DataFrames. In particular, you'll learn about appending and concatenating DataFrames while working with a variety of real-world datasets.
Here in chapter 3, you'll learn all about merging pandas DataFrames. You'll explore different techniques for merging, and learn about left joins, right joins, inner joins, and outer joins, as well as when to use which. You'll also learn about ordered merging, which is useful when you want to merge DataFrames whose columns have natural orderings, like date-time columns.
The last chapter will focus on a applying your skills with a case study on summer Olympics medal data.
About Dhavide: Dhavide is Director of Training at Continuum Analytics, the creator and driving force behind Anaconda—the leading Open Data Science platform powered by Python. Dhavide was previously an Associate Professor at the University of Ontario Institute of Technology (UOIT). He served as Program Director for various undergraduate & postgraduate programs at UOIT. His research interests include computational inverse problems, numerical linear algebra, & high-performance computing. The materials for this course were produced by the Continuum training team.