Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22462

Python Data: pandas Cheat Sheet (via yhat)

$
0
0

pandas cheat sheetThe folks over at yhat just released a cheat sheet for pandas.  You can download the cheat sheet in PDF for here.

There’s a couple important functions that I use all the time missing from their cheat sheet (actually….there are a lot of things missing, but its a great starter cheat sheet).

A few things that I use all the time with pandas dataframes that are worth collecting in one place are provided below.

Renaming columns in a pandas dataframe:

<span class="pln">df</span><span class="pun">.</span><span class="pln">rename</span><span class="pun">(</span><span class="pln">columns</span><span class="pun">={</span><span class="str">'col1'</span><span class="pun">:</span> <span class="str">'Column_1'</span><span class="pun">,</span> <span class="str">'col2'</span><span class="pun">:</span> 'Column_2<span class="str">'</span><span class="pun">},</span><span class="pln"> inplace</span><span class="pun">=</span><span class="kwd">True</span><span class="pun">)</span>

Iterating over a pandas dataframe:

<span class="kwd">for</span><span class="pln"> index</span><span class="pun">,</span><span class="pln"> row </span><span class="kwd">in</span><span class="pln"> df</span><span class="pun">.</span><span class="pln">iterrows</span><span class="pun">():</span>
    * DO STUFF

Splitting pandas dataframe into chunks:

The function plus the function call will split a pandas dataframe (or list for that matter) into NUM_CHUNKS chunks. I use this often when working with the multiprocessing libary.

# This function creates chunks and returns them
def chunkify(lst,n):
    return [ lst[i::n] for i in xrange(n) ]

chunks = chunkify(df, NUMCHUNKS)

Accessing the value of a specific cell:

This will give you the value of the last row’s “COLUMN” cell.  This may not be the ‘best’ way to do it, but it gets the value

df.COLUMN.tail(1).iloc[0]

Getting rows matching a condition:

The below will get all rows in a pandas dataframe that match the criteria.  In addition to finding equality, you can do all the logical operators.

df[df.COLUMN == Criteria]

Getting rows matching multiple conditions:

This gets rows that match a criteria in COLUMN1 and those that match another criteria in COLUMN2

 df[(df.COLUMN1 == Criteria) &amp; (df.COLUMN2 == Criteria_2) ]

The post pandas Cheat Sheet (via yhat) appeared first on Python Data.


Viewing all articles
Browse latest Browse all 22462

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>