This post originally appeared on the DataCamp blog. Big thanks to Karlijn and all the fine folks at DataCamp for letting us share with the Yhat audience! And be sure to check out DataCamp's other cheat sheets, as well.
The Pandas library is one of the most preferred tools for data scientists to do data manipulation and analysis, next to matplotlib for data visualization and NumPy, the fundamental library for scientific computing in Python on which Pandas was built.
The fast, flexible, and expressive Pandas data structures are designed to make real-world data analysis significantly easier, but this might not be immediately the case for those who are just getting started with it. Exactly because there is so much functionality built into this package that the options are overwhelming.
That's where this cheat sheet might come in handy.
It's a quick guide through the basics of Pandas that you will need to get started on wrangling your data with Python.
As such, you can use it as a quick reference if you are just beginning their data science journey with Pandas or, for those of you who already haven't started yet, you can just use it as a guide to make it easier to learn about and use it.
The cheat sheet will guide you through the basics of Pandas, going from the data structures to I/O, selection, dropping indices or columns, sorting and ranking, retrieving basic information of the data structures you're working with to applying functions and data alignment.
In short, everything that you need to kickstart your data science learning with Python!
Speaking of wrangling data in Python, be sure to try out Yhat's own Python IDE, Rodeo.
Have you ever seen pandas at a rodeo before?!
Here's a quick video overview of the most recent release notes. We recently added a movable terminal, block style execution, and .deb and .rpm support and the feedback from y'all has been awesome.
You can download Rodeo for Windows, Mac or Linux here!