{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# polars: A fast, fancy pandas alternative\n",
"\n",
"Most data folks use `pandas`. However, there is an alternative that I just wanted to bring to your attention. [polars](https://www.pola.rs) is a faster and, perhaps, more modern way to handle data in Python. Still, `pandas` is ubiquitous, so I wanted to start with that. \n",
"\n",
"Here's the [user's guide](https://pola-rs.github.io/polars-book/user-guide/). I'm not going to go through every command here. As always, think about what you want to do. Sketch it out. Then, look for the syntax to do the job.\n",
"\n",
"Why not use `pandas`? Here's the [author of pandas](https://wesmckinney.com/blog/apache-arrow-pandas-internals/) on why `pandas` isn't always the best tool for data manipulation. We're getting more advanced here, worrying about speed, being closer to the \"metal\", etc. \n",
"\n",
"[Some people](https://www.emilyriederer.com/post/py-rgo/), especially those coming to Python from other languages, are suggesting that you just start with `polars` instead.\n",
"\n",
"[Coding for Economists] discusses alternatives to `pandas`, like `polars`.\n",
"\n",
"We can insall `polars` using `pip` the usual way. Don't forget to use `! pip` in Google Colab.\n",
"\n",
"```\n",
"pip install polars\n",
"```\n",
"\n",
"You'll see my basic import statement below."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"polars.dataframe.frame.DataFrame"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Set-up\n",
"\n",
"import polars as pl\n",
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"df = pl.read_csv('https://raw.githubusercontent.com/aaiken1/fin-data-analysis-python/main/data/ncbreweries.csv')\n",
"\n",
"type(df)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"See what I did there? That was `pl.read_csv` from `polars`. I've created a `polars` DataFrame.\n",
"\n",
"Now, you can read the manual to find out all of things that you can do!"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"
shape: (7, 8)describe | Name | City | Type | Beer Count | Est | Status | URL |
---|
str | str | str | str | f64 | f64 | str | str |
"count" | "251" | "251" | "251" | 251.0 | 251.0 | "251" | "251" |
"null_count" | "0" | "0" | "0" | 0.0 | 0.0 | "0" | "0" |
"mean" | null | null | null | 32.960159 | 2012.155378 | null | null |
"std" | null | null | null | 43.723385 | 8.749158 | null | null |
"min" | "217 Brew Works... | "Aberdeen" | "Brewpub" | 1.0 | 1900.0 | "Active" | "https://www.ra... |
"max" | "Zebulon Artisa... | "Winston-Salem" | "Microbrewery" | 424.0 | 2018.0 | "Closed" | "https://www.ra... |
"median" | null | null | null | 18.0 | 2014.0 | null | null |
"
],
"text/plain": [
"shape: (7, 8)\n",
"┌─────────────┬────────────┬────────────┬────────────┬──────────┬────────────┬────────┬────────────┐\n",
"│ describe ┆ Name ┆ City ┆ Type ┆ Beer ┆ Est ┆ Status ┆ URL │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ Count ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ str ┆ str ┆ str ┆ --- ┆ f64 ┆ str ┆ str │\n",
"│ ┆ ┆ ┆ ┆ f64 ┆ ┆ ┆ │\n",
"╞═════════════╪════════════╪════════════╪════════════╪══════════╪════════════╪════════╪════════════╡\n",
"│ count ┆ 251 ┆ 251 ┆ 251 ┆ 251.0 ┆ 251.0 ┆ 251 ┆ 251 │\n",
"│ null_count ┆ 0 ┆ 0 ┆ 0 ┆ 0.0 ┆ 0.0 ┆ 0 ┆ 0 │\n",
"│ mean ┆ null ┆ null ┆ null ┆ 32.96015 ┆ 2012.15537 ┆ null ┆ null │\n",
"│ ┆ ┆ ┆ ┆ 9 ┆ 8 ┆ ┆ │\n",
"│ std ┆ null ┆ null ┆ null ┆ 43.72338 ┆ 8.749158 ┆ null ┆ null │\n",
"│ ┆ ┆ ┆ ┆ 5 ┆ ┆ ┆ │\n",
"│ min ┆ 217 Brew ┆ Aberdeen ┆ Brewpub ┆ 1.0 ┆ 1900.0 ┆ Active ┆ https://ww │\n",
"│ ┆ Works ┆ ┆ ┆ ┆ ┆ ┆ w.ratebeer │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ ┆ .com//brew │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ ┆ er… │\n",
"│ max ┆ Zebulon ┆ Winston-Sa ┆ Microbrewe ┆ 424.0 ┆ 2018.0 ┆ Closed ┆ https://ww │\n",
"│ ┆ Artisan ┆ lem ┆ ry ┆ ┆ ┆ ┆ w.ratebeer │\n",
"│ ┆ Ales ┆ ┆ ┆ ┆ ┆ ┆ .com//brew │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ ┆ er… │\n",
"│ median ┆ null ┆ null ┆ null ┆ 18.0 ┆ 2014.0 ┆ null ┆ null │\n",
"└─────────────┴────────────┴────────────┴────────────┴──────────┴────────────┴────────┴────────────┘"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.describe()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Looks a little different. I like it.\n",
"\n",
"You can select certain columns, as well. You can filter, do \"group bys\". All the usual things."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (251, 2)Name | City |
---|
str | str |
"217 Brew Works... | "Wilson" |
"3rd Rock Brewi... | "Trenton" |
"7 Clans Brewin... | "Cherokee" |
"Andrews Brewin... | "Andrews" |
"Angry Troll Br... | "Elkin" |
"Appalachian Mo... | "Boone" |
"Archetype Brew... | "Asheville" |
"Asheville Brew... | "Asheville" |
"Ass Clown Brew... | "Cornelius" |
"Aviator Brewin... | "Fuquay Varina" |
"Barking Duck B... | "Mint Hill" |
"Barrel Culture... | "Durham" |
… | … |
"Greenshields B... | "Raleigh" |
"Hams Restauran... | "Greenville" |
"Heinzelmannche... | "Sylva" |
"High Tide Brew... | "Jacksonville" |
"Hosanna Brewin... | "Fuqauy Varina" |
"Jack of the Wo... | "Asheville" |
"Loe's Brewing ... | "Hickory" |
"Sweet Taters" | "Rocky Mount" |
"Triangle Brewi... | "Durham" |
"White Rabbit B... | "Angier" |
"Williamsville ... | "Farmville" |
"Wolf Beer Comp... | "Wilmington" |
"
],
"text/plain": [
"shape: (251, 2)\n",
"┌───────────────────────────────────┬────────────┐\n",
"│ Name ┆ City │\n",
"│ --- ┆ --- │\n",
"│ str ┆ str │\n",
"╞═══════════════════════════════════╪════════════╡\n",
"│ 217 Brew Works ┆ Wilson │\n",
"│ 3rd Rock Brewing Company ┆ Trenton │\n",
"│ 7 Clans Brewing ┆ Cherokee │\n",
"│ Andrews Brewing Company ┆ Andrews │\n",
"│ … ┆ … │\n",
"│ Triangle Brewing Company ┆ Durham │\n",
"│ White Rabbit Brewing (NC) ┆ Angier │\n",
"│ Williamsville Brewery (formerly … ┆ Farmville │\n",
"│ Wolf Beer Company ┆ Wilmington │\n",
"└───────────────────────────────────┴────────────┘"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.select(\n",
" pl.col(['Name', 'City'])\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (177, 7)Name | City | Type | Beer Count | Est | Status | URL |
---|
str | str | str | i64 | i64 | str | str |
"217 Brew Works... | "Wilson" | "Microbrewery" | 10 | 2017 | "Active" | "https://www.ra... |
"3rd Rock Brewi... | "Trenton" | "Microbrewery" | 12 | 2016 | "Active" | "https://www.ra... |
"Andrews Brewin... | "Andrews" | "Microbrewery" | 18 | 2014 | "Active" | "https://www.ra... |
"Appalachian Mo... | "Boone" | "Microbrewery" | 78 | 2013 | "Active" | "https://www.ra... |
"Archetype Brew... | "Asheville" | "Microbrewery" | 15 | 2017 | "Active" | "https://www.ra... |
"Asheville Brew... | "Asheville" | "Brewpub" | 87 | 2003 | "Active" | "https://www.ra... |
"Aviator Brewin... | "Fuquay Varina" | "Microbrewery" | 59 | 2008 | "Active" | "https://www.ra... |
"Barking Duck B... | "Mint Hill" | "Microbrewery" | 16 | 2014 | "Active" | "https://www.ra... |
"Barrel Culture... | "Durham" | "Microbrewery" | 29 | 2017 | "Active" | "https://www.ra... |
"Bayne Brewing ... | "Cornelius" | "Microbrewery" | 16 | 2014 | "Active" | "https://www.ra... |
"BearWaters Bre... | "Canton" | "Microbrewery" | 39 | 2012 | "Active" | "https://www.ra... |
"Beer Army Comb... | "Trenton" | "Microbrewery" | 11 | 2012 | "Active" | "https://www.ra... |
… | … | … | … | … | … | … |
"Chesapeake Bay... | "Raleigh" | "Microbrewery" | 14 | 1999 | "Closed" | "https://www.ra... |
"Craggie Brewin... | "Asheville" | "Microbrewery" | 30 | 2009 | "Closed" | "https://www.ra... |
"Draft Line Bre... | "Fuquay-Varina" | "Microbrewery" | 19 | 2014 | "Closed" | "https://www.ra... |
"Four Friends B... | "Charlotte" | "Microbrewery" | 11 | 2009 | "Closed" | "https://www.ra... |
"G2B Gastropub ... | "Durham" | "Brewpub/Brewer... | 18 | 2015 | "Closed" | "https://www.ra... |
"Greenshields B... | "Raleigh" | "Microbrewery" | 15 | 1999 | "Closed" | "https://www.ra... |
"Hams Restauran... | "Greenville" | "Brewpub" | 26 | 2003 | "Closed" | "https://www.ra... |
"Heinzelmannche... | "Sylva" | "Microbrewery" | 18 | 2005 | "Closed" | "https://www.ra... |
"Hosanna Brewin... | "Fuqauy Varina" | "Brewpub" | 12 | 2013 | "Closed" | "https://www.ra... |
"Jack of the Wo... | "Asheville" | "Brewpub" | 13 | 2004 | "Closed" | "https://www.ra... |
"Triangle Brewi... | "Durham" | "Microbrewery" | 21 | 2007 | "Closed" | "https://www.ra... |
"White Rabbit B... | "Angier" | "Microbrewery" | 19 | 2013 | "Closed" | "https://www.ra... |
"
],
"text/plain": [
"shape: (177, 7)\n",
"┌───────────────────┬───────────────┬──────────────┬────────────┬──────┬────────┬──────────────────┐\n",
"│ Name ┆ City ┆ Type ┆ Beer Count ┆ Est ┆ Status ┆ URL │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ str ┆ str ┆ i64 ┆ i64 ┆ str ┆ str │\n",
"╞═══════════════════╪═══════════════╪══════════════╪════════════╪══════╪════════╪══════════════════╡\n",
"│ 217 Brew Works ┆ Wilson ┆ Microbrewery ┆ 10 ┆ 2017 ┆ Active ┆ https://www.rate │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ beer.com//brewer │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ … │\n",
"│ 3rd Rock Brewing ┆ Trenton ┆ Microbrewery ┆ 12 ┆ 2016 ┆ Active ┆ https://www.rate │\n",
"│ Company ┆ ┆ ┆ ┆ ┆ ┆ beer.com//brewer │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ … │\n",
"│ Andrews Brewing ┆ Andrews ┆ Microbrewery ┆ 18 ┆ 2014 ┆ Active ┆ https://www.rate │\n",
"│ Company ┆ ┆ ┆ ┆ ┆ ┆ beer.com//brewer │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ … │\n",
"│ Appalachian ┆ Boone ┆ Microbrewery ┆ 78 ┆ 2013 ┆ Active ┆ https://www.rate │\n",
"│ Mountain Brewery ┆ ┆ ┆ ┆ ┆ ┆ beer.com//brewer │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ … │\n",
"│ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … │\n",
"│ Hosanna Brewing ┆ Fuqauy Varina ┆ Brewpub ┆ 12 ┆ 2013 ┆ Closed ┆ https://www.rate │\n",
"│ Company ┆ ┆ ┆ ┆ ┆ ┆ beer.com//brewer │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ … │\n",
"│ Jack of the Wood ┆ Asheville ┆ Brewpub ┆ 13 ┆ 2004 ┆ Closed ┆ https://www.rate │\n",
"│ Brewpub ┆ ┆ ┆ ┆ ┆ ┆ beer.com//brewer │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ … │\n",
"│ Triangle Brewing ┆ Durham ┆ Microbrewery ┆ 21 ┆ 2007 ┆ Closed ┆ https://www.rate │\n",
"│ Company ┆ ┆ ┆ ┆ ┆ ┆ beer.com//brewer │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ … │\n",
"│ White Rabbit ┆ Angier ┆ Microbrewery ┆ 19 ┆ 2013 ┆ Closed ┆ https://www.rate │\n",
"│ Brewing (NC) ┆ ┆ ┆ ┆ ┆ ┆ beer.com//brewer │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ … │\n",
"└───────────────────┴───────────────┴──────────────┴────────────┴──────┴────────┴──────────────────┘"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.filter(\n",
" pl.col(\"Beer Count\").is_between(10, 100))"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (54, 7)Name | City | Type | Beer Count | Est | Status | URL |
---|
str | str | str | i64 | i64 | str | str |
"217 Brew Works... | "Wilson" | "Microbrewery" | 10 | 2017 | "Active" | "https://www.ra... |
"7 Clans Brewin... | "Cherokee" | "Client Brewer" | 1 | 2018 | "Active" | "https://www.ra... |
"Angry Troll Br... | "Elkin" | "Microbrewery" | 8 | 2017 | "Active" | "https://www.ra... |
"Bear Creek Bre... | "Bear Creek" | "Microbrewery" | 6 | 2012 | "Active" | "https://www.ra... |
"Beech Mountain... | "Beech Mountain... | "Microbrewery" | 7 | 2014 | "Active" | "https://www.ra... |
"Bill's Front P... | "Wilmington" | "Brewpub/Brewer... | 10 | 2016 | "Active" | "https://www.ra... |
"Biltmore Brewi... | "Asheville" | "Client Brewer" | 4 | 2010 | "Active" | "https://www.ra... |
"BottleTree Bee... | "Tryon" | "Client Brewer" | 2 | 2010 | "Active" | "https://www.ra... |
"Bright Light B... | "Fayetteville" | "Microbrewery" | 5 | 2018 | "Active" | "https://www.ra... |
"Broomtail Craf... | "Wilmington" | "Microbrewery" | 10 | 2014 | "Active" | "https://www.ra... |
"Bull City Cide... | "Durham" | "Commercial Bre... | 9 | 2014 | "Active" | "https://www.ra... |
"Bull Durham Be... | "Durham" | "Microbrewery" | 7 | 2015 | "Active" | "https://www.ra... |
… | … | … | … | … | … | … |
"Slammin' Sam B... | "Pinehurst" | "Client Brewer" | 1 | 2012 | "Active" | "https://www.ra... |
"Southern Range... | "Monroe" | "Microbrewery" | 6 | 2016 | "Active" | "https://www.ra... |
"Tarboro Brewin... | "Tarboro" | "Microbrewery" | 8 | 2016 | "Active" | "https://www.ra... |
"Tek Mountain B... | "Wilmington" | "Microbrewery" | 7 | 2016 | "Active" | "https://www.ra... |
"The Mason Jar ... | "Fuquay Varina" | "Microbrewery" | 5 | 2017 | "Active" | "https://www.ra... |
"Thristy Souls ... | "Mount Airy" | "Brewpub/Brewer... | 10 | 2018 | "Active" | "https://www.ra... |
"Tobacco Road S... | "Raleigh" | "Brewpub" | 7 | 2017 | "Active" | "https://www.ra... |
"Valley River B... | "Murphy" | "Brewpub" | 8 | 2017 | "Active" | "https://www.ra... |
"Vicious Fishes... | "Angier" | "Microbrewery" | 1 | 2017 | "Active" | "https://www.ra... |
"Waterline Brew... | "Wilmington" | "Microbrewery" | 6 | 2015 | "Active" | "https://www.ra... |
"Winding Creek ... | "Columbus" | "Microbrewery" | 9 | 2017 | "Active" | "https://www.ra... |
"York Chester B... | "Belmont" | "Microbrewery" | 8 | 2016 | "Active" | "https://www.ra... |
"
],
"text/plain": [
"shape: (54, 7)\n",
"┌────────────────────┬────────────┬───────────────┬────────────┬──────┬────────┬───────────────────┐\n",
"│ Name ┆ City ┆ Type ┆ Beer Count ┆ Est ┆ Status ┆ URL │\n",
"│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n",
"│ str ┆ str ┆ str ┆ i64 ┆ i64 ┆ str ┆ str │\n",
"╞════════════════════╪════════════╪═══════════════╪════════════╪══════╪════════╪═══════════════════╡\n",
"│ 217 Brew Works ┆ Wilson ┆ Microbrewery ┆ 10 ┆ 2017 ┆ Active ┆ https://www.rateb │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ eer.com//brewer… │\n",
"│ 7 Clans Brewing ┆ Cherokee ┆ Client Brewer ┆ 1 ┆ 2018 ┆ Active ┆ https://www.rateb │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ eer.com//brewer… │\n",
"│ Angry Troll ┆ Elkin ┆ Microbrewery ┆ 8 ┆ 2017 ┆ Active ┆ https://www.rateb │\n",
"│ Brewing ┆ ┆ ┆ ┆ ┆ ┆ eer.com//brewer… │\n",
"│ Bear Creek Brews ┆ Bear Creek ┆ Microbrewery ┆ 6 ┆ 2012 ┆ Active ┆ https://www.rateb │\n",
"│ ┆ ┆ ┆ ┆ ┆ ┆ eer.com//brewer… │\n",
"│ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … │\n",
"│ Vicious Fishes ┆ Angier ┆ Microbrewery ┆ 1 ┆ 2017 ┆ Active ┆ https://www.rateb │\n",
"│ Brewery ┆ ┆ ┆ ┆ ┆ ┆ eer.com//brewer… │\n",
"│ Waterline Brewing ┆ Wilmington ┆ Microbrewery ┆ 6 ┆ 2015 ┆ Active ┆ https://www.rateb │\n",
"│ Company ┆ ┆ ┆ ┆ ┆ ┆ eer.com//brewer… │\n",
"│ Winding Creek ┆ Columbus ┆ Microbrewery ┆ 9 ┆ 2017 ┆ Active ┆ https://www.rateb │\n",
"│ Brewing Company ┆ ┆ ┆ ┆ ┆ ┆ eer.com//brewer… │\n",
"│ York Chester ┆ Belmont ┆ Microbrewery ┆ 8 ┆ 2016 ┆ Active ┆ https://www.rateb │\n",
"│ Brewing Company ┆ ┆ ┆ ┆ ┆ ┆ eer.com//brewer… │\n",
"└────────────────────┴────────────┴───────────────┴────────────┴──────┴────────┴───────────────────┘"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.filter(\n",
" (pl.col('Beer Count') <= 10) & (pl.col('Status') != \"Closed\")\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
shape: (5, 2)Type | count |
---|
str | u32 |
"Microbrewery" | 165 |
"Brewpub/Brewer... | 41 |
"Brewpub" | 33 |
"Client Brewer" | 9 |
"Commercial Bre... | 3 |
"
],
"text/plain": [
"shape: (5, 2)\n",
"┌────────────────────┬───────┐\n",
"│ Type ┆ count │\n",
"│ --- ┆ --- │\n",
"│ str ┆ u32 │\n",
"╞════════════════════╪═══════╡\n",
"│ Microbrewery ┆ 165 │\n",
"│ Brewpub/Brewery ┆ 41 │\n",
"│ Brewpub ┆ 33 │\n",
"│ Client Brewer ┆ 9 │\n",
"│ Commercial Brewery ┆ 3 │\n",
"└────────────────────┴───────┘"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.groupby(\"Type\").count().sort(by=\"count\", descending=True)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Not bad!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "40d3a090f54c6569ab1632332b64b2c03c39dcf918b08424e98f38b5ae0af88f"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}