2. Python set-up #
We are going to start by getting you set-up in Github and Codespaces. I’ll go through how our assignments are going to work in Github Classroom. These tools mean that you can do all of your coding in your browser using what are called Jupyter notebooks, but in an environment that acts like the common VS Code desktop setup, including version control with git and Github.
The Writing Code section from Coding for Economists is a great place to start. His Code Preliminaries section also introduces some important ideas, like IDEs, packages, and VS Code.
Then, I’ll discuss how to do a local install of our tools. I’ll get you set-up with the Anaconda distribution of Python, Jupyter notebooks, and VS Code. From VS Code, we can also use Jupyter notebooks. I would encourage you to get very comfortable with the Jupyter notebook format and VS Code. We’ll go over everything in class as well.
I will also show you other ways to use Jupyter notebooks, including Google Colab.
In the end, we’ll be using Jupyter notebooks. Whether you use the Jupyter in Codespaces, Google Colab, or VS Code is up to you. I’ll tend to use VS Code in class, whether that is my local install or in Github Codespaces. Both are full feature developed environment. Jupyter notebooks in Anaconda and Google Colab are more limited.
For the local install, you can start with this video of me discussing how to download Anaconda and VS Code.
You can read more about Python and its history in Chapter 1 of Python for Data Analysis, 3E. Chapter 2 discusses Python basics, like Jupyter notebooks, in detail.
Getting started in Python isn’t easy. Even if you’re coming from another programming background, like Java, or a statistical language, like R. Data scientist (and good Twitter follow) Vicki Boykis has written about why getting started in Python can be hard.
This is a nice article on a modern Python data “stack”, especially if you’re coming from R. Note how she suggests using seaborn
and polars
, two newer packages, in place of matplotlib
and pandas
.
Don’t get discouraged!
2.1. Other tools#
This course is focused on using Python (and the VS Code IDE, Markdown, etc.) to solve financial problems. However, if you’re interested in building a full set of data/analytics tools, you need to know more.
This article, also by Vicki Boykis, outlines the need to learn three additional tools: Git, SQL, and the command line. I use all three in my day-to-day work, though I’m far from an expert in any of them. If you are thinking about working with data as a career, you should know these three tools.
Our notes touch on SQL, which is the standard database query language.