Welcome#

Spring 2024 with Prof. Adam Aiken

elon-signature

Note

This is a living textbook. I will be updating the notes and making improvements as we go.

Welcome! This online text contains my notes and code examples. There are essentially three parts to this course. First, we’ll get set-up and become comfortable with our coding environment.

We also have a course Github repository, where I will keep data and code (e.g. Jupyter notebooks) that we use in class.

We also have a course YouTube page, where I will post the occasional video, as needed. Most of our work will of course be done in class together.

Our labs and exams are posted using Github Classroom. You’ll find links to these on Moodle.

The most important part is to make sure that you are up and running with Github, Jupyter Notebooks, and Github Codespaces/VS Code. We’ll discuss how to get started in Chapter 2. Your coding environment can be entirely browser-based, if you choose to do so.

Hint

Bookmark important links so that you don’t have to go searching for them.

We also have a textbook on machine learning in finance. If you’ve taken a derivatives class, you might have seen the author’s other textbook - every derivatives trader in the world has read Hull. We’ll get to these topics more in the second-half of the course.

Getting help#

Note

No programming background is required for this class. For both students and instructor.

Learning to code means learning how to get help. No one has all of this stuff in their head at all times. That means using our textbook, the online books above, cheatsheets, and other resources. The links posted above will help a lot.

You can find help on Stackoverflow, though “copy and paste” isn’t really the best way to learn to code. Go slowly, line-by-line, and try to think like a computer. They do exactly what you tell them to do. No more, no less.

Even better, you can use tools such as ChatGPT and Github Copilot to help you code. We’ll look at some examples as we go, but you should have one of these tools open whenever you are coding. The basic version of ChatGPT is free. Github Copilot is free for verified students, so sign-up! You’ll be able to link it to your Codespaces and VS code installs, as I’ll show you.

Some more suggestions for using ChatGPT in data projects:

Sources#

I have pulled material from many different sources in order to create these notes and I am very grateful that they have made their work available. For example, I include commentary on examples from our book. There are also many other excellent and, often, free guides to using Python, Jupyter notebooks, and VS Code.

The Python Tutorial. This is the main tutorial from the folks who develop Python. I’ll refer to it throughout the text if you want a more in-depth look at something.

Python for Finance, 2e. This is probably the best single book for finance and Python that I’ve found. However, it does jump from the basics into more advanced material quite quickly. This textbook also has a Github repository that contains the code used in the book.

There are also many free resources available. The key is being able to find what’s helpful - web searches often lead you to awful AI-generated Medium posts. While these guides are not finance-related, per se, they cover material that will come up in any sort of data science project. I thank the authors for making these resources available and I use examples from them in my notes.

Coding for Economists is an excellent online book that discusses getting Python set up, importing and exploring your data, as well as more advanced econometrics topics that we won’t get into here. But, if you’re using STATA in another class, I recommend taking a look at what you can do for free in Python. You might not have that STATA license on the job! He even tells you how to automate your VS Code set-up, if you’re into that kind of thing.

Python Programming for Data Science is another great, general resource. You’ll find a discussion of the basics, along with a Python style guide for writing readable code, details and about NumPy and Pandas, and tips for data wrangling and cleaning.

Python for Data Analysis, 3E is also available for free online. The author has just updated to a third edition.

DataCamp also has lots of mini-tutorials and cheatsheets that can help get you started.

I also have a collection of non-DataCamp cheat sheets.

I have also found the The Self-Taught Programmer a useful book, especially for someone like me, who does write code for research, but has only had a few formal computer science courses.

If you’ve used R in the past, here’s a guide to setting up Python to be more like R.