NASDAQ Data Link#
NASDAQ Data Link (formerly Quandl) provides access to a wide variety of financial and alternative datasets.
Getting Started#
Create an account at data.nasdaq.com/sign-up - choose Academic if you’re a student
Set up two-factor authentication (Google Authenticator recommended)
Get your API key from data.nasdaq.com/account/profile
Store the key as a GitHub Secret (recommended) or in a
.envfile
Install the package:
pip install nasdaq-data-link
Fig. 37 NASDAQ Data Link homepage.#
Set-up#
import nasdaqdatalink
import os
from dotenv import load_dotenv
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
# Load API key
load_dotenv()
NASDAQ_API_KEY = os.getenv('NASDAQ_API_KEY')
# Configure the API
nasdaqdatalink.ApiConfig.api_key = NASDAQ_API_KEY
Secure API Key Usage#
For GitHub Codespaces, use secrets:
# Recommended approach
import nasdaqdatalink as ndl
import os
NASDAQ_API_KEY = os.environ.get('NASDAQ')
ndl.ApiConfig.api_key = NASDAQ_API_KEY
Exploring Available Data#
Click EXPLORE on the main data page to see all available datasets. Much of the data is premium (paid), but you can filter for free data.
Fig. 38 Exploring NASDAQ data options.#
Example: Zillow Housing Data#
NASDAQ provides Zillow real estate data for free.
Fig. 39 Zillow housing data on NASDAQ.#
The data is organized into tables:
ZILLOW/DATA- The actual housing valuesZILLOW/REGIONS- Region identifiers and descriptionsZILLOW/INDICATORS- Indicator descriptions
Understanding the Data Structure#
Before downloading data, you need to understand what filters are available. Check the documentation for filtering options.
Exploring Regions#
Let’s first pull the regions table to understand what geographic areas are available.
regions = nasdaqdatalink.get_table('ZILLOW/REGIONS', paginate=True)
regions
---------------------------------------------------------------------------
ForbiddenError Traceback (most recent call last)
Cell In[3], line 1
----> 1 regions = nasdaqdatalink.get_table('ZILLOW/REGIONS', paginate=True)
2 regions
File ~/Documents/fin-data-analysis-text/.venv/lib/python3.9/site-packages/nasdaqdatalink/get_table.py:19, in get_table(datatable_code, **options)
17 while True:
18 next_options = copy.deepcopy(options)
---> 19 next_data = Datatable(datatable_code).data(params=next_options)
21 if data is None:
22 data = next_data
File ~/Documents/fin-data-analysis-text/.venv/lib/python3.9/site-packages/nasdaqdatalink/model/datatable.py:31, in Datatable.data(self, **options)
29 if not options:
30 options = {'params': {}}
---> 31 return Data.page(self, **options)
File ~/Documents/fin-data-analysis-text/.venv/lib/python3.9/site-packages/nasdaqdatalink/operations/list.py:30, in ListOperation.page(cls, datatable, **options)
26 request_type = RequestType.get_request_type(path, **options)
28 updated_options = Util.convert_options(request_type=request_type, **options)
---> 30 r = Connection.request(request_type, path, **updated_options)
32 response_data = r.json()
33 Util.convert_to_dates(response_data)
File ~/Documents/fin-data-analysis-text/.venv/lib/python3.9/site-packages/nasdaqdatalink/connection.py:40, in Connection.request(cls, http_verb, url, **options)
36 options['headers'] = headers
38 abs_url = '%s/%s' % (ApiConfig.api_base, url)
---> 40 return cls.execute_request(http_verb, abs_url, **options)
File ~/Documents/fin-data-analysis-text/.venv/lib/python3.9/site-packages/nasdaqdatalink/connection.py:52, in Connection.execute_request(cls, http_verb, url, **options)
47 response = session.request(method=http_verb,
48 url=url,
49 verify=ApiConfig.verify_ssl,
50 **options)
51 if response.status_code < 200 or response.status_code >= 300:
---> 52 cls.handle_api_error(response)
53 else:
54 return response
File ~/Documents/fin-data-analysis-text/.venv/lib/python3.9/site-packages/nasdaqdatalink/connection.py:120, in Connection.handle_api_error(cls, resp)
109 d_klass = {
110 'L': LimitExceededError,
111 'M': InternalServerError,
(...)
116 'X': ServiceUnavailableError
117 }
118 klass = d_klass.get(code_letter, DataLinkError)
--> 120 raise klass(message, resp.status_code, resp.text, resp.headers, code)
ForbiddenError: (Status 403) (Nasdaq Data Link Error QEPx04) A valid API key is required to retrieve data. Please check your API key and try again. You can find your API key under your account settings.
Note
The paginate=True parameter is important! Without it, you only get the first 10,000 rows. With pagination, the limit extends to 1,000,000 rows.
# Filter for just cities
cities = regions[regions.region_type == 'city']
cities
cities.info()
# Filter for counties
counties = regions[regions.region_type == 'county']
counties
Finding Specific Regions#
You can search the text in a column to find specific regions. Let’s find all NC counties:
nc_counties = counties[counties['region'].str.contains(';NC')]
nc_counties
nc_counties.info()
There are 100 counties in NC, so this worked.
Downloading Housing Data#
Now we can use these region IDs to pull actual housing data. First, convert to a list:
nc_county_list = nc_counties['region_id'].to_list()
Now pull the data for a specific indicator. ZATT is one of the housing indicators.
Warning
Be careful with large downloads! They can exceed API limits and cause timeouts.
# Example: Pull NC county data (commented to avoid API calls during build)
# zillow_nc = nasdaqdatalink.get_table(
# 'ZILLOW/DATA',
# indicator_id='ZATT',
# paginate=True,
# region_id=nc_county_list,
# qopts={'columns': ['indicator_id', 'region_id', 'date', 'value']}
# )
# zillow_nc.head(25)
Exporting Data#
You can export to CSV for exploration in Excel:
# Export counties to CSV
counties.to_csv('counties.csv', index=True)
Using Rapid API#
Rapid API is a marketplace for thousands of APIs - markets, sports, weather, and more. Many have free tiers.
Fig. 40 Rapid API marketplace.#
REST APIs#
Most APIs on Rapid API are REST APIs - a standardized way for computers to communicate using standard data formats like JSON.
Learn more at the API Learn page.
Working with JSON Data#
API responses typically come in JSON format. Here’s how to handle them:
import requests
# Make API request
response = requests.get(url, headers=headers)
# Convert to JSON
data_json = response.json()
# Flatten nested JSON to DataFrame
df = pd.json_normalize(data=data_json)
Fig. 41 JSON structure - nested dictionaries and lists.#
For deeply nested JSON, use the record_path parameter:
# Flatten nested 'events' key
df = pd.json_normalize(data=data_json, record_path=['events'])
Data on Kaggle#
Kaggle is another great source for datasets. Search their datasets for finance-related data.
Examples:
Stock price histories
Economic indicators
Kaggle data is usually pre-cleaned for competitions, but you’ll still need to do some preparation work.