plotly
plotly#
We’ve now seen the basics of plotting with pandas and matplotlib. We saw seaborn. Let’s try another package, called plotly, that let’s us create interactive graphics. To install the Plotly package, you’ll need to use pip. As before, you can do this inside of a cell in your notebook, or in the terminal.
pip install plotly
I won’t do a comprehensive overview, but I can show you a couple of examples. We’ll look at Plotly Express.
Let’s start by getting our stock data back in.
# Set-up
import numpy as np
import pandas as pd
# This brings in all of matplotlib
import matplotlib as mpl
# This lets us refer to the pyplot part of matplot lib more easily. Just use plt!
import matplotlib.pyplot as plt
# Bring in Plotly Express
import plotly.express as px
# Bring in Plotly graphic objects
import plotly.graph_objects as go
import plotly.offline as py
# Keeps warnings from cluttering up our notebook.
import warnings
warnings.filterwarnings('ignore')
# Include this to have plots show up in your Jupyter notebook.
%matplotlib inline
# Read in some eod prices
stocks = pd.read_csv('https://raw.githubusercontent.com/aaiken1/fin-data-analysis-python/main/data/tr_eikon_eod_data.csv',
index_col=0, parse_dates=True)
stocks.dropna(inplace=True)
from janitor import clean_names
stocks = clean_names(stocks)
stocks.info()
/opt/anaconda3/lib/python3.9/site-packages/pandas/core/computation/expressions.py:21: UserWarning: Pandas requires version '2.8.4' or newer of 'numexpr' (version '2.8.3' currently installed).
from pandas.core.computation.check import NUMEXPR_INSTALLED
/opt/anaconda3/lib/python3.9/site-packages/pandas/core/arrays/masked.py:60: UserWarning: Pandas requires version '1.3.6' or newer of 'bottleneck' (version '1.3.5' currently installed).
from pandas.core import (
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[1], line 33
28 stocks = pd.read_csv('https://raw.githubusercontent.com/aaiken1/fin-data-analysis-python/main/data/tr_eikon_eod_data.csv',
29 index_col=0, parse_dates=True)
31 stocks.dropna(inplace=True)
---> 33 from janitor import clean_names
35 stocks = clean_names(stocks)
37 stocks.info()
File /opt/anaconda3/lib/python3.9/site-packages/janitor/__init__.py:9
5 import lazy_loader as lazy
8 from .accessors import * # noqa: F403, F401
----> 9 from .functions import * # noqa: F403, F401
10 from .io import * # noqa: F403, F401
11 from .math import * # noqa: F403, F401
File /opt/anaconda3/lib/python3.9/site-packages/janitor/functions/__init__.py:25
23 from .change_type import change_type
24 from .clean_names import clean_names
---> 25 from .coalesce import coalesce
26 from .collapse_levels import collapse_levels
27 from .complete import complete
File /opt/anaconda3/lib/python3.9/site-packages/janitor/functions/coalesce.py:7
4 import pandas_flavor as pf
6 from janitor.utils import check, deprecated_alias
----> 7 from janitor.functions.utils import _select_index
10 @pf.register_dataframe_method
11 @deprecated_alias(columns="column_names", new_column_name="target_column_name")
12 def coalesce(
(...)
16 default_value: Optional[Union[int, float, str]] = None,
17 ) -> pd.DataFrame:
18 """Coalesce two or more columns of data in order of column names provided.
19
20 Given the variable arguments of column names,
(...)
87 :raises ValueError: if length of `column_names` is less than 2.
88 """
File /opt/anaconda3/lib/python3.9/site-packages/janitor/functions/utils.py:16
5 import re
6 from typing import (
7 Hashable,
8 Iterable,
(...)
14 Any,
15 )
---> 16 from pandas.core.dtypes.generic import ABCPandasArray, ABCExtensionArray
17 from pandas.core.common import is_bool_indexer
18 from dataclasses import dataclass
ImportError: cannot import name 'ABCPandasArray' from 'pandas.core.dtypes.generic' (/opt/anaconda3/lib/python3.9/site-packages/pandas/core/dtypes/generic.py)
Let’s make a simple line graph of prices for Apple. As a reminder, our date is our index in this DataFrame.
fig = px.line(stocks, x=stocks.index, y='aapl_o', title='Apple Price History')
fig.show()
Not bad! You can create what plotly calls graphic objects. You than add traces, similar to axes, and start to layer things together. Here, we’ll create our “blank” figure and then add three more price sequences.
# Create traces
fig = go.Figure()
fig.add_trace(go.Scatter(x=stocks.index, y=stocks.aapl_o,
mode='lines',
name='AAPL'))
fig.add_trace(go.Scatter(x=stocks.index, y=stocks.msft_o,
mode='lines',
name='MSFT'))
fig.show()
We can create a histogram of returns too. I added a rug on the top which helps you see the distribution and outliers better. I am also showing the percentage of observations in a bin, not a count.
And, I made a bunch of other changes, just to give you an idea of the syntax.
stocks['aapl_ret'] = np.log(stocks.aapl_o / stocks.aapl_o.shift(1))
fig = px.histogram(stocks, x='aapl_ret',
marginal="rug", # That thing at the top!
histnorm='percent',
opacity=0.75, # alpha
width=600, #pixels
height=400,
template="simple_white")
fig.update_layout(
title_text='Apple Return Distribution', # title of plot
xaxis_title_text='Return', # xaxis label
yaxis_title_text='Percent', # yaxis label
bargap=0.2, # gap between bars of adjacent location coordinates
bargroupgap=0.1 # gap between bars of the same location coordinates
)
fig.show()