plotly#

We’ve now seen the basics of plotting with pandas and matplotlib. We saw seaborn. Let’s try another package, called plotly, that let’s us create interactive graphics. To install the Plotly package, you’ll need to use pip. As before, you can do this inside of a cell in your notebook, or in the terminal.

pip install plotly

I won’t do a comprehensive overview, but I can show you a couple of examples. We’ll look at Plotly Express.

Let’s start by getting our stock data back in.

# Set-up

import numpy as np
import pandas as pd

# This brings in all of matplotlib
import matplotlib as mpl 

# This lets us refer to the pyplot part of matplot lib more easily. Just use plt!
import matplotlib.pyplot as plt

# Bring in Plotly Express
import plotly.express as px

# Bring in Plotly graphic objects
import plotly.graph_objects as go

import plotly.offline as py

# Keeps warnings from cluttering up our notebook. 
import warnings
warnings.filterwarnings('ignore')

# Include this to have plots show up in your Jupyter notebook.
%matplotlib inline 

# Read in some eod prices
stocks = pd.read_csv('https://raw.githubusercontent.com/aaiken1/fin-data-analysis-python/main/data/tr_eikon_eod_data.csv',
                  index_col=0, parse_dates=True)  

stocks.dropna(inplace=True)  

from janitor import clean_names

stocks = clean_names(stocks)

stocks.info()
/opt/anaconda3/lib/python3.9/site-packages/pandas/core/computation/expressions.py:21: UserWarning: Pandas requires version '2.8.4' or newer of 'numexpr' (version '2.8.3' currently installed).
  from pandas.core.computation.check import NUMEXPR_INSTALLED
/opt/anaconda3/lib/python3.9/site-packages/pandas/core/arrays/masked.py:60: UserWarning: Pandas requires version '1.3.6' or newer of 'bottleneck' (version '1.3.5' currently installed).
  from pandas.core import (
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[1], line 33
     28 stocks = pd.read_csv('https://raw.githubusercontent.com/aaiken1/fin-data-analysis-python/main/data/tr_eikon_eod_data.csv',
     29                   index_col=0, parse_dates=True)  
     31 stocks.dropna(inplace=True)  
---> 33 from janitor import clean_names
     35 stocks = clean_names(stocks)
     37 stocks.info()

File /opt/anaconda3/lib/python3.9/site-packages/janitor/__init__.py:9
      5 import lazy_loader as lazy
      8 from .accessors import *  # noqa: F403, F401
----> 9 from .functions import *  # noqa: F403, F401
     10 from .io import *  # noqa: F403, F401
     11 from .math import *  # noqa: F403, F401

File /opt/anaconda3/lib/python3.9/site-packages/janitor/functions/__init__.py:25
     23 from .change_type import change_type
     24 from .clean_names import clean_names
---> 25 from .coalesce import coalesce
     26 from .collapse_levels import collapse_levels
     27 from .complete import complete

File /opt/anaconda3/lib/python3.9/site-packages/janitor/functions/coalesce.py:7
      4 import pandas_flavor as pf
      6 from janitor.utils import check, deprecated_alias
----> 7 from janitor.functions.utils import _select_index
     10 @pf.register_dataframe_method
     11 @deprecated_alias(columns="column_names", new_column_name="target_column_name")
     12 def coalesce(
   (...)
     16     default_value: Optional[Union[int, float, str]] = None,
     17 ) -> pd.DataFrame:
     18     """Coalesce two or more columns of data in order of column names provided.
     19 
     20     Given the variable arguments of column names,
   (...)
     87     :raises ValueError: if length of `column_names` is less than 2.
     88     """

File /opt/anaconda3/lib/python3.9/site-packages/janitor/functions/utils.py:16
      5 import re
      6 from typing import (
      7     Hashable,
      8     Iterable,
   (...)
     14     Any,
     15 )
---> 16 from pandas.core.dtypes.generic import ABCPandasArray, ABCExtensionArray
     17 from pandas.core.common import is_bool_indexer
     18 from dataclasses import dataclass

ImportError: cannot import name 'ABCPandasArray' from 'pandas.core.dtypes.generic' (/opt/anaconda3/lib/python3.9/site-packages/pandas/core/dtypes/generic.py)

Let’s make a simple line graph of prices for Apple. As a reminder, our date is our index in this DataFrame.

fig = px.line(stocks, x=stocks.index, y='aapl_o', title='Apple Price History')
fig.show()

Not bad! You can create what plotly calls graphic objects. You than add traces, similar to axes, and start to layer things together. Here, we’ll create our “blank” figure and then add three more price sequences.

# Create traces
fig = go.Figure()
fig.add_trace(go.Scatter(x=stocks.index, y=stocks.aapl_o,
                    mode='lines',
                    name='AAPL'))
fig.add_trace(go.Scatter(x=stocks.index, y=stocks.msft_o,
                    mode='lines',
                    name='MSFT'))
fig.show()

We can create a histogram of returns too. I added a rug on the top which helps you see the distribution and outliers better. I am also showing the percentage of observations in a bin, not a count.

And, I made a bunch of other changes, just to give you an idea of the syntax.

stocks['aapl_ret'] = np.log(stocks.aapl_o / stocks.aapl_o.shift(1))  

fig = px.histogram(stocks, x='aapl_ret', 
                   marginal="rug", # That thing at the top!
                   histnorm='percent', 
                   opacity=0.75, # alpha
                   width=600, #pixels
                   height=400,
                   template="simple_white")

fig.update_layout(
    title_text='Apple Return Distribution', # title of plot
    xaxis_title_text='Return', # xaxis label
    yaxis_title_text='Percent', # yaxis label
    bargap=0.2, # gap between bars of adjacent location coordinates
    bargroupgap=0.1 # gap between bars of the same location coordinates
)


fig.show()