Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
322 views
in Technique[技术] by (71.8m points)

python - Plotly: How to display regression errors with lines between the observations and the regression line?

I've generated the following plotly chart in python. I've adjusted a regression to a finite set of points, and got the following chart:

enter image description here

I wan't to draw a vertical line between those points and the adjusted curve, like in the example below:

enter image description here

Im using a plotly plotly.graph_objects , pandas to generate those graphs and i have no idea on how to draw them. This is the code that im using:

import pandas as pd
import plotly.graph_objects as go

for point, curve in zip(points, curves):

    point_plot = go.Scatter(x=df['Duration'],
                            y=df[point],
                            name=point,
                            # text=df['Nome'],
                            mode='markers+text',
                            line_color=COLOR_CODE[point],
                            textposition='top center')

    line_plot = go.Scatter(x=df['Duration'],
                            y=df[curve],
                            name='', 
                            line_color=COLOR_CODE[point],
                            mode='lines')
    

    # XXX: this don't solve the problem but it's what i could think of for now
    to_bar = df[points].diff(axis=1).copy()
    to_bar['Nome'] = df['Nome']
    bar_plot = go.Bar(x=to_bar['Nome'], y=to_bar[point], name='', marker_color=COLOR_CODE[point])

                            
    fig.add_trace(line_plot, row=1, col=1)
    fig.add_trace(point_plot, row=1, col=1)
    fig.add_trace(bar_plot, row=2, col=1)

I can share the dataframe that im using to generate this plot if you think it's needed. Any help is more than welcome.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You haven't provided a working code snippet with a sample of your data, so I'm going to base my suggestion on my earlier answer Plotly: How to plot a regression line using plotly?. If your figure is limeted to two series as in your example you can:

1. retrieve x-values from one of the series using xVals = fig.data[0]['x'], and

2. organize all points for your regression line and observations markers using a dict errors = {}, and

3. populate that dict using:

for d in fig.data:
    errors[d['mode']]=d['y']

4. Then you can add a line shape for distance between your line and markers (your errors) using:

for i, x in enumerate(xVals):
    shapes.append(go.layout.Shape(type="line", [...])

Result:

enter image description here

Complete code:

import plotly.graph_objects as go
import statsmodels.api as sm
import pandas as pd
import numpy as np
import datetime

# data
np.random.seed(123)
numdays=20

X = (np.random.randint(low=-20, high=20, size=numdays).cumsum()+100).tolist()
Y = (np.random.randint(low=-20, high=20, size=numdays).cumsum()+100).tolist()

df = pd.DataFrame({'X': X, 'Y':Y})

# regression
df['bestfit'] = sm.OLS(df['Y'],sm.add_constant(df['X'])).fit().fittedvalues

# plotly figure setup
fig=go.Figure()
fig.add_trace(go.Scatter(name='X vs Y', x=df['X'], y=df['Y'].values, mode='markers'))
fig.add_trace(go.Scatter(name='line of best fit', x=X, y=df['bestfit'], mode='lines'))


# plotly figure layout
fig.update_layout(xaxis_title = 'X', yaxis_title = 'Y')

# retrieve x-values from one of the series
xVals = fig.data[0]['x']

errors = {} # container for prediction errors

# organize data for errors in a dict
for d in fig.data:
    errors[d['mode']]=d['y']

shapes = [] # container for shapes

# make a line shape for each error == distance between each marker and line points
for i, x in enumerate(xVals):
    shapes.append(go.layout.Shape(type="line",
                                    x0=x,
                                    y0=errors['markers'][i],
                                    x1=x,
                                    y1=errors['lines'][i],
                                    line=dict(
                                        #color=np.random.choice(colors,1)[0],
                                        color = 'black',
                                        width=1),
                                    opacity=0.5,
                                    layer="above")
                 )

# include shapes in layout
fig.update_layout(shapes=shapes)
fig.show()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...