Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
70 views
in Technique[技术] by (71.8m points)

python 3.x - Apply function on a Pandas Dataframe

Apply function on a Pandas Dataframe

I have a code (C01) that calculates the moving averages (21 periods) of a given stock (individual) on the stock exchange (IBOV - B3-BRAZIL). Then I created a for loop where it determines that an asset is in an upward trend after 6 highs followed by moving averages (hypothesis, considering that there are more variables to determine this).

However, I want to do this loop for more than one asset, in this case C02, that is, it applies a function in each column of my code and returns only the name of the assets that are in an upward trend (in this case, the column name). I tried to turn the for loop into a function and apply that function using the pandas 'apply' to each column (axis = 1, I tried tbm axis = 'columns'). But I'm having an error creating the function. When I execute the function using apply, the message "ValueError: Lengths must match to compare" appears. How can I fix this?

Grateful for the attention.

import numpy as np
import pandas as pd
from pandas_datareader import data as wb
from mpl_finance import candlestick_ohlc
from pandas_datareader import data as wb
from datetime import datetime
import matplotlib.dates as mpl_dates
import matplotlib.pyplot as plt
import matplotlib.dates as mdates 

#STOCK
ativo = 'WEGE3.SA'
acao2 = ativo.upper()

#START AND END ANALYSIS
inicio = '2020-1-1'
fim = '2021-1-27'

#MAKE DATAFRAME
df00 = wb.DataReader(acao2, data_source='yahoo', start=inicio, end=fim)

df00.index.names = ['Data']
df= df00.copy(deep=True)
df['Data'] = df.index.map(mdates.date2num)

# MOVING AVERAGE
df['ema21'] = df['Close'].ewm(span=21, adjust=False).mean()
df['ema72'] = df['Close'].ewm(span=72, adjust=False).mean()

#DF PLOT
df1=df
df2=df[-120:]

#TREND RULE
alta=1
for i in range(6):
  if(df2.ema21[-i-1] < df2.ema21[-i-2]):
    alta=0

baixa=1
for i in range(6):
  if(df2.ema21[-i-1] > df2.ema21[-i-2]):
    baixa=0

if (alta==1 and baixa==0):
  a1 = ativo.upper()+ ' HIGH TREND'
elif (alta==0 and baixa==1):
  a1 = ativo.upper()+ ' LOW TREND!'
else:
  a1 = ativo.upper()+ ' UNDEFINED'
  
#PLOT RESULTS
print("---------------------------------------") 
print(a1)
print("---------------------------------------")

ohlc = df[['Data', 'Open', 'High', 'Low', 'Close']]

f1, ax = plt.subplots(figsize=(14, 8))

# plot the candlesticks
candlestick_ohlc(ax, ohlc.values, width=.6, colorup='green', colordown='red')
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))

label_ = acao2.upper() + ' EMA26'
label_2 = acao2.upper() + ' EMA09'
ax.plot(df.index, df1['ema21'], color='black', label=label_)
ax.plot(df.index, df1['ema72'], color='blue', label=label_)

ax.grid(False)
ax.legend()
ax.grid(True)

plt.title(acao2.upper() + ' : Gráfico Diário')
plt.show(block=True)

#C02

#START/END ANALISYS
inicio = '2020-1-1'
fim = '2021-1-27'

#STOCKS
ativos = ['SAPR11.SA','WEGE3.SA']

#DATAFRAME
mydata = pd.DataFrame()
for t in ativos:
    mydata[t] = wb.DataReader(t, data_source='yahoo', start=inicio, end=fim)['Close']
df2 = mydata

#MOVING AVERAGE
df3 = df2.apply(lambda x: x.rolling(window=21).mean())

#MAKE FUNCTION
def trend(x):
  tendencia_alta=1
  for i in range(6):
    if(df3.columns[-i-1:] > df3.columns[-i-2:]):
      tendencia_alta=0

  print()
  if (alta==1 and baixa==0):
      a1 = ativo.upper()+ ' HIGH TREND'
  elif (alta==0 and baixa==1):
      a1 = ativo.upper()+ ' LOW TREND!'
  else:
      a1 = ativo.upper()+ ' UNDEFINED'

#TRYING TO APPLY THE FUNCTION IN EVERY DF3 COLUMN
df3.apply(trend, axis=1)′′′
question from:https://stackoverflow.com/questions/65926811/apply-function-on-a-pandas-dataframe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

something like:

def myfunc(x):
   #do things here where x is the group of rows sent to function
   #instead of df['column'], you'll use x['column'] 
   #because you are passing the rows into x
   return x

df.groupby('yourcolumn').apply(myfunc)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...