My code looks like this, i use pd.DataFrame.from_records to fill data into the dataframe, but it takes Wall time: 1h 40min 30s to process the request and load data from the sql table with 22 mln rows into df.
# I skipped some of the code, since there are no problems with the extract of the query, it's fast
cur = con.cursor()
def db_select(query): # takes the request text and sends it to the data_frame
cur.execute(query)
col = [column[0].lower() for column in cur.description] # parse headers
df = pd.DataFrame.from_records(cur, columns=col) # fill the data into the dataframe
return df
Then I pass the sql query to the function:
frame = db_select("select * from table")
How can i optimize code for speed up process?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…