Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
665 views
in Technique[技术] by (71.8m points)

reading excel to a python data frame starting from row 5 and including headers

how do I import excel data into a dataframe in python.

Basically the current excel workbook runs some vba on opening which refreshes a pivot table and does some other stuff.

Then I wish to import the results of the pivot table refresh into a dataframe in python for further analysis.

import xlrd

wb = xlrd.open_workbook('C:UserscbMachine_LearningcMap_Joins.xlsm')

#sheetnames
print wb.sheet_names()

#number of sheets
print wb.nsheets

The refreshing and opening of the file works fine. But how do i select the data from the first sheet from say row 5 including header down to last record n.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use pandas' ExcelFile parse method to read Excel sheets, see io docs:

xls = pd.ExcelFile('C:UserscbMachine_LearningcMap_Joins.xlsm')

df = xls.parse('Sheet1', skiprows=4, index_col=None, na_values=['NA'])

skiprows will ignore the first 4 rows (i.e. start at row index 4), and several other options.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...