I have a small problem concerning conversion of data to time series. Here are the steps that i carried out.
I have the output data as follows :
Beautiful Soup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.
url1 = 'http://financials.morningstar.com/finan/financials/getFinancePart.html?&callback=xxx&t=BBRI'
url2 = 'http://financials.morningstar.com/finan/financials/getKeyStatPart.html?&callback=xxx&t=BBRI'
soup1 = BeautifulSoup(json.loads(re.findall(r'xxx((.*))', requests.get(url1).text)[0])['componentData'], 'lxml')
soup2 = BeautifulSoup(json.loads(re.findall(r'xxx((.*))', requests.get(url2).text)[0])['componentData'], 'lxml')
def print_table(soup):
for i, tr in enumerate(soup.select('tr')):
row_data = [td.text for td in tr.select('th, td') if td.text]
if not row_data:
continue
if len(row_data) < 12:
row_data = ['X'] + row_data
for j, td in enumerate(row_data):
if j==0:
print('{: >30}'.format(td))
else:
print('{: ^12}'.format(td))
print()
print_table(soup1)
produce output
X
2010-12
2011-12
2012-12
2013-12
2014-12
2015-12
2016-12
2017-12
2018-12
2019-12
TTM
Revenue IDR Mil
30,552,600
40,203,051
43,104,711
51,133,344
59,556,636
69,813,152
82,504,537
90,844,308
99,067,098
108,468,320
105,847,159
I need to convert it to a dataframe with panda being to:
data
X Revenue IDR Mil
2010-12 30,552,600
2011-12 40,203,051
2012-12 43,104,711
2013-12 51,133,344
2014-12 59,556,636
2015-12 69,813,152
2016-12 82,504,537
2017-12 90,844,308
2018-12 99,067,098
2019-12 108,468,320
2020-12 105,847,159
question from:
https://stackoverflow.com/questions/65933303/converting-ordinary-data-into-time-seris-dataframes-panda-python