Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
368 views
in Technique[技术] by (71.8m points)

Converting ordinary data into time seris dataframes panda Python

I have a small problem concerning conversion of data to time series. Here are the steps that i carried out. I have the output data as follows : Beautiful Soup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.

url1 = 'http://financials.morningstar.com/finan/financials/getFinancePart.html?&callback=xxx&t=BBRI'
url2 = 'http://financials.morningstar.com/finan/financials/getKeyStatPart.html?&callback=xxx&t=BBRI'

soup1 = BeautifulSoup(json.loads(re.findall(r'xxx((.*))', requests.get(url1).text)[0])['componentData'], 'lxml')
soup2 = BeautifulSoup(json.loads(re.findall(r'xxx((.*))', requests.get(url2).text)[0])['componentData'], 'lxml')

def print_table(soup):
    for i, tr in enumerate(soup.select('tr')):
        row_data = [td.text for td in tr.select('th, td') if td.text]
        if not row_data:
            continue
        if len(row_data) < 12:
            row_data = ['X'] + row_data
        for j, td in enumerate(row_data):
            if j==0:
                print('{: >30}'.format(td))
            else:
                print('{: ^12}'.format(td))
        print()


print_table(soup1)

produce output

          X
  2010-12   
  2011-12   
  2012-12   
  2013-12   
  2014-12   
  2015-12   
  2016-12   
  2017-12   
  2018-12   
  2019-12   
    TTM     

               Revenue IDR Mil
 30,552,600 
 40,203,051 
 43,104,711 
 51,133,344 
 59,556,636 
 69,813,152 
 82,504,537 
 90,844,308 
 99,067,098 
108,468,320 
105,847,159 

I need to convert it to a dataframe with panda being to:

data

   X        Revenue IDR Mil
  2010-12        30,552,600 
  2011-12        40,203,051 
  2012-12        43,104,711
  2013-12        51,133,344    
  2014-12        59,556,636    
  2015-12        69,813,152  
  2016-12        82,504,537   
  2017-12        90,844,308 
  2018-12        99,067,098   
  2019-12        108,468,320   
  2020-12        105,847,159     
question from:https://stackoverflow.com/questions/65933303/converting-ordinary-data-into-time-seris-dataframes-panda-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is a bit simplified from what you are doing, but I think it gets you where you need, mostly from Bitto Bennichan,

import json
import pandas as pd

url1 = 'http://financials.morningstar.com/finan/financials/getFinancePart.html?t=BBRI'
url2 = 'http://financials.morningstar.com/finan/financials/getKeyStatPart.html?t=BBRI'

lm_json = requests.get(url1).json()
df_list=pd.read_html(lm_json["componentData"])
df_list[0].transpose()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...