web scraping - Extracting Numbers from Yahoo Financial statement, willing to pay some money through Pay Pal

Question

Welcome To Ask or Share your Answers For Others

web scraping - Extracting Numbers from Yahoo Financial statement, willing to pay some money through Pay Pal

posted Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

web scraping - Extracting Numbers from Yahoo Financial statement, willing to pay some money through Pay Pal

I am trying to extract financial data from yahoo finance using python. Below there is a link to an image that shows in circles which data I am trying to retrieve. It has the organization of the data table however I do not know where to begin with the givens shown in the picture.

This is the image of the code location of the numbers I'm trying to extract from yahoo finance, with the table name and td tickers.

I realize that I must somehow use the td tickers to find the numbers that I need for the extraction however Im not sure what are the basics commands that I need to implement.

This is a link to an example of the the data table that I'm trying to scrape

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2022-01-31T07:21:29+0000

The page you scraped is rendered by JavaScript, requests and urllib can not handle JavaScript. I recommend you using selenium and BeautifulSoup to extract data. This is when JavaScript is disabled:

the data you wanted is in this url :

http://financials.morningstar.com/ajax/ReportProcess4HtmlAjax.html?&t=XNAS:AAPL&region=usa&culture=en-US&ops=clear&cur=&reportType=is&period=12&dataType=A&order=asc&columnYear=5&curYearPart=1st5year&rounding=3&view=raw&r=378724&callback=jsonp1482077238548&_=1482077239651

i put it in the bs4, you can get the data by you own:

import requests, bs4, json

r = requests.get('http://financials.morningstar.com/ajax/ReportProcess4HtmlAjax.html?&t=XNAS:AAPL&region=usa&culture=en-US&ops=clear&cur=&reportType=is&period=12&dataType=A&order=asc&columnYear=5&curYearPart=1st5year&rounding=3&view=raw&r=378724&callback=jsonp1482077238548&_=1482077239651')

js = r.text.strip('jsonp1482077238548()')
html_str = json.loads(js)['result']
soup = bs4.BeautifulSoup(html_str, 'lxml')

out:

<html>
 <body>
  <div id="baseline" style="display:none">
   <div>
    156508000000
   </div>
   <div>
    170910000000
   </div>
   <div>
    182795000000
   </div>
   <div>
    233715000000
   </div>
   <div>
    215639000000
   </div>
   <div>
    215639000000
   </div>
  </div>
  <div class="left ">
   <div class="r_xcmenu rf_table_left">
    <div class="rf_header ">
     <div class="lbl " currency="USD" fiscalyearend="September" fyenumber="9" id="unitsAndFiscalYear">
     </div>
    </div>
    <div class="rf_crow1" id="label_i1" style="_height:16px; _float:none;">
     <div class="lbl">
      Revenue
     </div>
     <div class="chart_contain_free" id="chart_i1">
      <div class="chart_icon">
      </div>
     </div>
    </div>

Categories

web scraping - Extracting Numbers from Yahoo Financial statement, willing to pay some money through Pay Pal

web scraping - Extracting Numbers from Yahoo Financial statement, willing to pay some money through Pay Pal

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags