Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
143 views
in Technique[技术] by (71.8m points)

Scraping an element as it is stored in the database

I'm in the process of scraping a website using python with scrapy for a univeristy assignment. I came across an element on the website taht's supposed to hold a rating. The rating is shown as an integer, but I think it's stored as a decimal in the database.

Is there a way to get the rating as it's stored in the database? if so, what are the tools and concepts I need to learn to do it?

An example for a page I'm trying to scrape: https://www.old-games.org/games/bond (written in Hebrew).
That's the element I'm talking about: rating element
and that's the html code:

<td class="game_specs">7/10</td>

The desired result is to get the actual value of "7" as it stored in the database, for example "7.32".
The website doesn't have an API for getting the information out of it.

I've tried to search on my own, but since I'm new to the field of web development/scraping I couldn't find any solution (probably because I don't know the terminology).

question from:https://stackoverflow.com/questions/66064245/scraping-an-element-as-it-is-stored-in-the-database

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you're scraping a website, you can't even know if there is a database, let alone what's in the database. For all you (or your code) can tell, there is someone sitting there typing out those ratings manually, or a function that generates them at random each time you access the page.

Another way of looking at it is if I asked you to tell me what the "real" rating was, where would you look? You can look on the displayed page; you can look in the HTML source, and in any JS, AJAX calls, etc. If you can find it in any of those, you can write a scraper for it; if you can't, you can't.

Imagine for a second that there was a special trick to read the database of any website in the world. Now you could go to Amazon and apply that trick to find the personal details of the people who left reviews, or of the suppliers who are selling marketplace items!

The owner of any website can choose what information to give you, and what to keep private. There's no way to force the website to tell you the private bits unless the person operating the site has accidentally left something available publicly that they intended to be private (and looking too hard for those mistakes may well be breaking the law).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...