Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
395 views
in Technique[技术] by (71.8m points)

python - Not getting desired XPath

To summarise, how can I get an XPath to scrape the odds in my script.

An XPath that gives different values.

groups = ".//div[contains(@class, 'gl-ParticipantOddsOnlyDarker gl')]"

xp_ba3 = ".//span[contains(@class, 'gl-Participa')]"

The XPath needs for both groups and xp_ba3 needs to be the same length and different for it to behave correctly (I believe).

Desired XPath looks like:

XPath: //div[contains(@class, 'gl-Market_HasLabels')]/following-sibling::div[contains(@class, 'gl-Market_PWidth-12-3333')][1]//div[contains(@class, 'gl-ParticipantOddsOnly')].

This works but when I add this logic and run the script -- it does not work.

Webpage odds I am after.

My output instead of different odds looks like:

[['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87'], ['2.87']]

How can I get the odds to work?

HTML:

<div class="gl-ParticipantOddsOnlyDarker gl-ParticipantOddsOnly gl-Participant_General sl-MarketCouponAdvancedBase_LastChild " >
        <span class="gl-ParticipantOddsOnly_Odds" > 2.45 </span>
    </div ><span class="gl-ParticipantOddsOnly_Odds">2.45</span> 

from website

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

So the main problem is that their HTML is set up in columns instead of rows which makes it harder to get the correlated data.

For the game names, you can use a CSS selector

div.sl-CouponParticipantWithBookCloses_NameContainer

For the odds in 1, you need to use an XPath

//div[contains(@class,'sl-MarketCouponValuesExplicit33')][./div[contains(@class,'gl-MarketColumnHeader')][.='1']]//span[@class='gl-ParticipantOddsOnly_Odds']

The XPath is looking for the DIV that is the parent container for the columns. It then looks for the one that contains '1' in the header, and then gets the odds from only that one.

You might want to do some basic validation that the number of elements returned by each matches or your odds will likely not match up with the right games. They are currently returning the same numbers.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...