Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
542 views
in Technique[技术] by (71.8m points)

javascript - How to get text from textnodes seperated by whitespace using Selenium and Python

I am on this page:

https://fantasy.premierleague.com/statistics

When you click on any "i" icon next to a player, a popup window appears. Then, i want to get the surname of the player. This is how "inspect element" looks like ("whitespace" actually appears within a box):

<h2 class="ElementDialog__ElementHeading-gmefnd-2 ijAScJ">
 Kevin
 whitespace
 De Bruyne

What i want to do is to take the text that appears after the whitespace. I can get the full text (ie both name and surname) using this:

player_full_name = driver.find_element_by_xpath('//*[@class="ElementDialog__ElementHeading-gmefnd-2 ijAScJ"]').text

but how can i get the surname only (ie what appears after the whitespace)? Note that for other players it could have been like this:

<h2 class="ElementDialog__ElementHeading-gmefnd-2 ijAScJ">
 Gabriel Fernando
 whitespace
 de Jesus

or like this:

<h2 class="ElementDialog__ElementHeading-gmefnd-2 ijAScJ">
 Dean
 whitespace
 Henderson

ie splitting the text and taking the last one or two elements will not work.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The surname of the player is the second or last text node within it's parent WebElement. So extract the surname e.g. De Bruyne from Kevin De Bruyne you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR, childNodes and strip():

    driver.get("https://fantasy.premierleague.com/statistics")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//table//tbody/tr/td/button"))).click()
    print( driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h2.ElementDialog__ElementHeading-gmefnd-2")))).strip())
    
  • Console Output:

    De Bruyne
    
  • Using CSS_SELECTOR, childNodes and splitlines():

    driver.get("https://fantasy.premierleague.com/statistics")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//table//tbody/tr/td/button"))).click()
    print( driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h2.ElementDialog__ElementHeading-gmefnd-2")))).splitlines())
    
  • Console Output:

    ['De Bruyne']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

References

You can find a couple of relevant detailed discussions in:


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...