i am having problem with the web scraping of the university course

Question

Welcome To Ask or Share your Answers For Others

i am having problem with the web scraping of the university course

posted Feb 19, 2021 in Technique[技术] by 深蓝 (71.8m points)

i am having problem with the web scraping of the university course

Hi i am trying to web scrap the university of reading : http://www.reading.ac.uk/ready-to-study/study/subject-area/modern-languages-and-european-studies-ug/ba-spanish-and-history.aspx but i am having problem to extract the Course duration of it. can any one help me. i used the code below?

duration_title = soup.find('li', text=re.compile(r'Course duration', re.IGNORECASE))
if duration_title:
    duration = duration_title.find_next_sibling('strong')
    if duration:
        duration_text = duration.get_text()
        duration_ = re.search(r"d+(?:.d+)|d+", duration_text)
        if duration_ is not None:
            if duration_.group() == 1 or '1' in duration_.group():
                course_data['Duration'] = duration_.group()
                course_data['Duration_Time'] = 'Year'
            elif '0.5' in duration_.group():
                course_data['Duration'] = '6'
                course_data['Duration_Time'] = 'Months'
            else:
                course_data['Duration'] = duration_.group()
                course_data['Duration_Time'] = 'Years'
else:
    course_data['Duration'] = 'Not mentioned'
    course_data['Duration_Time'] = 'Not mentioned'
print('Duration: ', str(course_data['Duration']) + ' / ' + course_data['Duration_Time'])

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-02-19T03:52:05+0000

replyed Feb 19, 2021 by 深蓝 (71.8m points)

Try text only and remove the li:

soup.find(text=re.compile(r'Course duration', re.IGNORECASE))

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

i am having problem with the web scraping of the university course

i am having problem with the web scraping of the university course

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags