Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
130 views
in Technique[技术] by (71.8m points)

html - Python: Doesn't get to the 2nd iteration of For loop

I'm trying to have Python loop through multiple pages, by using an incrementing page number at the end of the URL address.

# import get to call a get request on the site

from bs4 import BeautifulSoup
import requests
from warnings import warn

response1 = requests.get('https://lasvegas.craigslist.org/search/mcy?purveyor-input=owner&hasPic=1')  # get rid of those lame-o's that post a housing option without a pic using their filter
html_soup = BeautifulSoup(response1.text, 'html.parser')

results_num = html_soup.find('div', class_='search-legend')
results_total = int(results_num.find('span',class_='totalcount').text)  # pulled the total count of posts as the upper bound # of the pages array

pages = np.arange(0, results_total + 1, 120)
iterations = 0
print(pages)

for page in pages:
    response2 = requests.get("https://lasvegas.craigslist.org/search/mcy?purveyor-input=owner&hasPic=1"
                             + "&s="  # the parameter for defining the page number
                             + str(page))  # the page number in the pages array from earlier

    if response2.status_code != 200:
        warn('Request: {}; Status code: {}'.format(requests, response2.status_code))
iterations = iterations + 1


print(response2)

The code itself doesn't have any run-time error, but it doesn't jump to the 2nd page and it just stops at end of 1st page iteration. I am pulling my hair out. I don't know why that is the case.

Could someone please point me in the right direction? I expect <Response [200]> to show up 3 times.

It only shows up once

question from:https://stackoverflow.com/questions/65880776/python-doesnt-get-to-the-2nd-iteration-of-for-loop

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your code has couple of problems. You have a missing import for the numpy module and the print statement to print the output response is incorrectly indented.

The below script works as expected:

from bs4 import BeautifulSoup
import requests
from warnings import warn
import numpy as np

response1 = requests.get('https://lasvegas.craigslist.org/search/mcy?purveyor-input=owner&hasPic=1')  # get rid of those lame-o's that post a housing option without a pic using their filter
html_soup = BeautifulSoup(response1.text, 'html.parser')

results_num = html_soup.find('div', class_='search-legend')
results_total = int(results_num.find('span',class_='totalcount').text)  # pulled the total count of posts as the upper bound # of the pages array

pages = np.arange(0, results_total + 1, 120)
iterations = 0
print(pages)

for page in pages:
    response2 = requests.get("https://lasvegas.craigslist.org/search/mcy?purveyor-input=owner&hasPic=1"
                             + "&s="  # the parameter for defining the page number
                             + str(page))  # the page number in the pages array from earlier

    if response2.status_code != 200:
        warn('Request: {}; Status code: {}'.format(requests, response2.status_code))
    print(response2)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...