I'm trying to write a script to test for the existence of a web page, would be nice if it would check without downloading the whole page.
This is my jumping off point, I've seen multiple examples use httplib in the same way, however, every site I check simply returns false.
import httplib
from httplib import HTTP
from urlparse import urlparse
def checkUrl(url):
p = urlparse(url)
h = HTTP(p[1])
h.putrequest('HEAD', p[2])
h.endheaders()
return h.getreply()[0] == httplib.OK
if __name__=="__main__":
print checkUrl("http://www.stackoverflow.com") # True
print checkUrl("http://stackoverflow.com/notarealpage.html") # False
Any ideas?
Edit
Someone suggested this, but their post was deleted.. does urllib2 avoid downloading the whole page?
import urllib2
try:
urllib2.urlopen(some_url)
return True
except urllib2.URLError:
return False
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…