I am trying to web scrape both Instagram and Twitter based on geolocation.
I can run a query search but I am having challenges in reloading the web page to to more and store the fields to data-frame.
I did find couple of examples for web scraping twitter and Instagram without API keys. But they are with respect to #tags keywords.
I am trying to scrape with respect to geo location and between old dates. so far I have come this far with writing code in python 3.X and all the latest versions of packages in anaconda.
'''
Instagram - Components
"id": "1478232643287060472",
"dimensions": {"height": 1080, "width": 1080},
"owner": {"id": "351633262"},
"thumbnail_src": "https://instagram.fdel1-1.fna.fbcdn.net/t51.2885-15/s640x640/sh0.08/e35/17439262_973184322815940_668652714938335232_n.jpg",
"is_video": false,
"code": "BSDvMHOgw_4",
"date": 1490439084,
"taken-at=213385402"
"display_src": "https://instagram.fdel1-1.fna.fbcdn.net/t51.2885-15/e35/17439262_973184322815940_668652714938335232_n.jpg",
"caption": "Hakuna jambo zuri kama kumpa Mungu shukrani kwa kila jambo.. ud83dude4fud83cudffe
Its weekend
#lifeistooshorttobeunhappy
#Godisgood
#happysoul ud83dude00",
"comments": {"count": 42},
"likes": {"count": 3813}},
'''
import selenium
from selenium import webdriver
#from selenium import selenium
from bs4 import BeautifulSoup
import pandas
#geotags = pd.read_csv("geocodes.csv")
#parmalink =
query = geocode%3A35.68501%2C139.7514%2C30km%20since:2016-03-01%20until:2016-03-02&f=tweets
twitterURL = 'https://twitter.com/search?q=' + query
#instaURL = "https://www.instagram.com/explore/locations/213385402/"
browser = webdriver.Firefox()
browser.get(twitterURL)
content = browser.page_source
soup = BeautifulSoup(content)
print (soup)
For Twitter Search Query I am getting syntax error
For Instagram I am not getting any error but I am not able to reload for more posts and write back to csv dataframe.
I am also trying to search with latitude and longitude search in both Twitter and Instagram.
I have a list of geo coordinates in csv I can use that input or can write a query for search.
Any way to complete the scraping with location will be appreciated.
Appreciate the help !!
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…