I need to download a massive amount of excel-files (estimated: 500 - 1000) from sellercentral.amazon.de. Manually downloading is not an option, as every download needs several clicks until the excel pops up.
Since amazon cannot provide me a simple xml with its structure, I decided to automate this on my own. The first thing coming to mind was Selenium and Firefox.
The Problem:
A login to sellercentral is required, as well as 2-factor-authentication (2FA). So if I login once, i can open another tab, enter sellercentral.amazon.de and am instantly logged in.
I can even open another instance of the browser, and be instantly logged in there too. They might be using session-cookies. The target URL to "scrape" is https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu .
But when I open the URL from my python-script with selenium webdrive, a new instance of the browser is launched, in which I am not logged in. Even though, there are instances of firefox running at the same time, in which I am logged in. So I guess the instances launched by selenium are somewhat different.
What I've tried:
I tried setting a timedelay after the first .get() (to open site), then I'll manually login, and after that redoing the .get(), which makes the script go on for forever.
from selenium import webdriver
import time
browser = webdriver.Firefox()
# Wait for website to fire onload event
browser.get("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")
time.sleep(30000)
browser.get("https://sellercentral.amazon.de/listing/download?ref=ag_dnldinv_apvu_newapvu")
elements = browser.find_elements_by_tag_name("browse-node-component")
print(str(elements))
What am I looking for?
Need solution to use the two factor authentication token from google authenticator.
I want the selenium to be opened up as a tab in the existing instance of the firefox browser, where I will have already logged in beforehand. Therefore no login (should be) required and the "scraping" and downloading can be done.
If there's no direct way, maybe someone comes up with a workaround?
I know selenium cannot download the files itself, as the popups are no longer part of the browser. I'll fix that when I get there.
Important Side-Notes:
Firefox is not a given! I'll gladly accept a solution for any browser.
See Question&Answers more detail:
os