Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
278 views
in Technique[技术] by (71.8m points)

ftplib - How to download big file in python via ftp (with monitoring & reconnect)?

UPDATE #1

The code in the question works pretty good for stable connection (like local network or intranet).

UPDATE #2

I implemented the FTPClient class with ftplib which can:

  1. monitor a download progress
  2. reconnect in case of timeout or disconnect
  3. makes several attempts to download file
  4. shows current download speed.

After reconnect it continues the download process from disconnect point (if FTP server support it). For details see my answer below.


Question

I have to implement task on python which daily downloads a bunch of big files (0.3-1.5Gb per file * 200-300 files) via FTP and then makes some processing with the files. I did it via ftplib. But from time to time it hangs and it cannot complete the download for some files. To fix the issue I started to play with KEEPALIVE settings, but I still haven't received good result

with closing(ftplib.FTP()) as ftp:
    try:
        ftp.connect(self.host, self.port, 30*60) #30 mins timeout
        # print ftp.getwelcome()
        ftp.login(self.login, self.passwd)
        ftp.set_pasv(True)
        ftp.sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
        ftp.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 75)
        ftp.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 60)
        with open(local_filename, 'w+b') as f:
            res = ftp.retrbinary('RETR %s' % orig_filename, f.write)

            if not res.startswith('226 Transfer complete'):
                logging.error('Downloaded of file {0} is not compile.'.format(orig_filename))
                os.remove(local_filename)
                return None

        os.rename(local_filename, self.storage + filename + file_ext)
        ftp.rename(orig_filename, orig_filename + '.copied')

        return filename + file_ext

    except:
            logging.exception('Error during download from FTP')

Details

  • Usually it takes 7-15 minutes to download a file.
  • FTP server always shows me in the logs that files are fully downloaded, but the client part hangs. Not every time but from time to time.

Questions

  • May it be because of a disconnect?
  • How to implement a monitor for the download process and reconnect it in case if it's disconnected
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Because I couldn't find any good suggestions or code samples, I implemented my own solution. Thank you so much to the Stackoverflow community for some ideas which I used in my code. I put the code to GitHub (pyFTPclient) due to the size of the code(~ 120 lines).

I tested the solution on bad quality network (include 3G mobile internet) and it was work ok for me. But of course it may have some bugs.

I will appreciate any comments or suggestions. Thank you in advance.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...