Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
239 views
in Technique[技术] by (71.8m points)

java - 403 rate limit on insert sometimes succeeds

I insert 100 files in a loop. For this test I have DISABLED backoff and retry, so if an insert fails with a 403, I ignore it and proceed with the next file. Out of 100 files, I get 63 403 rate limit exceptions.

However, on checking Drive, of those 63 failures, 3 actually succeeded, ie. the file was created on drive. Had I done the usual backoff and retry, I would have ended up with duplicated inserts. This confirms the behaviour I was seeing with backoff-retry enabled, ie. from my 100 file test, I am consistently seeing 3-4 duplicate insertions.

It smells like there is an asynchronous connection between the API endpoint server and the Drive storage servers which is causing non-deterministic results, especially on high volume writes.

Since this means I can't rely on "403 rate limit" to throttle my inserts, I need to know what is a safe insert rate so as not to trigger these timing bugs.

Running the code below, gives ...

Summary...
File insert attempts (a)       = 100
rate limit errors (b)          = 31
expected number of files (a-b) = 69
Actual number of files         = 73 

code...

package com.cnw.test.servlets;

import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.google.api.client.auth.oauth2.Credential;
import com.google.api.client.googleapis.json.GoogleJsonError;
import com.google.api.client.googleapis.json.GoogleJsonResponseException;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.json.jackson.JacksonFactory;
import com.google.api.services.drive.Drive;
import com.google.api.services.drive.model.ChildList;
import com.google.api.services.drive.model.File;
import com.google.api.services.drive.model.File.Labels;
import com.google.api.services.drive.model.ParentReference;

import couk.cleverthinking.cnw.oauth.CredentialMediatorB;
import couk.cleverthinking.cnw.oauth.CredentialMediatorB.InvalidClientSecretsException;

@SuppressWarnings("serial")
    /**
     * 
     * AppEngine servlet to demonstrate that Drive IS performing an insert despite throwing a 403 rate limit exception.
     * 
     * All it does is create a folder, then loop to create x files. Any 403 rate limit exceptions are counted.
     * At the end, compare the expected number of file (attempted - 403) vs. the actual.
     * In a run of 100 files, I consistently see between 1 and 3 more files than expected, ie. despite throwing a 403 rate limit,
     * Drive *sometimes* creates the file anyway.
     * 
     * To run this, you will need to ...
     * 1) enter an APPNAME above
     * 2) enter a google user id above
     * 3) Have a valid stored credential for that user
     * 
     * (2) and (3) can be replaced by a manually constructed Credential 
     * 
     * Your test must generate rate limit errors, so if you have a very slow connection, you might need to run 2 or 3 in parallel. 
     * I run the test on a medium speed connection and I see 403 rate limits after 30 or so inserts.
     * Creating 100 files consistently exposes the problem.
     * 
     */
public class Hack extends HttpServlet {

    private final String APPNAME = "MyApp";  // ENTER YOUR APP NAME
    private final String GOOGLE_USER_ID_TO_FETCH_CREDENTIAL = "11222222222222222222222"; //ENTER YOUR GOOGLE USER ID
    @Override
    public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException {
        /*
         *  set up the counters
         */
        // I run this as a servlet, so I get the number of files from the request URL
        int numFiles = Integer.parseInt(request.getParameter("numfiles"));
        int fileCount = 0;
        int ratelimitCount = 0;

        /*
         * Load the Credential
         */
        CredentialMediatorB cmb = null;
        try {
            cmb = new CredentialMediatorB(request);
        } catch (InvalidClientSecretsException e) {
            e.printStackTrace();
        }
        // this fetches a stored credential, you might choose to construct one manually
        Credential credential = cmb.getStoredCredential(GOOGLE_USER_ID_TO_FETCH_CREDENTIAL);

        /*
         * Use the credential to create a drive service
         */
        Drive driveService = new Drive.Builder(new NetHttpTransport(), new JacksonFactory(), credential).setApplicationName(APPNAME).build();

        /* 
         * make a parent folder to make it easier to count the files and delete them after the test
         */
        File folderParent = new File();
        folderParent.setTitle("403parentfolder-" + numFiles);
        folderParent.setMimeType("application/vnd.google-apps.folder");
        folderParent.setParents(Arrays.asList(new ParentReference().setId("root")));
        folderParent.setLabels(new Labels().setHidden(false));
        driveService.files().list().execute();
        folderParent = driveService.files().insert(folderParent).execute();
        System.out.println("folder made with id = " + folderParent.getId());

        /*
         * store the parent folder id in a parent array for use by each child file
         */
        List<ParentReference> parents = new ArrayList<ParentReference>();
        parents.add(new ParentReference().setId(folderParent.getId()));

        /*
         * loop for each file
         */
        for (fileCount = 0; fileCount < numFiles; fileCount++) {
            /*
             * make a File object for the insert
             */
            File file = new File();
            file.setTitle("testfile-" + (fileCount+1));
            file.setParents(parents);
            file.setDescription("description");
            file.setMimeType("text/html");

            try {
                System.out.println("making file "+fileCount + " of "+numFiles);
                // call the drive service insert execute method 
                driveService.files().insert(file).setConvert(false).execute();
            } catch (GoogleJsonResponseException e) {
                GoogleJsonError error = e.getDetails();
                // look for rate errors and count them. Normally one would expo-backoff here, but this is to demonstrate that despite
                // the 403, the file DID get created
                if (error.getCode() == 403 && error.getMessage().toLowerCase().contains("rate limit")) {
                    System.out.println("rate limit exception on file " + fileCount + " of "+numFiles);
                    // increment a count of rate limit errors
                    ratelimitCount++;
                } else {
                    // just in case there is a different exception thrown
                    System.out.println("[DbSA465] Error message: " + error.getCode() + " " + error.getMessage());
                }
            }
        }

        /* 
         * all done. get the children of the folder to see how many files were actually created
         */
        ChildList children = driveService.children().list(folderParent.getId()).execute();

        /*
         * and the winner is ...
         */
        System.out.println("
Summary...");
        System.out.println("File insert attempts (a)       = " + numFiles);
        System.out.println("rate limit errors (b)          = " + ratelimitCount);
        System.out.println("expected number of files (a-b) = " + (numFiles - ratelimitCount));
        System.out.println("Actual number of files         = " + children.getItems().size() + " NB. There is a limit of 100 children in a single page, so if you're expecting more than 100, need to follow nextPageToken");
    }
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I'm assuming you're trying to do Parallel downloads...

This may not be an answer you're looking for, but this is what I've experienced in my interactions with google drive api. I use C#, so it's a bit different, but maybe it'll help.

I had to set a specific amount of threads to run at one time. If I let my program run all 100 entries at one time as separate threads, I run into the rate limit error as well.

I don't know well at all, but in my C# program, I run 3 threads (definable by the user, 3 is default)

opts = new ParallelOptions { MaxDegreeOfParallelism = 3 };
var checkforfinished = 
Parallel.ForEach(lstBackupUsers.Items.Cast<ListViewItem>(), opts, name => {
{ // my logic code here }

I did a quick search and found that Java 8 (not sure if that's what you're using) supports Parallel().forEach(), maybe that'd help you. The resource I found for this is at: http://radar.oreilly.com/2015/02/java-8-streams-api-and-parallelism.html

Hope this helps, taking my turns trying to help others on SO as people have helped me!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...