Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
504 views
in Technique[技术] by (71.8m points)

php - Youtube Data API v3 pageToken for arbitrary page

Another question on SO revealed that pageTokens are identical for different searches, provided that the page number and maxResults settings are the same.

Version 2 of the API let you go to any arbitrary page by setting a start position, but v3 only provides next and previous tokens. There's no jumping from page 1 to page 5, even if you know there are 5 pages of results.

So how do we work around this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A YouTube pageToken is six characters long. Here's what I've been able to determine about the format:

char 1: Always 'C' that I've seen. char 2-3: Encoded start position char 4-5: Always 'QA' that I've seen. char 6: 'A' means list items in a position greater than or equal to the start position. 'Q' means list items before the start position.

Due to the nature of character 6, there are two different ways to represent the same page. Given maxResults=1, page 2 can be reached by setting the page token to either "CAEQAA" or "CAIQAQ". The first one means to start at result number 2 (represented by characters 2-3 "AE") and list 1 item. The second means to return one item before result number 3 (represented by characters 2-3 "AI".

Characters 2-3 are a strange base 16 encoding.

Character 3 uses a list from A-Z, then a-z, then 0-9 and increments by 4 in the list for each increase of 1. The series is A,E,I,M,Q,U,Y,c,g,k,o,s,w,0,4,8. Character 2 goes from A to B to C to D and so on. For my purposes, I'm not working with large result sets, so I haven't bothered to see what happens to the second character beyond a couple hundred results. Perhaps someone working with larger sets will provide an update as to how character 2 behaves after that.

Since the string only contains a start position and an option for ">=" or "<", the same string is used in multiple cases. For instance, with 2 results per page, the start position of the second page is result 3. The pageToken for this is "CAIQAA". This is identical to the token for the third page with one result per page.

Since I'm primarily a php person, here's the function I'm using to get the pageToken for a given page:

function token($limit, $page) {
    $start = 1 + ($page - 1) * $limit;
    $third_chars = array_merge(
            range("A","Z",4),
            range("c","z",4),
            range(0,9,4));
    return 'C'.
           chr(ord('A') + floor($start / 16)).
           $third_chars[($start % 16) - 1].
           'QAA';
}
$limit = 1;
echo "With $limit result(s) per page...".PHP_EOL;
for ($i = 1; $i < 6; ++$i) {
    echo "The token for page $i is ".token($limit, $i).PHP_EOL;
}

Please test this function in your project and update the rest of us if you find a flaw or an enhancement since YouTube hasn't provided us with an easy way to do this.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...