Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
249 views
in Technique[技术] by (71.8m points)

javascript - Getting requests from a website and retrieving the response?

I am trying to monitor a website (www.bidcactus.com). While on the website I open up Firebug, go to the net tab, and click the XHR tab.

I want to take the responses of the requests and save it to a mySql database (I have a local one running on my computer(XAMPP).

I have been told to do a variety of things mainly using jQuery or JavaScript but I'm not experienced either so I was wondering if anyone can help me out here.

Someone suggested me this link Using Greasemonkey and jQuery to intercept JSON/AJAX data from a page, and process it

Its using Greasemonkey as well which I don't know much about either...

Thanks in advance for any help

Example/more detail:
While monitoring the requests sent(via firebug) I see below

http://www.bidcactus.com/CactusWeb/ItemUpdates?rnd=1310684278585
The response of this link is the following:
{"s":"uk5c","a":[{"w":"MATADORA","t":944,"p":5,"a":413173,"x":10},   
{"w":"1000BidsAintEnough","t":6,"p":863,"a":413198,"x":0}, 
{"w":"YourBidzWillBeWastedHere","t":4725,"p":21,"a":413200,"x":8}, 
{"w":"iwillpay2much","t":344,"p":9,"a":413201,"x":9}, 
{"w":"apcyclops84","t":884,"p":3,"a":413213,"x":14}, 
{"w":"goin_postal","t":165,"p":5,"a":413215,"x":12}, 
{"w":"487951","t":825,"p":10,"a":413218,"x":6}, 
{"w":"mishmash","t":3225,"p":3,"a":413222,"x":7}, 
{"w":"CrazyKatLady2","t":6464,"p":1,"a":413224,"x":2}, 
{"w":"BOSS1","t":224,"p":102,"a":413230,"x":4}, 
{"w":"serbian48","t":62,"p":2,"a":413232,"x":11}, 
{"w":"Tuffenough","t":1785,"p":1,"a":413234,"x":1}, 
{"w":"apcyclops84","t":1970,"p":1,"a":413240,"x":13}, 
{"w":"Tuffenough","t":3524,"p":1,"a":413244,"x":5}, 
{"w":"Cdm17517","t":1424,"p":1,"a":413252,"x":3}],"tau":"0"}

I understand what this information and I think I could format it myself however the website randomly creates new requests.
Example http://www.bidcactus.com/CactusWeb/ItemUpdates?rnd=XXXXXXXXXXXX
and I'm not sure how it creates them.

So I'm needing to get the response for all the requests that are for item updates and send the information to a mysql database.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

OK, here's working code, somewhat tuned for that site (front page, no account, only).

Instructions for use:

  1. Install the GM script. Note that it is Firefox only, for now.

  2. Observe it running in Firebug's console, and tune the filter section (clearly marked), to target the data you are interested in. (Maybe the whole a array?)

    Note that it can take several seconds after "Script Start" is printed, for the ajax intercepts to start.

  3. Setup your web application and server to receive the data. The script posts JSON, so PHP, for example, would grab the data, like so:

     $jsonData   = json_decode ($HTTP_RAW_POST_DATA);
    
  4. Point the script to your server.

  5. Voilà. She is done.


/******************************************************************************
*******************************************************************************
**  This script intercepts ajaxed data from the target web pages.
**  There are 4 main phases:
**      1)  Intercept XMLHttpRequest's made by the target page.
**      2)  Filter the data to the items of interest.
**      3)  Transfer the data from the page-scope to the GM scope.
**          NOTE:   This makes it technically possibly for the target page's
**                  webmaster to hack into GM's slightly elevated scope and
**                  exploit any XSS or zero-day vulnerabilities, etc.  The risk
**                  is probably zero as long as you don't start any feuds.
**      4)  Use GM_xmlhttpRequest () to send the data to our server.
*******************************************************************************
*******************************************************************************
*/
// ==UserScript==
// @name            _Record ajax, JSON data.
// @namespace       stackoverflow.com/users/331508/
// @description     Intercepts Ajax data, filters it and then sends it to our server.
// @include         http://www.bidcactus.com/*
// ==/UserScript==

DEBUG   = true;
if (DEBUG)  console.log ('***** Script Start *****');


/******************************************************************************
*******************************************************************************
**  PHASE 1 starts here, this is the XMLHttpRequest intercept code.
**  Note that it will not work in GM's scope.  We must inject the code to the
**  page scope.
*******************************************************************************
*******************************************************************************
*/
funkyFunc   = ( (<><![CDATA[

    DEBUG           = false;
    //--- This is where we will put the data we scarf. It will be a FIFO stack.
    payloadArray    = [];   //--- PHASE 3a

    (function (open) {
        XMLHttpRequest.prototype.open = function (method, url, async, user, pass)
        {
            this.addEventListener ("readystatechange", function (evt)
            {
                if (this.readyState == 4  &&  this.status == 200)  //-- Done, & status "OK".
                {
                    var jsonObj = null;
                    try {
                        jsonObj = JSON.parse (this.responseText);   // FF code.  Chrome??
                    }
                    catch (err) {
                        //if (DEBUG)  console.log (err);
                    }
                    //if (DEBUG)  console.log (this.readyState, this.status, this.responseText);

                    /******************************************************************************
                    *******************************************************************************
                    **  PHASE 2:    Filter as much as possible, at this stage.
                    **              For this site, jsonObj should be an object like so:
                    **                  { s="1bjqo", a=[15], tau="0"}
                    **              Where a is an array of objects, like:
                    **                  a   417387
                    **                  p   1
                    **                  t   826
                    **                  w   "bart69"
                    **                  x   7
                    *******************************************************************************
                    *******************************************************************************
                    */
                    //if (DEBUG)  console.log (jsonObj);
                    if (jsonObj  &&  jsonObj.a  &&  jsonObj.a.length > 1) {
                        /*--- For demonstration purposes, we will only get the 2nd row in
                            the `a` array. (Probably stands for "auction".)
                        */
                        payloadArray.push (jsonObj.a[1]);
                        if (DEBUG)  console.log (jsonObj.a[1]);
                    }
                    //--- Done at this stage!  Rest is up to the GM scope.
                }
            }, false);

            open.call (this, method, url, async, user, pass);
        };
    } ) (XMLHttpRequest.prototype.open);
]]></>).toString () );


function addJS_Node (text, s_URL)
{
    var scriptNode                      = document.createElement ('script');
    scriptNode.type                     = "text/javascript";
    if (text)  scriptNode.textContent   = text;
    if (s_URL) scriptNode.src           = s_URL;

    var targ    = document.getElementsByTagName('head')[0] || d.body || d.documentElement;
    targ.appendChild (scriptNode);
}

addJS_Node (funkyFunc);


/******************************************************************************
*******************************************************************************
**  PHASE 3b:
**  Set up a timer to check for data from our ajax intercept.
**  Probably best to make it slightly faster than the target's
**  ajax frequency (about 1 second?).
*******************************************************************************
*******************************************************************************
*/
timerHandle = setInterval (function() { SendAnyResultsToServer (); }, 888);

function SendAnyResultsToServer ()
{
    if (unsafeWindow.payloadArray) {
        var payload     = unsafeWindow.payloadArray;
        while (payload.length) {
            var dataRow = JSON.stringify (payload[0]);
            payload.shift ();   //--- pop measurement off the bottom of the stack.
            if (DEBUG)  console.log ('GM script, pre Ajax: ', dataRow);

            /******************************************************************************
            *******************************************************************************
            **  PHASE 4: Send the data, one row at a time, to the our server.
            **  The server would grab the data with:
            **      $jsonData   = json_decode ($HTTP_RAW_POST_DATA);
            *******************************************************************************
            *******************************************************************************
            */
            GM_xmlhttpRequest ( {
                method:     "POST",
                url:        "http://localhost/db_test/ShowJSON_PostedData.php",
                data:       dataRow,
                headers:    {"Content-Type": "application/json"},
                onload:     function (response) {
                                if (DEBUG)  console.log (response.responseText);
                            }
            } );
        }
    }
}


//--- EOF


Misc notes:

  1. I tested it on the main page of that site, without logging in (I'm not about to to create an account there).

  2. I tested with AdBlock, FlashBlock, NoSCript, and RequestPolicy all in full effect. JS was turned on for bidcactus.com (it has to be) but no others. Turning all that crud back on shouldn't cause side effects -- but if it does, I'm not going to debug it.

  3. Code like this has to be tuned for the site and for how you browse said site1. It's up to you to do that. Hopefully the code is self-documented enough.

  4. Enjoy!



1 Mainly: the @include and @exclude directives, the JSON data selection and filtering, and whether iFrames need to be blocked. Also it is recommended that the 2 DEBUG variables (one for GM scope and one for the page scope) be set to false when the tuning is done.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...