First, some caveats. Generally speaking, add-ons of this nature (e.g. those that run on all pages and scan all content) have a major performance impact and are likely to result in users removing or disabling the add-on when they see the performance impact it entails. It further appears that you're writing your code in .NET, which is also strongly discouraged due to the performance impact.
Getting the contents of a cross-domain subframe is non-trivial because you will get an Access Denied by default. The reason is that the cross-domain security restriction that exists for JavaScript is also applied when your add-on attempts to get the cross-domain content.
To get the cross-domain content from the top-level page, you must jump through some hoops which are non-trivial, particularly in .NET. Your best bet is to just run your code on each frame's DocumentComplete event, as Jeff observed.
If you must run your code only once, from the top-level page, then you can do so with a technique like this one:
http://support.microsoft.com/default.aspx?scid=kb;en-us;196340
// &lpDocDisp is the dispatch pointer for the document
IHTMLDocument2* pDocument;
HRESULT hr = lpDocDisp->QueryInterface(IID_IHTMLDocument2, (void**)&pDocument);
if (FAILED(hr))
return hr;
long iCount = 0;
// Now, check for subframes
// http://support.microsoft.com/default.aspx?scid=kb;en-us;196340
IOleContainer* pContainer;
// Get the container
hr = lpDocDisp->QueryInterface(IID_IOleContainer, (void**)&pContainer);
if (FAILED(hr) || (NULL == pContainer)){
OutputDebugString("[AXHUNTER] Failed to get container
");
return hr;
}
LPENUMUNKNOWN pEnumerator;
// Get an enumerator for the frames
hr = pContainer->EnumObjects(OLECONTF_EMBEDDINGS, &pEnumerator);
pContainer->Release();
if (FAILED(hr) || (NULL == pEnumerator)){
OutputDebugString("[AXHUNTER] Failed to get enumerator
");
return hr;
}
IUnknown* pUnk;
ULONG uFetched;
// Enumerate all the frames
for (UINT i = 0; S_OK == pEnumerator->Next(1, &pUnk, &uFetched); i++)
{
assert(NULL != pUnk);
IWebBrowser2* pBrowser;
hr = pUnk->QueryInterface(IID_IWebBrowser2, (void**)&pBrowser);
pUnk->Release();
if (SUCCEEDED(hr))
{
LPDISPATCH pSubDoc = NULL;
hr = pBrowser->get_Document(&pSubDoc);
if (SUCCEEDED(hr) && (NULL != pSubDoc)){
CrawlPage(pSubDoc, ++iNested);
pSubDoc->Release();
}
pBrowser->Release();
}
else
{
OutputDebugString("[AXHUNTER] Cannot get IWebBrowser2 interface
");
}
}
pEnumerator->Release();
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…