I'm trying to implement a limited web crawler in C# (for a few hundred sites only)
using HttpWebResponse.GetResponse() and Streamreader.ReadToEnd() , also tried using StreamReader.Read() and a loop to build my HTML string.
I'm only downloading pages which are about 5-10K.
It's all very slow! For example, the average GetResponse() time is about half a second, while the average StreamREader.ReadToEnd() time is about 5 seconds!
All sites should be very fast, as they are very close to my location, and have fast servers. (in Explorer takes practically nothing to D/L) and I am not using any proxy.
My Crawler has about 20 threads reading simultaneously from the same site. Could this be causing a problem?
How do I reduce StreamReader.ReadToEnd times DRASTICALLY?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…