Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
296 views
in Technique[技术] by (71.8m points)

Why is my CNN implmentation in C++ Slower by orders of magnitude than the keras python version

Why is my CNN implementation in C++ too slow. The python anaconda version runs at least 100 times as fast as the c++ code. I see a difference only in the use of cpu.whereas anaconda version tries to use 100% of cpu whereas my implementation uses 25% of cpu as I see in the Task Manager. I have optimized my for loops as much as possible. I am using pointers to pointers to implement N-D Arrays. I use C++ structs to abstracts the network layers,planes and nodes of the neural network. I create array of planes to construct a layer and array of layers to construct the neural network. Here's a snippet of the innermost part of my loops in the backward pass.

for (int k = 0; k < plnext.szx; k++)
{
   for (int l = 0; l < plnext.szy; l++)
   {
        if (!(n - k >= 0 && n - k < plnext.kx && o - l >= 0 && o - l < plnext.ky))
           continue;

        node&presnode = layers[i].arrplanes[j].nodes[k][l];
        prevplane.nodes[n][o].delta += presplane.kern[m][(n - k)][(o - l)].wt*presnode.delta;
        layers[i].arrplanes[j].kern[m][n - k][o - l].chwt += eta *presnode.delta*prevnode.state;

   }
}
question from:https://stackoverflow.com/questions/65952635/why-is-my-cnn-implmentation-in-c-slower-by-orders-of-magnitude-than-the-keras

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...