Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
512 views
in Technique[技术] by (71.8m points)

c - How can I get number of Cores in cuda device?

I am looking for a function that count number of core of my cuda device. I know each microprocessor have specific cores, and my cuda device has 2 microprocessors.

I searched a lot to find a property function that count number of cores per microprocessor but I couldn't. I use the code below but I still need number of cores?

  • cuda 7.0
  • program language C
  • visual studio 2013

Code:

void printDevProp(cudaDeviceProp devProp)
{   printf("%s
", devProp.name);
printf("Major revision number:         %d
", devProp.major);
printf("Minor revision number:         %d
", devProp.minor);
printf("Total global memory:           %u", devProp.totalGlobalMem);
printf(" bytes
");
printf("Number of multiprocessors:     %d
", devProp.multiProcessorCount);
printf("Total amount of shared memory per block: %u
",devProp.sharedMemPerBlock);
printf("Total registers per block:     %d
", devProp.regsPerBlock);
printf("Warp size:                     %d
", devProp.warpSize);
printf("Maximum memory pitch:          %u
", devProp.memPitch);
printf("Total amount of constant memory:         %u
",   devProp.totalConstMem);
return;
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The cores per multiprocessor is the only "missing" piece of data. That data is not provided directly in the cudaDeviceProp structure, but it can be inferred based on published data and more published data from the devProp.major and devProp.minor entries, which together make up the CUDA compute capability of the device.

Something like this should work:

#include "cuda_runtime_api.h"
// you must first call the cudaGetDeviceProperties() function, then pass 
// the devProp structure returned to this function:
int getSPcores(cudaDeviceProp devProp)
{  
    int cores = 0;
    int mp = devProp.multiProcessorCount;
    switch (devProp.major){
     case 2: // Fermi
      if (devProp.minor == 1) cores = mp * 48;
      else cores = mp * 32;
      break;
     case 3: // Kepler
      cores = mp * 192;
      break;
     case 5: // Maxwell
      cores = mp * 128;
      break;
     case 6: // Pascal
      if ((devProp.minor == 1) || (devProp.minor == 2)) cores = mp * 128;
      else if (devProp.minor == 0) cores = mp * 64;
      else printf("Unknown device type
");
      break;
     case 7: // Volta and Turing
      if ((devProp.minor == 0) || (devProp.minor == 5)) cores = mp * 64;
      else printf("Unknown device type
");
      break;
     case 8: // Ampere
      if (devProp.minor == 0) cores = mp * 64;
      else if (devProp.minor == 6) cores = mp * 128;
      else printf("Unknown device type
");
      break;
     default:
      printf("Unknown device type
"); 
      break;
      }
    return cores;
}

(coded in browser)

"cores" is a bit of a marketing term. The most common connotation in my opinion is to equate it with SP units in the SM. That is the meaning I have demonstrated here. I've also omitted cc 1.x devices from this, as those device types are no longer supported in CUDA 7.0 and CUDA 7.5

A pythonic version is here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...