Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
241 views
in Technique[技术] by (71.8m points)

python - Failed with error CUPTI could not be loaded or symbol could not be found

Here are my specifications so that it may help with any assistance:

PC SPECS:

  • CPU: Ryzen 7 3800X
  • RAM: 2x16 GB DDR4-3200
  • GPU: RTX 2060

SOFTWARE SPECS:

  • TensorFlow: v2.4.0
  • GPU Drivers: 460.89
  • NVIDIA GPU Computing Toolkit: v10.1

The issue that I'm facing with is that when I run my Functional API model using Tensorflow and Keras, I encounter this error:

Epoch 1/1000
  1/244 [..............................] - ETA: 5:58 - loss: 130.6428 - accuracy: 0.18752021-01-24 03:48:25.322233: I tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session initializing.
2021-01-24 03:48:25.322366: I tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session started.
2021-01-24 03:48:25.322481: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1415] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI could not be loaded or symbol could not be found.
  2/244 [..............................] - ETA: 3:06 - loss: 130.9427 - accuracy: 0.32032021-01-24 03:48:26.071232: I tensorflow/core/profiler/lib/profiler_session.cc:71] Profiler session collecting data.
2021-01-24 03:48:26.071381: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1496] function cupti_interface_->Finalize()failed with error CUPTI could not be loaded or symbol could not be found.
2021-01-24 03:48:26.072592: I tensorflow/core/profiler/internal/gpu/cupti_collector.cc:228]  GpuTracer has collected 0 callback api events and 0 activity events.
2021-01-24 03:48:26.074144: I tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear down.
2021-01-24 03:48:26.124377: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: logs/model-002rainpluginsprofile2021_01_24_10_48_26
2021-01-24 03:48:26.153611: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for trace.json.gz to logs/model-002rainpluginsprofile2021_01_24_10_48_26DESKTOP-U3KRM3T.trace.json.gz
2021-01-24 03:48:26.157559: I tensorflow/core/profiler/rpc/client/save_profile.cc:137] Creating directory: logs/model-002rainpluginsprofile2021_01_24_10_48_26
2021-01-24 03:48:26.211853: I tensorflow/core/profiler/rpc/client/save_profile.cc:143] Dumped gzipped tool data for memory_profile.json.gz to logs/model-002rainpluginsprofile2021_01_24_10_48_26DESKTOP-U3KRM3T.memory_profile.json.gz
2021-01-24 03:48:26.328921: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: logs/model-002rainpluginsprofile2021_01_24_10_48_26Dumped tool data for xplane.pb to logs/model-002rainpluginsprofile2021_01_24_10_48_26DESKTOP-U3KRM3T.xplane.pb
Dumped tool data for overview_page.pb to logs/model-002rainpluginsprofile2021_01_24_10_48_26DESKTOP-U3KRM3T.overview_page.pb
Dumped tool data for input_pipeline.pb to logs/model-002rainpluginsprofile2021_01_24_10_48_26DESKTOP-U3KRM3T.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to logs/model-002rainpluginsprofile2021_01_24_10_48_26DESKTOP-U3KRM3T.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to logs/model-002rainpluginsprofile2021_01_24_10_48_26DESKTOP-U3KRM3T.kernel_stats.pb

  6/244 [..............................] - ETA: 3:14 - loss: 132.3402 - accuracy: 0.3567Traceback (most recent call last):

Here is the code that I'm using to run my model:

# Imports
from tensorflow.python.keras.regularizers import l2
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.optimizers import Adam
from keras.utils.vis_utils import plot_model
import tensorflow as tf
from PIL import Image
from tqdm import tqdm
import pandas as pd
import numpy as np
import json
import os

# Prevents CUBLAS_STATUS_ALLOC_FAILED errors
config = tf.compat.v1.ConfigProto(gpu_options=tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=0.33))
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)
tf.compat.v1.keras.backend.set_session(session)

# Temporarily adds Graphviz to PATH
os.environ["PATH"] += os.pathsep + 'C:/Program Files (x86)/Graphviz/bin/'
os.environ["PATH"] += os.pathsep + 'C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/extras/CUPTI/lib64'

# Initializes checkpoint and tensorboard callbacks
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint("checkpoints/model-001/cp-{epoch:01d}", period=1)
tensorboard = TensorBoard(log_dir="logs/model-002")

# Loads metadata dataset
metadata = pd.read_csv("data/HAM10000_metadata.csv")

# Removes Lesion ID column
metadata = metadata.drop(["lesion_id"], axis=1)

# Loads translation key into memory
translation = json.loads(open("data/translation.json").read())

# Translates all non-numerical values into float64 values
metadata = metadata.replace(translation["dx_type"])
metadata = metadata.replace(translation["localization"])
metadata = metadata.replace(translation["sex"])
metadata = metadata.replace(translation["dx"])

# "Shaves" off some data to make the model generalize better and to optimize memory usage
metadata = metadata.drop(metadata[metadata.dx == "nv"].iloc[:5300].index)

# Separates Image IDs and Labels to different variables
image_id = metadata.pop("image_id")
label = metadata.pop("dx")

# Converts metadata and changes dtype of label
metadata = np.array(metadata)

# Categorical Formatting
new_label = tf.keras.utils.to_categorical(label, num_classes=7)

# Initializes image features set
image = np.zeros((len(image_id), 150, 200, 3), dtype=np.float32)

# Iterates through extracted image IDs list, resizes images (originally 600 x 450), and places RGB values into image features set
for i in tqdm(image_id):
    image[int(np.where(image_id == i)[0])] = np.array(Image.open("data/HAM10000_images/{}.jpg".format(i)).resize((200, 150)))

# Convolutional Branch
image_input = tf.keras.layers.Input(shape=(150, 200, 3))
conv2D_1 = tf.keras.layers.Conv2D(16, kernel_size=(2, 2), activation='relu', kernel_regularizer=l2(0.01))(image_input)
conv2D_2 = tf.keras.layers.Conv2D(16, kernel_size=(2, 2), activation='relu', kernel_regularizer=l2(0.01))(conv2D_1)
max_pool_1 = tf.keras.layers.MaxPool2D((2, 2))(conv2D_2)
dropout_1 = tf.keras.layers.Dropout(0.25)(max_pool_1)
conv2D_3 = tf.keras.layers.Conv2D(32, kernel_size=(2, 2), activation='relu', kernel_regularizer=l2(0.01))(dropout_1)
conv2D_4 = tf.keras.layers.Conv2D(32, kernel_size=(2, 2), activation='relu', kernel_regularizer=l2(0.01))(conv2D_3)
max_pool_2 = tf.keras.layers.MaxPool2D((2, 2))(conv2D_4)
dropout_2 = tf.keras.layers.Dropout(0.25)(max_pool_2)
conv2D_5 = tf.keras.layers.Conv2D(64, kernel_size=(2, 2), activation='relu', kernel_regularizer=l2(0.01))(dropout_2)
conv2D_6 = tf.keras.layers.Conv2D(64, kernel_size=(2, 2), activation='relu', kernel_regularizer=l2(0.01))(conv2D_5)
max_pool_3 = tf.keras.layers.MaxPool2D((2, 2))(conv2D_6)
dropout_3 = tf.keras.layers.Dropout(0.25)(max_pool_3)
conv2D_7 = tf.keras.layers.Conv2D(128, kernel_size=(2, 2), activation='relu', kernel_regularizer=l2(0.01))(dropout_3)
conv2D_8 = tf.keras.layers.Conv2D(128, kernel_size=(2, 2), activation='relu', kernel_regularizer=l2(0.01))(conv2D_7)
max_pool_4 = tf.keras.layers.MaxPool2D((2, 2))(conv2D_8)
dropout_4 = tf.keras.layers.Dropout(0.25)(max_pool_4)
flatten = tf.keras.layers.Flatten()(dropout_4)

# Metadata Branch
metadata_input = tf.keras.layers.Input(shape=(4,))

# Concatenated Branch
concat = tf.keras.layers.Concatenate()([flatten, metadata_input])
hidden_1 = tf.keras.layers.Dense(4096, activation='relu', kernel_regularizer=l2(0.01))(concat)
dropout_5 = tf.keras.layers.Dropout(0.25)(hidden_1)
hidden_2 = tf.keras.layers.Dense(4096, activation='relu', kernel_regularizer=l2(0.01))(dropout_5)
dropout_6 = tf.keras.layers.Dropout(0.25)(hidden_2)
hidden_3 = tf.keras.layers.Dense(1024, activation='relu', kernel_regularizer=l2(0.01))(dropout_6)
dropout_7 = tf.keras.layers.Dropout(0.25)(hidden_3)

# Model Creation
output = tf.keras.layers.Dense(7, activation='softmax')(dropout_7)
model = tf.keras.Model(inputs=[image_input, metadata_input], outputs=[output])

# # Prints structure of model
# plot_model(model, to_file='model-001_plot.png', show_shapes=True, show_layer_names=True)

# Model Compilation
model.compile(optimizer=Adam(learning_rate=0.0001), loss="categorical_crossentropy", metrics=['accuracy'])

# Model Training
model.fit(x=[image, metadata], y=new_label, batch_size=32, epochs=1000, validation_split=0.2, callbacks=[tensorboard, checkpoint_callback])

Here are the solutions that I've tried, but did not work:

  • Copied CUPTI64_101.dll to C:Program FilesNVIDIA GPU Computing ToolkitCUDAv10.1in
  • Added C:Program FilesNVIDIA GPU Computing ToolkitCUDAv10.1extrasCUPTIlib64 to both PATH and LD_LIBRARY_PATH (I had to make LD_LIBRARY_PATH)

The weird thing is that other models that I have, work well (Functional and Sequential).

question from:https://stackoverflow.com/questions/65869742/failed-with-error-cupti-could-not-be-loaded-or-symbol-could-not-be-found

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...