Using @tensorflow/tfjs-node-gpu 3.0.0 and CUDA 10.0 on Ubuntu 20.04
// Prepares and returns 4GB data
async function prepareData () {
return {
inputs: inputTensor,
outputs: outputTensor
}
}
export async function train () {
const { inputs, outputs } = await prepareData()
const model = getModel()
await model.fit(inputs, outputs, {
validationSplit: 0.2,
epochs: EPOCHS,
shuffle: true
})
await model.save('file://model')
}
When calling the train
method from somewhere else in my running Node process, 4GB of memory is allocated on the GPU, inspected with nvidia-smi
. When the training ends, the memory is never released.
I have tried tf.dispose()
, tf.disposeVariables
, wrapping in tf.engine().startScope()/endScope()
. The only thing that frees the memory is killing the containing Node process (which is not very practical since its a web server).
Is there any way to clear the allocated memory without killing the Node process?
question from:
https://stackoverflow.com/questions/65925898/tensorflow-for-node-gpu-doesnt-release-memory 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…