Error when start to train SAEHD hepl!

This topic has 2 replies, 3 voices, and was last updated 2 years, 6 months ago by defalafa.

Viewing 3 posts - 1 through 3 (of 3 total)

Author

Posts
December 11, 2022 at 6:18 am #7601
dvdfakfy
Participant
Hi!
I did all the previous steps and now i want to start SAEHD train but I get this error and it stops to work. I restarted the computer and start over again and same problem.
I apreciate the help!

[n] Enable gradient clipping ( y/n ?:help ) : n
[n] Enable pretraining mode ( y/n ?:help ) : n
Initializing models: 20%|############6 | 1/5 [00:11<00:46, 11.74s/it]
Error: OOM when allocating tensor with shape[576000,300] and type float on /job:localhost/replica:0/task:0/device:DML:0 by allocator DmlAllocator
[[node inter_AB/dense1/weight/Initializer/random_uniform/mul (defined at C:\Users\david.puluc\Documents\DeepFaceLab_DirectX12\_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1762) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
December 22, 2022 at 1:17 pm #7654
deepfakery
Keymaster
OOM ‘Out of Memory’ error usually means the settings are too high for the GPU to handle. Start by lowering the batch size until the model will run. If you have to go below batch size 4 then you should consider using a model with lower resolution and/or dims.
December 28, 2022 at 3:51 pm #7697
defalafa
Participant
also check general ram size – like pagefile / space

open task manager while loading the model – you will see under ram and gpu the rise of python grabbing resources.

also try models_opt_on_gpu: False

so you out source the model into ram – slower but more vram available
Author

Posts

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.