Error when start to train SAEHD hepl!

Home Forums DeepFaceLab Training Error when start to train SAEHD hepl!

  • This topic has 2 replies, 3 voices, and was last updated 1 month ago by defalafa.
Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #7601
    dvdfakfy
    Participant

      Hi!
      I did all the previous steps and now i want to start SAEHD train but I get this error and it stops to work. I restarted the computer and start over again and same problem.
      I apreciate the help!

      [n] Enable gradient clipping ( y/n ?:help ) : n
      [n] Enable pretraining mode ( y/n ?:help ) : n
      Initializing models: 20%|############6 | 1/5 [00:11<00:46, 11.74s/it]
      Error: OOM when allocating tensor with shape[576000,300] and type float on /job:localhost/replica:0/task:0/device:DML:0 by allocator DmlAllocator
      [[node inter_AB/dense1/weight/Initializer/random_uniform/mul (defined at C:\Users\david.puluc\Documents\DeepFaceLab_DirectX12\_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1762) ]]
      Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

      #7654
      deepfakery
      Keymaster

        OOM ‘Out of Memory’ error usually means the settings are too high for the GPU to handle. Start by lowering the batch size until the model will run. If you have to go below batch size 4 then you should consider using a model with lower resolution and/or dims.

        #7697
        defalafa
        Participant

          also check general ram size – like pagefile / space

          open task manager while loading the model – you will see under ram and gpu the rise of python grabbing resources.

          also try models_opt_on_gpu: False

          so you out source the model into ram – slower but more vram available

        Viewing 3 posts - 1 through 3 (of 3 total)
        • You must be logged in to reply to this topic.