RuntimeError: Worker failed with error 'CUDA out of memory. Tried to allocate 288.00 MiB. GPU 0 has a total capacity of 63.59 GiB of which 0 bytes is free. Of the allocated memory 57.90 GiB is allocated by PyTorch, and 45.49 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (pytorch.org/docs/stable/notes/cuda.html#environment-variables)', please check the stack trace above for the root cause