PyTorch Distributed Training
In this blog post, I would like to present a simple implementation of PyTorch distributed training on CIFAR-10 classification using
DistributedDataParallel
wrapped ResNet models. The usage of Docker container for distributed training and how to start distributed training usingtorch.distributed.launch
would also be covered.
Source: PyTorch Distributed Training, an article by Lei Mao.