PyTorch Distributed Training
In this blog post, I would like to present a simple implementation of PyTorch distributed training on CIFAR-10 classification using
DistributedDataParallelwrapped ResNet models. The usage of Docker container for distributed training and how to start distributed training usingtorch.distributed.launchwould also be covered.
Source: PyTorch Distributed Training, an article by Lei Mao.