NORDICS20 - Assets

Deep Learning on AWS

Amazon Web Services Resources EMEA

Issue link:

Contents of this Issue


Page 11 of 50

Amazon Web Services Deep Learning on AWS Page 7 implementations. Challenges arise based on the complexity of most neural networks, the high dimensionality of the dataset, and lastly the scale of the infrastructure needed to train large models with a lot of training data. To accommodate these challenges, you need elasticity and performance in your compute and storage infrastructure. On AWS, you can choose to build your neural net from the ground up with the AWS Deep Learning Amazon Machine Image (AWS DL AMI) which comes preconfigured with TensorFlow, PyTorch, Apache MXNet, Chainer, Microsoft Cognitive Toolkit, Gluon, Horovod, and Keras, enabling you to quickly deploy and run any of these frameworks and tools at scale. Additionally, you can choose to use the preconfigured AWS Deep Learning Containers (AWS DL Containers) preinstalled with deep learning frameworks supporting TensorFlow and Apache MXNet and run them on Amazon Elastic Kubernetes Service (Amazon EKS), self-managed Kubernetes, Amazon Elastic Container Service (Amazon ECS), or directly on Amazon Elastic Compute Cloud (Amazon EC2). Lastly, you can take advantage of the AWS SDK for Python. This SDK provides open source APIs and containers to train and deploy models in Amazon SageMaker with several different machine learning and deep learning frameworks. We will discuss the most common solutions and patterns using these services in the second half of this paper. Step 4. Train, Retrain, and Tune the Models Training neural networks is different from traditional machine learning implementations because the model needs to learn the mapping function from the inputs to the outputs via function approximation in a nonconvex error space with many "good" solutions. Since we can't directly compute the optimal set of weights via a closed form solution (as is the case with simple linear regression models), and we cannot get global convergence guarantees, training a neural network can be challenging and usually requires much more data and compute resources than other machine learning algorithms. AWS provides a variety of tools and services to simplify the training process of your neural networks. Throughout this paper, we will discuss a variety of options that includes running your self-managed deep learning environment on Amazon EC2; running a deep learning environment on Amazon EKS or Amazon ECS; or using fully managed service Amazon SageMaker for deep learning. All these environment uses highly customized GPU powered hardware to reduce training time and training cost. In addition to the model design discussed in Step 2. Choose and Optimize Your Algorithm, you also have the option of setting hyperparameters before starting the

Articles in this issue

Links on this page

view archives of NORDICS20 - Assets - Deep Learning on AWS