NORDICS20 - Assets

Deep Learning on AWS

Amazon Web Services Resources EMEA

Issue link:

Contents of this Issue


Page 34 of 50

Amazon Web Services Deep Learning on AWS Page 30 Amazon SageMaker completely abstracts the infrastructure creation and termination tasks required for deep learning build, train and deploy environment. As a part of the infrastructure, it also handles the installation of common drivers such as NCCL, CUDA, and CUDNN; installation of common data processing libraries; and installation of common frameworks used for deep learning today. The notebook, training, and inference environment creation is made available as an API call, CLI command, Amazon SageMaker SDK, or as an option in the AWS Management Console. Amazon SageMaker allows you to automate a sequence of tasks into an automated workflow for an ML pipeline, such as retraining and redeploying new variants of a model with consistency. Amazon SageMaker simplifies common deep learning tasks, such as data labeling, hyperparameter tuning, real time and batch inference. The simplification of these tasks allows you to accelerate and scale deep learning adoption within an organization without losing control. The standard sets of tools that are fully managed allows better collaboration and improved time to market for deep learning projects. Amazon SageMaker also provides a single platform to both deep learning engineers and scientists and DevOps professionals in an organization to allow for a clean handshake between model authors who design algorithms and train models on data and DevOps team who are responsible for model deployment and monitoring. Amazon SageMaker provides pre-built container images for most of the popular framework. You can extend the exiting Amazon SageMaker container images or build your own. Advanced deep learning engineers and scientists working at the framework level may want to try a custom DL framework such as TensorFlow (TF) to try a custom operator to accelerate deep learning training for a specific use case or may want to run two TF process on a single instance for improved performance. Amazon SageMaker allows you to configure and customize the environment using script mode and bring your own container mode. The script mode allows you to bring your custom script (such as a TF script) and run it on pre-built TF AWS DL containers. Bring your own container allows maximum flexibility and control as you can build the container from scratch with your custom TF build and run it on Amazon SageMaker. DIY Partially Managed Solution: Use Kubernetes with Kubeflow on AWS This pattern applies to customers who have decided to standardize on Kubernetes as an infrastructure layer and would like to leverage their existing investment in Kubernetes to run deep learning training and inference jobs. This setup introduces a lot of

Articles in this issue

Links on this page

view archives of NORDICS20 - Assets - Deep Learning on AWS