NORDICS20 - Assets

Deep Learning on AWS

Amazon Web Services Resources EMEA

Issue link:

Contents of this Issue


Page 42 of 50

Amazon Web Services Deep Learning on AWS Page 38 Figure 13: Workflow for retraining and redeployment Orchestrate Your Hyperscale Deep Learning jobs using AWS Batch with Amazon SageMaker as Backend in Multiple AWS Regions Some customers have use cases that requires training on a very large dataset where data must remain local within the sovereign boundaries of the region in which it was generated either due to cost, performance, or regulatory concerns. This data could be the 4K video data of an autonomous vehicle generated locally or campaign data generated locally, transferred locally to nearest AWS Regions, and labeled within the same Region. You can use Amazon SageMaker to train your model locally in the same region. Optionally, you can launch multiple Amazon SageMaker training jobs to train parallelly in each Region. You can use AWS Batch to orchestrate and monitor multiple jobs running on Amazon SageMaker in multiple AWS Regions from a central region. This event-driven architecture triggers the training job as data is uploaded from the on- premises environment to nearest AWS Region. You can generate data coming into Amazon S3 into a relation table in one central place. The central table keeps the index of all the data files sourced from different campaigns running in different geographic locations. From this central table, you can issue a query to generate an AWS Batch array job. AWS Batch array jobs are submitted just like regular jobs. However, you specify an array size (between 2 and 10,000) to define how many child jobs should run in the array. If you submit a job with an array size of 1,000, a single job runs and spawns 1,000 child jobs. The array job is a reference or pointer to the parent job to manage all the child jobs. This feature allows you to submit large

Articles in this issue

Links on this page

view archives of NORDICS20 - Assets - Deep Learning on AWS