NORDICS20 - Assets

Deep Learning on AWS

Amazon Web Services Resources EMEA

Issue link:

Contents of this Issue


Page 21 of 50

Amazon Web Services Deep Learning on AWS Page 17 acceleration that you need to use resources efficiently and to reduce the cost of running inference. However, some inference workloads require an entire GPU or have low latency requirements. Solving this challenge at low cost requires a specialized and a dedicated inference chip. AWS Inferentia is a machine learning inference chip designed to deliver high performance at low cost. AWS Inferentia hardware and software meet wide spectrum of inference use cases and state of art neural networks. AWS Inferentia supports the TensorFlow, Apache MXNet, and PyTorch deep learning frameworks, as well as models that use the ONNX format. Each AWS Inferentia chip provides hundreds of TOPS (tera operations per second) of inference throughput to allow complex models to make fast predictions. For even more performance, multiple AWS Inferentia chips can be used together to drive thousands of TOPS of throughput. AWS Inferentia will be available for use with Amazon SageMaker, Amazon EC2, and Amazon Elastic Inference. To be notified about AWS Inferentia availability, you can sign up here Amazon EC2 G4 We are advancing into an age where every customer interaction will be powered by AI in the backend. To meet and exceed your customer demands, you need a compute platform that allows you to cost effectively scale your AI-based products and services. The NVIDIA® Tesla® T4 GPU is the world's most advanced inference accelerator. Powered by NVIDIA Turing™ Tensor Cores, T4 brings revolutionary multi-precision inference performance to accelerate the diverse applications of modern AI. T4 is optimized for scale-out servers and is purpose-built to deliver state-of-the-art inference in real time. Responsiveness is key to user engagement for services such as conversational AI, recommender systems, and visual search. As models increase in accuracy and complexity, delivering the right answer right now requires exponentially larger compute capability. Tesla T4 delivers up to 40X times better low-latency throughput, so more requests can be served in real time. The new Amazon EC2 G4 instances packages T4-based GPUs to provide AWS customers with a versatile platform to cost-efficiently deploy a wide range of AI services. Through AWS Marketplace, customers will be able to pair the G4 instances with NVIDIA

Articles in this issue

Links on this page

view archives of NORDICS20 - Assets - Deep Learning on AWS