Tag: SageMaker Inference
Run ML inference on unplanned and spiky traffic using Amazon SageMaker multi-model endpoints | Amazon Web Services
Amazon SageMaker multi-model endpoints (MMEs) are a fully managed capability of SageMaker inference that allows you to deploy thousands of models on a single...
Top News
Breaking News
Reduce inference time for BERT models using neural architecture search and SageMaker Automated Model Tuning | Amazon Web Services
In this post, we demonstrate how to use neural architecture search (NAS) based structural pruning to compress a fine-tuned BERT model to improve model...
Llama Guard is now available in Amazon SageMaker JumpStart | Amazon Web Services
Today we are excited to announce that the Llama Guard model is now available for customers using Amazon SageMaker JumpStart. Llama Guard provides input...
Identify cybersecurity anomalies in your Amazon Security Lake data using Amazon SageMaker | Amazon Web Services
Customers are faced with increasing security threats and vulnerabilities across infrastructure and application resources as their digital footprint has expanded and the business impact...
Package and deploy classical ML and LLMs easily with Amazon SageMaker, part 2: Interactive User Experiences in SageMaker Studio | Amazon Web Services
Amazon SageMaker is a fully managed service that enables developers and data scientists to quickly and easily build, train, and deploy machine learning (ML)...
Package and deploy classical ML and LLMs easily with Amazon SageMaker, part 1: PySDK Improvements | Amazon Web Services
Amazon SageMaker is a fully managed service that enables developers and data scientists to quickly and effortlessly build, train, and deploy machine learning (ML)...
Scale foundation model inference to hundreds of models with Amazon SageMaker – Part 1 | Amazon Web Services
As democratization of foundation models (FMs) becomes more prevalent and demand for AI-augmented services increases, software as a service (SaaS) providers are looking to...
Reduce model deployment costs by 50% on average using the latest features of Amazon SageMaker | Amazon Web Services
As organizations deploy models to production, they are constantly looking for ways to optimize the performance of their foundation models (FMs) running on the...
Accelerate data preparation for ML in Amazon SageMaker Canvas | Amazon Web Services
Data preparation is a crucial step in any machine learning (ML) workflow, yet it often involves tedious and time-consuming tasks. Amazon SageMaker Canvas now...
Democratize ML on Salesforce Data Cloud with no-code Amazon SageMaker Canvas | Amazon Web Services
This post is co-authored by Daryl Martis, Director of Product, Salesforce Einstein AI. This is the third post in a series discussing the integration...
Build a contextual chatbot for financial services using Amazon SageMaker JumpStart, Llama 2 and Amazon OpenSearch Serverless with Vector Engine | Amazon Web Services
The financial service (FinServ) industry has unique generative AI requirements related to domain-specific data, data security, regulatory controls, and industry compliance standards. In addition,...
Build a medical imaging AI inference pipeline with MONAI Deploy on AWS | Amazon Web Services
This post is cowritten with Ming (Melvin) Qin, David Bericat and Brad Genereaux from NVIDIA. Medical imaging AI researchers and developers need a scalable,...
Deploy ML models built in Amazon SageMaker Canvas to Amazon SageMaker real-time endpoints | Amazon Web Services
Amazon SageMaker Canvas now supports deploying machine learning (ML) models to real-time inferencing endpoints, allowing you take your ML models to production and drive...