AWS Sagemaker¶
Sagemaker Clarify¶
Analyze and monitor data bias and model expandability to maintain fairness.
Features¶
- Bias monitoring. Alerts in Cloud Watch when bias is above determined thresholds
- Continuous monitoring of feature attribution drift. Alerts in Cloud Watch when significant feature attribution drift is detected
- Exportable reports and graphs detailing bias in Sagemaker Studio
Use Cases¶
- Analyze bias in the input and output data captured from SageMaker real-time endpoints
- Detect changes in how model attributes importance to features (Ex: Model starts relying more on zipcode than before)
Requirements¶
- Must enable data capture in Sagemaker Endpoints
Sagemaker Lineage Tracker¶
Track lineage of Machine Learning artifacts, such as datasets, models and experiments.
Use Cases¶
- Auditability
- Reproducibility
Sagemaker Model Monitor¶
Features¶
- sagemaker-model-monitor-analyzer built-in image: custom monitoring jobs for analysis and reporting purposes
Sagemaker Model Registry¶
Purpose built for storing, versioning and managing Machine Learning models. Allows users to organize models into model groups, which act as containers for different versions of a model.
Features¶
- Manage model versions and associate model versions within a model group
- Catalog models for production
- Associate metadata, such as training metrics, with a model
- Manage the approval status of a model
- Share models with others
- Organize model groups into collections
Sagemaker Data Wrangler¶
Features¶
- Corrupt image transform: preprocess data by simulating variations in image quality, such as blurring, noise, and compression artifacts. This can help improve the robustness of models to real-world conditions where image quality may vary
- Balanced data: oversample minority classe to resolve class imbalance
Sagemaker Canvas¶
Sagemaker JumpStart¶
Sagemaker Auto Model Tuning (AMT)¶
Use Cases¶
- Hyperparameter tunning
Sagemaker Pipelines¶
Is a service that allows you to create, automate, and manage end-to-end machine learning workflows. It provides a way to define and orchestrate the various steps involved in the machine learning process, such as data preprocessing, model training, and deployment.
Sagemaker Endpoints¶
Features¶
- multi-model endpoints: host multiple models on a single endpoint and route inference requests to the appropriate model based on the request content or other criteria. This can help reduce costs and improve resource utilization by sharing infrastructure across multiple models
- Auto-scaling: automatically adjust the number of instances serving the endpoint based on the incoming traffic. This can help ensure that the endpoint can handle varying workloads while optimizing costs by scaling down during periods of low demand.
Sagemaker Real-Time Inference¶
Use Cases¶
- Low latency applications
Sagemaker Shadow Testing¶
Use Cases¶
- Test new models in production without impacting end-users
Features¶
- Route a copy of live customer data to the new model for evaluation while maintaining the production model's operation. Compare predictions of both models to assess performance
SageMaker Batch Transform¶
Use Cases¶
- Offline, large-scale batch processing
Sagemaker Feature Store¶
Features¶
- Fully managed repository for storing, updating and retrieving machine learning features
Sagemaker Ground Truth¶
Features¶
- Human annotators for labeling data
Sagemaker Debugger¶
Use Cases¶
- Monitor training jobs in real time, detect issues, and automatically trigger actions such as stopping training or generating alerts.
Features¶
- Detect vanishing gradients, GPU underutilization, overfitting, and other common training issues. Provides real-time insights into the training process, allowing users to identify and address issues quickly.