Models

Models are one of AME's higher level constructs, see what that means here. if you are configuring how a model should be trained, deployed, monitored or validated this is the right place. Models exist in an AME file along side Datasets Tasks and Templates.

Model training

Model training is configured described use a Task.

AME can be deployed with a an MLflow instance which will be exposed to the Training Task allowing for simple storage and retrievel of models metrics and experiments.

# main project ame.yml
project: xgboost_project
models:
  - name: product_recommendor
    training:
      task: 
        taskRef: train_my_model 
tasks:
  - name: train_my_model
    fromTemplate: shared_templates.xgboost_resources
    executor:
      !poetry
      pythonVersion: 3.11
      command: python train.py
    resources:
      memory: 10G 
      cpu: 4 
      storage: 30G 
      nvidia.com/gpu: 1

Model deployment

If AME is setup with a model reigstry see supported registries here models can be deployed for inference.

Just like for Tasks, you can and probably should define the resources required to perform inference with your model. Here are configuration examples with the serving options available with AME.

Mlflow

# main project ame.yml
project: xgboost_project
models:
  - name: product_recommendor
    training:
      task: 
        taskRef: train_my_model 
    deployment:
      resources:
        memory: 10G
        cpu: 4
        storage: 10G
        nvidia.com/gpu: 1
      autoDeploy: true
tasks:
  ...

MLserver

Mlserver support is planned for a future release, see issue

Kserve

Mlserver support is planned for a future release, see issue

Triton

Advanced deployment configuration

Ingress

If you are hosting AME your self, there are a number of decisions that need to be made with regards to mode deployment. Currently AME does not automatically generate an ingress configuration and therefore one must be provided either. You can provide a cluster wide default as well as individual override for models.

See how to set a cluster wide default here.

Model specific ingress can be set here. The plan is to provide better abstractions to avoid having to work with this directly, see this.

Setting model deployment ingress:

# main project ame.yml
project: xgboost_project
models:
  - name: product_recommendor
    deployment:
      ingressAnnotations:
        TODO
      resources:
        memory: 10G
        cpu: 4
        storage: 10G
        nvidia.com/gpu: 1
      autoDeploy: true
tasks:
  ...

Replicas

For productioon deploiyments you will likely want some degree of relication for model instances. Custer wide defaults can be set here. Model specific replicas can set like this:

# main project ame.yml
project: xgboost_project
models:
  - name: product_recommendor
    deployment:
      replicas: 3
      ...
tasks:
  ...

Image

If AME's default deployment image is insufficient for your use case a custom image can be set. This can be change cluster wide here.

Model specific deployment images can be set like this:

# main project ame.yml
project: xgboost_project
models:
  - name: product_recommendor
    deployment:
      image: my.deployment.image
      ...
tasks:
  ...

If a secret is required to access the image remember to provide that secret to AME, a guide is here.

Model validation

AME supports validating models versions before they are deployed. To enable this we have to provide a task that will succeed or fail based on the validity of a model version. See a guide here.

# main project ame.yml
project: xgboost_project
models:
  - name: product_recommendor
    validation:
      task:
        taskRef: model_validation
tasks:
  ...

Model monitoring

Batch inference

TODO add a reference with all objects and options.