Inference model deployment¶

Using Helm¶

Repository includes two helm charts:

custom-model-kserve-helm-chart inference Kserve service that waits for input with possible scaling extension by Knative and Istio
- easy to scale, Knative listens to number of requests and scales up/down the number of pods
- esy to troubleshoot, monitor and deploy(uses MLflow and Minio database for experiment tracking)
- easy to extend to forward predictions to central InfluxDB by concept of pre/postprocessing Transformers in Kserve
simple-helm-chart a model deployed as Kubernetes-deployment that listens to Redis channel, makes predictions and forwards them to central InfluxdbDB
- simple to deploy and understand
- no scaling, troubleshooting, monitoring, etc. capabilities
- as the deployment listens to Redis channel (and not expects the input from client/Redis) the only way to scale the solution is replace Redis publish/subscribe channel for Redis stream and consumer groups to allow multiple consumers share the load of processing

Configuration parameters¶

custom-model-kserve-helm-chart¶

Name	Description	Default value
name	service/pod name	"custom-kserve-model"
storageUri	S3 storage uri where the model is stored	"s3://mlflow/5/988f6db2906641b8bcc1494c36619f9d/artifacts/model"
serviceAccountName	hosts and credentials to reach services by Kserve, e.g. s3, more description in Kserve documentation	"success6g"

simple-helm-chart¶

Name	Description	Default value
image.repository	Deployment Docker image repository	5uperpalo/success6g_custom_kserve
image.pullPolicy	Deployment Docker image policy	IfNotPresent
image.tag	Deployment Docker image tag	latest
influxdb.host	Central InfluxDB host	"10.152.183.219"
influxdb.port	Central InfluxDB port	"80"
influxdb.user	Central InfluxDB username	"admin"
influxdb.pass	Central InfluxDB password	"admin_pass"
redis.host	Redis database host	"10.152.183.250"
redis.port	Redis database port	"6379"
redis.pass	Redis database password	"redis"
resources.requests.cpu	Kubernetes requested CPU	"2"
resources.requests.memory	Kubernetes requested memory	"4Gi"
resources.limits.cpu	Kubernetes limits to CPU	"2"
resources.limits.memory	Kubernetes limits to memory	"4Gi"

Installation¶

From cloned repo:¶

custom-model-kserve-helm-chart

helm install custom-model-kserve ./custom-model-kserve-helm-chart --namespace custom-model-kserve --create-namespace

simple-helm-chart

helm install simple ./simple-helm-chart --namespace simple --create-namespace

From added helm repo:¶

custom-model-kserve-helm-chart

helm repo add success6g https://5uperpalo.github.io/success6g-edge/charts/
helm install custom-model-kserve success6g/custom-model-kserve --namespace custom-model-kserve --create-namespace

simple-helm-chart

helm repo add success6g https://5uperpalo.github.io/success6g-edge/charts/
helm install success6g/simple simple-helm-chart --namespace simple --create-namespace
# helm install simple success6g/simple --set redis.host="10.43.128.90" --set influxdb.host="10.17.252.101" --set influxdb.port="30567" --namespace simple --create-namespace