Configuring the chat bot
The sample retail application includes a built-in chat interface that allows customers to interact with the store using natural language. This feature can help customers find products, get recommendations, or answer questions about store policies. For this module, we'll configure this chat component to use our Mistral-7B model served through vLLM.
Let's reconfigure the UI component to enable the chat bot functionality and point it to our vLLM endpoint:
- Kustomize Patch
 - Deployment/ui
 - Diff
 
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - ../../../../base-application/ui
patches:
  - path: deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/created-by: eks-workshop
    app.kubernetes.io/type: app
  name: ui
  namespace: ui
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: service
      app.kubernetes.io/instance: ui
      app.kubernetes.io/name: ui
  template:
    metadata:
      annotations:
        prometheus.io/path: /actuator/prometheus
        prometheus.io/port: "8080"
        prometheus.io/scrape: "true"
      labels:
        app.kubernetes.io/component: service
        app.kubernetes.io/created-by: eks-workshop
        app.kubernetes.io/instance: ui
        app.kubernetes.io/name: ui
    spec:
      containers:
        - env:
            - name: RETAIL_UI_CHAT_ENABLED
              value: "true"
            - name: RETAIL_UI_CHAT_PROVIDER
              value: openai
            - name: RETAIL_UI_CHAT_MODEL
              value: /models/mistral-7b-v0.3
            - name: RETAIL_UI_CHAT_OPENAI_BASE_URL
              value: http://mistral.vllm:8080
            - name: JAVA_OPTS
              value: -XX:MaxRAMPercentage=75.0 -Djava.security.egd=file:/dev/urandom
            - name: METADATA_KUBERNETES_POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: METADATA_KUBERNETES_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: METADATA_KUBERNETES_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          envFrom:
            - configMapRef:
                name: ui
          image: public.ecr.aws/aws-containers/retail-store-sample-ui:1.2.1
          imagePullPolicy: IfNotPresent
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 45
            periodSeconds: 20
          name: ui
          ports:
            - containerPort: 8080
              name: http
              protocol: TCP
          resources:
            limits:
              memory: 1.5Gi
            requests:
              cpu: 250m
              memory: 1.5Gi
          securityContext:
            capabilities:
              add:
                - NET_BIND_SERVICE
              drop:
                - ALL
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            runAsUser: 1000
          volumeMounts:
            - mountPath: /tmp
              name: tmp-volume
      securityContext:
        fsGroup: 1000
      serviceAccountName: ui
      volumes:
        - emptyDir:
            medium: Memory
          name: tmp-volume
         app.kubernetes.io/name: ui
     spec:
       containers:
         - env:
+            - name: RETAIL_UI_CHAT_ENABLED
+              value: "true"
+            - name: RETAIL_UI_CHAT_PROVIDER
+              value: openai
+            - name: RETAIL_UI_CHAT_MODEL
+              value: /models/mistral-7b-v0.3
+            - name: RETAIL_UI_CHAT_OPENAI_BASE_URL
+              value: http://mistral.vllm:8080
             - name: JAVA_OPTS
               value: -XX:MaxRAMPercentage=75.0 -Djava.security.egd=file:/dev/urandom
             - name: METADATA_KUBERNETES_POD_NAME
               valueFrom:
This configuration makes the following important changes:
- Enables the chat bot component in the UI interface
 - Configures the application to use the OpenAI model provider, which works with vLLM's OpenAI-compatible API
 - Specifies the appropriate model name, which is required by the OpenAI endpoint format
 - Sets the endpoint URL to 
http://mistral.vllm:8080, connecting to our Kubernetes Service for the vLLM Deployment 
Let's apply these changes to our running application:
namespace/ui unchanged
serviceaccount/ui unchanged
configmap/ui unchanged
service/ui unchanged
deployment.apps/ui configured
With these changes applied, the UI will now display a chat interface that connects to our locally deployed language model. In the next section, we'll test this configuration to see our AI-powered chat bot in action.
While the UI is now configured to use the vLLM endpoint, the model needs to be fully loaded before it can respond to requests. If you encounter any delays or errors when testing, this may be because the model is still being initialized.