diff --git a/gpu-aware-scheduling/README.md b/gpu-aware-scheduling/README.md index 8efafc71..a51bae7a 100644 --- a/gpu-aware-scheduling/README.md +++ b/gpu-aware-scheduling/README.md @@ -36,7 +36,7 @@ Note: a shell script that shows these steps can be found [here](deploy/extender- The extender configuration files can be found under deploy/extender-configuration. GAS Scheduler Extender needs to be registered with the Kubernetes Scheduler. In order to do this a configmap should be created like the below: -```` +``` apiVersion: v1alpha1 kind: ConfigMap metadata: @@ -72,14 +72,14 @@ data: ] } -```` +``` A similar file can be found [in the deploy folder](./deploy/extender-configuration/scheduler-extender-configmap.yaml). This configmap can be created with ``kubectl apply -f ./deploy/scheduler-extender-configmap.yaml`` The scheduler requires flags passed to it in order to know the location of this config map. The flags are: -```` +``` - --policy-configmap=scheduler-extender-policy - --policy-configmap-namespace=kube-system -```` +``` If scheduler is running as a service these can be added as flags to the binary. If scheduler is running as a container - as in kubeadm - these args can be passed in the deployment file. Note: For Kubeadm set ups some additional steps may be needed. diff --git a/gpu-aware-scheduling/docs/example/README.md b/gpu-aware-scheduling/docs/example/README.md new file mode 100644 index 00000000..b0101601 --- /dev/null +++ b/gpu-aware-scheduling/docs/example/README.md @@ -0,0 +1,13 @@ +This folder has a simple example POD which uses kubernetes extended resources + +To deploy, you can run in this folder: + +``` +kubectl apply -f . +``` + +Then you can check the GPU devices of the first pod in the deployment with: + +``` +kubectl exec -it deploy/bb-example -- ls /dev/dri +``` \ No newline at end of file diff --git a/gpu-aware-scheduling/docs/example/bb_example.yaml b/gpu-aware-scheduling/docs/example/bb_example.yaml new file mode 100644 index 00000000..bd68d527 --- /dev/null +++ b/gpu-aware-scheduling/docs/example/bb_example.yaml @@ -0,0 +1,23 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: bb-example +spec: + replicas: 1 + selector: + matchLabels: + app: bb-example + template: + metadata: + labels: + app: bb-example + spec: + containers: + - name: gpu-resource-request + image: busybox:1.33.1 + command: ['sh', '-c', 'echo The gpu resource request app is running! && sleep 6000'] + resources: + limits: + gpu.intel.com/i915: 1 + gpu.intel.com/millicores: 100 + gpu.intel.com/memory.max: 1G diff --git a/gpu-aware-scheduling/docs/gpu_plugin/README.md b/gpu-aware-scheduling/docs/gpu_plugin/README.md new file mode 100644 index 00000000..6f9bc2f2 --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/README.md @@ -0,0 +1,8 @@ +This folder has a simple example of how to deploy the Intel GPU plugin so that it has the fractional +resource support enabled. + +To deploy, you can run in this folder: + +``` +kubectl apply -k overlays/fractional_resources +``` \ No newline at end of file diff --git a/gpu-aware-scheduling/docs/gpu_plugin/base/intel-gpu-plugin.yaml b/gpu-aware-scheduling/docs/gpu_plugin/base/intel-gpu-plugin.yaml new file mode 100644 index 00000000..29c00882 --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/base/intel-gpu-plugin.yaml @@ -0,0 +1,60 @@ +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: intel-gpu-plugin + labels: + app: intel-gpu-plugin +spec: + selector: + matchLabels: + app: intel-gpu-plugin + template: + metadata: + labels: + app: intel-gpu-plugin + spec: + initContainers: + - name: intel-gpu-initcontainer + image: intel/intel-gpu-initcontainer:devel + imagePullPolicy: IfNotPresent + securityContext: + readOnlyRootFilesystem: true + volumeMounts: + - mountPath: /etc/kubernetes/node-feature-discovery/source.d/ + name: nfd-source-hooks + containers: + - name: intel-gpu-plugin + env: + - name: NODE_NAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + image: intel/intel-gpu-plugin:devel + imagePullPolicy: IfNotPresent + securityContext: + readOnlyRootFilesystem: true + volumeMounts: + - name: devfs + mountPath: /dev/dri + readOnly: true + - name: sysfs + mountPath: /sys/class/drm + readOnly: true + - name: kubeletsockets + mountPath: /var/lib/kubelet/device-plugins + volumes: + - name: devfs + hostPath: + path: /dev/dri + - name: sysfs + hostPath: + path: /sys/class/drm + - name: kubeletsockets + hostPath: + path: /var/lib/kubelet/device-plugins + - name: nfd-source-hooks + hostPath: + path: /etc/kubernetes/node-feature-discovery/source.d/ + type: DirectoryOrCreate + nodeSelector: + kubernetes.io/arch: amd64 diff --git a/gpu-aware-scheduling/docs/gpu_plugin/base/kustomization.yaml b/gpu-aware-scheduling/docs/gpu_plugin/base/kustomization.yaml new file mode 100644 index 00000000..f51925e0 --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/base/kustomization.yaml @@ -0,0 +1,2 @@ +resources: + - intel-gpu-plugin.yaml diff --git a/gpu-aware-scheduling/docs/gpu_plugin/kustomization.yaml b/gpu-aware-scheduling/docs/gpu_plugin/kustomization.yaml new file mode 100644 index 00000000..f191f3aa --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/kustomization.yaml @@ -0,0 +1,2 @@ +bases: + - base diff --git a/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/add-args.yaml b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/add-args.yaml new file mode 100644 index 00000000..a438bab4 --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/add-args.yaml @@ -0,0 +1,12 @@ +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: intel-gpu-plugin +spec: + template: + spec: + containers: + - name: intel-gpu-plugin + args: + - "-shared-dev-num=300" + - "-resource-manager" diff --git a/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/add-podresource-mount.yaml b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/add-podresource-mount.yaml new file mode 100644 index 00000000..d127334f --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/add-podresource-mount.yaml @@ -0,0 +1,17 @@ +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: intel-gpu-plugin +spec: + template: + spec: + containers: + - name: intel-gpu-plugin + volumeMounts: + - name: podresources + mountPath: /var/lib/kubelet/pod-resources + volumes: + - name: podresources + hostPath: + path: /var/lib/kubelet/pod-resources + \ No newline at end of file diff --git a/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/add-serviceaccount.yaml b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/add-serviceaccount.yaml new file mode 100644 index 00000000..2926657b --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/add-serviceaccount.yaml @@ -0,0 +1,8 @@ +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: intel-gpu-plugin +spec: + template: + spec: + serviceAccountName: resource-reader-sa diff --git a/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/kustomization.yaml b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/kustomization.yaml new file mode 100644 index 00000000..991ddcd5 --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/kustomization.yaml @@ -0,0 +1,10 @@ +bases: + - ../../base +resources: + - resource-cluster-role-binding.yaml + - resource-cluster-role.yaml + - resource-reader-sa.yaml +patches: + - add-serviceaccount.yaml + - add-podresource-mount.yaml + - add-args.yaml \ No newline at end of file diff --git a/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/resource-cluster-role-binding.yaml b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/resource-cluster-role-binding.yaml new file mode 100644 index 00000000..f46439f3 --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/resource-cluster-role-binding.yaml @@ -0,0 +1,12 @@ +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: resource-reader-rb +subjects: +- kind: ServiceAccount + name: resource-reader-sa + namespace: default +roleRef: + kind: ClusterRole + name: resource-reader + apiGroup: rbac.authorization.k8s.io diff --git a/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/resource-cluster-role.yaml b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/resource-cluster-role.yaml new file mode 100644 index 00000000..cca48ccc --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/resource-cluster-role.yaml @@ -0,0 +1,8 @@ +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: resource-reader +rules: +- apiGroups: [""] + resources: ["pods"] + verbs: ["list"] diff --git a/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/resource-reader-sa.yaml b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/resource-reader-sa.yaml new file mode 100644 index 00000000..a2879ece --- /dev/null +++ b/gpu-aware-scheduling/docs/gpu_plugin/overlays/fractional_resources/resource-reader-sa.yaml @@ -0,0 +1,4 @@ +apiVersion: v1 +kind: ServiceAccount +metadata: + name: resource-reader-sa diff --git a/gpu-aware-scheduling/docs/nfd/README.md b/gpu-aware-scheduling/docs/nfd/README.md new file mode 100644 index 00000000..2e623114 --- /dev/null +++ b/gpu-aware-scheduling/docs/nfd/README.md @@ -0,0 +1,8 @@ +This folder has a simple example of how to deploy NFD so that it can create extended resources for +GPU Aware Scheduling + +To deploy, you can run in this folder: + +``` +kubectl apply -k . +``` \ No newline at end of file diff --git a/gpu-aware-scheduling/docs/nfd/kustom/env_vars.yaml b/gpu-aware-scheduling/docs/nfd/kustom/env_vars.yaml new file mode 100644 index 00000000..576133c8 --- /dev/null +++ b/gpu-aware-scheduling/docs/nfd/kustom/env_vars.yaml @@ -0,0 +1,19 @@ +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: nfd-worker +spec: + template: + spec: + containers: + - env: + # GPU_MEMORY_OVERRIDE is the value for gpus that don't tell memory amount via the driver + - name: GPU_MEMORY_OVERRIDE + value: "4000000000" + # GPU_MEMORY_RESERVED is the value of memory scoped out from k8s for those gpus which + # do tell the memory amount via the driver +# - name: GPU_MEMORY_RESERVED +# value: "294967295" + name: nfd-worker + +# the env var values propagate to the nfd extension hook (gpu nfd hook, installed by gpu plugin initcontainer) diff --git a/gpu-aware-scheduling/docs/nfd/kustom/external_resources.yaml b/gpu-aware-scheduling/docs/nfd/kustom/external_resources.yaml new file mode 100644 index 00000000..c5487c22 --- /dev/null +++ b/gpu-aware-scheduling/docs/nfd/kustom/external_resources.yaml @@ -0,0 +1,13 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nfd-master +spec: + template: + spec: + containers: + - name: nfd-master + command: + - "nfd-master" + - "--resource-labels=gpu.intel.com/memory.max,gpu.intel.com/millicores" + - "--extra-label-ns=gpu.intel.com" diff --git a/gpu-aware-scheduling/docs/nfd/kustom/rbac.yaml b/gpu-aware-scheduling/docs/nfd/kustom/rbac.yaml new file mode 100644 index 00000000..a9cf7d98 --- /dev/null +++ b/gpu-aware-scheduling/docs/nfd/kustom/rbac.yaml @@ -0,0 +1,18 @@ +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: nfd-master +rules: +- apiGroups: + - "" + resources: + - nodes +# since we are using command line flag --resource-labels to create extended resources +# this kustomize patch uncomments "- nodes/status" + - nodes/status + verbs: + - get + - patch + - update + # List only needed for --prune + - list diff --git a/gpu-aware-scheduling/docs/nfd/kustomization.yaml b/gpu-aware-scheduling/docs/nfd/kustomization.yaml new file mode 100644 index 00000000..4e3d0fcd --- /dev/null +++ b/gpu-aware-scheduling/docs/nfd/kustomization.yaml @@ -0,0 +1,7 @@ +resources: +- v0.7.0/nfd-master.yaml.template +- v0.7.0/nfd-worker-daemonset.yaml.template +patchesStrategicMerge: +- kustom/external_resources.yaml +- kustom/env_vars.yaml +- kustom/rbac.yaml diff --git a/gpu-aware-scheduling/docs/nfd/v0.7.0/nfd-master.yaml.template b/gpu-aware-scheduling/docs/nfd/v0.7.0/nfd-master.yaml.template new file mode 100644 index 00000000..6e39e631 --- /dev/null +++ b/gpu-aware-scheduling/docs/nfd/v0.7.0/nfd-master.yaml.template @@ -0,0 +1,129 @@ +apiVersion: v1 +kind: Namespace +metadata: + name: node-feature-discovery # NFD namespace +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: nfd-master + namespace: node-feature-discovery +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: nfd-master +rules: +- apiGroups: + - "" + resources: + - nodes +# when using command line flag --resource-labels to create extended resources +# you will need to uncomment "- nodes/status" +# - nodes/status + verbs: + - get + - patch + - update + # List only needed for --prune + - list +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: nfd-master +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: nfd-master +subjects: +- kind: ServiceAccount + name: nfd-master + namespace: node-feature-discovery +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + labels: + app: nfd-master + name: nfd-master + namespace: node-feature-discovery +spec: + replicas: 1 + selector: + matchLabels: + app: nfd-master + template: + metadata: + labels: + app: nfd-master + spec: + serviceAccount: nfd-master + affinity: + nodeAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 1 + preference: + matchExpressions: + - key: "node-role.kubernetes.io/master" + operator: In + values: [""] + tolerations: + - key: "node-role.kubernetes.io/master" + operator: "Equal" + value: "" + effect: "NoSchedule" + containers: + - env: + - name: NODE_NAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + image: k8s.gcr.io/nfd/node-feature-discovery:v0.7.0 + name: nfd-master + securityContext: + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + readOnlyRootFilesystem: true + runAsNonRoot: true + command: + - "nfd-master" +## Enable TLS authentication +## The example below assumes having the root certificate named ca.crt stored in +## a ConfigMap named nfd-ca-cert, and, the TLS authentication credentials stored +## in a TLS Secret named nfd-master-cert. +## Additional hardening can be enabled by specifying --verify-node-name in +## args, in which case every nfd-worker requires a individual node-specific +## TLS certificate. +# args: +# - "--ca-file=/etc/kubernetes/node-feature-discovery/trust/ca.crt" +# - "--key-file=/etc/kubernetes/node-feature-discovery/certs/tls.key" +# - "--cert-file=/etc/kubernetes/node-feature-discovery/certs/tls.crt" +# volumeMounts: +# - name: nfd-ca-cert +# mountPath: "/etc/kubernetes/node-feature-discovery/trust" +# readOnly: true +# - name: nfd-master-cert +# mountPath: "/etc/kubernetes/node-feature-discovery/certs" +# readOnly: true +# volumes: +# - name: nfd-ca-cert +# configMap: +# name: nfd-ca-cert +# - name: nfd-master-cert +# secret: +# secretName: nfd-master-cert +--- +apiVersion: v1 +kind: Service +metadata: + name: nfd-master + namespace: node-feature-discovery +spec: + selector: + app: nfd-master + ports: + - protocol: TCP + port: 8080 + type: ClusterIP diff --git a/gpu-aware-scheduling/docs/nfd/v0.7.0/nfd-worker-daemonset.yaml.template b/gpu-aware-scheduling/docs/nfd/v0.7.0/nfd-worker-daemonset.yaml.template new file mode 100644 index 00000000..1120beb3 --- /dev/null +++ b/gpu-aware-scheduling/docs/nfd/v0.7.0/nfd-worker-daemonset.yaml.template @@ -0,0 +1,188 @@ +apiVersion: apps/v1 +kind: DaemonSet +metadata: + labels: + app: nfd-worker + name: nfd-worker + namespace: node-feature-discovery +spec: + selector: + matchLabels: + app: nfd-worker + template: + metadata: + labels: + app: nfd-worker + spec: + dnsPolicy: ClusterFirstWithHostNet + containers: + - env: + - name: NODE_NAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + image: k8s.gcr.io/nfd/node-feature-discovery:v0.7.0 + name: nfd-worker + securityContext: + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"] + readOnlyRootFilesystem: true + runAsNonRoot: true + command: + - "nfd-worker" + args: + - "--sleep-interval=60s" + - "--server=nfd-master:8080" +## Enable TLS authentication (1/3) +## The example below assumes having the root certificate named ca.crt stored in +## a ConfigMap named nfd-ca-cert, and, the TLS authentication credentials stored +## in a TLS Secret named nfd-worker-cert +# - "--ca-file=/etc/kubernetes/node-feature-discovery/trust/ca.crt" +# - "--key-file=/etc/kubernetes/node-feature-discovery/certs/tls.key" +# - "--cert-file=/etc/kubernetes/node-feature-discovery/certs/tls.crt" + volumeMounts: + - name: host-boot + mountPath: "/host-boot" + readOnly: true + - name: host-os-release + mountPath: "/host-etc/os-release" + readOnly: true + - name: host-sys + mountPath: "/host-sys" + readOnly: true + - name: source-d + mountPath: "/etc/kubernetes/node-feature-discovery/source.d/" + readOnly: true + - name: features-d + mountPath: "/etc/kubernetes/node-feature-discovery/features.d/" + readOnly: true + - name: nfd-worker-conf + mountPath: "/etc/kubernetes/node-feature-discovery" + readOnly: true +## Enable TLS authentication (2/3) +# - name: nfd-ca-cert +# mountPath: "/etc/kubernetes/node-feature-discovery/trust" +# readOnly: true +# - name: nfd-worker-cert +# mountPath: "/etc/kubernetes/node-feature-discovery/certs" +# readOnly: true + volumes: + - name: host-boot + hostPath: + path: "/boot" + - name: host-os-release + hostPath: + path: "/etc/os-release" + - name: host-sys + hostPath: + path: "/sys" + - name: source-d + hostPath: + path: "/etc/kubernetes/node-feature-discovery/source.d/" + - name: features-d + hostPath: + path: "/etc/kubernetes/node-feature-discovery/features.d/" + - name: nfd-worker-conf + configMap: + name: nfd-worker-conf +## Enable TLS authentication (3/3) +# - name: nfd-ca-cert +# configMap: +# name: nfd-ca-cert +# - name: nfd-worker-cert +# secret: +# secretName: nfd-worker-cert +--- +apiVersion: v1 +kind: ConfigMap +metadata: + name: nfd-worker-conf + namespace: node-feature-discovery +data: + nfd-worker.conf: | + #sources: + # cpu: + # cpuid: + ## NOTE: whitelist has priority over blacklist + # attributeBlacklist: + # - "BMI1" + # - "BMI2" + # - "CLMUL" + # - "CMOV" + # - "CX16" + # - "ERMS" + # - "F16C" + # - "HTT" + # - "LZCNT" + # - "MMX" + # - "MMXEXT" + # - "NX" + # - "POPCNT" + # - "RDRAND" + # - "RDSEED" + # - "RDTSCP" + # - "SGX" + # - "SSE" + # - "SSE2" + # - "SSE3" + # - "SSE4.1" + # - "SSE4.2" + # - "SSSE3" + # attributeWhitelist: + # kernel: + # kconfigFile: "/path/to/kconfig" + # configOpts: + # - "NO_HZ" + # - "X86" + # - "DMI" + # pci: + # deviceClassWhitelist: + # - "0200" + # - "03" + # - "12" + # deviceLabelFields: + # - "class" + # - "vendor" + # - "device" + # - "subsystem_vendor" + # - "subsystem_device" + # usb: + # deviceClassWhitelist: + # - "0e" + # - "ef" + # - "fe" + # - "ff" + # deviceLabelFields: + # - "class" + # - "vendor" + # - "device" + # custom: + # - name: "my.kernel.feature" + # matchOn: + # - loadedKMod: ["example_kmod1", "example_kmod2"] + # - name: "my.pci.feature" + # matchOn: + # - pciId: + # class: ["0200"] + # vendor: ["15b3"] + # device: ["1014", "1017"] + # - pciId : + # vendor: ["8086"] + # device: ["1000", "1100"] + # - name: "my.usb.feature" + # matchOn: + # - usbId: + # class: ["ff"] + # vendor: ["03e7"] + # device: ["2485"] + # - usbId: + # class: ["fe"] + # vendor: ["1a6e"] + # device: ["089a"] + # - name: "my.combined.feature" + # matchOn: + # - pciId: + # vendor: ["15b3"] + # device: ["1014", "1017"] + # loadedKMod : ["vendor_kmod1", "vendor_kmod2"] diff --git a/gpu-aware-scheduling/docs/usage.md b/gpu-aware-scheduling/docs/usage.md index 051957a3..d90a15cf 100644 --- a/gpu-aware-scheduling/docs/usage.md +++ b/gpu-aware-scheduling/docs/usage.md @@ -4,34 +4,34 @@ This document explains how to get GAS working together with [Node Feature Discov To begin with, it will help a lot if you have been successful already using the GPU-plugin with some deployments. That means your HW and cluster is most likely fine with GAS also. ## GPU-plugin -Resource management enabled version of the GPU-plugin is currently necessary for running GAS. The resource management enabled GPU-plugin version can read the necessary annotations of the PODs, and without those GPU allocations will not work correctly. It can be deployed directly via kustomized yamls by issuing: -```` -kubectl apply -k deployments/gpu_plugin/overlays/fractional_resources -```` +Resource management enabled version of the GPU-plugin is currently necessary for running GAS. The resource management enabled GPU-plugin version can read the necessary annotations of the PODs, and without those annotations, GPU allocations will not work correctly. A copy of the plugin deployment kustomization can be found from [docs/gpu_plugin](./gpu_plugin). It can be deployed simply by issuing: +``` +kubectl apply -k docs/gpu_plugin/overlays/fractional_resources +``` -The GPU plugin initcontainer needs to be used in order to get the extended resources created with NFD. The initcontainer installs the required NFD-hook into the host system. +The GPU plugin initcontainer needs to be used in order to get the extended resources created with NFD. It is deployed by the kustomization base. The initcontainer installs the required NFD-hook into the host system. ## NFD Basically all versions starting with [v0.6.0](https://github.com/kubernetes-sigs/node-feature-discovery/releases/tag/v0.6.0) should work. You can use it to publish the GPU extended resources and GPU-related labels printed by the hook installed by the GPU-plugin initcontainer. For picking up the labels printed by the hook installed by the GPU-plugin initcontainer, deploy nfd master with this kind of command in its yaml: -```` +``` command: ["nfd-master", "--resource-labels=gpu.intel.com/memory.max,gpu.intel.com/millicores", "--extra-label-ns=gpu.intel.com"] -```` +``` The above would promote two labels, "memory.max" and "millicores" to extended resources of the node that produces the labels. If you want to enable i915 capability scanning, the nfd worker needs to read debugfs, and therefore it needs to run as privileged, like this: -```` +``` securityContext: runAsNonRoot: null # Adding GPU info labels needs debugfs "915_capabilities" access # (can't just have mount for that specific file because all hosts don't have i915) runAsUser: 0 -```` +``` In order to allow NFD to create extended resource, you will have to give it RBAC-rule to access nodes/status, like: -```` +``` rules: - apiGroups: - "" @@ -40,7 +40,13 @@ rules: # when using command line flag --resource-labels to create extended resources # you will need to uncomment "- nodes/status" - nodes/status -```` +``` + +A simple example of non-root NFD deployment kustomization can be found from [docs/nfd](./nfd). You can deploy it by running + +``` +kubectl apply -k docs/nfd +``` ## Cluster nodes @@ -49,13 +55,15 @@ You need some i915 GPUs in the nodes. Internal GPUs work fine for testing GAS, m ## PODs Your PODs then, needs to ask for some GPU-resources. Like this: -```` +``` resources: limits: gpu.intel.com/i915: 1 gpu.intel.com/millicores: 10 gpu.intel.com/memory.max: 10M -```` +``` + +A complete example pod yaml is located in [docs/example](./example) ## Summary in a chronological order