30 KiB
30 KiB
mayastor
Mayastor Helm chart for Kubernetes
Installation Guide
Prerequisites
- Make sure the system requirement pre-requisites are met.
- Label the storage nodes same as the mayastor.nodeSelector in values.yaml
- Create the namespace you want the chart to be installed, or pass the
--create-namespaceflag in thehelm installcommand.kubectl create ns <mayastor-namespace> - Create secret if downloading the container images from a private repo.
kubectl create secret docker-registry <same-as-image.pullSecrets[0]> --docker-server="https://index.docker.io/v1/" --docker-username="<user-name>" --docker-password="<password>" --docker-email="<user-email>" -n <mayastor-namespace>
Installing the chart via the git repo
Clone the mayastor charts repo. Sync the chart dependencies
$ helm dependency update
Install the mayastor chart using the command.
$ helm install mayastor . -n <mayastor-namespace>
Installing the Chart via Helm Registry
To install the chart with the release name mymayastor:
$ helm repo add mayastor https://openebs.github.io/mayastor-extensions/
$ helm install mymayastor mayastor/mayastor
Uninstall Helm Chart
$ helm uninstall [RELEASE_NAME]
This removes all the Kubernetes components associated with the chart and deletes the release.
See helm uninstall for command documentation.
Chart Dependencies
| Repository | Name | Version |
|---|---|---|
| crds | 2.10.0 | |
| https://charts.bitnami.com/bitnami | etcd | 12.0.14 |
| https://grafana.github.io/helm-charts | alloy | 1.0.1 |
| https://grafana.github.io/helm-charts | loki | 6.29.0 |
| https://jaegertracing.github.io/helm-charts | jaeger-operator | 2.50.1 |
| https://nats-io.github.io/k8s/helm/charts/ | nats | 0.19.14 |
| https://openebs.github.io/dynamic-localpv-provisioner | localpv-provisioner | 4.4.0 |
Values
| Key | Description | Default |
|---|---|---|
| agents.core.allowNonPersistentDevlink | Allow using non-persistent kernel devpaths for pool disks. Enabling this will let users to use the kernel devpaths e.g /dev/sda, for diskpools. However, this comes with associated risks if the devpaths get swapped among disks, resulting in total data loss especially if encryption is being used. | false |
| agents.core.capacity.thin.poolCommitment | The allowed pool commitment limit when dealing with thin provisioned volumes. Example: If the commitment is 250 and the pool is 10GiB we can overcommit the pool up to 25GiB (create 2 10GiB and 1 5GiB volume) but no further. | "250%" |
| agents.core.capacity.thin.snapshotCommitment | When creating snapshots for an existing volume, each replica pool must have at least this much free space percentage of the volume size. Example: if this value is 40, the pool has 40GiB free, then the max volume size allowed to be snapped on the pool is 100GiB. | "40%" |
| agents.core.capacity.thin.volumeCommitment | When creating replicas for an existing volume, each replica pool must have at least this much free space percentage of the volume size. Example: if this value is 40, the pool has 40GiB free, then the max volume size allowed to be created on the pool is 100GiB. | "40%" |
| agents.core.capacity.thin.volumeCommitmentInitial | Same as the volumeCommitment argument, but applicable only when creating replicas for a new volume. |
"40%" |
| agents.core.encryptedPoolsSoftScheduling | Prefer encrypted pools for volume replicas. If a volume wasn't provisioned with a encryption storageclass, we try to place the replicas of such volume on best-effort basis onto encrypted pools, if this global is set. This is effective subject to volume spec already modified via plugin to request encryption. | false |
| agents.core.logLevel | Log level for the core service | "info" |
| agents.core.minTimeouts | Enable minimal timeouts | true |
| agents.core.poolClusterSize | Default blobstore cluster size for diskpools, in bytes. This value is used as a default value of blobstore cluster size on diskpools. This is set to 4MiB internally by default, if nothing specified here. The value is also configurable via Diskpool CR, which takes precedence over this setting. This is an advanced configuration, please refer documentation to understand the usage and implications of this. | "" |
| agents.core.priorityClassName | Set PriorityClass, overrides global. If both local and global are not set, the final deployment manifest has a mayastor custom critical priority class assigned to the pod by default. Refer the templates/_helpers.tpl and templates/mayastor/agents/core/agent-core-deployment.yaml for more details. |
"" |
| agents.core.rebuild.maxConcurrent | The maximum number of system-wide rebuilds permitted at any given time. If set to an empty string, there are no limits. | "" |
| agents.core.rebuild.partial.enabled | Partial rebuild uses a log of missed IO to rebuild replicas which have become temporarily faulted, hence a bit faster, depending on the log size. | true |
| agents.core.rebuild.partial.waitPeriod | If a faulted replica comes back online within this time period then it will be rebuilt using the partial rebuild capability. Otherwise, the replica will be fully rebuilt. A blank value "" means internally derived value will be used. | "" |
| agents.core.requestTimeout | Request timeout for core agents Default value is defined in .base.default_req_timeout | nil |
| agents.core.resources.limits.cpu | Cpu limits for core agents | "1000m" |
| agents.core.resources.limits.memory | Memory limits for core agents | "128Mi" |
| agents.core.resources.requests.cpu | Cpu requests for core agents | "500m" |
| agents.core.resources.requests.memory | Memory requests for core agents | "32Mi" |
| agents.core.tolerations | Set tolerations, overrides global | [] |
| agents.core.volumeHealth | Enable extended volume health information, which helps generate the volume status more accurately. | true |
| agents.ha.cluster.logLevel | Log level for the ha cluster service | "info" |
| agents.ha.cluster.resources.limits.cpu | Cpu limits for ha cluster agent | "100m" |
| agents.ha.cluster.resources.limits.memory | Memory limits for ha cluster agent | "64Mi" |
| agents.ha.cluster.resources.requests.cpu | Cpu requests for ha cluster agent | "100m" |
| agents.ha.cluster.resources.requests.memory | Memory requests for ha cluster agent | "16Mi" |
| agents.ha.node.logLevel | Log level for the ha node service | "info" |
| agents.ha.node.port | Container port for the ha-node service | 50053 |
| agents.ha.node.priorityClassName | Set PriorityClass, overrides global | "" |
| agents.ha.node.resources.limits.cpu | Cpu limits for ha node agent | "100m" |
| agents.ha.node.resources.limits.memory | Memory limits for ha node agent | "64Mi" |
| agents.ha.node.resources.requests.cpu | Cpu requests for ha node agent | "100m" |
| agents.ha.node.resources.requests.memory | Memory requests for ha node agent | "64Mi" |
| agents.ha.node.tolerations | Set tolerations, overrides global | [] |
| alloy.logging_config.labels | Labels to enable scraping on, at-least one of these labels should be present. | { |
| alloy.logging_config.tenant_id | X-Scope-OrgID to pe populated which pushing logs. Make sure the caller also uses the same. | "openebs" |
| apis.rest.healthProbes.liveness.enabled | Toggle liveness probe. | true |
| apis.rest.healthProbes.liveness.failureThreshold | No. of failures the liveness probe will tolerate. | 1 |
| apis.rest.healthProbes.liveness.initialDelaySeconds | No. of seconds of delay before checking the liveness status. | 0 |
| apis.rest.healthProbes.liveness.periodSeconds | No. of seconds between liveness probe checks. | 30 |
| apis.rest.healthProbes.liveness.timeoutSeconds | No. of seconds of timeout tolerance. | 5 |
| apis.rest.healthProbes.readiness.agentCoreProbeFreq | Frequency for the agent-core liveness probe. | "20s" |
| apis.rest.healthProbes.readiness.enabled | Toggle readiness probe. | true |
| apis.rest.healthProbes.readiness.failureThreshold | No. of failures the readiness probe will tolerate. | 2 |
| apis.rest.healthProbes.readiness.initialDelaySeconds | No. of seconds of delay before checking the readiness status. | 0 |
| apis.rest.healthProbes.readiness.periodSeconds | No. of seconds between readiness probe checks. | 20 |
| apis.rest.healthProbes.readiness.timeoutSeconds | No. of seconds of timeout tolerance. | 5 |
| apis.rest.logLevel | Log level for the rest service | "info" |
| apis.rest.priorityClassName | Set PriorityClass, overrides global. If both local and global are not set, the final deployment manifest has a mayastor custom critical priority class assigned to the pod by default. Refer the templates/_helpers.tpl and templates/mayastor/apis/rest/api-rest-deployment.yaml for more details. |
"" |
| apis.rest.replicaCount | Number of replicas of rest | 1 |
| apis.rest.resources.limits.cpu | Cpu limits for rest | "100m" |
| apis.rest.resources.limits.memory | Memory limits for rest | "64Mi" |
| apis.rest.resources.requests.cpu | Cpu requests for rest | "50m" |
| apis.rest.resources.requests.memory | Memory requests for rest | "32Mi" |
| apis.rest.service.type | Rest K8s service type | "ClusterIP" |
| apis.rest.tolerations | Set tolerations, overrides global | [] |
| base.cache_poll_period | Cache timeout for core agent & diskpool deployment | "30s" |
| base.default_req_timeout | Request timeout for rest & core agents | "5s" |
| base.logging.color | Enable ansi color code for Pod StdOut/StdErr | true |
| base.logging.format | Valid values for format are pretty, json and compact | "pretty" |
| base.logging.silenceLevel | Silence specific module components | nil |
| base.metrics.enabled | Enable the metrics exporter | true |
| base.metrics.port | Container port for the metrics exporter service | 9502 |
| crds.csi.volumeSnapshots.enabled | Install Volume Snapshot CRDs | true |
| crds.enabled | Disables the installation of all CRDs if set to false | true |
| csi.controller.logLevel | Log level for the csi controller | "info" |
| csi.controller.preventVolumeModeConversion | Prevent modifying the volume mode when creating a PVC from an existing VolumeSnapshot | true |
| csi.controller.priorityClassName | Set PriorityClass, overrides global | "" |
| csi.controller.resources.limits.cpu | Cpu limits for csi controller | "32m" |
| csi.controller.resources.limits.memory | Memory limits for csi controller | "128Mi" |
| csi.controller.resources.requests.cpu | Cpu requests for csi controller | "16m" |
| csi.controller.resources.requests.memory | Memory requests for csi controller | "64Mi" |
| csi.controller.tolerations | Set tolerations, overrides global | [] |
| csi.image.attacherTag | csi-attacher image release tag | "v4.8.1" |
| csi.image.provisionerTag | csi-provisioner image release tag | "v5.2.0" |
| csi.image.pullPolicy | imagePullPolicy for all CSI Sidecar images | "IfNotPresent" |
| csi.image.registrarTag | csi-node-driver-registrar image release tag | "v2.13.0" |
| csi.image.registry | Image registry to pull all CSI Sidecar images | "registry.k8s.io" |
| csi.image.repo | Image registry's namespace | "sig-storage" |
| csi.image.resizerTag | csi-resizer image release tag | "v1.13.2" |
| csi.image.snapshotControllerTag | csi-snapshot-controller image release tag | "v8.2.0" |
| csi.image.snapshotterTag | csi-snapshotter image release tag | "v8.2.0" |
| csi.node.kubeletDir | The kubeletDir directory for the csi-node plugin | "/var/lib/kubelet" |
| csi.node.nvme.ctrl_loss_tmo | The ctrl_loss_tmo (controller loss timeout) in seconds | "1980" |
| csi.node.nvme.tcpFallback | Fallback to nvme-tcp if nvme-rdma is enabled for Mayastor but rdma is not available on a particular csi-node | true |
| csi.node.port | Container port for the csi-node service | 10199 |
| csi.node.priorityClassName | Set PriorityClass, overrides global | "" |
| csi.node.resources.limits.cpu | Cpu limits for csi node plugin | "100m" |
| csi.node.resources.limits.memory | Memory limits for csi node plugin | "128Mi" |
| csi.node.resources.requests.cpu | Cpu requests for csi node plugin | "100m" |
| csi.node.resources.requests.memory | Memory requests for csi node plugin | "64Mi" |
| csi.node.tolerations | Set tolerations, overrides global | [] |
| csi.node.topology.nodeSelector | Add topology segments to the csi-node and agent-ha-node daemonset node selector | false |
| etcd.autoCompactionMode | AutoCompaction Since etcd keeps an exact history of its keyspace, this history should be periodically compacted to avoid performance degradation and eventual storage space exhaustion. Auto compaction mode. Valid values: "periodic", "revision". - 'periodic' for duration based retention, defaulting to hours if no time unit is provided (e.g. 5m). - 'revision' for revision number based retention. | "revision" |
| etcd.autoCompactionRetention | Auto compaction retention length. 0 means disable auto compaction. | "100" |
| etcd.clusterDomain | Kubernetes Cluster Domain | "cluster.local" |
| etcd.enabled | Disable when using an external etcd cluster. | true |
| etcd.externalUrl | Url of the external etcd cluster. Note, etcd.enable must be set to false. | "" |
| etcd.extraEnvVars[0] | Raise alarms when backend size exceeds the given quota. | { |
| etcd.localpvScConfig.basePath | Host path where local etcd data is stored in. | "/var/local/{{ .Release.Name }}/localpv-hostpath/etcd" |
| etcd.localpvScConfig.reclaimPolicy | ReclaimPolicy of etcd's localpv hostpath storage class. | "Delete" |
| etcd.localpvScConfig.volumeBindingMode | VolumeBindingMode of etcd's localpv hostpath storage class. | "WaitForFirstConsumer" |
| etcd.persistence.enabled | If true, use a Persistent Volume Claim. If false, use emptyDir. | true |
| etcd.persistence.size | Volume size | "2Gi" |
| etcd.persistence.storageClass | Will define which storageClass to use in etcd's StatefulSets. Options: - |
"mayastor-etcd-localpv" |
| etcd.persistentVolumeClaimRetentionPolicy.enabled | PVC's reclaimPolicy | false |
| etcd.podAntiAffinityPreset | Pod anti-affinity preset Ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity | "hard" |
| etcd.removeMemberOnContainerTermination | Use a PreStop hook to remove the etcd members from the etcd cluster on container termination Ignored if lifecycleHooks is set or replicaCount=1 | false |
| etcd.replicaCount | Number of replicas of etcd | 3 |
| image.pullPolicy | ImagePullPolicy for our images | "IfNotPresent" |
| image.pullSecrets | docker-secrets required to pull images if the container registry from image.registry is protected | [] |
| image.registry | Image registry to pull our product images | "docker.io" |
| image.repo | Image registry's namespace | "openebs" |
| image.tag | Release tag for our images | "v2.10.0" |
| io_engine.coreList | If not empty, overrides the cpuCount and explicitly sets the list of cores. Example: --set='io_engine.coreList={30,31}' | [] |
| io_engine.cpuCount | The number of cores that each io-engine instance will bind to. | "2" |
| io_engine.envcontext | Pass additional arguments to the Environment Abstraction Layer. Example: --set {product}.envcontext=iova-mode=pa | "" |
| io_engine.logLevel | Log level for the io-engine service | "info" |
| io_engine.nodeSelector | Node selectors to designate storage nodes for diskpool creation Note that if multi-arch images support 'kubernetes.io/arch: amd64' should be removed. | { |
| io_engine.nvme.ioTimeout | Timeout for IOs The default here is exaggerated for local disks, but we've observed that in shared virtual environments having a higher timeout value is beneficial. Please adjust this according to your hardware and needs. | "110s" |
| io_engine.nvme.tcp.maxQueueDepth | You may need to increase this for a higher outstanding IOs per volume | "32" |
| io_engine.port | Container port for the io-engine service | 10124 |
| io_engine.priorityClassName | Set PriorityClass, overrides global | "" |
| io_engine.pstorRetries | Number of retries for pstor persistence before the volume target self shutdowns | 300 |
| io_engine.resources.limits.cpu | Cpu limits for the io-engine | "" |
| io_engine.resources.limits.hugepages1Gi | Hugepage memory in 1GiB chunks | nil |
| io_engine.resources.limits.hugepages2Mi | Hugepage memory in 2MiB chunks | "2Gi" |
| io_engine.resources.limits.memory | Memory limits for the io-engine | "1Gi" |
| io_engine.resources.requests.cpu | Cpu requests for the io-engine | "" |
| io_engine.resources.requests.hugepages1Gi | Hugepage memory in 1GiB chunks | nil |
| io_engine.resources.requests.hugepages2Mi | Hugepage memory in 2MiB chunks | "2Gi" |
| io_engine.resources.requests.memory | Memory requests for the io-engine | "1Gi" |
| io_engine.runtimeClassName | Runtime class to use. Defaults to cluster standard | "" |
| io_engine.target.nvmf.iface | NVMF target interface (ip, mac, name or subnet) If RDMA is enabled, please set iface to an RDMA capable netdev name from host network. Example, if an rdma device mlx5_0 is available on a netdev eth0 on RNIC, as can be seen from rdma link command output, then this field should be set to eth0. |
"" |
| io_engine.target.nvmf.ptpl | Reservations Persist Through Power Loss State | true |
| io_engine.target.nvmf.rdma | Enable RDMA Capability of Mayastor nvmf target to take RDMA connections if the cluster nodes have RDMA device(s) configured from RNIC. | { |
| io_engine.tolerations | Set tolerations, overrides global | [] |
| localpv-provisioner.enabled | Enables the openebs dynamic-localpv-provisioner. If disabled, modify etcd and loki storage class accordingly. | true |
| localpv-provisioner.hostpathClass.enabled | Enable default hostpath localpv StorageClass. | false |
| localpv-provisioner.localpv.priorityClassName | Set the PriorityClass for the LocalPV Hostpath provisioner Deployment. | "{{ .Release.Name }}-cluster-critical" |
| loki.localpvScConfig.loki.basePath | Host path where local loki data is stored in. | "/var/local/{{ .Release.Name }}/localpv-hostpath/loki" |
| loki.localpvScConfig.loki.reclaimPolicy | ReclaimPolicy of loki's localpv hostpath storage class. | "Delete" |
| loki.localpvScConfig.loki.volumeBindingMode | VolumeBindingMode of loki's localpv hostpath storage class. | "WaitForFirstConsumer" |
| loki.localpvScConfig.minio.basePath | Host path where local minio data is stored in. | "/var/local/{{ .Release.Name }}/localpv-hostpath/minio" |
| loki.localpvScConfig.minio.reclaimPolicy | ReclaimPolicy of minio's localpv hostpath storage class. | "Delete" |
| loki.localpvScConfig.minio.volumeBindingMode | VolumeBindingMode of minio's localpv hostpath storage class. | "WaitForFirstConsumer" |
| nodeSelector | Node labels for pod assignment ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/ Note that if multi-arch images support 'kubernetes.io/arch: amd64' should be removed and set 'nodeSelector' to empty '{}' as default value. | { |
| obs.callhome.enabled | Enable callhome | true |
| obs.callhome.logLevel | Log level for callhome | "info" |
| obs.callhome.priorityClassName | Set PriorityClass, overrides global | "" |
| obs.callhome.resources.limits.cpu | Cpu limits for callhome | "100m" |
| obs.callhome.resources.limits.memory | Memory limits for callhome | "32Mi" |
| obs.callhome.resources.requests.cpu | Cpu requests for callhome | "50m" |
| obs.callhome.resources.requests.memory | Memory requests for callhome | "16Mi" |
| obs.callhome.tolerations | Set tolerations, overrides global | [] |
| obs.stats.logLevel | Log level for stats | "info" |
| obs.stats.resources.limits.cpu | Cpu limits for stats | "100m" |
| obs.stats.resources.limits.memory | Memory limits for stats | "32Mi" |
| obs.stats.resources.requests.cpu | Cpu requests for stats | "50m" |
| obs.stats.resources.requests.memory | Memory requests for stats | "16Mi" |
| obs.stats.service.type | Rest K8s service type | "ClusterIP" |
| operators.pool.logLevel | Log level for diskpool operator service | "info" |
| operators.pool.priorityClassName | Set PriorityClass, overrides global | "" |
| operators.pool.resources.limits.cpu | Cpu limits for diskpool operator | "100m" |
| operators.pool.resources.limits.memory | Memory limits for diskpool operator | "32Mi" |
| operators.pool.resources.requests.cpu | Cpu requests for diskpool operator | "50m" |
| operators.pool.resources.requests.memory | Memory requests for diskpool operator | "16Mi" |
| operators.pool.tolerations | Set tolerations, overrides global | [] |
| preUpgradeHook.enabled | Enable/Disable mayastor pre-upgrade hook | true |
| preUpgradeHook.image.pullPolicy | The imagePullPolicy for the container | "IfNotPresent" |
| preUpgradeHook.image.registry | The container image registry URL for the hook job | "docker.io" |
| preUpgradeHook.image.repo | The container repository for the hook job | "openebs/kubectl" |
| preUpgradeHook.image.tag | The container image tag for the hook job | "1.25.15" |
| preUpgradeHook.imagePullSecrets | Optional array of imagePullSecrets containing private registry credentials # Ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ | [] |
| preUpgradeHook.tolerations | Node tolerations for server scheduling to nodes with taints # Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/ # | [] |
| priorityClassName | Pod scheduling priority. Setting this value will apply to all components except the external Chart dependencies. If any component has priorityClassName set, then this value would be overridden for that component. For external components like etcd, jaeger or loki, PriorityClass can only be set at component level. |
"" |
| storageClass.allowVolumeExpansion | Enable volume expansion for the default StorageClass. | true |
| tolerations | Tolerations to be applied to all components except external Chart dependencies. If any component has tolerations set, then it would override this value. For external components like etcd, jaeger and loki, tolerations can only be set at component level. | [] |