Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm fluentbit agent #3473

Merged
merged 19 commits into from
Feb 13, 2025
Merged

Helm fluentbit agent #3473

merged 19 commits into from
Feb 13, 2025

Conversation

P0NDER0SA
Copy link
Contributor

@P0NDER0SA P0NDER0SA commented Jan 29, 2025

What happens when your PR merges?

fluentbit gets deployed as fb-agent-fluent-bit using helm

What are you changing?

Updating and migrating fluentbit to be deployed with Helm

Provide some background on the changes

https://app.zenhub.com/workspaces/notify-planning-core-6411dfb7c95fb80014e0cab0/issues/gh/cds-snc/notification-planning-core/507

Release Steps

  • Delete the old Kustomize deployment before merging
  • Pull latest code from main on notification-manifests
  • navigate to 'env/staging' folder
  • run kubectl delete -f fluentbit.yaml
  • merge PR

Checklist if making changes to Kubernetes

  • I know how to get kubectl credentials in case it catches on fire

After merging this PR

  • I have verified that the tests / deployment actions succeeded
  • I have verified that any affected pods were restarted successfully
  • I have verified that I can still log into Notify production
  • I have verified that the smoke tests still pass on production
  • I have communicated the release in the #notify Slack channel.

@P0NDER0SA P0NDER0SA marked this pull request as ready for review January 30, 2025 17:57
@P0NDER0SA P0NDER0SA requested a review from jimleroyer as a code owner January 30, 2025 17:57
Copy link

github-actions bot commented Jan 30, 2025

ingress	nginx    	259     	2025-02-13 14:33:43.742624432 +0000 UTC	deployed	nginx-ingress-1.1.2	3.4.2      

xray-daemon	xray     	258     	2025-02-13 14:33:42.347630165 +0000 UTC	deployed	aws-xray-4.0.8	3.3.12     

Comparing release=notify-documentation, chart=charts/notify-documentation
Comparing release=notify-api, chart=charts/notify-api
Comparing release=notify-admin, chart=charts/notify-admin
Comparing release=notify-document-download, chart=charts/notify-document-download
Comparing release=notify-celery, chart=charts/notify-celery
Comparing release=k8s-event-logger, chart=/tmp/helmfile500354276/amazon-cloudwatch/staging/k8s-event-logger/k8s-event-logger/1.1.8/k8s-event-logger
Comparing release=karpenter-crd, chart=/tmp/helmfile500354276/karpenter/staging/karpenter-crd/karpenter-crd/0.36.1/karpenter-crd
Comparing release=karpenter, chart=/tmp/helmfile500354276/karpenter/staging/karpenter/karpenter/0.36.1/karpenter
Comparing release=karpenter-nodepool, chart=charts/karpenter-nodepool
Comparing release=priority-classes, chart=deliveryhero/priority-class
Comparing release=secrets-store-csi-driver, chart=secrets-store-csi-driver/secrets-store-csi-driver
Comparing release=aws-secrets-provider, chart=aws-secrets-manager/secrets-store-csi-driver-provider-aws
Comparing release=kube-state-metrics, chart=prometheus-community/kube-state-metrics
Comparing release=blazer, chart=stakater/application
Comparing release=ingress, chart=charts/nginx-ingress
Comparing release=xray-daemon, chart=okgolove/aws-xray
Comparing release=ipv4-geolocate, chart=charts/ipv4-geolocate
Comparing release=fb-agent, chart=fluent/fluent-bit
********************

	Release was not present in Helm.  Diff will show entire contents as new.

********************
amazon-cloudwatch, fb-agent-fluent-bit, ClusterRole (rbac.authorization.k8s.io) has been added:
- 
+ # Source: fluent-bit/templates/clusterrole.yaml
+ apiVersion: rbac.authorization.k8s.io/v1
+ kind: ClusterRole
+ metadata:
+   name: fb-agent-fluent-bit
+   labels:
+     helm.sh/chart: fluent-bit-0.48.5
+     app.kubernetes.io/name: fluent-bit
+     app.kubernetes.io/instance: fb-agent
+     app.kubernetes.io/version: "3.2.4"
+     app.kubernetes.io/managed-by: Helm
+ rules:
+   - apiGroups:
+       - ""
+     resources:
+       - namespaces
+       - pods
+       - nodes
+       - nodes/metrics
+       - nodes/proxy
+       - events
+     verbs:
+       - get
+       - list
+       - watch
amazon-cloudwatch, fb-agent-fluent-bit, ClusterRoleBinding (rbac.authorization.k8s.io) has been added:
- 
+ # Source: fluent-bit/templates/clusterrolebinding.yaml
+ apiVersion: rbac.authorization.k8s.io/v1
+ kind: ClusterRoleBinding
+ metadata:
+   name: fb-agent-fluent-bit
+   labels:
+     helm.sh/chart: fluent-bit-0.48.5
+     app.kubernetes.io/name: fluent-bit
+     app.kubernetes.io/instance: fb-agent
+     app.kubernetes.io/version: "3.2.4"
+     app.kubernetes.io/managed-by: Helm
+ roleRef:
+   apiGroup: rbac.authorization.k8s.io
+   kind: ClusterRole
+   name: fb-agent-fluent-bit
+ subjects:
+   - kind: ServiceAccount
+     name: fb-agent-fluent-bit
+     namespace: amazon-cloudwatch
amazon-cloudwatch, fb-agent-fluent-bit, ConfigMap (v1) has been added:
- 
+ # Source: fluent-bit/templates/configmap.yaml
+ apiVersion: v1
+ kind: ConfigMap
+ metadata:
+   name: fb-agent-fluent-bit
+   namespace: amazon-cloudwatch
+   labels:
+     helm.sh/chart: fluent-bit-0.48.5
+     app.kubernetes.io/name: fluent-bit
+     app.kubernetes.io/instance: fb-agent
+     app.kubernetes.io/version: "3.2.4"
+     app.kubernetes.io/managed-by: Helm
+ data:
+   custom_parsers.conf: |
+     [PARSER]
+         Name docker_no_time
+         Format json
+         Time_Keep Off
+         Time_Key time
+         Time_Format %Y-%m-%dT%H:%M:%S.%L
+     
+   fluent-bit.conf: |
+     [SERVICE]
+         Flush                     5
+         Grace                     30
+         Log_Level                 info
+         Daemon                    off
+         Parsers_File              /fluent-bit/etc/conf/parsers.conf
+         HTTP_Server               ${HTTP_SERVER}
+         HTTP_Listen               0.0.0.0
+         HTTP_Port                 ${HTTP_PORT}
+         storage.path              /var/fluent-bit/state/flb-storage/
+         storage.sync              normal
+         storage.checksum          off
+         storage.backlog.mem_limit 5M
+         
+     @INCLUDE celery-log.conf
+     @INCLUDE notify-log.conf
+     @INCLUDE dataplane-log.conf
+     @INCLUDE host-log.conf
+     
+     [INPUT]
+         Name                tail
+         Tag                 application.*
+         Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*, /var/log/containers/celery*
+         Path                /var/log/containers/*.log
+         multiline.parser    docker, cri
+         DB                  /var/fluent-bit/state/flb_container.db
+         Mem_Buf_Limit       50MB
+         Skip_Long_Lines     Off
+         Refresh_Interval    10
+         Rotate_Wait         30
+         storage.type        filesystem
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [INPUT]
+         Name                tail
+         Tag                 application.*
+         Path                /var/log/containers/fluent-bit*
+         multiline.parser    docker, cri
+         DB                  /var/fluent-bit/state/flb_log.db
+         Mem_Buf_Limit       5MB
+         Skip_Long_Lines     On
+         Refresh_Interval    10
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [INPUT]
+         Name                tail
+         Tag                 application.*
+         Path                /var/log/containers/cloudwatch-agent*
+         multiline.parser    docker, cri
+         DB                  /var/fluent-bit/state/flb_cwagent.db
+         Mem_Buf_Limit       5MB
+         Skip_Long_Lines     On
+         Refresh_Interval    10
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [FILTER]
+         Name                kubernetes
+         Match               application.*
+         Kube_URL            https://kubernetes.default.svc:443
+         Kube_Tag_Prefix     application.var.log.containers.
+         Merge_Log           On
+         Merge_Log_Key       log_processed
+         K8S-Logging.Parser  On
+         K8S-Logging.Exclude Off
+         Labels              On
+         Annotations         On
+         Use_Kubelet         On
+         Kubelet_Port        10250
+         Buffer_Size         0
+     
+     [OUTPUT]
+         Name cloudwatch_logs
+         Match application.*
+         region ${AWS_REGION} 
+         log_stream_prefix fallback-stream
+         log_group_name /aws/containerinsights/notification-canada-ca-dev-eks-cluster/application
+         log_stream_template $kubernetes['container_name']
+         auto_create_group on
+     
+   celery-log.conf: |
+     [INPUT]
+         Name                tail
+         Tag                 celery.*
+         Path                /var/log/containers/celery*
+         multiline.parser    docker, cri
+         DB                  /var/fluent-bit/state/celery.db
+         Mem_Buf_Limit       150MB
+         Skip_Long_Lines     Off
+         Refresh_Interval    10
+         Rotate_Wait         30
+         storage.type        filesystem
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [FILTER]
+         Name                  kubernetes
+         Match                 celery.*
+         Kube_URL              https://kubernetes.default.svc:443
+         Kube_Tag_Prefix       celery.var.log.containers.
+         Merge_Log             On
+         Merge_Log_Key         log_processed
+         K8S-Logging.Parser    On
+         K8S-Logging.Exclude   Off
+         Labels                On
+         Annotations           On
+         Use_Kubelet           On
+         Kubelet_Port          10250
+         Buffer_Size           0
+     
+     [FILTER]
+         name                  multiline
+         match                 celery.*
+         multiline.key_content log
+         multiline.parser      multiline-notify-python
+         emitter_mem_buf_limit 150MB
+     
+     [OUTPUT]
+         Name cloudwatch_logs
+         Match celery.*
+         region ${AWS_REGION} 
+         log_stream_prefix fallback-stream
+         log_group_name /aws/containerinsights/notification-canada-ca-dev-eks-cluster/application
+         log_stream_template $kubernetes['container_name']
+         auto_create_group on
+     
+   dataplane-log.conf: |
+     [INPUT]
+         Name                systemd
+         Tag                 dataplane.systemd.*
+         Systemd_Filter      _SYSTEMD_UNIT=docker.service
+         Systemd_Filter      _SYSTEMD_UNIT=containerd.service
+         Systemd_Filter      _SYSTEMD_UNIT=kubelet.service
+         DB                  /var/fluent-bit/state/systemd.db
+         Path                /var/log/journal
+         Read_From_Tail      ${READ_FROM_TAIL}
+     
+     [INPUT]
+         Name                tail
+         Tag                 dataplane.tail.*
+         Path                /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
+         multiline.parser    docker, cri
+         DB                  /var/fluent-bit/state/flb_dataplane_tail.db
+         Mem_Buf_Limit       50MB
+         Skip_Long_Lines     On
+         Refresh_Interval    10
+         Rotate_Wait         30
+         storage.type        filesystem
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [FILTER]
+         Name                modify
+         Match               dataplane.systemd.*
+         Rename              _HOSTNAME                   hostname
+         Rename              _SYSTEMD_UNIT               systemd_unit
+         Rename              MESSAGE                     message
+         Remove_regex        ^((?!hostname|systemd_unit|message).)*$
+     
+     [FILTER]
+         Name                aws
+         Match               dataplane.*
+         imds_version        v2
+     
+     [OUTPUT]
+         Name                cloudwatch_logs
+         Match               dataplane.*
+         region              ${AWS_REGION}
+         log_group_name      /aws/containerinsights/${CLUSTER_NAME}/dataplane
+         log_stream_prefix   ${HOST_NAME}-
+         auto_create_group   true
+         extra_user_agent    container-insights
+     
+   host-log.conf: |
+     [INPUT]
+         Name                tail
+         Tag                 host.dmesg
+         Path                /var/log/dmesg
+         Key                 message
+         DB                  /var/fluent-bit/state/flb_dmesg.db
+         Mem_Buf_Limit       5MB
+         Skip_Long_Lines     On
+         Refresh_Interval    10
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [INPUT]
+         Name                tail
+         Tag                 host.messages
+         Path                /var/log/messages
+         Parser              syslog
+         DB                  /var/fluent-bit/state/flb_messages.db
+         Mem_Buf_Limit       5MB
+         Skip_Long_Lines     On
+         Refresh_Interval    10
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [INPUT]
+         Name                tail
+         Tag                 host.secure
+         Path                /var/log/secure
+         Parser              syslog
+         DB                  /var/fluent-bit/state/flb_secure.db
+         Mem_Buf_Limit       5MB
+         Skip_Long_Lines     On
+         Refresh_Interval    10
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [FILTER]
+         Name                aws
+         Match               host.*
+         imds_version        v2
+     
+     [OUTPUT]
+         Name                        cloudwatch_logs
+         Match                       host.*
+         region                      ${AWS_REGION}
+         log_group_name              /aws/containerinsights/${CLUSTER_NAME}/host
+         log_stream_prefix           ${HOST_NAME}.
+         auto_create_group           true
+         extra_user_agent            container-insights
+     
+   notify-log.conf: |
+     [INPUT]
+         Name                tail
+         Tag                 application.*
+         Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*, /var/log/containers/celery*
+         Path                /var/log/containers/*.log
+         multiline.parser    docker, cri
+         DB                  /var/fluent-bit/state/flb_container.db
+         Mem_Buf_Limit       50MB
+         Skip_Long_Lines     Off
+         Refresh_Interval    10
+         Rotate_Wait         30
+         storage.type        filesystem
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [INPUT]
+         Name                tail
+         Tag                 application.*
+         Path                /var/log/containers/fluent-bit*
+         multiline.parser    docker, cri
+         DB                  /var/fluent-bit/state/flb_log.db
+         Mem_Buf_Limit       5MB
+         Skip_Long_Lines     On
+         Refresh_Interval    10
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [INPUT]
+         Name                tail
+         Tag                 application.*
+         Path                /var/log/containers/cloudwatch-agent*
+         multiline.parser    docker, cri
+         DB                  /var/fluent-bit/state/flb_cwagent.db
+         Mem_Buf_Limit       5MB
+         Skip_Long_Lines     On
+         Refresh_Interval    10
+         Read_from_Head      ${READ_FROM_HEAD}
+     
+     [FILTER]
+         Name                kubernetes
+         Match               application.*
+         Kube_URL            https://kubernetes.default.svc:443
+         Kube_Tag_Prefix     application.var.log.containers.
+         Merge_Log           On
+         Merge_Log_Key       log_processed
+         K8S-Logging.Parser  On
+         K8S-Logging.Exclude Off
+         Labels              On
+         Annotations         On
+         Use_Kubelet         On
+         Kubelet_Port        10250
+         Buffer_Size         0
+     
+   parsers.conf: |
+     [PARSER]
+         Name                syslog
+         Format              regex
+         Regex               ^(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
+         Time_Key            time
+         Time_Format         %b %d %H:%M:%S
+     
+     [PARSER]
+         Name                container_firstline
+         Format              regex
+         Regex               (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
+         Time_Key            time
+         Time_Format         %Y-%m-%dT%H:%M:%S.%LZ
+     
+     [PARSER]
+         Name                cwagent_firstline
+         Format              regex
+         Regex               (?<log>(?<="log":")\d{4}[\/-]\d{1,2}[\/-]\d{1,2}[ T]\d{2}:\d{2}:\d{2}(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
+         Time_Key            time
+         Time_Format         %Y-%m-%dT%H:%M:%S.%LZ
+     
+     [MULTILINE_PARSER]
+         name          multiline-notify-python
+         type          regex
+         flush_timeout 1000
+         # rules |   state name  | regex pattern         | next state
+         # ------|---------------|-----------------------------------
+         rule      "start_state"   "/^\[.*\].*/"                "cont"
+         rule      "cont"          "/^[^\[].*/"           "cont"
amazon-cloudwatch, fb-agent-fluent-bit, DaemonSet (apps) has been added:
- 
+ # Source: fluent-bit/templates/daemonset.yaml
+ apiVersion: apps/v1
+ kind: DaemonSet
+ metadata:
+   name: fb-agent-fluent-bit
+   namespace: amazon-cloudwatch
+   labels:
+     helm.sh/chart: fluent-bit-0.48.5
+     app.kubernetes.io/name: fluent-bit
+     app.kubernetes.io/instance: fb-agent
+     app.kubernetes.io/version: "3.2.4"
+     app.kubernetes.io/managed-by: Helm
+ spec:
+   selector:
+     matchLabels:
+       app.kubernetes.io/name: fluent-bit
+       app.kubernetes.io/instance: fb-agent
+   template:
+     metadata:
+       labels:
+         app.kubernetes.io/name: fluent-bit
+         app.kubernetes.io/instance: fb-agent
+       annotations:
+         checksum/config: 556bc5b0ffe71d6fa2bb9c9294c62eb3446fb7ecdcbec56c33f50fc744d73db6
+     spec:
+       serviceAccountName: fb-agent-fluent-bit
+       priorityClassName: system-cluster-critical
+       terminationGracePeriodSeconds: 10
+       hostNetwork: true
+       dnsPolicy: ClusterFirstWithHostNet
+       initContainers:
+         - command:
+           - sh
+           - -c
+           - echo "Waiting for 10 seconds for node to sort itself out" && sleep 10
+           image: busybox:1.28
+           name: wait-for-init
+       containers:
+         - name: fluent-bit
+           image: "cr.fluentbit.io/fluent/fluent-bit:3.2.4"
+           imagePullPolicy: IfNotPresent
+           env:
+             - name: AWS_REGION
+               value: ca-central-1
+             - name: CLUSTER_NAME
+               value: notification-canada-ca-staging-eks-cluster
+             - name: HTTP_SERVER
+               value: "On"
+             - name: HTTP_PORT
+               value: "2020"
+             - name: READ_FROM_HEAD
+               value: "Off"
+             - name: READ_FROM_TAIL
+               value: "On"
+             - name: HOST_NAME
+               valueFrom:
+                 fieldRef:
+                   fieldPath: spec.nodeName
+             - name: HOSTNAME
+               valueFrom:
+                 fieldRef:
+                   apiVersion: v1
+                   fieldPath: metadata.name
+             - name: CI_VERSION
+               value: k8s/1.3.15
+           command:
+             - /fluent-bit/bin/fluent-bit
+           args:
+             - --workdir=/fluent-bit/etc
+             - --config=/fluent-bit/etc/conf/fluent-bit.conf
+           ports:
+             - name: http
+               containerPort: 2020
+               protocol: TCP
+           livenessProbe:
+             httpGet:
+               path: /
+               port: http
+           readinessProbe:
+             httpGet:
+               path: /api/v1/health
+               port: http
+           resources:
+             limits:
+               memory: 500Mi
+             requests:
+               cpu: 100m
+               memory: 100Mi
+           volumeMounts:
+             - name: config
+               mountPath: /fluent-bit/etc/conf
+             - mountPath: /var/log
+               name: varlog
+             - mountPath: /var/lib/docker/containers
+               name: varlibdockercontainers
+               readOnly: true
+             - mountPath: /etc/machine-id
+               name: etcmachineid
+               readOnly: true
+       volumes:
+         - name: config
+           configMap:
+             name: fb-agent-fluent-bit
+         - hostPath:
+             path: /var/log
+           name: varlog
+         - hostPath:
+             path: /var/lib/docker/containers
+           name: varlibdockercontainers
+         - hostPath:
+             path: /etc/machine-id
+             type: File
+           name: etcmachineid
+       tolerations:
+         - effect: NoSchedule
+           key: node-role.kubernetes.io/master
+           operator: Exists
+         - effect: NoExecute
+           key: node.kubernetes.io/unreachable
+           operator: Exists
+           tolerationSeconds: 300
+         - effect: NoExecute
+           key: node.kubernetes.io/not-ready
+           operator: Exists
+           tolerationSeconds: 300
amazon-cloudwatch, fb-agent-fluent-bit, Service (v1) has been added:
- 
+ # Source: fluent-bit/templates/service.yaml
+ apiVersion: v1
+ kind: Service
+ metadata:
+   name: fb-agent-fluent-bit
+   namespace: amazon-cloudwatch
+   labels:
+     helm.sh/chart: fluent-bit-0.48.5
+     app.kubernetes.io/name: fluent-bit
+     app.kubernetes.io/instance: fb-agent
+     app.kubernetes.io/version: "3.2.4"
+     app.kubernetes.io/managed-by: Helm
+ spec:
+   type: ClusterIP
+   ports:
+     - port: 2020
+       targetPort: http
+       protocol: TCP
+       name: http
+   selector:
+     app.kubernetes.io/name: fluent-bit
+     app.kubernetes.io/instance: fb-agent
amazon-cloudwatch, fb-agent-fluent-bit, ServiceAccount (v1) has been added:
- 
+ # Source: fluent-bit/templates/serviceaccount.yaml
+ apiVersion: v1
+ kind: ServiceAccount
+ metadata:
+   name: fb-agent-fluent-bit
+   namespace: amazon-cloudwatch
+   labels:
+     helm.sh/chart: fluent-bit-0.48.5
+     app.kubernetes.io/name: fluent-bit
+     app.kubernetes.io/instance: fb-agent
+     app.kubernetes.io/version: "3.2.4"
+     app.kubernetes.io/managed-by: Helm

Comparing release=cert-manager, chart=jetstack/cert-manager

@@ -0,0 +1,401 @@
env:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember we had some difficulties with fluentbit back when we enabled it with @ben851 using kustomize. Is that a copy of the same configuration? 👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take this up on a. slack DM chat, but the quick answer is that it's the same configurations, but they've been completely migrated and modified where necessary to be a helmfile overrides file

Copy link
Collaborator

@ben851 ben851 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably delete the old one first

@P0NDER0SA P0NDER0SA merged commit c44c6e6 into main Feb 13, 2025
2 checks passed
@P0NDER0SA P0NDER0SA deleted the helm-fluentbit-telemetry branch February 13, 2025 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants