diff --git a/docs/guides/gettingStarted/assets/alert6.png b/docs/guides/gettingStarted/assets/alert6.png index 15325eac44..7d921fe977 100644 Binary files a/docs/guides/gettingStarted/assets/alert6.png and b/docs/guides/gettingStarted/assets/alert6.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.mdx b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.mdx index b1e5ded99e..00edc74724 100644 --- a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.mdx +++ b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.mdx @@ -1,149 +1,170 @@ --- -title: 'Management Console and Monitoring System' +title: 'Monitoring and Alert System' sidebar_position: 6 --- -# Management Console 🖥️ -### 🟢 Open Ports +## Component Description +When deploying with Docker Compose, the following components will be automatically deployed. If deploying from source code, these components need to be manually enabled in the docker-compose.yaml. -| TCP Port | Description | Action ⚙️ | -|:---------:|:------------------------:|:----------------------------------------------:| -| TCP:11002 | `http://ip:11002` Access Management Console | Open the port or use nginx reverse proxy, and disable the firewall | +| Component Name | Description | Deployment Instructions | +|-----------------|---------------------------------------------------|------------------------------------------------------| +| openim-admin | Management console, providing a monitoring page entry | Automatically enabled in both Docker and source code deployments | +| prometheus | A monitoring system component for collecting and storing metric data | Automatically enabled in Docker; manual enablement required for source deployment | +| alertmanager | Component for managing and sending alerts | Automatically enabled in Docker; manual enablement required for source deployment | +| grafana | Dashboard component for displaying monitoring data | Automatically enabled in Docker; manual enablement required for source deployment | +| node-exporter | Collects metric information from nodes (e.g., servers) | Automatically enabled in Docker; manual enablement required for source deployment | -## 📌 Accessing the Management Console +## Configuration File Description -:::tip -Enter `http://ip:11002` in your browser to access the management console. This IP is the server IP, and make sure your browser can access it. The default username and password are both admin1. -::: +| File Name | Description | Modification Items | +|-----------------------------------|---------------------------|-------------------------------------------------| +| config/config.yaml | openIM service configuration | prometheus.enable: true to enable | +| config/prometheus.yml | Prometheus configuration | No modification needed | +| config/instance-down-rules.yml | Alert rules | Defaults to two rules (instance_down, database_insert_failure_alerts) | +| config/alertmanager.yml | Alert management configuration | Needs configuration of sender and receiver email information | +| config/email.tmpl | Email alert template | Default email template, can be modified as needed | +| config/templates/prometheus-dashboard.yaml | Custom dashboard | No modification needed | -![PC Web Interface](./assets/admin.jpg) +## Logging into the Management Console -# Monitoring System -This document introduces the deployment and usage of Prometheus monitoring and alerting functions for openim deployed in binary & Docker modes. For monitoring and alerting in openim deployed via k8s, please refer to the following document: https://github.com/openimsdk/helm-charts/blob/main/docs/user-guide-zh.md +Enter `http://ip:11002` in your browser to access the management console. This IP is the server's OPENIM_IP, ensure your browser can access it. The default username and password are both chatAdmin. -## Binary Deployment of openim-server Monitoring -The following are the steps for deploying openim-server in binary mode: +import Image4 from './assets/admin.jpg'; -Step 1: Clone the repository and switch to the release branch -``` -git clone https://github.com/openimsdk/open-im-server && cd open-im-server -``` - -Step 2: Set common environment variables. - -Step 3: Deploy components. -``` -make init && docker compose up -d. -``` -To enable monitoring, divide step 3 into three smaller steps: 3.1, 3.2, 3.3 - -Step 3.1: Execute the make init command, open the generated config/config.yaml file, and modify as follows: prometheus.enable: true - -> To configure alerting functions, rewrite alertmanager.yml, email.tmpl, instance-down-rules.yml at this step. For convenience, you may skip rewriting these three files initially. - -![PC Web Interface](./assets/config1.png) - -Step 3.2: Modify the docker-compose.yml file to enable the monitoring components required: prometheus, grafana, alertmanager, node-exporter (optional). Uncomment the respective modules as shown in the image. - -![PC Web Interface](./assets/docker-compose1.png) - -![PC Web Interface](./assets/docker-compose2.png) - -Step 3.3: Execute docker compose up -d to install all monitoring components and the management console module. -> Since the openim-admin management console module uses the chat service module, you also need to install our additional chat module. Documentation for installing the chat module can be found in Quick Start>Source Code Deployment>AppServer(chat). - -## Docker Deployment of openim-server Monitoring -The following are the steps for deploying openim-server using Docker: - -Step 1: Set Common Environment Variables - -Step 2: Pull and Launch Image -``` -git clone https://github.com/openim-sigs/openim-docker openim/openim-docker && export openim=$(pwd)/openim && cd $openim/openim-docker && make init && docker compose up -d -``` -To enable monitoring, divide step 2 into three smaller steps: 2.1, 2.2, 2.3 - -Step 2.1: Execute the following command -``` -git clone https://github.com/openim-sigs/openim-docker openim/openim-docker && export openim=$(pwd)/openim && cd $openim/openim-docker && make init -``` -Open the generated config/config.yaml file and modify as follows: prometheus.enable: true. -> To configure alerting functions, rewrite alertmanager.yml, email.tmpl, instance-down-rules.yml at this step. - -![PC Web Interface](./assets/config1.png) - -Step 2.2: Modify the docker-compose.yml file to enable the monitoring components required: prometheus, grafana, alertmanager, node-exporter (optional). Uncomment the respective modules as shown in the image. -![PC Web Interface](./assets/docker-compose1.png) - -![PC Web Interface](./assets/docker-compose2.png) - -Step 2.3: Execute docker compose up -d to install all monitoring components. +admin -## k8s Deployment of openim-server -For deploying openim-server and enabling monitoring features in k8s, please refer to the document: https://github.com/openimsdk/helm-charts/blob/main/README-zh_CN.md +## Logging into Grafana -## Experience -1. Access the openim management console webpage via the link, with the address: https://ip:11002/ -2. The default username and password for the management console are (admin1:admin1). Clicking the link below will open the grafana webpage. +First, log into the management console, then click the data monitoring menu on the left side, and enter the default username (admin) and password (admin) to log into Grafana. +![PC Web Interface](./assets/login1.png) -![PC Web Interface](./assets/admin1.png) +## Adding Prometheus Data Source -3. Log into grafana using the default username and password (admin:admin). - -![PC Web Interface](./assets/login.png) - -4. Add Prometheus as a data source. In the image below, enter the URL of the Prometheus data source: http://172.28.0.1:19090 and click "Save and Test" to save the data source. +As shown in the image below, enter the URL of the Prometheus data source: http://172.28.0.1:19090 (19090 is the default Prometheus port) and click "Save and Test" to save. ![PC Web Interface](./assets/database.png) ![PC Web Interface](./assets/database2.png) -5. Import the custom dashboard for the Docker version of openim. Click the import button as shown in the image below. +## Importing Custom Dashboard + +Click the import button as shown in the image below to import the dashboard. ![PC Web Interface](./assets/dashboard.png) -Copy the contents of config/template/prometheus-dashboard.yaml into the area shown in the image below, then click the load button. +Copy the contents from https://github.com/openimsdk/open-im-server/tree/main/config/templates/prometheus-dashboard.yaml into the area shown in the image below, then click the load button. ![PC Web Interface](./assets/dashboard2.png) -Select your Data Source and job, and you will see custom metric information as shown below. +Select your Data Source and job, customize metric information as shown in the image below. ![PC Web Interface](./assets/dashboard3.png) -6. Import the official node-exporter dashboard from the official website (https://grafana.com/grafana/dashboards/ ), find a node-exporter dashboard you like, and import it, for example, 1860 (Node Exporter Full). +## Importing Node-Exporter Dashboard +Enter 1860 to import, or find other node-exporter dashboard views on the official website (https://grafana.com/grafana/dashboards/). ![PC Web Interface](./assets/dashboard4.png) -You will see node-exporter metric information as shown below. +Node-exporter metric information, as shown in the image below. ![PC Web Interface](./assets/dashboard5.png) -# Alert System -The system has implemented two default alert rules (instance_down, database_insert_failure_alerts) with email alerts. Simply modify the sending and receiving email configurations in the alertmanager.yml file in the config folder to receive system alert emails for the default rules. -> To implement alerts via DingTalk, WeChat Work, etc., you need to rewrite alertmanager.yml. You can refer to the official documentation of the alert management module: https://prometheus.io/docs/alerting/latest/alertmanager/ +## Alert Configuration File Description -## Alert Configuration File Explanation -1. Three alert configuration files are involved: alertmanager.yml, email.tmpl, and instance-down-rules.yml, as shown below: -![PC Web Interface](./assets/alert1.png) - -2. The email alert architecture is shown below. The Prometheus component loads the alert rule file instance-down-rules.yml and sends alert information that meets the conditions to the alertmanager component. The alertmanager component loads alertmanager.yml and email.tmpl files and sends emails using the configured alert email information and email template. +1. Email alert architecture diagram: The Prometheus component loads the alert rule file instance-down-rules.yml and sends alerts meeting the criteria to the alertmanager component. The alertmanager component, upon loading alertmanager.yml and email.tmpl files, sends emails using the configured alert email information and the email template. ![PC Web Interface](./assets/alert2.png) -3. Explanation of the alert rule file instance-down-rules.yaml. To add alert rules, you can add them in the instance-down-rules.yml file: -![PC Web Interface](./assets/alert3.png) - -4. Explanation of the alert management file alertmanager.yml. Modify it with your real sending and receiving email configuration information to receive alert messages: -![PC Web Interface](./assets/alert4.png) - -5. Explanation of the email template file email.tmpl. This file is in HTML format. The alert management module fills in the variable information and renders it into an HTML format file for sending emails. You can rewrite the template file according to your needs: -![PC Web Interface](./assets/alert5.png) +2. prometheus.yml file description: Mainly used to configure the path of alert rule files, the address of the alert management service, and the IP address for capturing monitoring data. Default configuration requires no modification. +``` +# Alertmanager configuration +alerting: + alertmanagers: + - static_configs: + - targets: ['172.28.0.1:19093'] + +# Load rules once and periodically evaluate them according to the global 'evaluation_interval'. +rule_files: + - "instance-down-rules.yml" +``` -## Alert Deployment -The alert module is deployed together with the monitoring module. There is no separate deployment for the alert module. Modify the alertmanager.yml, email.tmpl, instance-down-rules.yml files as needed during the deployment steps for monitoring. +3. Alert rule instance-down-rules.yaml file description: By default, two email alert rules (instance_down, database_insert_failure_alerts) are implemented. To add more alert rules, you can do so in the instance-down-rules.yml file: +``` +groups: + - name: instance_down # Alert rule one: Trigger an alert if a monitoring module crashes for more than one minute + rules: + + + - alert: InstanceDown + expr: up == 0 + for: 1m + labels: + severity: critical + annotations: + summary: "Instance {{ $labels.instance }} down" + description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes." + + - name: database_insert_failure_alerts # Alert rule two: Trigger an alert if msg_insert_redis_failed_total and msg_insert_mongo_failed_total metrics increase + rules: + - alert: DatabaseInsertFailed + expr: (increase(msg_insert_redis_failed_total[5m]) > 0) or (increase(msg_insert_mongo_failed_total[5m]) > 0) + for: 1m + labels: + severity: critical + annotations: + summary: "Increase in MsgInsertRedisFailedCounter or MsgInsertMongoFailedCounter detected" + description: "Either MsgInsertRedisFailedCounter or MsgInsertMongoFailedCounter has increased in the last 5 minutes, indicating failures in message insert operations to Redis or MongoDB, maybe the redis or mongodb is crash." +``` -(Monitoring System>Binary Deployment of openim-server Monitoring>Step 3.1; Monitoring System>Docker Deployment of openim-server Monitoring>Step 2.1) +4. Alert management alertmanager.yml file description: Modify sender and receiver email configuration information to receive alert information. For alerts via DingTalk, WeChat Work, etc., modify alertmanager.yml accordingly. For reference, see the official alert management documentation: https://prometheus.io/docs/alerting/latest/alertmanager/ +``` +global: + resolve_timeout: 5m + smtp_from: alert@openim.io # Alert sender email + smtp_smarthost: smtp.163.com:465 # Sender email SMTP address + smtp_auth_username: alert@openim.io # Sender email authorization username, usually the same as smtp_from + smtp_auth_password: YOURAUTHPASSWORD # Sender email authorization code + smtp_require_tls: false + smtp_hello: openim alert + +templates: + - /etc/alertmanager/email.tmpl # Email template + +route: + group_by: ['alertname'] + group_wait: 5s + group_interval: 5s + repeat_interval: 5m + receiver: email +receivers: + - name: email + email_configs: + - to: 'alert@example.com' # Recipient alert email + html: '{{ template "email.to.html" . }}' + headers: { Subject: "[OPENIM-SERVER]Alarm" }# Email title + send_resolved: true +``` -For the experience phase, you only need to modify the sending and receiving email configurations in the alertmanager.yml file to receive system alert emails for the default rules quickly. +5. Email template file email.tmpl description: This file is in HTML format. The alert management module fills in the variable information and then renders it into an HTML file for email sending. It can be modified as needed: +``` +{{ define "email.to.html" }} +{{ range .Alerts }} + +
+

OpenIM Alert

+

Alert Program: Prometheus Alert

+

Severity Level: {{ .Labels.severity }}

+

Alert Type: {{ .Labels.alertname }}

+

Affected Host: {{ .Labels.instance }}

+

Affected Service: {{ .Labels.job }}

+

Alert Subject: {{ .Annotations.summary }}

+

Trigger Time: {{ .StartsAt.Format "2006-01-02 15:04:05" }}

+
+ +{{ end }} +{{ end }} +``` ## Alert Experience -You can manually trigger the instancedown alert rule. Execute the make stop command to stop the openim-server service, wait for more than 5 minutes, and you will receive an alert email as shown below: +You can manually trigger the instancedown alert rule. If deploying openim from source, execute the `make stop` command to stop the openim-server service. After waiting for more than 5 minutes, you will receive an alert email as shown below: ![PC Web Interface](./assets/alert6.png) -# Log System -If openim services are deployed in a k8s environment via helm chart, you can view loki logs through grafana, i.e., view logs of all openim services through grafana. Currently, binary and Docker deployments do not integrate the loki log collection component. To experience loki log collection, please deploy using helm chart. Details can be found at https://github.com/openimsdk/helm-charts/blob/main/docs/user-guide-zh.md \ No newline at end of file +## Log System +If deploying the OpenIM service in a k8s environment using the helm chart method, you can view all OpenIM service logs through Grafana. +Currently, binary and Docker deployments have not integrated the Loki log collection component. To experience the Loki log collection feature, please use the helm chart deployment method. +For more details, please refer to the link: https://github.com/openimsdk/helm-charts/blob/main/docs/user-guide-zh.md diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/admin1.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/admin1.png index ab4b0402a4..2a5d99d906 100644 Binary files a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/admin1.png and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/admin1.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert1.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert1.png index 3d5c79c1cb..05ab394391 100644 Binary files a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert1.png and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert1.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert2.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert2.png index ad9fa77d14..e8f04ad1b1 100644 Binary files a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert2.png and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert2.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert6.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert6.png index 15325eac44..7d921fe977 100644 Binary files a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert6.png and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/alert6.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/login.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/login.png index 2fea38f5b1..69096a6dee 100644 Binary files a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/login.png and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/login.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/login1.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/login1.png new file mode 100644 index 0000000000..c09dfdffe4 Binary files /dev/null and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/assets/login1.png differ