关于k8s中部署rabbitmq集群

来源:7-6 使用K8s搭建高可用集群

jaymie

2020-12-02

老师,我想请教下k8s中部署rabbitmq集群,

  1. 假设说单个pods中有三个节点,每个节点都是开放 15672和5672两个端口的吗

  2. 还需要登录到docker容器的控制台手动进行 cluster_join 吗? 

  3. 另外关于镜像队列的配置,也是需要手动登录主节点然后去设置吗?


另外在操作的过程中遇到一个问题


Failed to fetch a list of nodes from Kubernetes API: 403

下面是详细的日志

Starting RabbitMQ 3.8.9 on Erlang 23.1.4
Copyright (c) 2007-2020 VMware, Inc. or its affiliates.
Licensed under the MPL 2.0. Website: https://rabbitmq.com

 ##  ##      RabbitMQ 3.8.9
 ##  ##
 ##########  Copyright (c) 2007-2020 VMware, Inc. or its affiliates.
 ######  ##
 ##########  Licensed under the MPL 2.0. Website: https://rabbitmq.com

 Doc guides: https://rabbitmq.com/documentation.html
 Support:    https://rabbitmq.com/contact.html
 Tutorials:  https://rabbitmq.com/getstarted.html
 Monitoring: https://rabbitmq.com/monitoring.html

 Logs: <stdout>

 Config file(s): /etc/rabbitmq/rabbitmq.conf

 Starting broker...2020-12-02 00:14:06.880 [info] <0.272.0>
node           : rabbit@172.20.12.72
home dir       : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.conf
cookie hash    : fEhBdooaT6CkwPXiM8I6gQ==
log(s)         : <stdout>
database dir   : /var/lib/rabbitmq/mnesia/rabbit@172.20.12.72
2020-12-02 00:14:07.225 [debug] <0.283.0> Lager installed handler lager_backend_throttle into lager_event
2020-12-02 00:14:08.477 [info] <0.272.0> Running boot step pre_boot defined by app rabbit
2020-12-02 00:14:08.477 [info] <0.272.0> Running boot step rabbit_core_metrics defined by app rabbit
2020-12-02 00:14:08.478 [info] <0.272.0> Running boot step rabbit_alarm defined by app rabbit
2020-12-02 00:14:08.482 [info] <0.352.0> Memory high watermark set to 3128 MiB (3280435609 bytes) of 7821 MiB (8201089024 bytes) total
2020-12-02 00:14:08.490 [info] <0.354.0> Enabling free disk space monitoring
2020-12-02 00:14:08.490 [info] <0.354.0> Disk free limit set to 50MB
2020-12-02 00:14:08.494 [info] <0.272.0> Running boot step code_server_cache defined by app rabbit
2020-12-02 00:14:08.494 [info] <0.272.0> Running boot step file_handle_cache defined by app rabbit
2020-12-02 00:14:08.495 [info] <0.357.0> Limiting to approx 1048479 file handles (943629 sockets)
2020-12-02 00:14:08.495 [info] <0.358.0> FHC read buffering:  OFF
2020-12-02 00:14:08.495 [info] <0.358.0> FHC write buffering: ON
2020-12-02 00:14:08.496 [info] <0.272.0> Running boot step worker_pool defined by app rabbit
2020-12-02 00:14:08.496 [info] <0.346.0> Will use 4 processes for default worker pool
2020-12-02 00:14:08.496 [info] <0.346.0> Starting worker pool 'worker_pool' with 4 processes in it
2020-12-02 00:14:08.496 [info] <0.272.0> Running boot step database defined by app rabbit
2020-12-02 00:14:08.497 [info] <0.272.0> Node database directory at /var/lib/rabbitmq/mnesia/rabbit@172.20.12.72 is empty. Assuming we need to join an existing cluster or initialise from scratch...
2020-12-02 00:14:08.497 [info] <0.272.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
2020-12-02 00:14:08.513 [info] <0.272.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2020-12-02 00:14:08.513 [info] <0.272.0> Peer discovery backend does not support locking, falling back to randomized delay
2020-12-02 00:14:08.513 [info] <0.272.0> Peer discovery backend rabbit_peer_discovery_k8s supports registration.
2020-12-02 00:14:08.513 [info] <0.272.0> Will wait for 1777 milliseconds before proceeding with registration...
2020-12-02 00:14:10.376 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:10.380 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 9 retries left...
2020-12-02 00:14:10.886 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:10.889 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 8 retries left...
2020-12-02 00:14:11.393 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:11.397 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 7 retries left...
2020-12-02 00:14:11.902 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:11.905 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 6 retries left...
2020-12-02 00:14:12.409 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:12.413 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 5 retries left...
2020-12-02 00:14:12.918 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:12.921 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 4 retries left...
2020-12-02 00:14:13.426 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:13.430 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 3 retries left...
2020-12-02 00:14:13.934 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:13.938 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 2 retries left...
2020-12-02 00:14:14.441 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:14.445 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 1 retries left...
2020-12-02 00:14:14.949 [error] <0.272.0> Failed to fetch a list of nodes from Kubernetes API: 403
2020-12-02 00:14:14.953 [error] <0.272.0> Peer discovery returned an error: "403". Will retry after a delay of 500 ms, 0 retries left...

BOOT FAILED
===========
Exception during startup:

2020-12-02 00:14:15.458 [info] <0.44.0> Application mnesia exited with reason: stopped
2020-12-02 00:14:15.458 [error] <0.272.0>
2020-12-02 00:14:15.458 [info] <0.44.0> Application mnesia exited with reason: stopped
2020-12-02 00:14:15.458 [error] <0.272.0> BOOT FAILED
2020-12-02 00:14:15.458 [error] <0.272.0> ===========
2020-12-02 00:14:15.458 [error] <0.272.0> Exception during startup:
2020-12-02 00:14:15.458 [error] <0.272.0>
2020-12-02 00:14:15.458 [error] <0.272.0>     rabbit_boot_steps:run_boot_steps/1 line 20
   rabbit_boot_steps:run_boot_steps/1 line 20
   rabbit_boot_steps:'-run_boot_steps/1-lc$^0/1-0-'/1 line 19
   rabbit_boot_steps:run_step/2 line 46
   rabbit_boot_steps:'-run_step/2-lc$^0/1-0-'/2 line 41
   rabbit_mnesia:init/0 line 76
   rabbit_mnesia:init_with_lock/3 line 111
   rabbit_mnesia:run_peer_discovery_with_retries/2 line 145
   rabbit_mnesia:run_peer_discovery_with_retries/2 line 138
2020-12-02 00:14:15.459 [error] <0.272.0>     rabbit_boot_steps:'-run_boot_steps/1-lc$^0/1-0-'/1 line 19
2020-12-02 00:14:15.459 [error] <0.272.0>     rabbit_boot_steps:run_step/2 line 46
2020-12-02 00:14:15.459 [error] <0.272.0>     rabbit_boot_steps:'-run_step/2-lc$^0/1-0-'/2 line 41
2020-12-02 00:14:15.459 [error] <0.272.0>     rabbit_mnesia:init/0 line 76
2020-12-02 00:14:15.459 [error] <0.272.0>     rabbit_mnesia:init_with_lock/3 line 111
2020-12-02 00:14:15.459 [error] <0.272.0>     rabbit_mnesia:run_peer_discovery_with_retries/2 line 145
2020-12-02 00:14:15.459 [error] <0.272.0>     rabbit_mnesia:run_peer_discovery_with_retries/2 line 138
error:{badmatch,ok}

2020-12-02 00:14:15.459 [error] <0.272.0> error:{badmatch,ok}
2020-12-02 00:14:15.459 [error] <0.272.0>

写回答

1回答

Moody

2020-12-02

jaymie同学你好~

  1. 各个节点不应该部署在同一个pod里面,应该部署在不同的pod里面,因为同一个pod生命周期是相同的,会出现一起宕机的情况。

  2. 我们在课程中使用的脚本是需要手动进行cluster join和镜像队列配置的,但是如果是使用一些插件比如helm等是不用手动进行的

我看了一下你的这个问题,你可以查看一下,脚本里是不是给rabbit安装了rabbitmq_peer_discovery_k8s插件?感觉应该是配置相关的问题

0
3
jaymie
但是现在我又遇到了一个问题,之前我试部署在 default 的namespace的,然后我的应用是在prod,跨 namespace 访问有问题,然后我打算迁移到 prod 部署,然后现在 prod namespace 部署不起来了,问题是 Failed to fetch a list of nodes from Kubernetes API: 404 没有学习过 k8s 遇到这些问题真的好头大
2020-12-02
共3条回复

RabbitMQ精讲,提升工程实践能力,培养架构思维

消息驱动架构+订单状态机,二次开发,手写分布式事务框架。

455 学习 · 202 问题

查看课程