请教个问题,MMM 台主从故障转移失败

来源:5-13 MMM架构实例演示(下)

crazy398

2019-02-11

3台主机:
db1: 192.168.147.142
db2:192.168.147.141
db3:192.168.147.139
虚拟:
db1: 192.168.147.90/192.168.147.91
db2:192.168.147.90/192.168.147.92
db3:192.168.147.93
关系:
db1 和 db2 互为主从
db3 是db1 的从

监控位置:db2
db1:ip
图片描述
db2:ip
图片描述
db3:ip
图片描述
mmm-comm.conf配置 3台一致
图片描述
图片描述
db3:主从复制
图片描述
db2:主从复制
图片描述
db1:主从复制
图片描述
关闭db1:
图片描述
关闭后的db3:主从失败
图片描述
db3:mysql.cnf配置
图片描述
db2:mysql.cnf配置
图片描述
db1:mysql.cnf配置
图片描述
麻烦帮我看看问题出在哪里
debug:1
mysql-mmm-monitor 在192.168.147.141 上其他不变:

写回答

6回答

慕粉3182897

2022-07-19

1234

0
2
慕粉3182897
啊啦啦啦啦
2022-07-19
共2条回复

crazy398

提问者

2019-02-17

//img.mukewang.com/szimg/5c6977150001c20009760478.jpg

群找不到了

0
0

crazy398

提问者

2019-02-13

//img.mukewang.com/szimg/5c641dc80001b49c05390447.jpg

//img.mukewang.com/szimg/5c641dc900017f9a07140404.jpg

//img.mukewang.com/szimg/5c641dc90001bb4a05130418.jpg

三台ip配置

0
0

crazy398

提问者

2019-02-13

//img.mukewang.com/szimg/5c6419b50001432c08540558.jpg

//img.mukewang.com/szimg/5c6419b60001543408950539.jpg

//img.mukewang.com/szimg/5c6419b600010db009580564.jpg

三台都配置了,mmm_agent/mmm_mintor/relication 账号

0
0

crazy398

提问者

2019-02-12


debug:1

mysql-mmm-monitor 在192.168.147.141 上,其他不变

日志导出 使用了unix2dos转换格式 

方式1:service mysql-mmm-monitor start >> mmm_log.txt  如下:

2019/02/12 21:45:26  INFO Check 'rep_backlog' on 'db1' is ok!

2019/02/12 21:46:36 DEBUG Core: reaped child 8319

2019/02/12 21:46:36  INFO Shutting down checker 'rep_backlog'...

2019/02/12 21:32:38  INFO Check 'rep_threads' on 'db2' is ok!

2019/02/12 21:32:38  INFO Check 'rep_threads' on 'db3' is ok!

2019/02/12 21:32:38  INFO Check 'rep_threads' on 'db1' is ok!

2019/02/12 21:34:59  WARN Check 'rep_threads' on 'db1' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.147.142:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.142' (111)

2019/02/12 21:39:20  INFO Check 'rep_threads' on 'db1' is ok!

2019/02/12 21:40:45  WARN Check 'rep_threads' on 'db2' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.147.141:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.141' (111)

2019/02/12 21:42:00  INFO Check 'rep_threads' on 'db2' is ok!

2019/02/12 21:43:40  WARN Check 'rep_threads' on 'db1' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.147.142:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.142' (111)

2019/02/12 21:45:26  INFO Check 'rep_threads' on 'db1' is ok!

2019/02/12 21:46:36 DEBUG Core: reaped child 8322

2019/02/12 21:46:36  INFO Shutting down checker 'rep_threads'...

2019/02/12 21:46:36  INFO Shutting down checker 'ping_ip'...

:03 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:06 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:10 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:13 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:16 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:19 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:22 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:25 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:28 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:31 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:34 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:37 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:40 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:43 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:46 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:49 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:52 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:55 DEBUG Listener: Waiting for connection...

2019/02/12 21:45:58 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:01 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:04 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:07 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:10 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:13 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:16 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:19 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:22 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:25 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:28 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:31 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:34 DEBUG Listener: Waiting for connection...

2019/02/12 21:46:36 DEBUG Core: reaped child 8314

2019/02/12 21:46:36 DEBUG Core: reaped child 8324

ommand 'SET_STATUS(AWAITING_RECOVERY, , db2)' to db1 (192.168.147.142:9989)

2019/02/12 21:46:21 DEBUG Received Answer: OK: Status applied successfully!|UP:2499.47

2019/02/12 21:46:24 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.91),writer(192.168.147.90), db2)' to db2 (192.168.147.141:9989)

2019/02/12 21:46:24 DEBUG Received Answer: OK: Status applied successfully!|UP:2494.70

2019/02/12 21:46:24 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.92),reader(192.168.147.93), db2)' to db3 (192.168.147.139:9989)

2019/02/12 21:46:24 DEBUG Received Answer: OK: Status applied successfully!|UP:2482.67

2019/02/12 21:46:24 DEBUG Sending command 'SET_STATUS(AWAITING_RECOVERY, , db2)' to db1 (192.168.147.142:9989)

2019/02/12 21:46:24 DEBUG Received Answer: OK: Status applied successfully!|UP:2502.48

2019/02/12 21:46:27 FATAL State of host 'db1' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(60 seconds). It was in state AWAITING_RECOVERY for 60 seconds

2019/02/12 21:46:27 DEBUG Sending command 'SET_STATUS(ONLINE, , db2)' to db1 (192.168.147.142:9989)

2019/02/12 21:46:27 DEBUG Received Answer: OK: Status applied successfully!|UP:2505.41

2019/02/12 21:46:27  INFO Moving role 'reader(192.168.147.92)' from host 'db3' to host 'db1'

2019/02/12 21:46:27 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.91),writer(192.168.147.90), db2)' to db2 (192.168.147.141:9989)

2019/02/12 21:46:27 DEBUG Received Answer: OK: Status applied successfully!|UP:2497.69

2019/02/12 21:46:27 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.93), db2)' to db3 (192.168.147.139:9989)

2019/02/12 21:46:27 DEBUG Received Answer: OK: Status applied successfully!|UP:2485.65

2019/02/12 21:46:27 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.92), db2)' to db1 (192.168.147.142:9989)

2019/02/12 21:46:27 DEBUG Received Answer: OK: Status applied successfully!|UP:2505.47

2019/02/12 21:46:30 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.91),writer(192.168.147.90), db2)' to db2 (192.168.147.141:9989)

2019/02/12 21:46:30 DEBUG Received Answer: OK: Status applied successfully!|UP:2500.69

2019/02/12 21:46:30 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.93), db2)' to db3 (192.168.147.139:9989)

2019/02/12 21:46:30 DEBUG Received Answer: OK: Status applied successfully!|UP:2488.64

2019/02/12 21:46:30 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.92), db2)' to db1 (192.168.147.142:9989)

2019/02/12 21:46:30 DEBUG Received Answer: OK: Status applied successfully!|UP:2508.46

2019/02/12 21:46:33 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.91),writer(192.168.147.90), db2)' to db2 (192.168.147.141:9989)

2019/02/12 21:46:33 DEBUG Received Answer: OK: Status applied successfully!|UP:2503.69

2019/02/12 21:46:33 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.93), db2)' to db3 (192.168.147.139:9989)

2019/02/12 21:46:33 DEBUG Received Answer: OK: Status applied successfully!|UP:2491.64

2019/02/12 21:46:33 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.147.92), db2)' to db1 (192.168.147.142:9989)

2019/02/12 21:46:33 DEBUG Received Answer: OK: Status applied successfully!|UP:2511.46

2019/02/12 21:46:36  INFO Signal received: exiting...

2019/02/12 21:46:36  INFO END

2019/02/12 21:46:37  INFO Child exited normally (with exitcode 0), shutting down

Ok


方式2: /var/log/mysql-mmm/mmm_mond.log

2019/02/12 21:32:33  INFO STARTING...

2019/02/12 21:32:33  INFO Waiting for network connection...

2019/02/12 21:32:33  INFO Spawning checker 'ping_ip'...

2019/02/12 21:32:33  INFO Shutting down checker 'ping_ip'...

2019/02/12 21:32:33  INFO Network connection is available.

2019/02/12 21:32:33  INFO Performing initial checks...

2019/02/12 21:32:33  INFO Spawning checker 'mysql'...

2019/02/12 21:32:33  INFO Shutting down checker 'mysql'...

2019/02/12 21:32:33  INFO Spawning checker 'ping'...

2019/02/12 21:32:33  INFO Shutting down checker 'ping'...

2019/02/12 21:32:33  INFO Spawning checker 'rep_backlog'...

2019/02/12 21:32:34  INFO Shutting down checker 'rep_backlog'...

2019/02/12 21:32:34  INFO Spawning checker 'rep_threads'...

2019/02/12 21:32:34  INFO Shutting down checker 'rep_threads'...

2019/02/12 21:32:34  WARN No binary found for killing hosts (/usr/lib/mysql-mmm//monitor/kill_host).

2019/02/12 21:32:34  WARN auto_increment_offset should be different on both masters (db1: 1 , db2: 1)

2019/02/12 21:32:34  WARN db1: auto_increment_increment (1) should be >= 2

2019/02/12 21:32:34  WARN db2: auto_increment_increment (1) should be >= 2

2019/02/12 21:32:34  INFO Startup status:


Roles:

    Role                     Host    Stored  System  Agent

    reader(192.168.147.91)   db2     Yes     -       -    

    reader(192.168.147.93)   db3     Yes     -       -    

    reader(192.168.147.92)   db1     Yes     -       -    

    writer(192.168.147.90)   db2     Yes     -       -    


Hosts:

    Host    Master  Writable  Stored state      Agent state     

    db2     -       No        ONLINE            UNKNOWN         

    db3     db1     No        ONLINE            UNKNOWN         

    db1     -       No        ONLINE            UNKNOWN         

2019/02/12 21:32:34  INFO Monitor started in active mode.

2019/02/12 21:32:37  INFO Spawning checker 'ping_ip'...

2019/02/12 21:32:37  INFO Spawning checker 'mysql'...

2019/02/12 21:32:38  INFO Spawning checker 'ping'...

2019/02/12 21:32:38  INFO Check 'mysql' on 'db2' is ok!

2019/02/12 21:32:38  INFO Check 'mysql' on 'db3' is ok!

2019/02/12 21:32:38  INFO Spawning checker 'rep_backlog'...

2019/02/12 21:32:38  INFO Check 'mysql' on 'db1' is ok!

2019/02/12 21:32:38  INFO Check 'ping' on 'db2' is ok!

2019/02/12 21:32:38  INFO Check 'ping' on 'db3' is ok!

2019/02/12 21:32:38  INFO Check 'ping' on 'db1' is ok!

2019/02/12 21:32:38  INFO Spawning checker 'rep_threads'...

2019/02/12 21:32:38  INFO Check 'rep_backlog' on 'db2' is ok!

2019/02/12 21:32:38  INFO Check 'rep_backlog' on 'db3' is ok!

2019/02/12 21:32:38  INFO Check 'rep_backlog' on 'db1' is ok!

2019/02/12 21:32:38  INFO Check 'rep_threads' on 'db2' is ok!

2019/02/12 21:32:38  INFO Check 'rep_threads' on 'db3' is ok!

2019/02/12 21:32:38  INFO Check 'rep_threads' on 'db1' is ok!

2019/02/12 21:34:59  WARN Check 'rep_threads' on 'db1' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.147.142:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.142' (111)

2019/02/12 21:34:59  WARN Check 'rep_backlog' on 'db1' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.147.142:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.142' (111)

2019/02/12 21:35:08 ERROR Check 'mysql' on 'db1' has failed for 10 seconds! Message: ERROR: Connect error (host = 192.168.147.142:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.142' (111)

2019/02/12 21:35:09 FATAL State of host 'db1' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK)

2019/02/12 21:35:09  INFO Removing all roles from host 'db1':

2019/02/12 21:35:09  INFO     Removed role 'reader(192.168.147.92)' from host 'db1'

2019/02/12 21:35:09  INFO Orphaned role 'reader(192.168.147.92)' has been assigned to 'db3'

2019/02/12 21:39:19  INFO Check 'mysql' on 'db1' is ok!

2019/02/12 21:39:20  INFO Check 'rep_threads' on 'db1' is ok!

2019/02/12 21:39:20  INFO Check 'rep_backlog' on 'db1' is ok!

2019/02/12 21:39:21 FATAL State of host 'db1' changed from HARD_OFFLINE to AWAITING_RECOVERY

2019/02/12 21:40:21 FATAL State of host 'db1' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(60 seconds). It was in state AWAITING_RECOVERY for 60 seconds

2019/02/12 21:40:21  INFO Moving role 'reader(192.168.147.92)' from host 'db3' to host 'db1'

2019/02/12 21:40:45  WARN Check 'rep_threads' on 'db2' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.147.141:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.141' (111)

2019/02/12 21:40:45  WARN Check 'rep_backlog' on 'db2' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.147.141:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.141' (111)

2019/02/12 21:40:54 ERROR Check 'mysql' on 'db2' has failed for 10 seconds! Message: ERROR: Connect error (host = 192.168.147.141:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.141' (111)

2019/02/12 21:40:55 FATAL State of host 'db2' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK)

2019/02/12 21:40:55  INFO Removing all roles from host 'db2':

2019/02/12 21:40:55  INFO     Removed role 'reader(192.168.147.91)' from host 'db2'

2019/02/12 21:40:55  INFO     Removed role 'writer(192.168.147.90)' from host 'db2'

2019/02/12 21:40:55  INFO Orphaned role 'writer(192.168.147.90)' has been assigned to 'db1'

2019/02/12 21:40:55  INFO Orphaned role 'reader(192.168.147.91)' has been assigned to 'db3'

2019/02/12 21:42:00  INFO Check 'mysql' on 'db2' is ok!

2019/02/12 21:42:00  INFO Check 'rep_backlog' on 'db2' is ok!

2019/02/12 21:42:00  INFO Check 'rep_threads' on 'db2' is ok!

2019/02/12 21:42:01 FATAL State of host 'db2' changed from HARD_OFFLINE to AWAITING_RECOVERY

2019/02/12 21:43:02 FATAL State of host 'db2' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(60 seconds). It was in state AWAITING_RECOVERY for 61 seconds

2019/02/12 21:43:02  INFO Moving role 'reader(192.168.147.91)' from host 'db3' to host 'db2'

2019/02/12 21:43:40  WARN Check 'rep_threads' on 'db1' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.147.142:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.142' (111)

2019/02/12 21:43:40  WARN Check 'rep_backlog' on 'db1' is in unknown state! Message: UNKNOWN: Connect error (host = 192.168.147.142:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.142' (111)

2019/02/12 21:43:50 ERROR Check 'mysql' on 'db1' has failed for 10 seconds! Message: ERROR: Connect error (host = 192.168.147.142:3306, user = mmm_monitor)! Can't connect to MySQL server on '192.168.147.142' (111)

2019/02/12 21:43:53 FATAL State of host 'db1' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK)

2019/02/12 21:43:53  INFO Removing all roles from host 'db1':

2019/02/12 21:43:53  INFO     Removed role 'reader(192.168.147.92)' from host 'db1'

2019/02/12 21:43:53  INFO     Removed role 'writer(192.168.147.90)' from host 'db1'

2019/02/12 21:43:53  INFO Orphaned role 'writer(192.168.147.90)' has been assigned to 'db2'

2019/02/12 21:43:53  INFO Orphaned role 'reader(192.168.147.92)' has been assigned to 'db3'

2019/02/12 21:45:26  INFO Check 'mysql' on 'db1' is ok!

2019/02/12 21:45:26  INFO Check 'rep_threads' on 'db1' is ok!

2019/02/12 21:45:26  INFO Check 'rep_backlog' on 'db1' is ok!

2019/02/12 21:45:27 FATAL State of host 'db1' changed from HARD_OFFLINE to AWAITING_RECOVERY

2019/02/12 21:46:27 FATAL State of host 'db1' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(60 seconds). It was in state AWAITING_RECOVERY for 60 seconds

2019/02/12 21:46:27  INFO Moving role 'reader(192.168.147.92)' from host 'db3' to host 'db1'

2019/02/12 21:46:36  INFO Signal received: exiting...

2019/02/12 21:46:36  INFO Shutting down checker 'mysql'...

2019/02/12 21:46:36  INFO Shutting down checker 'ping'...

2019/02/12 21:46:36  INFO Shutting down checker 'rep_backlog'...

2019/02/12 21:46:36  INFO Shutting down checker 'rep_threads'...

2019/02/12 21:46:36  INFO Shutting down checker 'ping_ip'...

2019/02/12 21:46:36  INFO END

2019/02/12 21:46:37  INFO Child exited normally (with exitcode 0), shutting down



0
0

sqlercn

2019-02-12

需要把debug设置为1,开启debug日志再测试一下,然后把debug日志发上来看一下。
0
21
慕粉3182897
回复
crazy398
1234
2022-07-20
共21条回复

MySQL提升课程 全面讲解MySQL架构设计

面面俱到讲解影响MySQL性能的各个因素,让MySQL架构了然于胸。

4419 学习 · 547 问题

查看课程