Galera Cluster для Multi Master: не удалось запустить второй узел в Ubuntu 16.04.4 x64

Я пытаюсь протестировать репликацию master-master между двумя серверами, обратившись к этому руководству: Настроить кластер galera

Первый узел был запущен нормально после выполнения galera_new_cluster & get wsrep_cluster_size = 1, но при выполнении systemctl start mysql для второго узла я получил сообщение об ошибке, как указано ниже.

Задание для mariadb.service завершилось неудачно, так как процесс управления завершился с кодом ошибки. Подробности смотрите в "systemctl status mariadb.service" и "journalctl -xe".

Ошибка "systemctl status mariadb.service":

 Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
 Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1458: Failed to open channel 'test_cluster' at 'gcomm://206.189.37.50,206.189.39.149': -110 (Connection timed out)
 Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] WSREP: gcs connect failed: Connection timed out
 Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] WSREP: wsrep::connect(gcomm://206.189.37.50,206.189.39.149) failed: 7
 Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] Aborting
 Jul 18 08:03:56 mariadb-s2 systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
 Jul 18 08:03:56 mariadb-s2 systemd[1]: Failed to start MariaDB 10.1.34 database server.
 Jul 18 08:03:56 mariadb-s2 systemd[1]: mariadb.service: Unit entered failed state.
 Jul 18 08:03:56 mariadb-s2 systemd[1]: mariadb.service: Failed with result 'exit-code'.

Ошибка от "journalctl -xe":

Jul 18 08:03:22 mariadb-s2 mysqld[7734]: WSREP: Recovered position 5820583f-8a43-11e8-8c80-9e4cdd04427e:0
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] /usr/sbin/mysqld (mysqld 10.1.34-MariaDB-1~xenial) starting as process 7956 ...
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: Read nil XID from storage engines, skipping position init
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: wsrep_load(): Galera 25.3.23(r3789) by Codership Oy <[email protected]> loaded successfully.
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: CRC-32C: using hardware acceleration.
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: Found saved state: 5820583f-8a43-11e8-8c80-9e4cdd04427e:-1, safe_to_bootstrap: 1 
Jul 18 08:03:22 mariadb-s2 mysqld[7734]: WSREP: Recovered position 5820583f-8a43-11e8-8c80-9e4cdd04427e:0
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] /usr/sbin/mysqld (mysqld 10.1.34-MariaDB-1~xenial) starting as process 7956 ...
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: Read nil XID from storage engines, skipping position init
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: wsrep_load(): Galera 25.3.23(r3789) by Codership Oy <[email protected]> loaded successfully.
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: CRC-32C: using hardware acceleration.
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: Found saved state: 5820583f-8a43-11e8-8c80-9e4cdd04427e:-1, safe_to_bootstrap: 1
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 206.189.39.149; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evi
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: Assign initial position for certification: 0, protocol version: -1
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: wsrep_sst_grab()
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: Start replication
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: Setting initial position to 5820583f-8a43-11e8-8c80-9e4cdd04427e:0
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: protonet asio version 0
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: Using CRC-32C for message checksums.
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: backend: asio
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: gcomm thread scheduling priority set to other:0
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: restore pc from disk failed
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: GMCast version 0
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: EVS version 0
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: gcomm: connecting to group 'test_cluster', peer '206.189.37.50:,206.189.39.149:'
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') connection established to 0a4e92b4 tcp://206.189.39.149:4567
Jul 18 08:03:22 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:22 140559072364800 [Warning] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') address 'tcp://206.189.39.149:4567' points to own listening address, blacklisting
Jul 18 08:03:25 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:25 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://206.189.37.50:4567 timed out, no messages seen in PT3S
Jul 18 08:03:25 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:25 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') connection to peer 0a4e92b4 with addr tcp://206.189.39.149:4567 timed out, no messages seen in PT3S
Jul 18 08:03:25 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:25 140559072364800 [Warning] WSREP: no nodes coming from prim view, prim not possible
Jul 18 08:03:25 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:25 140559072364800 [Note] WSREP: view(view_id(NON_PRIM,0a4e92b4,1) memb {
Jul 18 08:03:25 mariadb-s2 mysqld[7956]:         0a4e92b4,0
Jul 18 08:03:25 mariadb-s2 mysqld[7956]: } joined {
Jul 18 08:03:25 mariadb-s2 mysqld[7956]: } left {
Jul 18 08:03:25 mariadb-s2 mysqld[7956]: } partitioned {
Jul 18 08:03:25 mariadb-s2 mysqld[7956]: })
Jul 18 08:03:25 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:25 140559072364800 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50398S), skipping check
Jul 18 08:03:29 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:29 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://206.189.37.50:4567 timed out, no messages seen in PT3S
Jul 18 08:03:34 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:34 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://206.189.37.50:4567 timed out, no messages seen in PT3S
Jul 18 08:03:38 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:38 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://206.189.37.50:4567 timed out, no messages seen in PT3S
Jul 18 08:03:43 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:43 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://206.189.37.50:4567 timed out, no messages seen in PT3S
Jul 18 08:03:47 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:47 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://206.189.37.50:4567 timed out, no messages seen in PT3S
Jul 18 08:03:49 mariadb-s2 kernel: [UFW BLOCK] IN=eth0 OUT= MAC=e2:57:b6:36:df:64:ec:38:73:0c:78:30:08:00 SRC=123.249.27.191 DST=206.189.39.149 LEN=40 TOS=0x00 PREC=0x00 TTL=113 ID=256 PROTO=TCP SPT=39069 DPT=8888 WINDOW=16384 RES=0x00 S
Jul 18 08:03:52 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:52 140559072364800 [Note] WSREP: (0a4e92b4, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://206.189.37.50:4567 timed out, no messages seen in PT3S
Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [Note] WSREP: view((empty))
Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
Jul 18 08:03:55 mariadb-s2 mysqld[7956]:          at gcomm/src/pc.cpp:connect():158
Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1458: Failed to open channel 'test_cluster' at 'gcomm://206.189.37.50,206.189.39.149': -110 (Connection timed out)
Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] WSREP: gcs connect failed: Connection timed out
Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] WSREP: wsrep::connect(gcomm://206.189.37.50,206.189.39.149) failed: 7
Jul 18 08:03:55 mariadb-s2 mysqld[7956]: 2018-07-18  8:03:55 140559072364800 [ERROR] Aborting
Jul 18 08:03:56 mariadb-s2 systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
Jul 18 08:03:56 mariadb-s2 systemd[1]: Failed to start MariaDB 10.1.34 database server.

Конфигурация узла 1:

[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0

# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so

# Galera Cluster Configuration
wsrep_cluster_name="test_cluster"
wsrep_cluster_address="gcomm://206.189.37.50,206.189.39.149"

# Galera Synchronization Configuration
wsrep_sst_method=rsync

# Galera Node Configuration
wsrep_node_address="206.189.37.50"
wsrep_node_name="node1"

Конфигурация узла 2:

[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0

# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so

# Galera Cluster Configuration
wsrep_cluster_name="test_cluster"
wsrep_cluster_address="gcomm://206.189.37.50,206.189.39.149"

# Galera Synchronization Configuration
wsrep_sst_method=rsync

# Galera Node Configuration
wsrep_node_address="206.189.39.149"
wsrep_node_name="node2"

Я ссылался на многие подобные вопросы в Stack Overflow, но ни одно из предложений не помогло. Я все еще получаю ту же ошибку.


person yyii    schedule 19.07.2018    source источник
comment
Вы разблокировали TCP-порт 4567 в UFW? Это, вероятно, лучше подходит для dba.stackexchange.com.   -  person felixbuenemann    schedule 19.07.2018
comment
Наконец, оба узла могут нормально запускаться после того, как я удалил все настройки ufw и установил их снова. Спасибо!   -  person yyii    schedule 20.07.2018
comment
Я рад, что у вас все получилось! Я собираюсь предложить проверить правила UFW в качестве ответа, чтобы этот вопрос отображался как ответ.   -  person felixbuenemann    schedule 20.07.2018


Ответы (1)


Журналы показывают, что брандмауэр Ubuntu (UFW) активен, поэтому вполне вероятно, что соединения блокируются брандмауэром.

Чтобы узнать, не в этом ли проблема, вы можете запустить ufw disable. Если репликация работает с отключенным ufw, убедитесь, что вы добавили правила, разрешающие соединения с IP-адреса узла 1 на TCP-порт 4567 на узле No2 и с IP-адреса узла 2 на TCP-порт 4567 на узле No1.

После добавления правил проверьте вывод ufw list и, если все в порядке, запустите ufw enable для повторной активации брандмауэра.

Вы можете запустить man ufw, чтобы узнать больше о конфигурации UFW.

person felixbuenemann    schedule 20.07.2018