MariaDB 를 이중화해야하는 요건이 있어 정보를 찾던 중
mysql 에 지원되는 Galera Cluster 를 MariaDB 에서도 지원한다는 것을 확인했다.
예제또한 심플했고 테스트해보고 적용하면 되겠다고 판단.
하지만 다음 에러를 만나고 삽질을 한참동안 했었고..
다른사람은 같은 삽질이 없길 바란다.
에러내용
[ERROR] WSREP: State Transfer Request preparation failed: bind: Cannot assign requested address Can't continue, aborting.
[ERROR] WSREP: State Transfer Request preparation failed: bind: Permission denied Can't continue, aborting.
mysqld[64115]: 2021/07/19 17:35:44 socat[64416] E bind(6, {AF=2 0.0.0.0:4444}, 16): Permission denied
mysqld[64115]: WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 1 0 (20210719 17:35:44.508)
mysqld[64115]: Cannot open netlink socket: Permission denied
[개발환경]
OS : CentOS 7.9
MariaDB 10.4.18
MariaDB 설치
MariaDB 설치는 아래 글과 비슷하게 진행하였다.
(추후에 Galera Cluster 구성까지 다루는 예제를 작성해봐야겠다)
https://tecadmin.net/install-mariadb-10-centos-redhat/
간단하게 설명하면
1. mariadb 설치를 위한 yum repo 파일 생성
2. yum 을 이용한 설치
3. 테스트
Galera Cluster 구성하기
간단한 테스트 후 아래 글을 참고하여 Galera Cluster 구성 테스트를 진행했다.
https://mariadb.com/kb/en/getting-started-with-mariadb-galera-cluster/
[설정파일]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_provider_options="gcache.recover=yes"
wsrep_cluster_address="gcomm://192.168.10.1,192.168.10.2"
wsrep_cluster_name="test-cluster"
wsrep_sst_method=mariabackup
wsrep_node_address="192.168.10.1"
wsrep_node_name='main_node'
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=1
bind-address=0.0.0.0
main node 를 실행하는 서버에서 MariaDB 실행
$ sudo galera_new_cluster
backup node 서버에서는 위 설정에서 wsrep_node_address 와 wsrep_node_name 만 변경
설정 완료 후 MariaDB 실행
$ sudo systemctl start mariadb
main node 서버에서는 MariaDB 정상동작.
backup node 서버에서는 에러가 발생하며 MariaDB 가 계속 죽었다 재실행되었다 반복한다.
로그를 파보자..
2021-07-20 11:58:36 0 [Note] WSREP: Loading provider /usr/lib64/galera-4/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1
2021-07-20 11:58:36 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera-4/libgalera_smm.so'
2021-07-20 11:58:36 0 [Note] WSREP: wsrep_load(): Galera 4.7(ree4f10f) by Codership Oy <info@codership.com> loaded successfully.
WSREP provider 를 로드하고...
WSREP: Start replication
WSREP: Connecting with bootstrap option: 0
WSREP: Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
WSREP: protonet asio version 0
WSREP: Using CRC-32C for message checksums.
WSREP: backend: asio
WSREP: gcomm thread scheduling priority set to other:0
WSREP: restore pc from disk successfully
WSREP: GMCast version 0
WSREP: (a54105b6-b3b7, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
WSREP: (a54105b6-b3b7, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
WSREP: EVS version 1
WSREP: gcomm: connecting to group 'test-cluster', peer '192.168.10.1:,192.168.10.2:'
WSREP: (a54105b6-b3b7, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://192.168.10.2:4567
WSREP: (a54105b6-b3b7, 'tcp://0.0.0.0:4567') connection established to 463c57f7-9925 tcp://192.168.10.1:4567
WSREP: (a54105b6-b3b7, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
WSREP: gcomm: connected
WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
WSREP: Shifting CLOSED -> OPEN (TO: 0)
WSREP: Opened channel 'test-cluster'
WSREP: Starting rollbacker thread 1
WSREP: Starting applier thread 2
WSREP: EVS version upgrade 0 -> 1
WSREP: declaring 463c57f7-9925 at tcp://192.168.10.1:4567 stable
WSREP: PC protocol upgrade 0 -> 1
WSREP: Node 463c57f7-9925 state prim
WSREP: view(view_id(PRIM,463c57f7-9925,14) memb {
463c57f7-9925,0
a54105b6-b3b7,0
} joined {
} left {
} partitioned {
})
WSREP: save pc into disk
WSREP: clear restored view
WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
WSREP: STATE EXCHANGE: Waiting for state UUID.
WSREP: STATE EXCHANGE: sent state msg: 60da74f2-e906-11eb-93cb-aea884e9576f
WSREP: STATE EXCHANGE: got state msg: 60da74f2-e906-11eb-93cb-aea884e9576f from 0 (main)
WSREP: STATE EXCHANGE: got state msg: 60da74f2-e906-11eb-93cb-aea884e9576f from 1 (backup)
WSREP: Quorum results:
version = 6,
component = PRIMARY,
conf_id = 13,
members = 1/2 (joined/total),
act_id = 240,
last_appl. = 227,
protocols = 2/10/4 (gcs/repl/appl),
vote policy= 0,
group UUID = 23811917-e860-11eb-a643-e3c873c56ff9
WSREP: Flow-control interval: [23, 23]
WSREP: Shifting OPEN -> PRIMARY (TO: 241)
WSREP: ####### processing CC 241, local, ordered
WSREP: Process first view: 23811917-e860-11eb-a643-e3c873c56ff9 my uuid: a54105b6-e905-11eb-b3b7-132036154124
WSREP: Server backup connected to cluster at position 23811917-e860-11eb-a643-e3c873c56ff9:241 with ID a54105b6-e905-11eb-b3b7-132036154124
WSREP: Server status change disconnected -> connected
WSREP: wsrep_notify_cmd is not defined, skipping notification.
WSREP: ####### My UUID: a54105b6-e905-11eb-b3b7-132036154124
WSREP: Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 10), state transfer needed: yes
test-cluster 에 connect 되고 status 값을 갱신하고..
연결은 된거 같은데 문제가 발생한다니.
WSREP: Running: 'wsrep_sst_mariabackup --role 'joiner' --address '192.168.10.2' --datadir '/var/lib/mysql/' --parent '4756' --mysqld-args --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1'
WSREP_SST: [INFO] Streaming with xbstream (20210720 11:58:37.222)
WSREP_SST: [INFO] Using socat as streamer (20210720 11:58:37.227)
WSREP_SST: [INFO] Stale sst_in_progress file: /var/lib/mysql//sst_in_progress (20210720 11:58:37.233)
WSREP_SST: [INFO] Evaluating timeout -k 310 300 socat -u TCP-LISTEN:4444,reuseaddr stdio | mbstream -x; RC=( ${PIPESTATUS[@]} ) (20210720 11:58:37.292)
2021/07/20 11:58:37 socat[5057] E bind(6, {AF=2 0.0.0.0:4444}, 16): Permission denied
WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 1 0 (20210720 11:58:37.307)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20210720 11:58:37.311)
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
Cannot open netlink socket: Permission denied
2021-07-20 11:58:39 0 [Note] WSREP: (a54105b6-b3b7, 'tcp://0.0.0.0:4567') turning message relay requesting off
backup node 서버가 mariabackup 을 joiner role 로 실행하는데..
socat bind 에서 Permission denied 가 발생했다.
[ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'joiner' --address '192.168.10.2' --datadir '/var/lib/mysql/' --parent '4756' --mysqld-args --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1: 32 (Broken pipe)
[ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
[Note] WSREP: IST receiver addr using tcp://192.168.10.2:4568
[Note] WSREP: SST received
[Note] WSREP: SST received: 00000000-0000-0000-0000-000000000000:-1
[ERROR] WSREP: not JOINING when sst_received() called, state: CONNECTED
terminate called after throwing an instance of 'wsrep::runtime_error'
what(): wsrep::sst_received() failed: Not connected to Primary Component
210720 11:58:48 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 10.4.18-MariaDB-log
key_buffer_size=0
read_buffer_size=131072
max_used_connections=0
max_threads=153
thread_count=4
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 336688 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7fe16c000a78
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fe1857f8ab0 thread_stack 0x49000
[ERROR] WSREP: State Transfer Request preparation failed: bind: Permission denied Can't continue, aborting.
[Note] WSREP: ReplicatorSMM::abort()
[Note] WSREP: Closing send monitor...
[Note] WSREP: Closed send monitor.
[Note] WSREP: gcomm: terminating thread
[Note] WSREP: gcomm: joining thread
[Note] WSREP: gcomm: closing backend
아무튼 mariabackup process 에서 에러가 발생했고...
아래 정보를 가지고 bug reporting 을 하면 도와주겠단다..
찾아본 방법들
1. 방화벽이 문제일 수 있다
다음은 MariaDB 및 Galera Cluster 에서 사용하는 Port 정보이다.
Standard MariaDB Port (default: 3306) - For MySQL client connections and State Snapshot Transfers that use the mysqldump method. This can be changed by setting port.
Galera Replication Port (default: 4567) - For Galera Cluster replication traffic, multicast replication uses both UDP transport and TCP on this port. Can be changed by setting wsrep_node_address.
IST Port (default: 4568) - For Incremental State Transfers. Can be changed by setting ist.recv_addr in wsrep_provider_options.
SST Port (default: 4444) - For all State Snapshot Transfer methods other than mysqldump. Can be changed by setting wsrep_sst_receive_address.
해당 포트들을 모두 열어주었고 문제는 해결되지 않았다.
(방화벽을 꺼버려도 여전히 문제는 해결되지 않음)
2. mysql.sock 파일에 접근할 수 있는 권한체크
/var/lib/mysql 경로에 mysql.sock 파일이 생성된다.
생성될 때 권한은 mysql mysql 이며 이를 mysql root 로 변경하라는 글이 있었다.
$ rm /var/lib/mysql/mysql.sock
$ service mysql stop
$ chown -R mysql:root /var/lib/mysql
해당 내용을 적용해도 mysql.sock 파일은 mysql mysql 권한으로 생성되고 효과가 없었다.
3. SELinux 설정 중 httpd_can_network_connect_db 를 활성화
$ sudo setsebool -P httpd_can_network_connect_db 1
여전히 효과가 없다.
4. SELinux 를 비활성화
$ sudo sestatus
$ sudo setenforce 0
$ sudo sestatus
SELinux 를 끄니까 된다......
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: enforcing -> permissive
Mode from config file: enforcing
Policy MLS status: enabled
Policy deny_unknown status: allowed
Max kernel policy version: 31
setenforce 0 을 실행하면 Current mode 값이 변경된다.
하지만 SELinux 를 끄는건 보안에 문제가 발생할 수 있다.
아래 예제를 통해 별도의 SELinux 정책을 생성해주면 해결이 될듯하다..!
https://galeracluster.com/library/documentation/selinux.html
이번 글은 에러를 리뷰가 목적인 글이라서 설치 및 설정에 대한 내용이 많이 부실합니다.
SELinux 정책 생성까지 완료한 후 설치 및 설정을 목적으로 글을 작성하겠습니다.
이 포스트가 유용했다면 하트(공감), 댓글, 구독을 해주시면 저에게 보탬이 됩니다.
'Error Review' 카테고리의 다른 글
argocd git credential 변경하기 (0) | 2023.06.16 |
---|---|
python alpine image pymssql 설치 에러 (Could not build wheels for pymssql which use PEP 517 and cannot be installed directly) (2) | 2021.03.25 |
sudo: pip3.7: command not found 해결방법 (0) | 2021.03.25 |
python:3.7.10-alpine Failed building wheel for pycryptodome (0) | 2021.03.24 |
java 버전에러 (java.lang.UnsupportedClassVersionError) (0) | 2021.01.19 |
최근댓글