master node 2대를 추가해 H.A 구성을 해보자.
[ 방법 ]
- 트래픽을 받도록 load balancer로 배포한다. HAProxy는 배포가 쉬운편이다. start하여 현재 동작중인 master node를 확인한다.
 - k8smaster2,3번 노드에 kubernetes software를 설치한다. (이 과정은 기존 worker node를 이미지 떠서 만들면 더 편할 듯하다.)
 - k8smaster2번을 master node에 join 한다. 기존에 worker node 추가때 사용하던 kubeadm join 으로 부터 추가적으로 hash와 flag가 필요할 것이다.
 - k8smaster3번을 master node에 join 한다.
 - haproxy에 3개의 master를 사용하도록 설정을 수정하여 재시작한다.
 - 트래픽 및 모니터링 화면을 통해 제대로 분배되어 들어오는지 확인한다.
 
[ 준비사항 ]
- 3개의 노드 준비 (proxy node, k8smaster2, k8smaster3)
 
haproxy, master2, master3 세팅
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# 계정 추가 및 세팅
$ useradd -s /bin/bash -m ps0107
$ passwd ps0107
$ vi /etc/sudoers.d/ps0107
  ps0107 ALL=(ALL) ALL
$ chmod 440 /etc/sudoers.d/ps0107
$ vi /etc/sudoers
  %sudo   ALL=(ALL:ALL) NOPASSWD: ALL
$ vi /etc/ssh/sshd_config
  PasswordAuthentication yes
$ service ssh restart
$ vi /etc/hosts
10.146.0.2   k8smaster   master 
10.146.0.13  k8smaster2  master2
10.146.0.14  k8smaster3  master3
10.146.0.4   k8sworker1  worker1
$ apt-get update && apt-get upgrade -y
#### master2,3번은 docker 및 kubernetes software 설치
$ apt-get install -y docker.io
$ vi /etc/apt/sources.list.d/kubernetes.list
  deb http://apt.kubernetes.io/ kubernetes-xenial main
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ apt-get update
$ apt-get install -y kubeadm=1.15.1-00 kubelet=1.15.1-00 kubectl=1.15.1-00
haproxy 설치
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
###################################
# haproxy 설치
###################################
# haproxy install
ps0107@proxy:~$ sudo apt-get update ; sudo apt-get install -y apache2 haproxy
# haproxy 설정 변경
ps0107@proxy:~$ sudo vi /etc/haproxy/haproxy.cfg
global
	log /dev/log	local0
	log /dev/log	local1 notice
	chroot /var/lib/haproxy
	stats socket /run/haproxy/admin.sock mode 660 level admin
	stats timeout 30s
	user haproxy
	group haproxy
	daemon
	# Default SSL material locations
	ca-base /etc/ssl/certs
	crt-base /etc/ssl/private
	# Default ciphers to use on SSL-enabled listening sockets.
	# For more information, see ciphers(1SSL). This list is from:
	#  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
	ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
	ssl-default-bind-options no-sslv3
defaults
	log	global
	mode	tcp
	option	tcplog
	option	dontlognull
        timeout connect 5000
        timeout client  50000
        timeout server  50000
	errorfile 400 /etc/haproxy/errors/400.http
	errorfile 403 /etc/haproxy/errors/403.http
	errorfile 408 /etc/haproxy/errors/408.http
	errorfile 500 /etc/haproxy/errors/500.http
	errorfile 502 /etc/haproxy/errors/502.http
	errorfile 503 /etc/haproxy/errors/503.http
	errorfile 504 /etc/haproxy/errors/504.http
frontend proxynode
   bind *:80
   bind *:6443
   stats uri /proxystats
   default_backend k8sServers
backend k8sServers
   balance roundrobin
   server master1  10.146.0.2:6443 check
#  server master2  10.128.0.30:6443 check
#  server master3  10.128.0.66:6443 check
listen stats
     bind :9999
     mode http
     stats enable
     stats hide-version
     stats uri /stats
# 설정 변경 후 재시작
ps0107@proxy:~$ sudo service haproxy restart
# 재시작 및 상태 확인
ps0107@proxy:~$ sudo service haproxy status
● haproxy.service - HAProxy Load Balancer
   Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2020-02-18 06:02:59 UTC; 9s ago
     Docs: man:haproxy(1)
           file:/usr/share/doc/haproxy/configuration.txt.gz
  Process: 4116 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS (code=exited, status=0/SUCCESS)
Main PID: 4118 (haproxy-systemd)
    Tasks: 3 (limit: 4915)
   CGroup: /system.slice/haproxy.service
           ├─4118 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
           ├─4121 /usr/sbin/haproxy-master
           └─4122 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
Feb 18 06:02:59 proxy systemd[1]: Starting HAProxy Load Balancer...
Feb 18 06:02:59 proxy systemd[1]: Started HAProxy Load Balancer.
Feb 18 06:02:59 proxy haproxy-systemd-wrapper[4118]: haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
Feb 18 06:02:59 proxy haproxy[4121]: Proxy proxynode started.
Feb 18 06:02:59 proxy haproxy[4121]: Proxy proxynode started.
Feb 18 06:02:59 proxy haproxy[4121]: Proxy k8sServers started.
Feb 18 06:02:59 proxy haproxy[4121]: Proxy k8sServers started.
haproxy 동작 확인 (현재는 master1번만 backend)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 이제 master1의 hosts파일을 수정하여 k8smaster가 haproxy 로 가도록 아이피를 수정해준다.
# haproxy 에서 현재 설정된대로 master1번 으로 호출하게 될것이다.
# hosts파일 수정
ps0107@k8smaster1:~/.ssh$ cat /etc/hosts
127.0.0.1 localhost
#10.146.0.2  k8smaster  master
10.146.0.9  k8smaster   master
# haproxy stats 웹페이지 접속
http://35.243.70.115:9999/stats
# kubectl 명령을 여러번 호출 한 후 웹페이지를 확인해 본다.
ps0107@k8smaster1:~/lab/LFS458/SOLUTIONS/s_16$ kubectl get node
NAME         STATUS   ROLES    AGE   VERSION
k8smaster1   Ready    master   21d   v1.15.1
k8sworker1   Ready    <none>   21d   v1.15.1
- http://35.243.70.115:9999/stats 로 접속하면 HAProxy Status 화면을 볼수 있다.
 - 현재는 master node 1개가 backend로 연결되어 있다.
 

master 1번 노드에서 token 과 SSL hash 생성
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
###############################################
# master 1번 노드에서 token 과 SSL hash 생성
###############################################
# 조인을 위해 master1번서버에서 token들과 hash
# master1번 아이피를 haproxy 서버 아이피로 변경한다
ps0107@k8smaster2:~$ sudo vi /etc/hosts
# master1번 서버에서 token 생성
ps0107@k8smaster1:~$ sudo kubeadm token create
nmub4r.x6pnwhwicqosxtbn
# SSL hash 생성
ps0107@k8smaster1:~$ openssl x509 -pubkey \
> -in /etc/kubernetes/pki/ca.crt | openssl rsa \                                                                                      
> -pubin -outform der 2>/dev/null | openssl dgst \
> -sha256 -hex | sed 's/^.* //'
16e1304347fa2f5cb6f1660e582ed3501aa63e0ddce276773566098aef13d0ba
# worker node가 아니라 master node로 join 하기 위해 신규 master certificate를 생성한다.
ps0107@k8smaster1:~$ sudo kubeadm init phase upload-certs --upload-certs
I0219 01:57:13.393987    9787 version.go:248] remote version is much newer: v1.17.3; falling back to: stable-1.15
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
d3bc0cd2c943fafe26e0dc5903ae41669993b7cbfbb7c4b5cc6336bb99a2e365
master 2번 노드에서 master로 클러스터 조인 수행
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
###############################################
# master 2번 노드에서 master로 클러스터 조인 수행
###############################################
# master2번 노드에서 join 수행
ps0107@k8smaster2:~$ sudo kubeadm join k8smaster:6443 ₩
> --token nmub4r.x6pnwhwicqosxtbn ₩
> --discovery-token-ca-cert-hash sha256:16e1304347fa2f5cb6f1660e582ed3501aa63e0ddce276773566098aef13d0ba ₩
> --control-plane --certificate-key d3bc0cd2c943fafe26e0dc5903ae41669993b7cbfbb7c4b5cc6336bb99a2e365
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8smaster2 localhost] and IPs [10.146.0.13 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8smaster2 localhost] and IPs [10.146.0.13 127.0.0.1 ::1]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8smaster2 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8smaster] and IPs [10.96.0.1 10.146.0.13]
[certs] Generating "front-proxy-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node k8smaster2 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8smaster2 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
# config file 복사
ps0107@k8smaster2:~$ mkdir -p $HOME/.kube                                                                                             
ps0107@k8smaster2:~$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
ps0107@k8smaster2:~$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
# master 1번에서 node 리스트를 확인해보자.
# 정상적으로 2번 마스트노드가 조인된걸 확인할수 있다. 3번도 같은 방법으로 수행한다.
ps0107@k8smaster1:~$ kubectl get nodes
NAME         STATUS   ROLES    AGE     VERSION
k8smaster1   Ready    master   21d     v1.15.1
k8smaster2   Ready    master   4m15s   v1.15.1
k8sworker1   Ready    <none>   21d     v1.15.1
# etcd pod도 조회해본다.
ps0107@k8smaster1:~$ kubectl -n kube-system get pods | grep etcd
etcd-k8smaster1                      1/1     Running   0          21d
etcd-k8smaster2                      1/1     Running   0          117m
# 새로 생긴 etcd pod의 로그를 확인해 본다. 
ps0107@k8smaster1:~$ kubectl -n kube-system logs -f etcd-k8smaster2 
......
2020-02-19 03:28:08.167787 I | etcdserver: start to snapshot (applied: 2930857, lastsnap: 2920856)
2020-02-19 03:28:08.173395 I | etcdserver: saved snapshot at index 2930857
2020-02-19 03:28:08.174123 I | etcdserver: compacted raft log at 2925857
2020-02-19 03:30:02.972421 I | mvcc: store.index: compact 2528020
2020-02-19 03:30:02.973830 I | mvcc: finished scheduled compaction at 2528020 (took 1.034404ms)
# etcd pod에 들어가서 cluster status를 체크해보자.
ps0107@k8smaster1:~$ kubectl -n kube-system  exec -it etcd-k8smaster1 -- /bin/sh
# 아래 명령으로 조회해 본다.
# master1 etcd 가 리더로 설정되어 있다.
ETCDCTL_API=3 etcdctl -w table ₩
--endpoints 10.146.0.2:2379,10.146.0.13:2379 ₩
--cacert /etc/kubernetes/pki/etcd/ca.crt ₩
--cert /etc/kubernetes/pki/etcd/server.crt ₩
--key /etc/kubernetes/pki/etcd/server.key ₩
endpoint status
+------------------+------------------+---------+---------+-----------+-----------+------------+
|     ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+------------------+------------------+---------+---------+-----------+-----------+------------+
|  10.146.0.2:2379 | 7a2c7e572714edd4 |  3.3.10 |  5.3 MB |      true |         6 |    2936615 |
| 10.146.0.13:2379 | fb2d7152d16646c3 |  3.3.10 |  5.3 MB |     false |         6 |    2936615 |
+------------------+------------------+---------+---------+-----------+-----------+------------+
- 윽.. gcp 사용중인데 한 리즌에 cpu 리밋이 8개라 더이상 인스턴스 생성이 불가하네요~
 - 일단 그냥 한대 master 붙인걸로 테스트 해봐야겠네요.
 - http://35.243.70.115:9999/stats 접속하여 정상 적으로 분배되어 들어가는지 확인한다.
 

현재 노드한개가 추가된 걸 볼수 있다.
FailOver 테스트 해보자.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
###############################################
# FailOver 테스트 해보자.
###############################################
# master 1번 서버에서 docker service를 내려보자.
ps0107@k8smaster1:~$ sudo service docker stop
# etcd 로그를 확인해보자.
# master1번 etcd 에 대한 connection을 잃었다.
ps0107@k8smaster1:~$ kubectl -n kube-system logs -f etcd-k8smaster2
.....
2020-02-19 04:20:17.075534 I | raft: fb2d7152d16646c3 became leader at term 7
2020-02-19 04:20:17.075554 I | raft: raft.node: fb2d7152d16646c3 elected leader fb2d7152d16646c3 at term 7
2020-02-19 04:20:17.170081 W | rafthttp: lost the TCP streaming connection with peer 7a2c7e572714edd4 (stream Message reader)
2020-02-19 04:20:17.170267 W | rafthttp: lost the TCP streaming connection with peer 7a2c7e572714edd4 (stream MsgApp v2 reader)
2020-02-19 04:20:17.192833 E | rafthttp: failed to dial 7a2c7e572714edd4 on stream Message (EOF)
2020-02-19 04:20:17.192864 I | rafthttp: peer 7a2c7e572714edd4 became inactive (message send to peer failed)
2020-02-19 04:20:17.454474 W | rafthttp: lost the TCP streaming connection with peer 7a2c7e572714edd4 (stream Message writer)
2020-02-19 04:20:19.054260 W | raft: fb2d7152d16646c3 stepped down to follower since quorum is not active
# kubectl 명령은 잘 수행된다.
# master1번이 not ready 상태로 변경되었다.
ps0107@k8smaster1:~$ kubectl get nodes
NAME         STATUS     ROLES    AGE    VERSION
k8smaster1   NotReady   master   21d    v1.15.1
k8smaster2   Ready      master   138m   v1.15.1
k8sworker1   NotReady   <none>   21d    v1.15.1
# etcd를 접속해 cluster 상태를 확인해본다.
# 리더가 master2 etcd로 변경되었다.
ps0107@k8smaster1:~$ kubectl -n kube-system  exec -it etcd-k8smaster2 -- /bin/sh
ETCDCTL_API=3 etcdctl -w table ₩
--endpoints 10.146.0.2:2379,10.146.0.13:2379 ₩
--cacert /etc/kubernetes/pki/etcd/ca.crt ₩
--cert /etc/kubernetes/pki/etcd/server.crt ₩
--key /etc/kubernetes/pki/etcd/server.key ₩
endpoint status
+------------------+------------------+---------+---------+-----------+-----------+------------+
|     ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+------------------+------------------+---------+---------+-----------+-----------+------------+
|  10.146.0.2:2379 | 7a2c7e572714edd4 |  3.3.10 |  5.3 MB |     false |        10 |    2937708 |
| 10.146.0.13:2379 | fb2d7152d16646c3 |  3.3.10 |  5.3 MB |      true |        10 |    2937708 |
+------------------+------------------+---------+---------+-----------+-----------+------------+
- http://35.243.70.115:9999/stats 접속하여 노드 상태를 확인할 수 있다.
 

- 노드상태를 볼수 있다. docker service를 내린 후 fail 상태를 볼수 있다.
 
이제 다시 master1번 노드를 정상화 하여 본다.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 이제 다시 master1번 서버의 docker service를 시작해 보자.
ps0107@k8smaster1:~$ sudo service docker start  
# 정상적으로 Ready 상태로 변경되었다.
ps0107@k8smaster1:~$ kubectl get node
NAME         STATUS   ROLES    AGE    VERSION
k8smaster1   Ready    master   21d    v1.15.1
k8smaster2   Ready    master   142m   v1.15.1
k8sworker1   Ready    <none>   21d    v1.15.1
http://35.243.70.115:9999/stats 접속하여 노드 상태를 확인할 수 있다.
# etcd 로그르 확인해보자.
# master1번 etcd가 active 되었음을 확인할 수 있다.
ps0107@k8smaster1:~$ kubectl -n kube-system logs -f etcd-k8smaster2
.....
2020-02-19 04:27:41.728243 I | rafthttp: peer 7a2c7e572714edd4 became active
2020-02-19 04:27:41.728282 I | rafthttp: established a TCP streaming connection with peer 7a2c7e572714edd4 (stream MsgApp v2 reader)
2020-02-19 04:27:41.729772 I | rafthttp: established a TCP streaming connection with peer 7a2c7e572714edd4 (stream Message reader)
2020-02-19 04:27:41.741347 W | rafthttp: closed an existing TCP streaming connection with peer 7a2c7e572714edd4 (stream MsgApp v2 writer)
.....
- http://35.243.70.115:9999/stats 접속하여 노드 상태를 확인할 수 있다.
 

두개 노드다 정상화 된것을 확인할 수 있다.