`
jamie.wang
  • 浏览: 338136 次
  • 性别: Icon_minigender_1
  • 来自: 成都
社区版块
存档分类
最新评论

Redis HA 配置

阅读更多

Sentinel简介
Redis HA 需要用到Redis的Sentinel模块,该模块的稳定版自2.8之后集成在Redis的发行版中,因此不需要单独安装。
Sentinel的特点
监控:持续检查Redis集群中Master Slave节点的健康状态;
通知:当节点有错误时通知使用者,管理员;
自动故障转移:当Master节点故障时,自动从Slave节点选择一个作为Master,其他Slave配置为新的Slave,通知客户端新的Master节点;
配置支持:client都从这里获取认证和连接的地址;
Sentinel本身也是集群,在每个Master Slave节点都启动一个Sentinel。
配置指南
配置Master-Slave集群
Sentinel集群要求最少要3个节点,因此这里配置3个节点,一个Master两个Slave
Master:10.111.134.44:6379
Slave1:10.111.134.7:6379
Slave2:10.111.134.55:6379
配置文件
示例如下:
daemonize yes
pidfile /var/run/redis/redis-server.pid
port 6379
bind 0.0.0.0
timeout 0
tcp-keepalive 0
loglevel notice
logfile /var/log/redis/redis-server.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
slaveof 10.111.134.44 6379
min-slaves-to-write 1
min-slaves-max-lag 10
slave-serve-stale-data yes
slave-read-only yes
repl-disable-tcp-nodelay no
slave-priority 100
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
注意这几行配置:
bind 0.0.0.0 # 绑定任何网卡
slaveof 10.111.134.44 6379 # master不需要配置这行
min-slaves-to-write 1 # 当slave1个都连不上时,master拒绝连接
min-slaves-max-lag 10 # 连不上slave的超时时间10秒
配置Sentinel
每个节点均需要配置Sentinel,但配置相同
# 指定端口
port 26379
 
# 监控master节点(10.111.134.44)的6379端口,2表示如果两个节点都不能连接到master,则认为master已经宕机
sentinel monitor master1 10.111.134.44 6379 2
# 如果5秒之内master不能连接,则认定master已经宕机
sentinel down-after-milliseconds master1 5000
# 故障转移超时
sentinel failover-timeout master1 180000
# 同时同步的slave节点个数
sentinel parallel-syncs master1 1
启动
启动Redis
$ sudo /etc/init.d/redis-server start
Starting redis-server: redis-server.
启动Sentinel
 
$ sudo /home/sensu/redis-3.0.1/src/redis-sentinel /etc/redis/sentinel.conf
20561:X 29 Dec 16:43:35.366 # Sentinel runid is 82decae77e954ca8fe14ebde0da937afbadc0688
20561:X 29 Dec 16:43:35.366 # +monitor master master1 10.111.134.44 6379 quorum 2
20561:X 29 Dec 16:48:57.603 * +slave slave 10.111.134.55:6379 10.111.134.55 6379 @ master1 10.111.134.44 6379
20561:X 29 Dec 16:53:16.568 * +sentinel sentinel 10.111.134.55:26379 10.111.134.55 26379 @ master1 10.111.134.44 6379
20561:X 29 Dec 16:53:48.825 * +slave slave 10.111.134.7:6379 10.111.134.7 6379 @ master1 10.111.134.44 6379
20561:X 29 Dec 16:53:56.447 * +sentinel sentinel 10.111.134.7:26379 10.111.134.7 26379 @ master1 10.111.134.44 6379
从日志可以看出master和slave都被监控了
查看master slave状态
 
$ redis-cli -h 10.111.134.44 info replication
# Replication
role:master
connected_slaves:2
min_slaves_good_slaves:2
slave0:ip=10.111.134.55,port=6379,state=online,offset=135234,lag=1
slave1:ip=10.111.134.7,port=6379,state=online,offset=135234,lag=0
master_repl_offset:135374
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:135373
 
测试
Master-Slave测试
在Master设置值
 
$ redis-cli -h 10.111.134.44 set test_ha 1
 
OK
获取:
$ redis-cli -h 10.111.134.44 get test_ha
"1"
$ redis-cli -h 10.111.134.7 get test_ha
"1"
$ redis-cli -h 10.111.134.55 get test_ha
"1"
在Slave设置值
$ redis-cli -h 10.111.134.55 set test_ha 2
(error) READONLY You can't write against a read only slave.
配置了不允许在slave写操作,防止混乱
HA测试
停止Slave
停止1个slave
$ ps aux | grep redis
redis    26189  0.2  0.9  38148  9644 ?        Ssl  16:48   0:02 /home/sensu/redis-3.0.1/src/redis-server 0.0.0.0:6379        
root     27265  0.0  0.2  64952  2136 pts/3    S+   16:52   0:00 sudo /home/sensu/redis-3.0.1/src/redis-sentinel /etc/redis/sentinel.conf
root     27266  0.2  0.7  38152  7984 pts/3    Sl+  16:52   0:02 /home/sensu/redis-3.0.1/src/redis-sentinel *:26379 [sentinel]     
sensu    30928  0.0  0.0  11744   896 pts/1    S+   17:10   0:00 grep redis
$ sudo kill -9 26189
获取
$ redis-cli -h 10.111.134.44 set test_ha 2
OK
停止2个slave后设置
$ redis-cli -h 10.111.134.44 set test_ha 3
(error) NOREPLICAS Not enough good slaves to write.
报错,因为设置了如果一个slave都不能连上就失败
获取
$ redis-cli -h 10.111.134.44 get test_ha
"2"
 
成功
停止Sentinel
停止Sentinel只影响,failover和monitor,GET SET都不影响
$ ps aux | grep redis
root     15603  0.0  0.2  64952  2136 pts/1    S+   16:53   0:00 sudo /home/sensu/redis-3.0.1/src/redis-sentinel /etc/redis/sentinel.conf
root     15604  0.4  0.7  38152  7980 pts/1    Sl+  16:53   0:09 /home/sensu/redis-3.0.1/src/redis-sentinel *:26379 [sentinel]     
redis    23566  0.0  0.9  38148  9592 ?        Ssl  17:30   0:00 /home/sensu/redis-3.0.1/src/redis-server 0.0.0.0:6379        
sensu    23570  0.0  0.0  11744   896 pts/0    S+   17:30   0:00 grep redis
$ sudo kill -9 15603 15604
GET SET
$ redis-cli -h 10.111.134.44 set test_ha 3
OK
$ redis-cli -h 10.111.134.7 get test_ha
"3"
$ redis-cli -h 10.111.134.55 get test_ha
"3"
停止Mater
在Master节点执行
 
$ ps aux | grep redis
redis    20518  0.4  0.2  38148  9656 ?        Ssl  16:40   0:16 /home/sensu/redis-3.0.1/src/redis-server 0.0.0.0:6379        
root     20560  0.0  0.0  64952  2132 pts/2    S+   16:43   0:00 sudo /home/sensu/redis-3.0.1/src/redis-sentinel /etc/redis/sentinel.conf
root     20561  0.9  0.2  38152  7992 pts/2    Sl+  16:43   0:31 /home/sensu/redis-3.0.1/src/redis-sentinel *:26379 [sentinel]     
sensu    21280  0.0  0.0  11748   900 pts/3    S+   17:38   0:00 grep redis
$ sudo kill -9 20518
查看Sentinel日志
5秒之后,因为我们配置的是5秒
20561:X 29 Dec 17:39:27.853 # +sdown master master1 10.111.134.44 6379
20561:X 29 Dec 17:39:27.921 # +odown master master1 10.111.134.44 6379 #quorum 2/2
20561:X 29 Dec 17:39:27.921 # +new-epoch 1
20561:X 29 Dec 17:39:27.921 # +try-failover master master1 10.111.134.44 6379
20561:X 29 Dec 17:39:27.926 # +vote-for-leader 82decae77e954ca8fe14ebde0da937afbadc0688 1
20561:X 29 Dec 17:39:27.931 # 10.111.134.7:26379 voted for 82decae77e954ca8fe14ebde0da937afbadc0688 1
20561:X 29 Dec 17:39:27.933 # 10.111.134.55:26379 voted for 82decae77e954ca8fe14ebde0da937afbadc0688 1
20561:X 29 Dec 17:39:27.985 # +elected-leader master master1 10.111.134.44 6379
20561:X 29 Dec 17:39:27.985 # +failover-state-select-slave master master1 10.111.134.44 6379
20561:X 29 Dec 17:39:28.053 # +selected-slave slave 10.111.134.55:6379 10.111.134.55 6379 @ master1 10.111.134.44 6379
20561:X 29 Dec 17:39:28.053 * +failover-state-send-slaveof-noone slave 10.111.134.55:6379 10.111.134.55 6379 @ master1 10.111.134.44 6379
20561:X 29 Dec 17:39:28.137 * +failover-state-wait-promotion slave 10.111.134.55:6379 10.111.134.55 6379 @ master1 10.111.134.44 6379
20561:X 29 Dec 17:39:28.974 # +promoted-slave slave 10.111.134.55:6379 10.111.134.55 6379 @ master1 10.111.134.44 6379
20561:X 29 Dec 17:39:28.974 # +failover-state-reconf-slaves master master1 10.111.134.44 6379
20561:X 29 Dec 17:39:29.071 * +slave-reconf-sent slave 10.111.134.7:6379 10.111.134.7 6379 @ master1 10.111.134.44 6379
20561:X 29 Dec 17:39:30.066 * +slave-reconf-inprog slave 10.111.134.7:6379 10.111.134.7 6379 @ master1 10.111.134.44 6379
20561:X 29 Dec 17:39:30.067 * +slave-reconf-done slave 10.111.134.7:6379 10.111.134.7 6379 @ master1 10.111.134.44 6379
20561:X 29 Dec 17:39:30.131 # -odown master master1 10.111.134.44 6379
20561:X 29 Dec 17:39:30.132 # +failover-end master master1 10.111.134.44 6379
20561:X 29 Dec 17:39:30.132 # +switch-master master1 10.111.134.44 6379 10.111.134.55 6379
20561:X 29 Dec 17:39:30.133 * +slave slave 10.111.134.7:6379 10.111.134.7 6379 @ master1 10.111.134.55 6379
20561:X 29 Dec 17:39:30.133 * +slave slave 10.111.134.44:6379 10.111.134.44 6379 @ master1 10.111.134.55 6379
20561:X 29 Dec 17:39:35.198 # +sdown slave 10.111.134.44:6379 10.111.134.44 6379 @ master1 10.111.134.55 6379
可以看到10.111.134.55 6379被选为master
验证一下:
$ redis-cli -h 10.111.134.44 info replication
Could not connect to Redis at 10.111.134.44:6379: Connection refused
Master已经连不上
$ redis-cli -h 10.111.134.55 info replication
# Replication
role:master
connected_slaves:1
slave0:ip=10.111.134.7,port=6379,state=online,offset=46855,lag=0
master_repl_offset:46995
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:46994
 
$ redis-cli -h 10.111.134.7 info replication
 
# Replication
 
role:slave
 
master_host:10.111.134.55
 
master_port:6379
 
master_link_status:up
 
master_last_io_seconds_ago:0
 
master_sync_in_progress:0
 
slave_repl_offset:69775
 
slave_priority:100
 
slave_read_only:1
 
connected_slaves:0
 
master_repl_offset:0
 
repl_backlog_active:0
 
repl_backlog_size:1048576
 
repl_backlog_first_byte_offset:0
 
repl_backlog_histlen:0
 
 55已经变成了master,7成了他的slave
设置或获取
$ redis-cli -h 10.111.134.55 set test_ha 4
OK
$ redis-cli -h 10.111.134.7 set test_ha 4
(error) READONLY You can't write against a read only slave.
$ redis-cli -h 10.111.134.7 get test_ha
"4"
$ redis-cli -h 10.111.134.55 get test_ha
"4"
再次停止Master
$ ps aux | grep redis
redis     1849  0.2  0.9  38148  9680 ?        Ssl  17:24   0:02 /home/sensu/redis-3.0.1/src/redis-server 0.0.0.0:6379        
root      4612  0.0  0.2  65188  2364 pts/3    S+   17:37   0:00 sudo /home/sensu/redis-3.0.1/src/redis-sentinel /etc/redis/sentinel.conf
root      4613  0.6  0.7  38152  7988 pts/3    Sl+  17:37   0:03 /home/sensu/redis-3.0.1/src/redis-sentinel *:26379 [sentinel]     
sensu     6739  0.0  0.0  11748   900 pts/1    S+   17:48   0:00 grep redis
sensu@ubuntu:~$ sudo kill -9 1849
这次7成了唯一的master了
设置获取,没法设置
$ redis-cli -h 10.111.134.55 get test_ha
Could not connect to Redis at 10.111.134.55:6379: Connection refused
$ redis-cli -h 10.111.134.7 set test_ha 5
(error) NOREPLICAS Not enough good slaves to write.
$ redis-cli -h 10.111.134.7 get test_ha
"4"
恢复
恢复Slave
 
$ sudo /etc/init.d/redis-server start
Starting redis-server: redis-server.
$ redis-cli -h 10.111.134.7 info replication
# Replication
role:master
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
 
恢复的Slave并没有加到新的Master中,因为配置中是指到:44的
恢复Master
$ sudo /etc/init.d/redis-server start
Starting redis-server: redis-server.
$ redis-cli -h 10.111.134.7 info replication
# Replication
role:master
connected_slaves:1
slave0:ip=10.111.134.44,port=6379,state=online,offset=163,lag=0
master_repl_offset:163
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:162
恢复的Master加入了新的Master的Slave
过一会再次查看
$ redis-cli -h 10.111.134.7 info replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.111.134.44,port=6379,state=online,offset=73074,lag=1
slave1:ip=10.111.134.55,port=6379,state=online,offset=73351,lag=1
master_repl_offset:73490
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:73489
 
发现第二个Slave也加入进来了,因为它配置的Master 44连上了
总结
Redis的HA部署至少需要3个Sentinel节点,实际不需要3个Master-Slave节点,官方文档有说明。各项配置都有trade off,需根据情况调整。仍需要有外部的监控工具保证Master-Slave在宕机后能及时恢复,不然节点在全部宕机后,HA失败,并且Master宕机的几秒(down-after-milliseconds)内新的Master被选举出来之前操作是失败的。

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics