Loading... <p></p> <p style="margin-bottom: 15px;padding: 0px;border: 0px;margin-top: 0px !important">公司的很多项目在使用redis主从。由于coder的各种毁灭性操作,迫切需要一个能带故障恢复的架构。因此新版的cluster,开始了测试。</p> <h2>一、Cluster 理论基础</h2> <h3 style="margin: 20px 0px 10px;padding: 0px;border: 0px;font-size: 18px">Cluster介绍</h3> <p style="margin-top: 10px;margin-bottom: 15px;padding: 0px;border: 0px">Redis集群是一个提供在多个Redis间节点间共享数据的程序集。</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">Redis集群并不支持处理多个keys的命令,因为这需要在不同的节点间移动数据,从而达不到像Redis那样的性能,在高负载的情况下可能会导致不可预料的错误。</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">Redis 集群通过分区来提供一定程度的可用性,在实际环境中当某个节点宕机或者不可达的情况下继续处理命令。</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">Redis 集群的优势:</p> <ul style="margin-top: 15px;margin-bottom: 15px;padding: 0px 0px 0px 30px;border: 0px" class=" list-paddingleft-2"> <li> <p>自动分割数据到不同的节点上。</p> </li> <li> <p>整个集群的部分节点失败或者不可达的情况下能够继续处理命令。</p> </li> </ul> <h3 style="margin: 20px 0px 10px;padding: 0px;border: 0px;font-size: 18px">Redis 一致性保证</h3> <p style="margin-top: 10px;margin-bottom: 15px;padding: 0px;border: 0px">Redis 并不能保证数据的强一致性. 这意味这在实际中集群在特定的条件下可能会丢失写操作. 第一个原因是因为集群是用了异步复制. 写操作过程:<br />客户端向主节点B写入一条命令。 <br />主节点B向客户端回复命令状态。 <br />主节点将写操作复制给他得从节点 B1, B2 和 B3。 <br />主节点对命令的复制工作发生在返回命令回复之后, 因为如果每次处理命令请求都需要等待复制操作完成的话, 那么主节点处理命令请求的速度将极大地降低 —— 我们必须在性能和一致性之间做出权衡。</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">Redis 集群另外一种可能会丢失命令的情况是集群出现了网络分区, 并且一个客户端与至少包括一个主节点在内的少数实例被孤立。. 举个例子 假设集群包含 A 、 B 、 C 、 A1 、 B1 、 C1 六个节点, 其中 A 、B 、C 为主节点, A1 、B1 、C1 为A,B,C的从节点, 还有一个客户端 Z1 假设集群中发生网络分区,那么集群可能会分为两方,大部分的一方包含节点 A 、C 、A1 、B1 和 C1 ,小部分的一方则包含节点 B 和客户端 Z1 。 Z1仍然能够向主节点B中写入, 如果网络分区发生时间较短,那么集群将会继续正常运作,如果分区的时间足够让大部分的一方将B1选举为新的master,那么Z1写入B中得数据便丢失了。</p> <h3 style="margin: 20px 0px 10px;padding: 0px;border: 0px;font-size: 18px">Cluster 架构</h3> <p style="margin-top: 10px;margin-bottom: 15px;padding: 0px;border: 0px">1、redis-cluster架构图</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px"><img src="//cto.wang/usr/uploads/2016/07/20160703170020-52.jpg" style="margin-top: 0px;margin-right: 0px;margin-bottom: 0px;margin-left: 0px;padding-top: 0px;padding-right: 0px;padding-bottom: 0px;padding-left: 0px;border-top-width: 0px;border-right-width: 0px;border-bottom-width: 0px;border-left-width: 0px;border-style: initial;border-color: initial;max-width: 100%" /></p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">架构细节:</p> <ul style="margin-top: 15px;margin-bottom: 15px;padding: 0px 0px 0px 30px;border: 0px" class=" list-paddingleft-2"> <li> <p>所有的redis节点彼此互联(PING-PONG机制),内部使用二进制协议优化传输速度和带宽。</p> </li> <li> <p>节点的fail是通过集群中超过半数的节点检测失效时才生效。</p> </li> <li> <p>客户端与redis节点直连,不需要中间proxy层.客户端不需要连接集群所有节点,连接集群中任何一个可用节点即可。</p> </li> <li> <p>redis-cluster把所有的物理节点映射到[0-16383]slot上,cluster 负责维护node<->slot<->value。</p> </li> </ul> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">2、redis-cluster选举:容错</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px"><img src="//cto.wang/usr/uploads/2016/07/20160703170020-100.jpg" style="margin-top: 0px;margin-right: 0px;margin-bottom: 0px;margin-left: 0px;padding-top: 0px;padding-right: 0px;padding-bottom: 0px;padding-left: 0px;border-top-width: 0px;border-right-width: 0px;border-bottom-width: 0px;border-left-width: 0px;border-style: initial;border-color: initial;max-width: 100%" /></p> <ul style="margin-top: 15px;margin-bottom: 15px;padding: 0px 0px 0px 30px;border: 0px" class=" list-paddingleft-2"> <li> <p>领着选举过程是集群中所有master参与,如果半数以上master节点与master节点通信超过(cluster-node-timeout),认为当前master节点挂掉。</p> </li> <li> <p>什么时候整个集群不可用(cluster_state:fail),当集群不可用时,所有对集群的操作做都不可用,收到((error) CLUSTERDOWN The cluster is down)错误。</p> </li> <ul class=" list-paddingleft-2"> <li> <p>如果集群任意master挂掉,且当前master没有slave.集群进入fail状态,也可以理解成进群的slot映射[0-16383]不完成时进入fail状态。</p> </li> <li> <p>如果进群超过半数以上master挂掉,无论是否有slave集群进入fail状态。</p> </li> </ul> </ul> <h2>安装 Cluster</h2> <p style="margin-top: 10px;margin-bottom: 15px;padding: 0px;border: 0px">所需软件: redis-3.0.6.tar.gz<br />tcl8.6.1-src.tar.gz rubygems-2.4.2.tgz redis-3.0.0.gem</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">注意,由于官网,建议6个台服务器,本人笔记本就启动三个VM凑合。</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">192.168.11.12:6379/6380 <br />192.168.11.13:6379/6380 <br />192.168.11.14:6379/6380</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">1.安装redis(只演示一台)</p> <pre>mkdir -pv /usr/local/redis6379/{etc,log,var,data} cd redis-3.0.6 make make PREFIX=/usr/local/redis6379 install</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">2.配置文件(仅测试,为最少选项):</p> <pre>cat /usr/local/redis6379/etc/redis6379.conf daemonize yes port 6379 appendonly yes appendfilename "appendonly-6379.aof" cluster-enabled yes cluster-config-file /opt/nodes-6379.conf cluster-node-timeout 5000</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">cluster-enabled 选项用于开实例的集群模式 cluster-config-file 定了保存节点配置文件的路径, 默认值为 nodes.conf.节点配置文件无须人为修改, 它由 Redis 集群在启动时创建, 并在有需要时自动进行更新。</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">3.启动redis</p> <pre>/usr/local/redis6379/bin/redis-server /usr/local/redis6379/etc/redis6379.conf # ps aux | grep redis root 14968 0.1 0.9 137444 9616 ? Ssl 20:23 0:11 /usr/local/redis6379/bin/redis-server *:6379 [cluster] root 15002 0.1 0.7 137444 7520 ? Ssl 20:23 0:11 /usr/local/redis6380/bin/redis-server *:6380 [cluster]</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">4.启动集群</p> <pre>[root@12 10.19.166.212 /usr/local/src/redis-3.0.6/src ] \#/usr/local/src/redis-3.0.6/src/redis-trib.rb create --replicas 1 192.168.11.12:6379 192.168.11.12:6380 192.168.11.13:6379 192.168.11.13:6380 192.168.11.14:6379 192.168.11.14:6380 /usr/bin/env: ruby: 没有那个文件或目录</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">报错找不到ruby</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">因为我们还没有安装</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">5.安装ruby以及ruby依赖</p> <pre>yum -y install ruby ruby-rdoc</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">注意这时候还缺少依赖,直接开始构建集群的话还会报错:</p> <pre>\#/usr/local/src/redis-3.0.6/src/redis-trib.rb create --replicas 1 192.168.11.12:6379 192.168.11.12:6380 192.168.11.13:6379 192.168.11.13:6380 192.168.11.14:6379 192.168.11.14:6380 /usr/local/src/redis-3.0.6/src/redis-trib.rb:24:in `require': no such file to load -- rubygems (LoadError) from /usr/local/src/redis-3.0.6/src/redis-trib.rb:24</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">继续安装rubygem</p> <pre>tar zxmf rubygems-2.4.2.tgz ruby setup.rb cp bin/gem /usr/local/bin/ gem install -l redis-3.0.0.gem</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">6.构建集群</p> <pre>/usr/local/src/redis-3.0.6/src/redis-trib.rb create --replicas 1 192.168.11.12:6379 192.168.11.12:6380 192.168.11.13:6379 192.168.11.13:6380 192.168.11.14:6379 192.168.11.14:6380 >>> Creating cluster >>> Performing hash slots allocation on 6 nodes... Using 3 masters: 192.168.11.14:6379 192.168.11.13:6379 192.168.11.12:6379 Adding replica 192.168.11.13:6380 to 192.168.11.14:6379 Adding replica 192.168.11.14:6380 to 192.168.11.13:6379 Adding replica 192.168.11.12:6380 to 192.168.11.12:6379 M: c776fbe75505f6cc5c452cea363626804d675433 192.168.11.12:6379 slots:10923-16383 (5461 slots) master S: 4dec205bd3c73333f3976e202fe4282c5b72286a 192.168.11.12:6380 replicates c776fbe75505f6cc5c452cea363626804d675433 M: 945d62135ac298c06e56ea5d3da0bdf4eda86eb0 192.168.11.13:6379 slots:5461-10922 (5462 slots) master S: 00b47d4d1b3b26a276a96975fe33063225b87fc9 192.168.11.13:6380 replicates ed3f0377b8c7cdace570fcdc8eb60b2fce61bba8 M: ed3f0377b8c7cdace570fcdc8eb60b2fce61bba8 192.168.11.14:6379 slots:0-5460 (5461 slots) master S: a8334f26be6b35a3e75f037fa5f44779cf970b12 192.168.11.14:6380 replicates 945d62135ac298c06e56ea5d3da0bdf4eda86eb0 Can I set the above configuration? (type 'yes' to accept): yes >>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join... >>> Performing Cluster Check (using node 192.168.11.12:6379) M: c776fbe75505f6cc5c452cea363626804d675433 192.168.11.12:6379 slots:10923-16383 (5461 slots) master M: 4dec205bd3c73333f3976e202fe4282c5b72286a 192.168.11.12:6380 slots: (0 slots) master replicates c776fbe75505f6cc5c452cea363626804d675433 M: 945d62135ac298c06e56ea5d3da0bdf4eda86eb0 192.168.11.13:6379 slots:5461-10922 (5462 slots) master M: 00b47d4d1b3b26a276a96975fe33063225b87fc9 192.168.11.13:6380 slots: (0 slots) master replicates ed3f0377b8c7cdace570fcdc8eb60b2fce61bba8 M: ed3f0377b8c7cdace570fcdc8eb60b2fce61bba8 192.168.11.14:6379 slots:0-5460 (5461 slots) master M: a8334f26be6b35a3e75f037fa5f44779cf970b12 192.168.11.14:6380 slots: (0 slots) master replicates 945d62135ac298c06e56ea5d3da0bdf4eda86eb0 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">好了安装完成了 <br />注: <br />redis-trib.rb 这个命令在这里用于创建一个新的集群, 选项–replicas 1 表示我们希望为集群中的每个主节点创建一个从节点。 <br />之后跟着的其他参数则是这个集群实例的地址列表,3个master3个slave。</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">在这里可以 :/usr/local/src/redis-3.0.6/src/redis-trib.rb create –replicas 0 192.168.11.12:6379 192.168.11.13:6379 192.168.11.14:6379 表示不做slave</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">之后在redis-trib.rb add-node 192.168.11.12:6380 192.168.11.13:6380 添加节点</p> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">下面检查一下节点:</p> <pre>/usr/local/src/redis-3.0.6/src/redis-trib.rb check 192.168.11.12:6379 >>> Performing Cluster Check (using node 192.168.11.12:6379) M: c776fbe75505f6cc5c452cea363626804d675433 192.168.11.12:6379 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: a8334f26be6b35a3e75f037fa5f44779cf970b12 192.168.11.14:6380 slots: (0 slots) slave replicates 945d62135ac298c06e56ea5d3da0bdf4eda86eb0 M: 945d62135ac298c06e56ea5d3da0bdf4eda86eb0 192.168.11.13:6379 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: 00b47d4d1b3b26a276a96975fe33063225b87fc9 192.168.11.13:6380 slots: (0 slots) slave replicates ed3f0377b8c7cdace570fcdc8eb60b2fce61bba8 M: ed3f0377b8c7cdace570fcdc8eb60b2fce61bba8 192.168.11.14:6379 slots:0-5460 (5461 slots) master 1 additional replica(s) S: 4dec205bd3c73333f3976e202fe4282c5b72286a 192.168.11.12:6380 slots: (0 slots) slave replicates c776fbe75505f6cc5c452cea363626804d675433 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">7.测试写入:</p> <pre>/usr/local/redis6379/bin/redis-cli -c -h 192.168.11.12 -p 6379 192.168.11.12:6379> set aa 123 -> Redirected to slot [1180] located at 192.168.11.14:6379 OK 192.168.11.14:6379> [root@13 ~]# /usr/local/redis6379/bin/redis-cli -c -h 192.168.11.12 -p 6380 192.168.11.12:6380> get aa -> Redirected to slot [1180] located at 192.168.11.14:6379 "123" 192.168.11.14:6379></pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">测试删除:</p> <pre>[root@12 10.19.166.212 /usr/local/src ] /usr/local/redis6379/bin/redis-cli -c -h 192.168.11.12 -p 6379 shutdown [root@12 10.19.166.212 /usr/local/src ] /usr/local/src/redis-3.0.6/src/redis-trib.rb check 192.168.11.13:6379 >>> Performing Cluster Check (using node 192.168.11.13:6379) M: 945d62135ac298c06e56ea5d3da0bdf4eda86eb0 192.168.11.13:6379 slots:5461-10922 (5462 slots) master 1 additional replica(s) M: ed3f0377b8c7cdace570fcdc8eb60b2fce61bba8 192.168.11.14:6379 slots:0-5460 (5461 slots) master 1 additional replica(s) M: 4dec205bd3c73333f3976e202fe4282c5b72286a 192.168.11.12:6380 slots:10923-16383 (5461 slots) master 0 additional replica(s) S: 00b47d4d1b3b26a276a96975fe33063225b87fc9 192.168.11.13:6380 slots: (0 slots) slave replicates ed3f0377b8c7cdace570fcdc8eb60b2fce61bba8 S: a8334f26be6b35a3e75f037fa5f44779cf970b12 192.168.11.14:6380 slots: (0 slots) slave replicates 945d62135ac298c06e56ea5d3da0bdf4eda86eb0 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. /usr/local/redis6379/bin/redis-cli -c -h 192.168.11.12 -p 6380 shutdown /usr/local/redis6379/bin/redis-cli -c -h 192.168.11.14 -p 6379 192.168.11.14:6379> get aa (error) CLUSTERDOWN The cluster is down 192.168.11.14:6379> get bb (error) CLUSTERDOWN The cluster is down 192.168.11.14:6379> quit</pre> <p style="margin-top: 15px;margin-bottom: 15px;padding: 0px;border: 0px">/usr/local/redis6379/bin/redis-cli -c -h 192.168.11.14 -p 6379 <br />192.168.11.14:6379> get aa "123" 192.168.11.14:6379> get bb -> Redirected to slot [8620] located at 192.168.11.13:6379 "234"</p> <p style="margin-top: 15px;padding: 0px;border: 0px;margin-bottom: 0px !important">安装到此结束</p> <p></p> 最后修改:2021 年 12 月 10 日 10 : 53 AM © 允许规范转载 赞赏 如果觉得我的文章对你有用,请随意赞赏 赞赏作者 支付宝微信