My new project Sophia requires a ton of database calls. When I say a ton, I mean a TON. Thankfully, the tech world has JUST started vetting exactly the solution I need: Redis clustering.

Redis is a super fast memory key:value store like memcache, but supports more types of data. Everything is kept in memory and sync’d to disk every 5m or whatever you set it to.

Redis clustering is still in beta and very new. In fact, to even get your hands on it, you have to compile it from the unstable branch on github. With that said, clustering has been worked on for about TWO YEARS now and when setup right, it seems to work.

Documentation is missing large chunks of information and real world examples are nearly nonexistent right now. Lets fix that!

I’ll be deploying my redis shards on CentOS 6.5 and will really use a different server for each node, so YMMV.

Run the following setup procedure on each of your servers

First, install packages.

yum install wget make gcc man tcl ruby rubygems -y
gem install redis

Next, get the source and install it:

wget -O tmpredis.tar.gz "https://github.com/antirez/redis/archive/3.0.0-beta3.tar.gz" -q
cd /tmp
tar zxf redis.tar.gz
cd redis*
make
make install

Make some directories redis wants…

mkdir -p /var/lib/redis
mkdir -p /var/log/redis

At this point you’ll want to setup an init script. You can use mine if you want. I can’t remember where I got it at this point, but it works so far. Run the code below to install it.

cat << EOF > /etc/init.d/redis-server
#!/bin/sh
#
# redis - this script starts and stops the redis-server daemon
#
# chkconfig:   - 85 15
# description:  Redis is a persistent key-value database
# processname: redis-server
# config:      etcredisredis.conf
# config:      etcsysconfigredis
# pidfile:     varrunredis.pid
# Source function library.
. etcrc.dinit.dfunctions
# Source networking configuration.
. etcsysconfignetwork
# Check that networking is up.
[ "$NETWORKING" = "no" ] && exit 0
redis="/usr/local/bin/redis-server"
prog=$(basename $redis)
REDIS_CONF_FILE="/etc/redis/redis.conf"
[ -f /etc/sysconfig/redis ] && . /etc/sysconfig/redis
lockfile=/var/lock/subsys/redis
start() {
    [ -x $redis ] || exit 5
    [ -f $REDIS_CONF_FILE ] || exit 6
    echo -n $"Starting $prog: "
    daemon $redis $REDIS_CONF_FILE
    retval=$?
    echo
    [ $retval -eq 0 ] && touch $lockfile
    return $retval
}
stop() {
    echo -n $"Stopping $prog: "
    killproc $prog -QUIT
    retval=$?
    echo
    [ $retval -eq 0 ] && rm -f $lockfile
    return $retval
}
restart() {
    stop
    start
}
reload() {
    echo -n $"Reloading $prog: "
    killproc $redis -HUP
    RETVAL=$?
    echo
}
force_reload() {
    restart
}
rh_status() {
    status $prog
}
rh_status_q() {
    rh_status >devnull 2>&1
}
case "$1" in
    start)
        rh_status_q && exit 0
        $1
        ;;
    stop)
        rh_status_q || exit 0
        $1
        ;;
    restart|configtest)
        $1
        ;;
    reload)
        rh_status_q || exit 7
        $1
        ;;
    force-reload)
        force_reload
        ;;
    status)
        rh_status
        ;;
    condrestart|try-restart)
        rh_status_q || exit 0
            ;;
    *)
        echo $"Usage: $0 {start|stop|status|restart|condrestart|try-restart|reload|force-reload}"
        exit 2
esac
EOF

chmod u+x /etc/init.d/redis-server
chkconfig --add redis-server
chkconfig redis-server on

Now, setup a sysctl setting that Redis asks for:

sysctl vm.overcommit_memory=1
echo "vm.overcommit_memory = 1" >> /etc/sysctl.conf

To make our lives easier, lets put redis-trib in our path. redis-trib.rb is a CLI tool used for clustering.

find / -name redis-trib.rb -exec cp -f {} /usr/local/bin/redis-trib.rb \;

The configuration of cluster compatible servers is commonly not compatible with prior Redis 2.x versions and clustering won’t support passwords. Here is my config used in testing - feel free to use it. It goes in /etc/redis/redis.conf.

activerehashing yes
aof-rewrite-incremental-fsync yes
appendfilename "appendonly.aof"
appendfsync everysec
appendonly no
auto-aof-rewrite-min-size 64mb
auto-aof-rewrite-percentage 100
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit pubsub 32mb 8mb 60
client-output-buffer-limit slave 256mb 64mb 60
cluster-config-file nodes-6379.conf
cluster-enabled yes
cluster-node-timeout 15000
daemonize yes
databases 16
dbfilename dump.rdb
dir .
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
hll-sparse-max-bytes 3000
hz 10
list-max-ziplist-entries 512
list-max-ziplist-value 64
logfile "varlogredisredis.log"
loglevel notice
lua-time-limit 5000
no-appendfsync-on-rewrite no
notify-keyspace-events ""
pidfile varrunredis.pid
port 6379
rdbchecksum yes
rdbcompression yes
repl-disable-tcp-nodelay no
save 300 10
save 60 10000
save 900 1
set-max-intset-entries 512
slave-priority 100
slave-read-only yes
slave-serve-stale-data yes
slowlog-log-slower-than 10000
slowlog-max-len 128
stop-writes-on-bgsave-error yes
tcp-backlog 511
tcp-keepalive 0
timeout 0
zset-max-ziplist-entries 128
zset-max-ziplist-value 64

Start Redis up…

service redis-server start

Great! Your CentOS servers should all be running Redis now!

Now, SSH to one of your redis servers and start configuring clustering…

To setup your cluster, we use redis-trib.rb. Keep in mind that all server addresses must be IPs (hostnames will break everything!) and you must specify a port. If anything goes wrong at this stage, you should probably tear down everything and start over. The Redis cluster seems to remember its cluster configuration no matter what files or configs I delete.

Here is the cluster create command. Switch in your server IPs. You are gonna wanna make sure your servers have static IPs because when they change, this stuff won’t update. The –replicas switch is optional and requires an extra server for each master. You’ll want to have either 3 masters or 3 masters and 3 slaves at the minimum.

Note! If you use a hostname, this will break! Use IPs only!

Option 1) Adding 6 servers (3 master and 3 replica)

./redis-trib.rb create --replicas 1 192.168.1.200:3679 192.168.1.201:3679 192.168.1.202:3679 192.168.1.203:3679 192.168.1.204:3679 192.168.1.205:3679 192.168.1.206:3679

Option 2) Adding 3 servers (3 masters)

./redis-trib.rb create --replicas 1 192.168.1.200:3679 192.168.1.201:3679 192.168.1.202:3679

you should see some output like this:

[root@redis8 ~]# ./redis-trib.rb create --replicas 1 192.168.1.107:6379 192.168.1.108:6379 192.168.1.109:6379 192.168.1.110:6379 192.168.1.104:6379 192.168.1.89:6379
>>> Creating cluster
Connecting to node 192.168.1.107:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.109:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.104:6379: OK
Connecting to node 192.168.1.89:6379: OK
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
192.168.1.89:6379
192.168.1.104:6379
192.168.1.109:6379
Adding replica 192.168.1.108:6379 to 192.168.1.89:6379
Adding replica 192.168.1.107:6379 to 192.168.1.104:6379
Adding replica 192.168.1.110:6379 to 192.168.1.109:6379
S: 39287c43c3efc6c93904afd460aa200ebd06b4de 192.168.1.107:6379
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   replicates 9add269cc27885bf129f43ef2c92fa090963ecce
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:10923-16383 (5461 slots) master
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5461-10922 (5462 slots) master
M: 9add269cc27885bf129f43ef2c92fa090963ecce 192.168.1.89:6379
   slots:0-5460 (5461 slots) master
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join......
>>> Performing Cluster Check (using node 192.168.1.107:6379)
M: 39287c43c3efc6c93904afd460aa200ebd06b4de 192.168.1.107:6379
   slots: (0 slots) master
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
M: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) master
   replicates 9add269cc27885bf129f43ef2c92fa090963ecce
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:10923-16383 (5461 slots) master
M: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) master
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5461-10922 (5462 slots) master
M: 9add269cc27885bf129f43ef2c92fa090963ecce 192.168.1.89:6379
   slots:0-5460 (5461 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

You now have a running Redis cluster! The commands that show the current state of the cluster (that I know thus far) are:

Cluster Info and Cluster nodes

So now you have a cluster that looks something like this:

127.0.0.1:6379> cluster nodes
e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379 master - 0 1400558545169 0 connected 5461-10922
39287c43c3efc6c93904afd460aa200ebd06b4de :0 myself,slave e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 0 0 5 connected
b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379 slave 9add269cc27885bf129f43ef2c92fa090963ecce 0 1400558544165 4 connected
9add269cc27885bf129f43ef2c92fa090963ecce 192.168.1.89:6379 master - 0 1400558546170 3 connected 0-5460
45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379 slave 399e4195a145545dacc1aaa3daa65a0869d8688a 0 1400558543163 2 connected
399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379 master - 0 1400558542160 1 connected 10923-16383
127.0.0.1:6379> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:5
cluster_stats_messages_sent:592
cluster_stats_messages_received:592
127.0.0.1:6379>

The hash at the begining of every line is the cluster node ID. The last numbers are important because they are where your slots reside. Slots are “shares” of keys that get divided across your nodes. Nodes own any number of shards. When a new key is stored, it is hashed and mapped into a slot ranging from 1-16383. The cluster then looks up what member is responsible for that slot and stores the key there.

Subsequent requests for the key (to any node) will get redirected to the node that has the key. This is fast because all nodes hold a TCP session open to every other node. It is still best to use a cluster aware Redis library (like Predis) to reduce latency and get better query speeds.

A good command to know is the check command. This command verifies that all your nodes are online and every slot is accounted for. Here is what the command looks like. You can run it from any of your nodes:

[root@redis8 ~]# ./redis-trib.rb check 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.104:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.89:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
S: 39287c43c3efc6c93904afd460aa200ebd06b4de 127.0.0.1:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates 9add269cc27885bf129f43ef2c92fa090963ecce
M: 9add269cc27885bf129f43ef2c92fa090963ecce 192.168.1.89:6379
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

If you ever have an issue with your cluster, try running a fix with redis-trib:

[root@redis8 ~]# ./redis-trib.rb fix 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.104:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.89:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
S: 39287c43c3efc6c93904afd460aa200ebd06b4de 127.0.0.1:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates 9add269cc27885bf129f43ef2c92fa090963ecce
M: 9add269cc27885bf129f43ef2c92fa090963ecce 192.168.1.89:6379
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

Some day you might want to move slots to rebalance your load. That is called a reshard. Here is how you do that:

[root@redis8 ~]# ./redis-trib.rb reshard 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.104:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.89:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
S: 39287c43c3efc6c93904afd460aa200ebd06b4de 127.0.0.1:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates 9add269cc27885bf129f43ef2c92fa090963ecce
M: 9add269cc27885bf129f43ef2c92fa090963ecce 192.168.1.89:6379
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 10
What is the receiving node ID? 399e4195a145545dacc1aaa3daa65a0869d8688a
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1:all
Ready to move 10 slots.
  Source nodes:
    M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
    M: 9add269cc27885bf129f43ef2c92fa090963ecce 192.168.1.89:6379
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
  Destination node:
    M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
  Resharding plan:
    Moving slot 5461 from e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
    Moving slot 5462 from e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
    Moving slot 5463 from e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
    Moving slot 5464 from e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
    Moving slot 5465 from e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
    Moving slot 5466 from e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
    Moving slot 0 from 9add269cc27885bf129f43ef2c92fa090963ecce
    Moving slot 1 from 9add269cc27885bf129f43ef2c92fa090963ecce
    Moving slot 2 from 9add269cc27885bf129f43ef2c92fa090963ecce
    Moving slot 3 from 9add269cc27885bf129f43ef2c92fa090963ecce
Do you want to proceed with the proposed reshard plan (yesno)? yes
Moving slot 5461 from 192.168.1.104:6379 to 192.168.1.109:6379:
Moving slot 5462 from 192.168.1.104:6379 to 192.168.1.109:6379:
Moving slot 5463 from 192.168.1.104:6379 to 192.168.1.109:6379:
Moving slot 5464 from 192.168.1.104:6379 to 192.168.1.109:6379:
Moving slot 5465 from 192.168.1.104:6379 to 192.168.1.109:6379:
Moving slot 5466 from 192.168.1.104:6379 to 192.168.1.109:6379:
Moving slot 0 from 192.168.1.89:6379 to 192.168.1.109:6379:
Moving slot 1 from 192.168.1.89:6379 to 192.168.1.109:6379:
Moving slot 2 from 192.168.1.89:6379 to 192.168.1.109:6379:
Moving slot 3 from 192.168.1.89:6379 to 192.168.1.109:6379:

Another thing you’ll eventually have to do is remove a node. Pretty easy! That works in two phases. First, move all the slots off the node. Second, delete it from the cluster. Example:

root@redis8 ~]# ./redis-trib.rb reshard 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.104:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.89:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
S: 39287c43c3efc6c93904afd460aa200ebd06b4de 127.0.0.1:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5467-10922 (5456 slots) master
   1 additional replica(s)
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates 9add269cc27885bf129f43ef2c92fa090963ecce
M: 9add269cc27885bf129f43ef2c92fa090963ecce 192.168.1.89:6379
   slots:4-5460 (5457 slots) master
   1 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:0-3,5461-5466,10923-16383 (5471 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 5457
What is the receiving node ID? 399e4195a145545dacc1aaa3daa65a0869d8688a
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1:9add269cc27885bf129f43ef2c92fa090963ecce
Source node #2:done
Ready to move 5457 slots.
  Source nodes:
    M: 9add269cc27885bf129f43ef2c92fa090963ecce 192.168.1.89:6379
   slots:4-5460 (5457 slots) master
   1 additional replica(s)
  Destination node:
    M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:0-3,5461-5466,10923-16383 (5471 slots) master
   1 additional replica(s)
  Resharding plan:
    Moving slot 4 from 9add269cc27885bf129f43ef2c92fa090963ecce
    Moving slot 5 from 9add269cc27885bf129f43ef2c92fa090963ecce
.... (lots of shard stuff here) ....
[root@redis8 ~]# ./redis-trib.rb del-node 127.0.0.1:6379 9add269cc27885bf129f43ef2c92fa090963ecce
>>> Removing node 9add269cc27885bf129f43ef2c92fa090963ecce from cluster 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.104:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.89:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
>>> Sending CLUSTER FORGET messages to the cluster...
>>> 192.168.1.108:6379 as replica of 192.168.1.104:6379
>>> SHUTDOWN the node.
[root@redis8 ~]# redis-cli
127.0.0.1:6379> cluster nodes
e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379 master - 0 1400559289088 0 connected 5467-10922
39287c43c3efc6c93904afd460aa200ebd06b4de :0 myself,slave e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 0 0 5 connected
b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379 slave e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 0 1400559282071 4 connected
45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379 slave 399e4195a145545dacc1aaa3daa65a0869d8688a 0 1400559288085 6 connected
399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379 master - 0 1400559285078 6 connected 0-5466 10923-16383

Notice that cluster nodes now shows the slave of the master we removed has been re-tasked to slave a different server. Let’s get rid of that extra slave for fun.

[root@redis8 ~]# ./redis-trib.rb del-node 127.0.0.1:6379 39287c43c3efc6c93904afd460aa200ebd06b4de
>>> Removing node 39287c43c3efc6c93904afd460aa200ebd06b4de from cluster 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.104:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
127.0.0.1:6379> cluster nodes
399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379 master - 0 1400559428764 6 connected 0-5466 10923-16383
b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 :0 myself,slave e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 0 0 4 connected
e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379 master - 0 1400559429767 0 connected 5467-10922
45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379 slave 399e4195a145545dacc1aaa3daa65a0869d8688a 0 1400559427762 6 connected

Now, lets add our master back into the cluster. All of this is being done with minimal degradation to regular traffic. Notice that the existing node is specified THIRD in this case. Normally redis-trib.rb expects the existing node to be the SECOND argument.

[root@redis9 ~]# ./redis-trib.rb add-node 192.168.1.107:6379 127.0.0.1:6379
>>> Adding node 192.168.1.107:6379 to cluster 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.109:6379: OK
Connecting to node 192.168.1.104:6379: OK
Connecting to node 192.168.1.110:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 127.0.0.1:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:0-5466,10923-16383 (10928 slots) master
   1 additional replica(s)
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5467-10922 (5456 slots) master
   1 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Connecting to node 192.168.1.107:6379: OK
>>> Send CLUSTER MEET to node 192.168.1.107:6379 to make it join the cluster.
[OK] New node added correctly.

For some reason, my cluster didn’t ACTUALLY join here. Going to the new node and issuing cluster info showed the cluster was status ‘fail’. None of the other nodes new about it, either. I was able to save the broken state by connecting to the new node and issuing the following:

[root@redis8 ~]# redis-cli
27.0.0.1:6379> cluster meet 192.168.1.109 6379
OK
127.0.0.1:6379> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:5
cluster_size:2
cluster_current_epoch:7
cluster_stats_messages_sent:13
cluster_stats_messages_received:13
127.0.0.1:6379> cluster nodes
45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379 slave 399e4195a145545dacc1aaa3daa65a0869d8688a 0 1400560196589 6 connected
b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379 slave e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 0 1400560193637 0 connected
4bc676c1fde0c1b83a3a7850abddda80c601a183 127.0.0.1:6379 myself,master - 0 0 7 connected
399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379 master - 0 1400560195587 6 connected 0-5466 10923-16383
e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379 master - 0 1400560194642 0 connected 5467-10922

So now we have five nodes, but our new node isn’t hosting any slots. Lets give it some (we did this earlier):

[root@redis8 ~]# ./redis-trib.rb reshard 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.109:6379: OK
Connecting to node 192.168.1.104:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
M: 4bc676c1fde0c1b83a3a7850abddda80c601a183 127.0.0.1:6379
   slots: (0 slots) master
   0 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:0-5466,10923-16383 (10928 slots) master
   1 additional replica(s)
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5467-10922 (5456 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 5467
What is the receiving node ID? 4bc676c1fde0c1b83a3a7850abddda80c601a183
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1:all
Ready to move 5467 slots.
  Source nodes:
    M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:0-5466,10923-16383 (10928 slots) master
   1 additional replica(s)
    M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:5467-10922 (5456 slots) master
   1 additional replica(s)
  Destination node:
    M: 4bc676c1fde0c1b83a3a7850abddda80c601a183 127.0.0.1:6379
   slots: (0 slots) master
   0 additional replica(s)
  Resharding plan:
    Moving slot 0 from 399e4195a145545dacc1aaa3daa65a0869d8688a
    Moving slot 1 from 399e4195a145545dacc1aaa3daa65a0869d8688a
    Moving slot 2 from 399e4195a145545dacc1aaa3daa65a0869d8688a
... (lots of stuff here)...
Moving slot 7284 from 192.168.1.104:6379 to 127.0.0.1:6379:
Moving slot 7285 from 192.168.1.104:6379 to 127.0.0.1:6379:
Moving slot 7286 from 192.168.1.104:6379 to 127.0.0.1:6379:
[root@redis8 ~]# redis-cli
127.0.0.1:6379> cluster nodes
45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379 slave 399e4195a145545dacc1aaa3daa65a0869d8688a 0 1400560399076 6 connected
b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379 slave e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 0 1400560395066 0 connected
4bc676c1fde0c1b83a3a7850abddda80c601a183 127.0.0.1:6379 myself,master - 0 0 7 connected 0-3646 5467-7286
399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379 master - 0 1400560397072 6 connected 3647-5466 10923-16383
e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379 master - 0 1400560398074 0 connected 7287-10922

Great! But what about our slave that this master used to have??? Lets add that back into the cluster. If we simply issue the command, Redis will find the masters who have the least number of slaves, randomly pick one and setup the new slave. Notice that again using the add-node command requires you to specify the NEW node SECOND and the EXISTING node THIRD. This must be run from an existing node in the cluster.

[root@redis12 ~]# ./redis-trib.rb add-node --slave 192.168.1.89:6379 127.0.0.1:6379
>>> Adding node 192.168.1.89:6379 to cluster 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.107:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
Connecting to node 192.168.1.108:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 127.0.0.1:6379
   slots:7287-10922 (3636 slots) master
   1 additional replica(s)
M: 4bc676c1fde0c1b83a3a7850abddda80c601a183 192.168.1.107:6379
   slots:0-3646,5467-7286 (5467 slots) master
   0 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:3647-5466,10923-16383 (7281 slots) master
   1 additional replica(s)
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Automatically selected master 192.168.1.107:6379
Connecting to node 192.168.1.89:6379: OK
>>> Send CLUSTER MEET to node 192.168.1.89:6379 to make it join the cluster.
Waiting for the cluster to join...
>>> Configure node as replica of 192.168.1.107:6379.
[OK] New node added correctly.

Sometimes this command gets stuck waiting for the node to join. To unstick it, you can connect to the node being added and manually give it the cluster meet command like below. That will let the first command finish up successfully.

[root@redis13 ~]# redis-cli
127.0.0.1:6379> CLUSTER MEET 192.168.1.104 6379
OK

Run the usual commands to verify the node was added as a slave into the cluster:

[root@redis13 ~]# redis-cli
127.0.0.1:6379> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:8
cluster_stats_messages_sent:259
cluster_stats_messages_received:259
127.0.0.1:6379> cluster nodes
e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379 master - 0 1400561937685 0 connected 7287-10922
45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379 slave 399e4195a145545dacc1aaa3daa65a0869d8688a 0 1400561936681 6 connected
b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379 slave e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 0 1400561938687 0 connected
4bc676c1fde0c1b83a3a7850abddda80c601a183 192.168.1.107:6379 master - 0 1400561939689 7 connected 0-3646 5467-7286
399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379 master - 0 1400561933671 6 connected 3647-5466 10923-16383
6d05e673926dba35be5ed2a00847596369527853 127.0.0.1:6379 myself,slave 4bc676c1fde0c1b83a3a7850abddda80c601a183 0 0 8 connected
[root@redis13 ~]# ./redis-trib.rb check 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.104:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.107:6379: OK
Connecting to node 192.168.1.109:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
S: 6d05e673926dba35be5ed2a00847596369527853 127.0.0.1:6379
   slots: (0 slots) slave
   replicates 4bc676c1fde0c1b83a3a7850abddda80c601a183
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:7287-10922 (3636 slots) master
   1 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
M: 4bc676c1fde0c1b83a3a7850abddda80c601a183 192.168.1.107:6379
   slots:0-3646,5467-7286 (5467 slots) master
   1 additional replica(s)
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:3647-5466,10923-16383 (7281 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

Okay! Now lets test failover. There is a CLUSTER FAILOVER command that does this. Open redis-cli from a slave in the cluster and issue the command to see it happen. Notice that the previous master is now a slave!

[root@redis13 ~]# redis-cli
127.0.0.1:6379> CLUSTER FAILOVER
OK
127.0.0.1:6379> cluster nodes
e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379 master - 0 1400562197385 0 connected 7287-10922
45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379 slave 399e4195a145545dacc1aaa3daa65a0869d8688a 0 1400562195379 6 connected
b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379 slave e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 0 1400562196382 0 connected
4bc676c1fde0c1b83a3a7850abddda80c601a183 192.168.1.107:6379 slave 6d05e673926dba35be5ed2a00847596369527853 0 1400562194375 9 connected
399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379 master - 0 1400562198386 6 connected 3647-5466 10923-16383
6d05e673926dba35be5ed2a00847596369527853 127.0.0.1:6379 myself,master - 0 0 9 connected 0-3646 5467-7286

Lets try a truly violent failover scenario. From a master, enter DEBUG SEGFAULT to cause a segfault. This will crash the master and the slave will take over in its place!

127.0.0.1:6379> DEBUG SEGFAULT
Could not connect to Redis at 127.0.0.1:6379: Connection refused
(0.74s)
not connected>

Connect to a node that is still online to see status:

[root@redis12 ~]# ./redis-trib.rb check 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.107:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
Connecting to node 192.168.1.108:6379: OK
>> Performing Cluster Check (using node 127.0.0.1:6379)
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 127.0.0.1:6379
   slots:7287-10922 (3636 slots) master
   1 additional replica(s)
M: 4bc676c1fde0c1b83a3a7850abddda80c601a183 192.168.1.107:6379
   slots:0-3646,5467-7286 (5467 slots) master
   0 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:3647-5466,10923-16383 (7281 slots) master
   1 additional replica(s)
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

Looks good! After a node crashes out of the cluster, it forgets about the cluster it was part of. I am assuming this is because a re-sync back into a cluster is the same thing effectively as rejoining. We know how to rejoin our node back (as a slave, of course). Lets give it a whirl from a node still in the cluster:

[root@redis12 ~]# ./redis-trib.rb add-node --slave 192.168.1.89:6379 127.0.0.1:6379
>>> Adding node 192.168.1.89:6379 to cluster 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.107:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
Connecting to node 192.168.1.108:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 127.0.0.1:6379
   slots:7287-10922 (3636 slots) master
   1 additional replica(s)
M: 4bc676c1fde0c1b83a3a7850abddda80c601a183 192.168.1.107:6379
   slots:0-3646,5467-7286 (5467 slots) master
   0 additional replica(s)
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:3647-5466,10923-16383 (7281 slots) master
   1 additional replica(s)
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Automatically selected master 192.168.1.107:6379
Connecting to node 192.168.1.89:6379: OK
>>> Send CLUSTER MEET to node 192.168.1.89:6379 to make it join the cluster.
Waiting for the cluster to join....................................................
>>> Configure node as replica of 192.168.1.107:6379.
[OK] New node added correctly.

Note that I DID have to go to the node I was adding and manually issue the CLUSTER MEET command to get the cluster add-node command to finish running.

[root@redis13 ~]# redis-cli
127.0.0.1:6379> CLUSTER MEET 192.168.1.104 6379
OK

Finally, lets see if our cluster is healthy after this last violent failover test.

27.0.0.1:6379> cluster nodes
45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379 slave 399e4195a145545dacc1aaa3daa65a0869d8688a 0 1400562781567 6 connected
399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379 master - 0 1400562782569 6 connected 3647-5466 10923-16383
b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379 slave e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 0 1400562785576 0 connected
328614ee3c0ece8899d0273a9530e409e81b1c0f 127.0.0.1:6379 myself,slave 4bc676c1fde0c1b83a3a7850abddda80c601a183 0 0 11 connected
4bc676c1fde0c1b83a3a7850abddda80c601a183 192.168.1.107:6379 master - 0 1400562786579 10 connected 0-3646 5467-7286
e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379 master - 0 1400562784573 0 connected 7287-10922
127.0.0.1:6379> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:11
cluster_stats_messages_sent:226
cluster_stats_messages_received:226
[root@redis13 ~]# ./redis-trib.rb check 127.0.0.1:6379
Connecting to node 127.0.0.1:6379: OK
Connecting to node 192.168.1.110:6379: OK
Connecting to node 192.168.1.109:6379: OK
Connecting to node 192.168.1.108:6379: OK
Connecting to node 192.168.1.107:6379: OK
Connecting to node 192.168.1.104:6379: OK
>>> Performing Cluster Check (using node 127.0.0.1:6379)
S: 328614ee3c0ece8899d0273a9530e409e81b1c0f 127.0.0.1:6379
   slots: (0 slots) slave
   replicates 4bc676c1fde0c1b83a3a7850abddda80c601a183
S: 45e813b92ce6cdec92757b2ea0f23412133dbfe2 192.168.1.110:6379
   slots: (0 slots) slave
   replicates 399e4195a145545dacc1aaa3daa65a0869d8688a
M: 399e4195a145545dacc1aaa3daa65a0869d8688a 192.168.1.109:6379
   slots:3647-5466,10923-16383 (7281 slots) master
   1 additional replica(s)
S: b8d80cbac1bd6fe84b8d6b492823607dfd0d8552 192.168.1.108:6379
   slots: (0 slots) slave
   replicates e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40
M: 4bc676c1fde0c1b83a3a7850abddda80c601a183 192.168.1.107:6379
   slots:0-3646,5467-7286 (5467 slots) master
   1 additional replica(s)
M: e98c99b8bc7e53b71d0f7b78d7aba1dce8e27d40 192.168.1.104:6379
   slots:7287-10922 (3636 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

It’s good!

We have covered adding nodes, removing nodes, resharding, failover and fault tolerance. Have any questions? Let me know in the comments!

« Back to Article List

Comments

comments powered by Disqus