- Very fast (110000 SETs/sec, 81000 GETs/sec). in RAM, high availability, key-value data store with persistence and consistency (relaxed).
- Widely used: Twitter GitHub Weibo Pinterest Snapchat Craigslist Digg StackOverflow Flickr, and more
- Key: string, Value: can be a string, list, set, sorted set or hash
- Documentation
Using Redis
- In lab
# ssh into VM * 3, then
docker run --name some-redis -d redis redis-server --appendonly yes # single server
docker run -it --link some-redis:redis --rm redis redis-cli -h redis -p 6379 # container running cli to server
docker run -it --link some-redis:redis --rm redis redis-cli -h redis -p 6379 # container running cli to server
docker container ls
- Run through the redis tutorial.
SET server:name "fido"
GET server:name
SET connections 10
# set if not exists
SETNX connections "sid"
# Non-atomic: INCR
client1: GET connections
client2: GET connections
client1: SET connections 11
client2: SET connections 11
# Atomic operations
INCR connections
INCR connections
DEL connections
INCR connections
# Keys with limited lifespan
SET resource:lock "Redis Demo"
EXPIRE resource:lock 120
TTL resource:lock
TTL resource:lock
TTL resource:lock
TTL resource:lock
SET resource:lock "Redis Demo 1"
# TTL = -1 => infinite lifespan
# TTL = -2 => expired/does not exist
EXPIRE resource:lock 120
TTL resource:lock
SET resource:lock "Redis Demo 2"
TTL resource:lock
# Redis datastructures: lists
# RPUSH, LPUSH, LLEN, LRANGE, LPOP, and RPOP.
RPUSH friends "Alice"
RPUSH friends "Bob"
LPUSH friends "Sam"
LLEN friends
# similar to pythons [a:b], except includes b.
LRANGE friends 0 -1
LRANGE friends 0 1
LRANGE friends 1 2
LPOP
RPOP
RPOP
# Set
# SADD, SREM, SISMEMBER, SMEMBERS , SUNION
SADD superpowers "flight"
SADD superpowers "x-ray vision"
SADD superpowers "reflexes"
SREM superpowers "reflexes"
# value in set
SISMEMBER superpowers "flight"
SISMEMBER superpowers "reflexes"
SMEMBERS superpowers
SADD birdpowers "pecking"
SADD birdpowers "flight"
SUNION superpowers birdpower
# Sorted sets: Each member has a score
ZADD hackers 1940 "Alan Kay"
ZADD hackers 1906 "Grace Hopper"
ZADD hackers 1953 "Richard Stallman"
ZADD hackers 1965 "Yukihiro Matsumoto"
ZADD hackers 1916 "Claude Shannon"
ZADD hackers 1969 "Linus Torvalds"
ZADD hackers 1957 "Sophie Wilson"
ZADD hackers 1912 "Alan Turing"
ZRANGE hackers 2 4
# Hashes like python Dictionary
HSET user:1000 name "John Smith"
HSET user:1000 email "john.smith@example.com"
HSET user:1000 password "s3cret"
HGETALL user:1000
HMSET user:1001 name "Mary Jones" password "hidden" email "mjones@example.com"
HGET user:1001 name
# Numeric values in hashes have INC/DEC operations...
HSET user:1000 visits 10
HINCRBY user:1000 visits 1
HINCRBY user:1000 visits 10
HDEL user:1000 visits
HINCRBY user:1000 visits 1
- Atomic operations like incr, and Transactions but limited in clustered environments.
- Also has PUB/SUB
client1: subscribe csc409 csc207
client2: publish csc409 "Welcome to csc409"
client1: unsubscribe
Reminders...
# Run redis server
docker run --name some-redis -d redis redis-server --appendonly yes # single server
# Run another redis container executing redis-cli in it, connecting our first container
docker run -it --link some-redis:redis --rm redis redis-cli -h redis -p 6379
docker stop CONTAINER_ID
docker start CONTAINER_ID
docker exec -it CONTAINER_ID /bin/bash # shell inside container
redis-benchmark
vs
They are for very different purposes!!
- Postgresql: Store on disk => slow, cheap, large storage
- Redis: Store in RAM => fast, expensive, small storage
- Postgresql: strong properties: ACID,replicate/scale carefully (replication=>reduced performance in many cases)
- Redis: weaker properties, very scalable (more hosts=>better performance), high availability, some atomic operations
- Postgresql: Very general purpose tool, lots of usecases
- Redis: More limited usecases, typically a high speed cache, message queue.
- Big difference: redis=fast+small data+high availabiliy. Way to use lots of RAM.
- Main competitors are things like memcached, MongoDB.
Redis Architecture: Single Server
- Stop the server => loose everything, so redis adds persistence
- RDB: Occassionally, or on demand (SAVE command), store current state of memory in a compact binary format on disk.
- AOF: Continually log all commands executed by this server.
docker container ls # looking for some-redis
docker exec -it CONTAINER_ID /bin/bash # shell inside container
bash: redis-benchmark # executable (not redis command)
redis-server: SAVE # redis command, then take a look at db file (or BGSAVE)
bash: ls -al
RDB Pros and Cons
- + RDB compact single-file, good for versioning state of system.
- + Disaster recovery: Quickly store small .rdb offsite.
- + Good performance: Occassionally redis forks child process
to store state of memory. Parent does no disk I/O.
- + Disaster recovery: Fast restarts than with AOF.
- - loose all data since last snapshot
- - fork may be slow on large datasets impacting performance of server
AOF Pros and Cons
- + better durability, tunable (via fsync). Note cost of fsync.
- + AOF log is append only, so potentially lose only last command
- - AOF larger than RDB, but has a rewrite/compress option.
- - AOF slower than RDB (depending on fsync policy)
- - AOF slower on restarts than RDB
Advice: Use both.
RDB snapshotting algorithm
fork # so takes advantage of Copy On Write
parent: continues to server requests
child: write RAM to tmp.rdb
child: mv tmp.rdb dump.rdb
child: terminates
# on redis restart, load dump.rdb
AOF algorithm
write to log whenever a command modifies a key
# on redis restart, run the log
# How durable? Depends on how you configure fsync
# fsync on each command/every second/never fsync tradeoffs
AOF rewriting
To keep AOF small, create a shorter version of the
log with the same end RAM state
ex: increment key 1000 times vs increment key by 100
Algorithm
redis-client: BGREWRITEAOF # redis command
fork rewriter job
parent: continues serving requests, accumulates any new commands in RAM, not current.log
child:rewrites current.log to tmp.aof
child: mv tmp.aof current.aof
child: tell parent "DONE"
child: terminates
parent: logs accumulated backlog to current.aof
parent: continues as in AOF algorithm above
- Master(write/read) - slave(read) replication: slaves are eventually consistent.
- Async replication, fast since master does not wait for slave to update.
client-master: set
master: performs operation
master-client: done
master-slave: set
- Master is single writer, many slaves readers. Clients can read from any slave.
- Possible non-durability: master fails after "done" to client.
- not strongly consistent as in CAP. Client writes to master and immediately reads from slave.
- There is a wait command to tune consistency.
- Slaves can have other slaves, cascading replication.
- Master can have persistence turned off, slaves can have it turned on.
Healing
Link between master and slave fails
- Slave connects to master
- Slave asks for partial re-sync
- If partial fails, slave asks for complete re-sync.
Redis Architecture: Hash Partitioning (=Sharding)
- 16384 hashslots in a redis cluster (16 bits)
- Each node is responsible for a range/ collection of hashslots.
-
HASH_SLOT = CRC16(key) mod 16384 // CRC16 is very quick to compute
- Can also specify parts of keys instead of full keys to use in sharding.
- Redis also supports other partitioning schemes.
Redis Architecture: Cluster
- Full Mesh: Each node is connected to all others.
- Supports re-sharding on the fly.
-
Redis Architecture: Sentinel
- What if Master becomes unavailable, who watches the watcher?
References