ElastiCache Redis is the unsung hero of the Redis ecosystem. It has been instrumental in the growth and adoption of Redis. But Redis is a tale seemingly as old as time. It is a much-respected open source software project which has grown over a decade or more to include broad functionality - and it has an active developer community with very good reason. In case you missed it, you can read more about the Momento team’s love of Redis in our Momento vs Redis blog.
Ironically, ElastiCache Redis and other managed Redis services are not cloud native
This calls for a truly cloud-native solution. That’s exactly what we’ve built at Momento, and here’s why:
- We believe that a fully reimagined, clean-slate implementation of the developer interfaces to such data requirements can result in faster innovation.
- We believe that starting with a truly serverless approach (via our distributed architecture) gives smoother operational results (availability, scalability, consistent latency at the tail). Not every developer should be (or wants to be) an expert in operating ElastiCache Redis, and learning the hard way can be painful and costly for a business.
- We believe there are a host of other opportunities to solve additional data-related problems in a simpler and more-streamlined way - all backed by our distributed foundation and the serverless experience. Take a look at our Momento Topics product for an example of this. No need to learn about a bunch of low-level components or commands and carefully stitch them together to suit your purpose - it’s ready to go.
Read on to dive into exactly why ElastiCache Redis and other managed Redis services aren’t cloud native and how Momento Cache is upping the ante.
8 reasons ElastiCache Redis is not cloud native
1. Nodes, clusters, network complexity vs high-level resource abstraction
If you take a look through the documentation for services like DynamoDB and S3, you’ll find no mention of nodes, clusters, subnets,, AZs etc. There’s just an endpoint, and a high-level abstraction for the resource - “table” or “bucket”. Momento follows the same truly serverless approach, allowing you to simply call “CreateCache()”. If you find some of this terminology for complex concepts in the documentation for a product, it portends non-optimal outcomes - look carefully, because some vendors try to pretend they’re offering serverless by using an abstracted name - dig further into details like pricing and maintenance/security information and you may find the equivalent of nodes, network choices around AZs and more. This distinction is important, because if you’re seeing this it suggests that the architecture for the managed service is not truly distributed, leading to more of the constraints around scale, operational overhead and cost as discussed in the below factors. Best practice is to operate isolated resources - easy and fine if it’s a “bucket” - but if it’s a “cluster” you’re probably signing yourself up for minimum buy-in, maintenance windows, and scaling pain (all on a per-cluster basis). The effort, pain, and risk multiplies as your architecture grows and becomes more micro-service oriented. Don’t forget that each service is likely present in a number of separate environments (dev, integration, staging, prod).
2. Maintenance windows
In a distributed environment served by a routing layer that handles API abstraction, product improvements roll out by exposing additional APIs or parameters and adding support for them to SDKs. Such is the case with DynamoDB, S3, and Momento Cache. There are no minor or major versions. No maintenance windows - so updates are imperceptible and have no impact on your customer experience.
With the managed Redis alternatives: if you’re using caching purely to reduce spend then it may be acceptable to fall back to the authoritative source during maintenance, but you’ll need to plan for doing this at low traffic times. On the other hand, if you are using ElastiCache Redis as a primary store for advanced data structures, you have more to plan for - potentially your service is down during this window - for an unknown period. Depending on the managed Redis product, you may see the maintenance windows clearly described as such, or it may be more subtle: “brief interruptions and possible emergent upgrades”.
3. Minimum buy-in (not scale-to-zero) and overprovisioning
Most managed Redis options do not offer a pricing model which allows you to pay only for the requests you make. This means maintaining a provisioned level of capability for a minimum granularity - typically per hour. Often this will be per node, and if you want better availability than a single node can offer, you’ll need to think about running multiple nodes across multiple AZs. Generally you’ll run this configuration all month long - so your minimum buy-in is probably the cost of running 3 nodes throughout the month. While the nodes can be small, your minimum for reasonably available Redis store (per service) is likely $35/mo or more. In the interests of reducing costs, many operators start to make trade-offs in lower environments (dev/test), like running a single node - no replication. This adds further management complexity - and what happens if you have no resiliency for your in-memory store and the node reboots? That’s right, you’re starting your test effort over.
Wouldn’t it be nice to not have to worry about any of this? With Momento Cache, you get the same high availability and performance for every cache, and you only pay for the requests you make - 5 requests per month or 5 billion. Need to ramp up from 5 requests per second to 5 million? No problem, it’s a smooth path with no interruptions and no operational effort to trip over.
4. Security and access control not given high priority
Opinionated statement: TLS should not be optional (as it is in ElastiCache Redis). There should be no room for intentional or accidental configuration for clear text transfer of your data and your access credentials. Momento Cache encrypts all data over the network, and also encrypts your data in storage (including memory).
All data and operations in Redis have the same authorization - it’s all or nothing. Momento provides for granular authorization (superuser, read-only, write-only, and per key) with expirable API authentication keys and short-lived client access tokens. Compare this with Redis (regardless of provider) and you’ll find that there is plenty of reason to try to hide your Redis data behind a complex network configuration. All access to Redis should be gated via application code that you’ll need to build and manage.
5. Scale choices which tightly couple different workload properties
Adding to the aforementioned pain points of minimum buy-in and node/cluster complexity, managed Redis will generally require that you choose (and commit to in production) a particular ratio of CPU, storage volume (memory size), and network bandwidth or request rate. You might have different options for this, such as a range of node types or sizes. Or you might get to choose different service tiers - but if the “tier” bounds the ratio of CPU:memory:throughput, ask yourself if it’s just a node by another name. Workloads change in nature over time, and there is often seasonal variability too (daily heavy and slow periods, monthly bulk operations, annual peak events). Each type of data operation has particular demands in terms of CPU, and the weight can vary depending on the size of an advanced data structure. You see, at any given time, one or more of these resource properties is likely to be far less efficiently utilized than others. Being truly serverless, Momento Cache does not ask you to concern yourself with this shifting balance - it just accommodates your needs at that moment.
6. Hot keys and hot shards
In most sharded stores (including managed Redis), there’s potential for a single key (or a small number of keys in the same shard) to overwhelm the backing nodes when there’s an unanticipated spike or concentration in traffic. This can “brown out” collocated data. Momento’s distributed architecture allows for timely addition of a second layer of caching for extremely hot keys, protecting the primary cache from this traffic, and continuing to serve the hot keys with even better performance - so you can weather the storm.
7. Scaling in node-sized increments with risk of impact
A managed Redis product which is defined by nodes and clusters can be complex to scale up or scale down. To support changes in read load, replica nodes can be added or removed - but the minimum increment is the size (and cost) of one node - and you’ll probably find that there’s a limit on the number of replica nodes. Adding replicas increases the CPU load on the leader, because it has to handle the work of replication - this might mean a different node type is required. Vertical scaling of node types is also possible - to change node type for the cluster when necessary. Unfortunately, none of these changes are likely to happen without affecting the performance of your application as perceived by users. The Redis clients often have to reconnect, may see an increase in misses during these scaling activities, etc. There may be support for sharding - a “cluster mode”, which can help some with all of this. But it’s still likely to result in some impact when scaling and adding/removing shards - remember those maintenance windows mentioned above? Momento Cache customers don’t have to think about any of this - we handle all the scaling automatically on the back-end, and we handle changes very smoothly - prewarming storage nodes before placing them into service.
As a result of the pain of scaling, most managed Redis implementations we evaluate when migrating customers to Momento Cache are massively overprovisioned. Operators tend not to want to descale when load is low, and they tend to build in a large buffer of underutilized capacity to cover growth and unanticipated spikes. They are then hesitant to scale out proactively because of potential for impact and cost. Unfortunately, all too often scale out must be performed emergently - when there is ongoing operational impact.
8. BYO authorization and RESTful data interface
As mentioned above regarding security, you’ll need to build and operate your Redis store in an isolated network because the authentication and authorization support is weak - and you’ll need to build an application layer to manage access to Redis. If the application is complex this is perhaps an incremental effort - but what if it is simply providing for HTTPS access to POST and GET data by key? This is a key pattern for many architectures. Momento Cache provides for direct RESTful requests to data by key, over HTTPS - all with Momento’s strong authentication and authorization. It’s easy to directly integrate your cache data into your broader application architecture, and to do so securely.
It’s easy, perhaps, to think of Momento Cache as “serverless Elasticache Redis”. And we see that as something of a compliment - we’re truly serverless (not just managed, and not just using the word as a marketing play with some terminology and price structure shortcuts). And some of what we’re doing supports caching and advanced data structures to solve for many of the data challenges that the Redis project has focused on. But Momento services aim to do much more than this - they solve for operational pain and they make it simpler to build quickly, securely, and reliably. As our ecosystem of data solutions grows, you’ll see more products built to perfectly solve particular developer needs right out of the box - very different from being handed some data structures and a lot of commands that must be combined just so to cover a particular pattern.