I'd start with a token-bucket per key in Redis, but to avoid a single-region bottleneck I'd shard the counters and use a sliding-window approximation. For cross-region consistency I'd accept slight over-counting and reconcile asynchronously, rather than pay the latency cost of strong consistency on every request