Caching System
The WispHub API implements an advanced in-memory caching system using Least Recently Used (LRU) eviction with Time-To-Live (TTL) support. This is the primary performance optimization technique that enables sub-5ms response times for frequently accessed data.Why Caching?
Conversational bots (like WhatsApp bots) generate high-frequency, repetitive queries to the API:- Multiple users asking about their service status simultaneously
- Repeated lookups of the same client data within short time windows
- Frequent access to internet plan information during verification flows
- Network roundtrip to WispHub Net (200-500ms latency)
- Database query on WispHub’s infrastructure
- Response serialization and network return
async_lru Implementation
The API uses theasync_lru library, which provides asynchronous LRU caching compatible with FastAPI’s async architecture.
Installation
requirements.txt
Basic Usage Pattern
Decorator Parameters
maxsize
maxsize
Type:
intPurpose: Maximum number of cached results to storeExamples:maxsize=1: Single cached value (e.g., entire client list)maxsize=32: Multiple cached values (e.g., individual plan details)
ttl
ttl
Type:
int (seconds)Purpose: Time-To-Live - how long cached data remains validExamples:ttl=300: Cache expires after 5 minutesttl=900: Cache expires after 15 minutes
Client List Caching
The most critical cache is the global client list:app/services/clients_service.py
Why maxsize=1?
The entire client list is treated as a single cached entity because:- Search operations need access to all clients for filtering
- Memory efficiency: Storing once vs. storing individual clients
- Consistency: All clients are refreshed together, preventing stale partial data
Why ttl=300 (5 minutes)?
Balancing freshness vs. performance:- Too short (under 1 min): Excessive load on WispHub Net
- Too long (over 10 min): Risk of showing stale client data
- 5 minutes: Sweet spot for conversational bot patterns
During peak hours with 100+ concurrent users, the 5-minute TTL reduces WispHub Net requests from ~6,000/hour to ~12/hour (99.8% reduction).
Internet Plans Caching
Internet plans change infrequently and are heavily referenced:app/services/internet_plans_service.py
Why maxsize=32?
Most WispHub deployments have 5-20 active plans. A cache size of 32 provides:- Room for all current plans
- Headroom for seasonal/promotional plans
- Historical plan lookups without eviction
Why ttl=900 (15 minutes)?
Plan pricing and specifications change rarely:- New plans: Added monthly or quarterly
- Price changes: Typically announced in advance
- 15-minute staleness is acceptable for this data type
Cache Flow Visualization
Cache Invalidation
When Cache is Invalidated
- TTL Expiration: Automatic after configured seconds
- Server Restart: Cache is in-memory and is lost on restart
- LRU Eviction: When maxsize is exceeded (least recently used items)
Implications
If a client’s data is updated in WispHub Net:- Worst case delay: Up to 5 minutes (client list TTL)
- Average case delay: ~2.5 minutes
- Best case: Immediate (if cache already expired)
Memory Considerations
Estimated Memory Usage
Client List Cache
- Typical deployment: 500-2000 clients
- Per client: ~500 bytes (serialized)
- Total: ~1MB for 2000 clients
Performance Metrics
Response Time Comparison
| Operation | Without Cache | With Cache (Hit) | Improvement |
|---|---|---|---|
| List all clients | 800ms | 4ms | 200x faster |
| Search clients | 850ms | 6ms | 141x faster |
| List plans | 300ms | 3ms | 100x faster |
| Get client by ID | 250ms | 5ms | 50x faster |
Cache Hit Ratio
During typical bot operations:- Client list queries: 98% hit ratio
- Plan lookups: 95% hit ratio
- Overall: 96% of requests served from cache
Monitoring Cache Performance
The
async_lru library provides cache_info() method for monitoring, but it’s not exposed in the current API implementation.hits: Number of cache hitsmisses: Number of cache missesmaxsize: Configured maximum sizecurrsize: Current cache size
Best Practices
Choose TTL Wisely
Balance data freshness requirements against load reduction. Monitor your data change frequency.
Size Appropriately
Set
maxsize based on actual data volume plus headroom. Monitor memory usage.Handle Cache Misses
Always have fallback logic for when WispHub Net is unavailable during cache refresh.
Document TTL
Make cache TTL values configurable via environment variables for easy tuning.
Advanced: Future Enhancements
Potential improvements to the caching system:- Redis-backed caching: Share cache across multiple API instances
- Webhook invalidation: WispHub Net pushes updates to invalidate specific cache entries
- Conditional requests: Use ETags to validate cache freshness with WispHub Net
- Tiered caching: Different TTLs for different client states (active vs. suspended)
- Cache warming: Proactively refresh cache before TTL expiration during off-peak hours
