@@ -375,6 +375,92 @@ def __init__(
375375 - Example: 100ms RTT + SSL = ~300-500ms handshake time
376376 - Consider TLS session resumption to reduce reconnection overhead
377377
378+ socket_keepalive (Optional[bool], default=None):
379+ Main description: Enable TCP keepalive to detect dead connections.
380+
381+ What is TCP keepalive:
382+ TCP keepalive is a mechanism where the operating system periodically sends
383+ small probe packets on idle connections to verify the remote endpoint is
384+ still reachable. If the remote side doesn't respond after several probes,
385+ the connection is considered dead and closed. This happens at the TCP level,
386+ below the application layer.
387+
388+ Why keepalive is needed:
389+ Redis keeps connections open indefinitely by default (if the timeout config is set to 0), but network
390+ issues, client crashes, or intermediate devices (firewalls, NAT, proxies) can
391+ cause "half-open" connections where one side thinks the connection is alive
392+ but the other side is unreachable. Without keepalive, these dead connections
393+ can accumulate and consume resources until manually detected.
394+
395+ How keepalive improves reconnection:
396+ When keepalive detects a dead connection, the socket is closed immediately.
397+ This means reconnection attempts are much faster because redis-py won't waste
398+ time retrying operations on a dead connection and waiting for timeouts.
399+ Instead, it quickly establishes a new connection.
400+
401+ Recommended values:
402+ - Production systems: True (recommended for all connections)
403+ - Connection pools: True (essential - affects all pool connections)
404+ - Development/testing: False or None (for simplicity)
405+ Trade-offs:
406+ - True: Detects dead connections but uses more network resources (only during idle periods)
407+ - False: Lower network overhead but may not detect connection failures
408+ Related parameters: socket_keepalive_options, health_check_interval
409+ Common issues:
410+ - Firewall interference: Some firewalls drop keepalive packets
411+ - Resource usage: Keepalive packets consume bandwidth
412+ - Timing conflicts: May conflict with application-level health checks
413+ - NAT timeouts: Helps prevent NAT table entry expiration
414+
415+ socket_keepalive_options (Optional[Mapping[int, Union[int, bytes]]], default=None):
416+ Main description: Advanced TCP keepalive socket options.
417+
418+ Available options reference:
419+ - Python socket module: import socket; help(socket) or dir(socket)
420+ - Common constants: socket.TCP_KEEPIDLE, socket.TCP_KEEPINTVL, socket.TCP_KEEPCNT
421+ - Platform-specific: socket.TCP_KEEPALIVE (macOS), socket.TCP_USER_TIMEOUT (Linux)
422+ - Online reference: https://docs.python.org/3/library/socket.html#socket-families
423+ - System documentation: man 7 tcp (Linux), man 4 tcp (BSD/macOS)
424+
425+ Recommended values:
426+ - Linux: {socket.TCP_KEEPIDLE: 30, socket.TCP_KEEPINTVL: 10, socket.TCP_KEEPCNT: 3}
427+ - macOS: {socket.TCP_KEEPALIVE: 30, socket.TCP_KEEPINTVL: 10, socket.TCP_KEEPCNT: 3}
428+ - Windows: {socket.TCP_KEEPIDLE: 30, socket.TCP_KEEPINTVL: 10, socket.TCP_KEEPCNT: 3}
429+ - Default: None (use system defaults)
430+ - Custom: Tune based on network characteristics
431+
432+ How to discover available options:
433+ ```python
434+ import socket
435+ # List all TCP-related constants
436+ tcp_options = [attr for attr in dir(socket) if attr.startswith('TCP_')]
437+ print(tcp_options)
438+
439+ # Check if specific option exists on your platform
440+ if hasattr(socket, 'TCP_KEEPIDLE'):
441+ print(f"TCP_KEEPIDLE = {socket.TCP_KEEPIDLE}")
442+
443+ # Example configuration for 30-second keepalive
444+ keepalive_opts = {socket.TCP_KEEPIDLE: 30, socket.TCP_KEEPINTVL: 10, socket.TCP_KEEPCNT: 3}
445+ ```
446+
447+ Trade-offs:
448+ - Custom options: Fine-tuned detection but platform-specific
449+ - System defaults: Portable but may not be optimal
450+ Related parameters: socket_keepalive (must be True)
451+ Use cases:
452+ - High-availability systems: Aggressive keepalive settings
453+ - Satellite/slow networks: Longer intervals
454+ - Container environments: Shorter intervals for faster detection
455+ Common issues:
456+ - Platform differences: Options vary between OS (use hasattr() to check)
457+ - Invalid options: May cause socket creation to fail
458+ - Firewall interference: Aggressive settings may be blocked
459+ - Constant availability: Not all TCP options available on all platforms
460+ Performance implications:
461+ - More frequent keepalive packets increase network usage
462+ - Faster dead connection detection improves reliability
463+
378464 To specify a retry policy for specific errors, you have two options:
379465
380466 1. Set the `retry_on_error` to a list of the error/s to retry on, and
0 commit comments