-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix(init): conn init should be thread safe #3555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -78,18 +78,21 @@ type Conn struct { | |||||
| // background operations that may execute commands, like re-authentication. | ||||||
| used atomic.Bool | ||||||
|
|
||||||
| // Inited flag to mark connection as initialized, this is almost the same as usable | ||||||
| // inited flag to mark connection as initialized, this is almost the same as usable | ||||||
| // but it is used to make sure we don't initialize a network connection twice | ||||||
| // On handoff, the network connection is replaced, but the Conn struct is reused | ||||||
| // this flag will be set to false when the network connection is replaced and | ||||||
| // set to true after the new network connection is initialized | ||||||
| Inited atomic.Bool | ||||||
| inited atomic.Bool | ||||||
|
|
||||||
| pooled bool | ||||||
| pubsub bool | ||||||
| closed atomic.Bool | ||||||
| createdAt time.Time | ||||||
| expiresAt time.Time | ||||||
| // Initializing flag to mark connection as initializing | ||||||
| // This is used to prevent concurrent initialization of the network connection | ||||||
| initializing atomic.Bool | ||||||
| pooled bool | ||||||
| pubsub bool | ||||||
| closed atomic.Bool | ||||||
| createdAt time.Time | ||||||
| expiresAt time.Time | ||||||
|
|
||||||
| // maintenanceNotifications upgrade support: relaxed timeouts during migrations/failovers | ||||||
| // Using atomic operations for lock-free access to avoid mutex contention | ||||||
|
|
@@ -306,7 +309,27 @@ func (cn *Conn) IsPubSub() bool { | |||||
| } | ||||||
|
|
||||||
| func (cn *Conn) IsInited() bool { | ||||||
| return cn.Inited.Load() | ||||||
| return cn.inited.Load() | ||||||
| } | ||||||
|
|
||||||
| func (cn *Conn) SetInited(inited bool) { | ||||||
| cn.inited.Store(inited) | ||||||
| } | ||||||
|
|
||||||
| func (cn *Conn) CompareAndSwapInited(old, new bool) bool { | ||||||
| return cn.inited.CompareAndSwap(old, new) | ||||||
| } | ||||||
|
|
||||||
| func (cn *Conn) IsInitializing() bool { | ||||||
| return cn.initializing.Load() | ||||||
| } | ||||||
|
|
||||||
| func (cn *Conn) SetInitializing(initializing bool) { | ||||||
| cn.initializing.Store(initializing) | ||||||
| } | ||||||
|
|
||||||
| func (cn *Conn) CompareAndSwapInitializing(old, new bool) bool { | ||||||
| return cn.initializing.CompareAndSwap(old, new) | ||||||
| } | ||||||
|
|
||||||
| // SetRelaxedTimeout sets relaxed timeouts for this connection during maintenanceNotifications upgrades. | ||||||
|
|
@@ -478,8 +501,17 @@ func (cn *Conn) GetNetConn() net.Conn { | |||||
|
|
||||||
| // SetNetConnAndInitConn replaces the underlying connection and executes the initialization. | ||||||
| func (cn *Conn) SetNetConnAndInitConn(ctx context.Context, netConn net.Conn) error { | ||||||
| // max retries of 100ms * 20 = 2 second | ||||||
| maxRetries := 20 | ||||||
| for cn.IsInitializing() || cn.IsUsed() { | ||||||
| time.Sleep(100 * time.Millisecond) | ||||||
| maxRetries-- | ||||||
| if maxRetries <= 0 { | ||||||
| return fmt.Errorf("failed to set net conn after %d attempts due to high contention", maxRetries) | ||||||
|
||||||
| return fmt.Errorf("failed to set net conn after %d attempts due to high contention", maxRetries) | |
| return fmt.Errorf("failed to set net conn after %d attempts due to high contention", 20) |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retry loop blocks the goroutine with fixed 100ms sleeps, which could cause unnecessary delays. Consider using exponential backoff or a more efficient synchronization mechanism like a condition variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected 'second' to 'seconds' for grammatical accuracy.