Is Mutex+Chan Version of Once Better Than sync.Once?¶
In my previous blog packages.Load jitters, I said the jitters are caused by too many go routines are spawned so synchronization takes a lot of times.
However, at the beginning I thought the Lock
in sync.Once
costs a lot, so I tried to replace sync.Once
with Mutex
+Chan
. The result is the sync.Once
is still better.
Every call of sync.Once tries to lock¶
After checking the atomic value, the Do
tries to lock the mutex before executing f
to ensure only 1 function will be executed. It also means when the Do
is called simultaneously, if the f
takes a long time to execute, the other go routines will keep trying to acquire the mutex lock.
func (o *Once) Do(f func()) {
if atomic.LoadUint32(&o.done) == 0 {
// Outlined slow-path to allow inlining of the fast-path.
o.doSlow(f)
}
}
func (o *Once) doSlow(f func()) {
o.m.Lock()
defer o.m.Unlock()
if o.done == 0 {
defer atomic.StoreUint32(&o.done, 1)
f()
}
}
Usually, after spinning, the lock will fall asleep, see my previous blog about mutex. But actually it could sleep directly if it fails to acquire a lock.
Use Mutex+Chan to implement once¶
Here, I try to use TryLock
to acquire the mutex and let state control whether the function has been initialized. The usage of TryLock
will try only once while channel allows to sleep directly and wait to be awoken.
type Once struct {
mu *sync.Mutex
state chan struct{}
}
func NewOnce() *Once {
return &Once{
state: make(chan struct{}, 1),
mu: &sync.Mutex{},
}
}
func (o *Once) Do(f func()) {
if o.mu.TryLock() {
f()
close(o.state)
return
}
<-o.state
}
Conclusion¶
The benchmark result shows the mutex+chan is 2.5x slower than sync.Once. It shows that keeping acquiring mutex inside sync.Once is not a problem. Indeed, the synchronization of channel is slower.
I have overlooked the cost of channel and the efficiency of mutex lock.
To reduce synchronization efforts, focus on reducing to spawn un-necessary go routines.
BenchmarkMutexChan
BenchmarkMutexChan-10 1 2210972750 ns/op
BenchmarkSyncOnce
BenchmarkSyncOnce-10 2 746455666 ns/op
BenchmarkMutexChan
BenchmarkMutexChan-10 1 2269286375 ns/op
BenchmarkSyncOnce
BenchmarkSyncOnce-10 2 767407875 ns/op
BenchmarkMutexChan
BenchmarkMutexChan-10 1 1914167458 ns/op
BenchmarkSyncOnce
BenchmarkSyncOnce-10 2 762076584 ns/op
Benchmark Code
const (
TURN = 100_0000
SleepNs = 1_000_000
)
func f() {
time.Sleep(SleepNs * time.Nanosecond)
}
type Once struct {
mu *sync.Mutex
state chan struct{}
}
func NewOnce() *Once {
return &Once{
state: make(chan struct{}, 1),
mu: &sync.Mutex{},
}
}
func (o *Once) Do(f func()) {
if o.mu.TryLock() {
f()
close(o.state)
return
}
<-o.state
}
func BenchmarkMutexChan(b *testing.B) {
for n := 0; n < b.N; n++ {
once := NewOnce()
wg := sync.WaitGroup{}
barrier := sync.WaitGroup{}
barrier.Add(1)
for i := 0; i < TURN; i++ {
wg.Add(1)
go func() {
defer wg.Done()
barrier.Wait()
once.Do(f)
}()
}
barrier.Done()
wg.Wait()
}
}
func BenchmarkSyncOnce(b *testing.B) {
for n := 0; n < b.N; n++ {
once := sync.Once{}
wg := sync.WaitGroup{}
barrier := sync.WaitGroup{}
barrier.Add(1)
for i := 0; i < TURN; i++ {
wg.Add(1)
go func() {
defer wg.Done()
barrier.Wait()
once.Do(f)
}()
}
barrier.Done()
wg.Wait()
}
}