golang/tools: reduce jitter of packages.Load¶

In CL 614975, I have migrated deprecated loader from ssa package. However, miller has reported the tests became extremely slows. In an operation-system with a slow file system, such as plan9, the test slowed by more than 50x.

Oops, a un-intentional change:( I have submitted a CL to load 300+ packages by packages.Load once instead of loading them many. Alan thought it's a bravo change:)

Moreover, when I involved to investigate why the packages.Load slows so much, I have learned a lot. This blog records how I investigate them.

Who to blame, go or tools?¶

The packages.Load calls go list underlying to list all the packages that matches the patterns. When we recorded the elapsed time of packages.Load, sometimes it takes more than 2s while the typical elapsed time is 300ms. Alan suspects go kills some time by sleep intentionally.

But when comparing time took by go list and packages.Load, the execution time of go has a low variance. It's the tools own issue.

Why¶

During construct the full DAG from a preliminary one, the packages uses a naive way. The parent spawns go routines for each of its imports and waits them done.

When we use packages.Load in parallel, the synchronization efforts are a lot so some of the packages.Load jitter.

func (ld *loader) loadRecursive(lpkg *loaderPackage) {
    lpkg.loadOnce.Do(func() {
        // Load the direct dependencies, in parallel.
        var wg sync.WaitGroup
        for _, ipkg := range lpkg.Imports {
            imp := ld.pkgs[ipkg.ID]
            wg.Add(1)
            go func(imp *loaderPackage) {
                ld.loadRecursive(imp)
                wg.Done()
            }(imp)
        }
        wg.Wait()
        ld.loadPackage(lpkg)
    })
}

How to fix it¶

Alan's approach is better by finding all leaves of DAG and trigger the parent packages when all its import packages are done.

enqueue = func(lpkg *loaderPackage) {
    g.Go(func() error {
        // Parse and type-check.
        ld.loadPackage(lpkg)

        // Notify each waiting predecessor,
        // and enqueue it when it becomes a leaf.
        for _, pred := range lpkg.preds {
            if pred.unfinishedSuccs.Add(-1) == 0 { // decref
                enqueue(pred)
            }
        }

        return nil
    })
}

How to fix it(my bad approach)¶

At beginning, I have a fix but I don't think it's better than alan's. I have tried a top-down approach and resolved all the import packages inside the spawning go routines. It spawns huge amount of go routines for every package and actually I cannot prevent the duplication. As a result, too many go routines are spawned and it requires a lot of synchronization efforts.

func (ld *loader) enqueue(lpkg *loaderPackage) {
    ld.g.Go(func() error {
        limiter <- struct{}{}
        defer func() { <-limiter }()
        // some imports are not finished, so need to wait
        if lpkg.unfinishedImports.Load() != 0 {
            for _, ipkg := range lpkg.Imports {
                imp := ld.pkgs[ipkg.ID]
                ld.enqueue(imp)
            }
            return nil
        }

        // this package is done already, need do nothing
        if lpkg.done.Load() {
            return nil
        }

        if !lpkg.running.CompareAndSwap(false, true) {
            return nil // another place is running, skip it
        }

        // all imports are ready, so load lpkg as well
        ld.loadPackage(lpkg)

        // after lpkg is ready, decrease all parents reference
        // if this makes parent ready, enqueue parent to start load as well
        for pred := range lpkg.preds {
            if pred.unfinishedImports.Add(-1) == 0 {
                ld.enqueue(pred)
            }
        }
        lpkg.done.Store(true)
        lpkg.running.Store(false)
        return nil
    })
}

Limit spawning go routines¶

Besides that, we can also use a channel to limit the spawned go routine numbers with capacity runtime.GOMAXPROCS(0).

var cpuLimit = make(chan bool, runtime.GOMAXPROCS(0))
func (ld *loader) loadPackage(lpkg *loaderPackage) {
    cpuLimit <- true              // acquire a token
    defer func() { <-cpuLimit }() // release a token

    typErr := types.NewChecker(tc, ld.Fset, lpkg.Types, lpkg.TypesInfo).Files(lpkg.Syntax)
}

Conclusion¶

By reducing spawning go routines for a DAG and limiting the executing of spawned go routines, we reduce the jitter caused by synchronization of too many go routines.

Note that it doesn't improve the function efficiency.