You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
garble/shared.go

436 lines
14 KiB
Go

// Copyright (c) 2020, The Garble Authors.
// See LICENSE for licensing information.
package main
import (
"bytes"
"crypto/sha256"
"encoding/gob"
"encoding/json"
"errors"
"fmt"
"log"
"os"
"os/exec"
"path/filepath"
"strings"
"time"
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
"golang.org/x/mod/module"
)
//go:generate ./scripts/gen-go-std-tables.sh
// sharedCacheType is shared as a read-only cache between the many garble toolexec
// sub-processes.
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
//
// Note that we fill this cache once from the root process in saveListedPackages,
// store it into a temporary file via gob encoding, and then reuse that file
// in each of the garble toolexec sub-processes.
type sharedCacheType struct {
fail if we are unexpectedly overwriting files (#418) While investigating a bug report, I noticed that garble was writing to the same temp file twice. At best, writing to the same path on disk twice is wasteful, as the design is careful to be deterministic and use unique paths. At worst, the two writes could cause races at the filesystem level. To prevent either of those situations, we now create files with os.OpenFile and os.O_EXCL, meaning that we will error if the file already exists. That change uncovered a number of such unintended cases. First, transformAsm would write obfuscated Go files twice. This is because the Go toolchain actually runs: [...]/asm -gensymabis [...] foo.s bar.s [...]/asm [...] foo.s bar.s That is, the first run is only meant to generate symbol ABIs, which are then used by the compiler. We need to obfuscate at that first stage, because the symbol ABI descriptions need to use obfuscated names. However, having already obfuscated the assembly on the first stage, there is no need to do so again on the second stage. If we detect gensymabis is missing, we simply reuse the previous files. This first situation doesn't seem racy, but obfuscating the Go assembly files twice is certainly unnecessary. Second, saveKnownReflectAPIs wrote a gob file to the build cache. Since the build cache can be kept between builds, and since the build cache uses reproducible paths for each build, running the same "garble build" twice could overwrite those files. This could actually cause races at the filesystem level; if two concurrent builds write to the same gob file on disk, one of them could end up using a partially-written file. Note that this is the only of the three cases not using temporary files. As such, it is expected that the file may already exist. In such a case, we simply avoid overwriting it rather than failing. Third, when "garble build -a" was used, and when we needed an export file not listed in importcfg, we would end up calling roughly: go list -export -toolexec=garble -a <dependency> This meant we would re-build and re-obfuscate those packages. Which is unfortunate, because the parent process already did via: go build -toolexec=garble -a <main> The repeated dependency builds tripped the new os.O_EXCL check, as we would try to overwrite the same obfuscated Go files. Beyond being wasteful, this could again cause subtle filesystem races. To fix the problem, avoid passing flags like "-a" to nested go commands. Overall, we should likely be using safer ways to write to disk, be it via either atomic writes or locked files. However, for now, catching duplicate writes is a big step. I have left a self-assigned TODO for further improvements. CI on the pull request found a failure on test-gotip. The failure reproduces on master, so it seems to be related to gotip, and not a regression introduced by this change. For now, disable test-gotip until we can investigate.
3 years ago
ExecPath string // absolute path to the garble binary being used
ForwardBuildFlags []string // build flags fed to the original "garble ..." command
CacheDir string // absolute path to the GARBLE_CACHE directory being used
// ListedPackages contains data obtained via 'go list -json -export -deps'.
// This allows us to obtain the non-obfuscated export data of all dependencies,
// useful for type checking of the packages as we obfuscate them.
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
ListedPackages map[string]*listedPackage
// We can't use garble's own module version, as it may not exist.
// We can't use the stamped VCS information either,
// as uncommitted changes simply show up as "dirty".
//
// The only unique way to identify garble's version without being published
// or committed is to use its content ID from the build cache.
BinaryContentID []byte
GOGARBLE string
// GoVersion is a version of the Go toolchain currently being used,
// as reported by "go env GOVERSION" and compatible with go/version.
// Note that the version of Go that built the garble binary might be newer.
// Also note that a devel version like "go1.22-231f290e51" is
// currently represented as "go1.22", as the suffix is ignored by go/version.
GoVersion string
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
// Filled directly from "go env".
// Keep in sync with fetchGoEnv.
GoEnv struct {
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
GOOS string // i.e. the GOOS build target
GOMOD string
GOVERSION string
GOROOT string
}
}
var sharedCache *sharedCacheType
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
// loadSharedCache the shared data passed from the entry garble process
func loadSharedCache() error {
if sharedCache != nil {
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
panic("shared cache loaded twice?")
}
startTime := time.Now()
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
f, err := os.Open(filepath.Join(sharedTempDir, "main-cache.gob"))
if err != nil {
return fmt.Errorf(`cannot open shared file: %v\ndid you run "go [command] -toolexec=garble" instead of "garble [command]"?`, err)
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
}
defer func() {
log.Printf("shared cache loaded in %s from %s", debugSince(startTime), f.Name())
}()
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
defer f.Close()
if err := gob.NewDecoder(f).Decode(&sharedCache); err != nil {
return fmt.Errorf("cannot decode shared file: %v", err)
}
return nil
}
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
// saveSharedCache creates a temporary directory to share between garble processes.
// This directory also includes the gob-encoded cache global.
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
func saveSharedCache() (string, error) {
if sharedCache == nil {
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
panic("saving a missing cache?")
}
dir, err := os.MkdirTemp("", "garble-shared")
if err != nil {
return "", err
}
cachePath := filepath.Join(dir, "main-cache.gob")
if err := writeGobExclusive(cachePath, &sharedCache); err != nil {
return "", err
}
fail if we are unexpectedly overwriting files (#418) While investigating a bug report, I noticed that garble was writing to the same temp file twice. At best, writing to the same path on disk twice is wasteful, as the design is careful to be deterministic and use unique paths. At worst, the two writes could cause races at the filesystem level. To prevent either of those situations, we now create files with os.OpenFile and os.O_EXCL, meaning that we will error if the file already exists. That change uncovered a number of such unintended cases. First, transformAsm would write obfuscated Go files twice. This is because the Go toolchain actually runs: [...]/asm -gensymabis [...] foo.s bar.s [...]/asm [...] foo.s bar.s That is, the first run is only meant to generate symbol ABIs, which are then used by the compiler. We need to obfuscate at that first stage, because the symbol ABI descriptions need to use obfuscated names. However, having already obfuscated the assembly on the first stage, there is no need to do so again on the second stage. If we detect gensymabis is missing, we simply reuse the previous files. This first situation doesn't seem racy, but obfuscating the Go assembly files twice is certainly unnecessary. Second, saveKnownReflectAPIs wrote a gob file to the build cache. Since the build cache can be kept between builds, and since the build cache uses reproducible paths for each build, running the same "garble build" twice could overwrite those files. This could actually cause races at the filesystem level; if two concurrent builds write to the same gob file on disk, one of them could end up using a partially-written file. Note that this is the only of the three cases not using temporary files. As such, it is expected that the file may already exist. In such a case, we simply avoid overwriting it rather than failing. Third, when "garble build -a" was used, and when we needed an export file not listed in importcfg, we would end up calling roughly: go list -export -toolexec=garble -a <dependency> This meant we would re-build and re-obfuscate those packages. Which is unfortunate, because the parent process already did via: go build -toolexec=garble -a <main> The repeated dependency builds tripped the new os.O_EXCL check, as we would try to overwrite the same obfuscated Go files. Beyond being wasteful, this could again cause subtle filesystem races. To fix the problem, avoid passing flags like "-a" to nested go commands. Overall, we should likely be using safer ways to write to disk, be it via either atomic writes or locked files. However, for now, catching duplicate writes is a big step. I have left a self-assigned TODO for further improvements. CI on the pull request found a failure on test-gotip. The failure reproduces on master, so it seems to be related to gotip, and not a regression introduced by this change. For now, disable test-gotip until we can investigate.
3 years ago
return dir, nil
}
fail if we are unexpectedly overwriting files (#418) While investigating a bug report, I noticed that garble was writing to the same temp file twice. At best, writing to the same path on disk twice is wasteful, as the design is careful to be deterministic and use unique paths. At worst, the two writes could cause races at the filesystem level. To prevent either of those situations, we now create files with os.OpenFile and os.O_EXCL, meaning that we will error if the file already exists. That change uncovered a number of such unintended cases. First, transformAsm would write obfuscated Go files twice. This is because the Go toolchain actually runs: [...]/asm -gensymabis [...] foo.s bar.s [...]/asm [...] foo.s bar.s That is, the first run is only meant to generate symbol ABIs, which are then used by the compiler. We need to obfuscate at that first stage, because the symbol ABI descriptions need to use obfuscated names. However, having already obfuscated the assembly on the first stage, there is no need to do so again on the second stage. If we detect gensymabis is missing, we simply reuse the previous files. This first situation doesn't seem racy, but obfuscating the Go assembly files twice is certainly unnecessary. Second, saveKnownReflectAPIs wrote a gob file to the build cache. Since the build cache can be kept between builds, and since the build cache uses reproducible paths for each build, running the same "garble build" twice could overwrite those files. This could actually cause races at the filesystem level; if two concurrent builds write to the same gob file on disk, one of them could end up using a partially-written file. Note that this is the only of the three cases not using temporary files. As such, it is expected that the file may already exist. In such a case, we simply avoid overwriting it rather than failing. Third, when "garble build -a" was used, and when we needed an export file not listed in importcfg, we would end up calling roughly: go list -export -toolexec=garble -a <dependency> This meant we would re-build and re-obfuscate those packages. Which is unfortunate, because the parent process already did via: go build -toolexec=garble -a <main> The repeated dependency builds tripped the new os.O_EXCL check, as we would try to overwrite the same obfuscated Go files. Beyond being wasteful, this could again cause subtle filesystem races. To fix the problem, avoid passing flags like "-a" to nested go commands. Overall, we should likely be using safer ways to write to disk, be it via either atomic writes or locked files. However, for now, catching duplicate writes is a big step. I have left a self-assigned TODO for further improvements. CI on the pull request found a failure on test-gotip. The failure reproduces on master, so it seems to be related to gotip, and not a regression introduced by this change. For now, disable test-gotip until we can investigate.
3 years ago
func createExclusive(name string) (*os.File, error) {
return os.OpenFile(name, os.O_RDWR|os.O_CREATE|os.O_EXCL, 0o666)
}
func writeFileExclusive(name string, data []byte) error {
f, err := createExclusive(name)
if err != nil {
return err
}
fail if we are unexpectedly overwriting files (#418) While investigating a bug report, I noticed that garble was writing to the same temp file twice. At best, writing to the same path on disk twice is wasteful, as the design is careful to be deterministic and use unique paths. At worst, the two writes could cause races at the filesystem level. To prevent either of those situations, we now create files with os.OpenFile and os.O_EXCL, meaning that we will error if the file already exists. That change uncovered a number of such unintended cases. First, transformAsm would write obfuscated Go files twice. This is because the Go toolchain actually runs: [...]/asm -gensymabis [...] foo.s bar.s [...]/asm [...] foo.s bar.s That is, the first run is only meant to generate symbol ABIs, which are then used by the compiler. We need to obfuscate at that first stage, because the symbol ABI descriptions need to use obfuscated names. However, having already obfuscated the assembly on the first stage, there is no need to do so again on the second stage. If we detect gensymabis is missing, we simply reuse the previous files. This first situation doesn't seem racy, but obfuscating the Go assembly files twice is certainly unnecessary. Second, saveKnownReflectAPIs wrote a gob file to the build cache. Since the build cache can be kept between builds, and since the build cache uses reproducible paths for each build, running the same "garble build" twice could overwrite those files. This could actually cause races at the filesystem level; if two concurrent builds write to the same gob file on disk, one of them could end up using a partially-written file. Note that this is the only of the three cases not using temporary files. As such, it is expected that the file may already exist. In such a case, we simply avoid overwriting it rather than failing. Third, when "garble build -a" was used, and when we needed an export file not listed in importcfg, we would end up calling roughly: go list -export -toolexec=garble -a <dependency> This meant we would re-build and re-obfuscate those packages. Which is unfortunate, because the parent process already did via: go build -toolexec=garble -a <main> The repeated dependency builds tripped the new os.O_EXCL check, as we would try to overwrite the same obfuscated Go files. Beyond being wasteful, this could again cause subtle filesystem races. To fix the problem, avoid passing flags like "-a" to nested go commands. Overall, we should likely be using safer ways to write to disk, be it via either atomic writes or locked files. However, for now, catching duplicate writes is a big step. I have left a self-assigned TODO for further improvements. CI on the pull request found a failure on test-gotip. The failure reproduces on master, so it seems to be related to gotip, and not a regression introduced by this change. For now, disable test-gotip until we can investigate.
3 years ago
_, err = f.Write(data)
if err2 := f.Close(); err == nil {
err = err2
}
return err
}
func writeGobExclusive(name string, val any) error {
fail if we are unexpectedly overwriting files (#418) While investigating a bug report, I noticed that garble was writing to the same temp file twice. At best, writing to the same path on disk twice is wasteful, as the design is careful to be deterministic and use unique paths. At worst, the two writes could cause races at the filesystem level. To prevent either of those situations, we now create files with os.OpenFile and os.O_EXCL, meaning that we will error if the file already exists. That change uncovered a number of such unintended cases. First, transformAsm would write obfuscated Go files twice. This is because the Go toolchain actually runs: [...]/asm -gensymabis [...] foo.s bar.s [...]/asm [...] foo.s bar.s That is, the first run is only meant to generate symbol ABIs, which are then used by the compiler. We need to obfuscate at that first stage, because the symbol ABI descriptions need to use obfuscated names. However, having already obfuscated the assembly on the first stage, there is no need to do so again on the second stage. If we detect gensymabis is missing, we simply reuse the previous files. This first situation doesn't seem racy, but obfuscating the Go assembly files twice is certainly unnecessary. Second, saveKnownReflectAPIs wrote a gob file to the build cache. Since the build cache can be kept between builds, and since the build cache uses reproducible paths for each build, running the same "garble build" twice could overwrite those files. This could actually cause races at the filesystem level; if two concurrent builds write to the same gob file on disk, one of them could end up using a partially-written file. Note that this is the only of the three cases not using temporary files. As such, it is expected that the file may already exist. In such a case, we simply avoid overwriting it rather than failing. Third, when "garble build -a" was used, and when we needed an export file not listed in importcfg, we would end up calling roughly: go list -export -toolexec=garble -a <dependency> This meant we would re-build and re-obfuscate those packages. Which is unfortunate, because the parent process already did via: go build -toolexec=garble -a <main> The repeated dependency builds tripped the new os.O_EXCL check, as we would try to overwrite the same obfuscated Go files. Beyond being wasteful, this could again cause subtle filesystem races. To fix the problem, avoid passing flags like "-a" to nested go commands. Overall, we should likely be using safer ways to write to disk, be it via either atomic writes or locked files. However, for now, catching duplicate writes is a big step. I have left a self-assigned TODO for further improvements. CI on the pull request found a failure on test-gotip. The failure reproduces on master, so it seems to be related to gotip, and not a regression introduced by this change. For now, disable test-gotip until we can investigate.
3 years ago
f, err := createExclusive(name)
if err != nil {
return err
}
// Always close the file, and return the first error we get.
err = gob.NewEncoder(f).Encode(val)
fail if we are unexpectedly overwriting files (#418) While investigating a bug report, I noticed that garble was writing to the same temp file twice. At best, writing to the same path on disk twice is wasteful, as the design is careful to be deterministic and use unique paths. At worst, the two writes could cause races at the filesystem level. To prevent either of those situations, we now create files with os.OpenFile and os.O_EXCL, meaning that we will error if the file already exists. That change uncovered a number of such unintended cases. First, transformAsm would write obfuscated Go files twice. This is because the Go toolchain actually runs: [...]/asm -gensymabis [...] foo.s bar.s [...]/asm [...] foo.s bar.s That is, the first run is only meant to generate symbol ABIs, which are then used by the compiler. We need to obfuscate at that first stage, because the symbol ABI descriptions need to use obfuscated names. However, having already obfuscated the assembly on the first stage, there is no need to do so again on the second stage. If we detect gensymabis is missing, we simply reuse the previous files. This first situation doesn't seem racy, but obfuscating the Go assembly files twice is certainly unnecessary. Second, saveKnownReflectAPIs wrote a gob file to the build cache. Since the build cache can be kept between builds, and since the build cache uses reproducible paths for each build, running the same "garble build" twice could overwrite those files. This could actually cause races at the filesystem level; if two concurrent builds write to the same gob file on disk, one of them could end up using a partially-written file. Note that this is the only of the three cases not using temporary files. As such, it is expected that the file may already exist. In such a case, we simply avoid overwriting it rather than failing. Third, when "garble build -a" was used, and when we needed an export file not listed in importcfg, we would end up calling roughly: go list -export -toolexec=garble -a <dependency> This meant we would re-build and re-obfuscate those packages. Which is unfortunate, because the parent process already did via: go build -toolexec=garble -a <main> The repeated dependency builds tripped the new os.O_EXCL check, as we would try to overwrite the same obfuscated Go files. Beyond being wasteful, this could again cause subtle filesystem races. To fix the problem, avoid passing flags like "-a" to nested go commands. Overall, we should likely be using safer ways to write to disk, be it via either atomic writes or locked files. However, for now, catching duplicate writes is a big step. I have left a self-assigned TODO for further improvements. CI on the pull request found a failure on test-gotip. The failure reproduces on master, so it seems to be related to gotip, and not a regression introduced by this change. For now, disable test-gotip until we can investigate.
3 years ago
if err2 := f.Close(); err == nil {
err = err2
}
return err
}
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
// listedPackage contains the 'go list -json -export' fields obtained by the
// root process, shared with all garble sub-processes via a file.
type listedPackage struct {
Name string
ImportPath string
ForTest string
Export string
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
BuildID string
Deps []string
ImportMap map[string]string
refactor "current package" with TOOLEXEC_IMPORTPATH (#266) Now that we've dropped support for Go 1.15.x, we can finally rely on this environment variable for toolexec calls, present in Go 1.16. Before, we had hacky ways of trying to figure out the current package's import path, mostly from the -p flag. The biggest rough edge there was that, for main packages, that was simply the package name, and not its full import path. To work around that, we had a restriction on a single main package, so we could work around that issue. That restriction is now gone. The new code is simpler, especially because we can set curPkg in a single place for all toolexec transform funcs. Since we can always rely on curPkg not being nil now, we can also start reusing listedPackage.Private and avoid the majority of repeated calls to isPrivate. The function is cheap, but still not free. isPrivate itself can also get simpler. We no longer have to worry about the "main" edge case. Plus, the sanity check for invalid package paths is now unnecessary; we only got malformed paths from goobj2, and we now require exact matches with the ImportPath field from "go list -json". Another effect of clearing up the "main" edge case is that -debugdir now uses the right directory for main packages. We also start using consistent debugdir paths in the tests, for the sake of being easier to read and maintain. Finally, note that commandReverse did not need the extra call to "go list -toolexec", as the "shared" call stored in the cache is enough. We still call toolexecCmd to get said cache, which should probably be simplified in a future PR. While at it, replace the use of the "-std" compiler flag with the Standard field from "go list -json".
3 years ago
Standard bool
Dir string
CompiledGoFiles []string
IgnoredGoFiles []string
Imports []string
Error *packageError // to report package loading errors to the user
reverse: support unexported names and package paths (#233) Unexported names are a bit tricky, since they are not listed in the export data file. Perhaps unsurprisingly, it's only meant to expose exported objects. One option would be to go back to adding an extra header to the export data file, containing the unexported methods in a map[string]T or []string. However, we have an easier route: just parse the Go files and look up the names directly. This does mean that we parse the Go files every time "reverse" runs, even if the build cache is warm, but that should not be an issue. Parsing Go files without any typechecking is very cheap compared to everything else we do. Plus, we save having to load go/types information from the build cache, or having to load extra headers from export files. It should be noted that the obfuscation process does need type information, mainly to be careful about which names can be obfuscated and how they should be obfuscated. Neither is a worry here; all names belong to a single package, and it doesn't matter if some aren't actually obfuscated, since the string replacements would simply never trigger in practice. The test includes an unexported func, to test the new feature. We also start reversing the obfuscation of import paths. Now, the test's reverse output is as follows: goroutine 1 [running]: runtime/debug.Stack(0x??, 0x??, 0x??) runtime/debug/stack.go:24 +0x?? test/main/lib.ExportedLibFunc(0x??, 0x??, 0x??, 0x??) p.go:6 +0x?? main.unexportedMainFunc(...) C.go:2 main.main() z.go:3 +0x?? The only major missing feature is positions and filenames. A follow-up PR will take care of those. Updates #5.
3 years ago
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
// The fields below are not part of 'go list', but are still reused
// between garble processes. Use "Garble" as a prefix to ensure no
// collisions with the JSON fields from 'go list'.
// GarbleActionID is a hash combining the Action ID from BuildID,
// with Garble's own inputs as per addGarbleToHash.
// It is set even when ToObfuscate is false, as it is also used for random
// seeds and build cache paths, and not just to obfuscate names.
GarbleActionID [sha256.Size]byte `json:"-"`
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
// ToObfuscate records whether the package should be obfuscated.
// When true, GarbleActionID must not be empty.
ToObfuscate bool `json:"-"`
}
type packageError struct {
Pos string
Err string
}
func (p *listedPackage) obfuscatedImportPath() string {
// We can't obfuscate these standard library import paths,
// as the toolchain expects to recognize the packages by them:
//
// * runtime: it is special in many ways
// * reflect: its presence turns down dead code elimination
// * embed: its presence enables using //go:embed
// * others like syscall are allowed by import path to have more ABI tricks
//
// TODO: collect directly from cmd/internal/objabi/pkgspecial.go,
// in this particular case from allowAsmABIPkgs.
switch p.ImportPath {
case "runtime", "reflect", "embed", "syscall", "runtime/internal/startlinetest":
return p.ImportPath
}
// Intrinsics are matched by package import path as well.
avoid breaking intrinsics when obfuscating names We obfuscate import paths as well as their declared names. The compiler treats some packages and APIs in special ways, and the way it detects those is by looking at import paths and names. In the past, we have avoided obfuscating some names like embed.FS or reflect.Value.MethodByName for this reason. Otherwise, go:embed or the linker's deadcode elimination might be broken. This matching by path and name also happens with compiler intrinsics. Intrinsics allow the compiler to rewrite some standard library calls with small and efficient assembly, depending on the target GOARCH. For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32, which on amd64 may result in using the TZCNTL instruction. We never noticed that we were breaking many of these intrinsics. The intrinsics for funcs declared in the runtime and its dependencies still worked properly, as we do not obfuscate those packages yet. However, for other packages like math/bits and sync/atomic, the intrinsics were being entirely disabled due to obfuscated names. Skipping intrinsics is particularly bad for performance, and it also leads to slightly larger binaries: │ old │ new │ │ bin-B │ bin-B vs base │ Build-16 5.450Mi ± ∞ ¹ 5.333Mi ± ∞ ¹ -2.15% (p=0.029 n=4) Finally, the main reason we noticed that intrinsics were broken is that apparently GOARCH=mips fails to link without them, as some symbols end up being not defined at all. This patch fixes builds for the MIPS family of architectures. Rather than building and linking all of std for every GOARCH, test that intrinsics work by asking the compiler to print which intrinsics are being applied, and checking that math/bits gets them. This fix is relatively unfortunate, as it means we stop obfuscating about 120 function names and a handful of package paths. However, fixing builds and intrinsics is much more important. We can figure out better ways to deal with intrinsics in the future. Fixes #646.
1 year ago
if compilerIntrinsicsPkgs[p.ImportPath] {
return p.ImportPath
}
if !p.ToObfuscate {
return p.ImportPath
}
newPath := hashWithPackage(p, p.ImportPath)
log.Printf("import path %q hashed with %x to %q", p.ImportPath, p.GarbleActionID, newPath)
return newPath
}
use fewer build flags when building std or cmd When we use `go list` on the standard library, we need to be careful about what flags are passed from the top-level build command, because some flags are not going to be appropriate. In particular, GOFLAGS=-modfile=... resulted in a failure, reproduced via the GOFLAGS variable added to linker.txtar: go: inconsistent vendoring in /home/mvdan/tip/src: golang.org/x/crypto@v0.5.1-0.20230203195927-310bfa40f1e4: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/net@v0.7.0: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/sys@v0.5.1-0.20230208141308-4fee21c92339: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/text@v0.7.1-0.20230207171107-30dadde3188b: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod To ignore the vendor directory, use -mod=readonly or -mod=mod. To sync the vendor directory, run: go mod vendor To work around this problem, reset the -mod and -modfile flags when calling "go list" on the standard library, as those are the only two flags which alter how we load the main module in a build. The code which builds a modified cmd/link has a similar problem; it already reset GOOS and GOARCH, but it could similarly run into problems if other env vars like GOFLAGS were set. To be on the safe side, we also disable GOENV and GOEXPERIMENT, which we borrow from Go's bootstrapping commands.
1 year ago
// garbleBuildFlags are always passed to top-level build commands such as
// "go build", "go list", or "go test".
var garbleBuildFlags = []string{"-trimpath", "-buildvcs=false"}
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
// appendListedPackages gets information about the current package
// and all of its dependencies
use fewer build flags when building std or cmd When we use `go list` on the standard library, we need to be careful about what flags are passed from the top-level build command, because some flags are not going to be appropriate. In particular, GOFLAGS=-modfile=... resulted in a failure, reproduced via the GOFLAGS variable added to linker.txtar: go: inconsistent vendoring in /home/mvdan/tip/src: golang.org/x/crypto@v0.5.1-0.20230203195927-310bfa40f1e4: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/net@v0.7.0: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/sys@v0.5.1-0.20230208141308-4fee21c92339: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/text@v0.7.1-0.20230207171107-30dadde3188b: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod To ignore the vendor directory, use -mod=readonly or -mod=mod. To sync the vendor directory, run: go mod vendor To work around this problem, reset the -mod and -modfile flags when calling "go list" on the standard library, as those are the only two flags which alter how we load the main module in a build. The code which builds a modified cmd/link has a similar problem; it already reset GOOS and GOARCH, but it could similarly run into problems if other env vars like GOFLAGS were set. To be on the safe side, we also disable GOENV and GOEXPERIMENT, which we borrow from Go's bootstrapping commands.
1 year ago
func appendListedPackages(packages []string, mainBuild bool) error {
startTime := time.Now()
use fewer build flags when building std or cmd When we use `go list` on the standard library, we need to be careful about what flags are passed from the top-level build command, because some flags are not going to be appropriate. In particular, GOFLAGS=-modfile=... resulted in a failure, reproduced via the GOFLAGS variable added to linker.txtar: go: inconsistent vendoring in /home/mvdan/tip/src: golang.org/x/crypto@v0.5.1-0.20230203195927-310bfa40f1e4: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/net@v0.7.0: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/sys@v0.5.1-0.20230208141308-4fee21c92339: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/text@v0.7.1-0.20230207171107-30dadde3188b: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod To ignore the vendor directory, use -mod=readonly or -mod=mod. To sync the vendor directory, run: go mod vendor To work around this problem, reset the -mod and -modfile flags when calling "go list" on the standard library, as those are the only two flags which alter how we load the main module in a build. The code which builds a modified cmd/link has a similar problem; it already reset GOOS and GOARCH, but it could similarly run into problems if other env vars like GOFLAGS were set. To be on the safe side, we also disable GOENV and GOEXPERIMENT, which we borrow from Go's bootstrapping commands.
1 year ago
args := []string{
"list",
// Similar flags to what go/packages uses.
"-json", "-export", "-compiled", "-e",
}
if mainBuild {
// When loading the top-level packages we are building,
// we want to transitively load all their dependencies as well.
// That is not the case when loading standard library packages,
// as runtimeLinknamed already contains transitive dependencies.
args = append(args, "-deps")
}
use fewer build flags when building std or cmd When we use `go list` on the standard library, we need to be careful about what flags are passed from the top-level build command, because some flags are not going to be appropriate. In particular, GOFLAGS=-modfile=... resulted in a failure, reproduced via the GOFLAGS variable added to linker.txtar: go: inconsistent vendoring in /home/mvdan/tip/src: golang.org/x/crypto@v0.5.1-0.20230203195927-310bfa40f1e4: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/net@v0.7.0: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/sys@v0.5.1-0.20230208141308-4fee21c92339: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/text@v0.7.1-0.20230207171107-30dadde3188b: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod To ignore the vendor directory, use -mod=readonly or -mod=mod. To sync the vendor directory, run: go mod vendor To work around this problem, reset the -mod and -modfile flags when calling "go list" on the standard library, as those are the only two flags which alter how we load the main module in a build. The code which builds a modified cmd/link has a similar problem; it already reset GOOS and GOARCH, but it could similarly run into problems if other env vars like GOFLAGS were set. To be on the safe side, we also disable GOENV and GOEXPERIMENT, which we borrow from Go's bootstrapping commands.
1 year ago
args = append(args, garbleBuildFlags...)
args = append(args, sharedCache.ForwardBuildFlags...)
use fewer build flags when building std or cmd When we use `go list` on the standard library, we need to be careful about what flags are passed from the top-level build command, because some flags are not going to be appropriate. In particular, GOFLAGS=-modfile=... resulted in a failure, reproduced via the GOFLAGS variable added to linker.txtar: go: inconsistent vendoring in /home/mvdan/tip/src: golang.org/x/crypto@v0.5.1-0.20230203195927-310bfa40f1e4: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/net@v0.7.0: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/sys@v0.5.1-0.20230208141308-4fee21c92339: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/text@v0.7.1-0.20230207171107-30dadde3188b: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod To ignore the vendor directory, use -mod=readonly or -mod=mod. To sync the vendor directory, run: go mod vendor To work around this problem, reset the -mod and -modfile flags when calling "go list" on the standard library, as those are the only two flags which alter how we load the main module in a build. The code which builds a modified cmd/link has a similar problem; it already reset GOOS and GOARCH, but it could similarly run into problems if other env vars like GOFLAGS were set. To be on the safe side, we also disable GOENV and GOEXPERIMENT, which we borrow from Go's bootstrapping commands.
1 year ago
if !mainBuild {
// If the top-level build included the -mod or -modfile flags,
// they should be used when loading the top-level packages.
// However, when loading standard library packages,
// using those flags would likely result in an error,
// as the standard library uses its own Go module and vendoring.
args = append(args, "-mod=", "-modfile=")
}
args = append(args, packages...)
cmd := exec.Command("go", args...)
defer func() {
log.Printf("original build info obtained in %s via: go %s", debugSince(startTime), strings.Join(args, " "))
}()
stdout, err := cmd.StdoutPipe()
if err != nil {
return err
}
var stderr bytes.Buffer
cmd.Stderr = &stderr
if err := cmd.Start(); err != nil {
return fmt.Errorf("go list error: %v", err)
}
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
dec := json.NewDecoder(stdout)
if sharedCache.ListedPackages == nil {
sharedCache.ListedPackages = make(map[string]*listedPackage)
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
}
var pkgErrors strings.Builder
for dec.More() {
var pkg listedPackage
if err := dec.Decode(&pkg); err != nil {
return err
}
if perr := pkg.Error; perr != nil {
if pkg.Standard && len(pkg.CompiledGoFiles) == 0 && len(pkg.IgnoredGoFiles) > 0 {
// Some packages in runtimeLinknamed need a build tag to be importable,
// like crypto/internal/boring/fipstls with boringcrypto,
// so any pkg.Error should be ignored when the build tag isn't set.
} else {
if pkgErrors.Len() > 0 {
pkgErrors.WriteString("\n")
}
if perr.Pos != "" {
pkgErrors.WriteString(perr.Pos)
pkgErrors.WriteString(": ")
}
// Error messages sometimes include a trailing newline.
pkgErrors.WriteString(strings.TrimRight(perr.Err, "\n"))
}
}
// Note that we use the `-e` flag above with `go list`.
// If a package fails to load, the Incomplete and Error fields will be set.
// We still record failed packages in the ListedPackages map,
// because some like crypto/internal/boring/fipstls simply fall under
// "build constraints exclude all Go files" and can be ignored.
// Real build errors will still be surfaced by `go build -toolexec` later.
if sharedCache.ListedPackages[pkg.ImportPath] != nil {
return fmt.Errorf("duplicate package: %q", pkg.ImportPath)
}
if pkg.BuildID != "" {
actionID := decodeBuildIDHash(splitActionID(pkg.BuildID))
pkg.GarbleActionID = addGarbleToHash(actionID)
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
}
sharedCache.ListedPackages[pkg.ImportPath] = &pkg
}
if err := cmd.Wait(); err != nil {
use fewer build flags when building std or cmd When we use `go list` on the standard library, we need to be careful about what flags are passed from the top-level build command, because some flags are not going to be appropriate. In particular, GOFLAGS=-modfile=... resulted in a failure, reproduced via the GOFLAGS variable added to linker.txtar: go: inconsistent vendoring in /home/mvdan/tip/src: golang.org/x/crypto@v0.5.1-0.20230203195927-310bfa40f1e4: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/net@v0.7.0: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/sys@v0.5.1-0.20230208141308-4fee21c92339: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod golang.org/x/text@v0.7.1-0.20230207171107-30dadde3188b: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod To ignore the vendor directory, use -mod=readonly or -mod=mod. To sync the vendor directory, run: go mod vendor To work around this problem, reset the -mod and -modfile flags when calling "go list" on the standard library, as those are the only two flags which alter how we load the main module in a build. The code which builds a modified cmd/link has a similar problem; it already reset GOOS and GOARCH, but it could similarly run into problems if other env vars like GOFLAGS were set. To be on the safe side, we also disable GOENV and GOEXPERIMENT, which we borrow from Go's bootstrapping commands.
1 year ago
return fmt.Errorf("go list error: %v:\nargs: %q\n%s", err, args, stderr.Bytes())
}
if pkgErrors.Len() > 0 {
return errors.New(pkgErrors.String())
}
anyToObfuscate := false
for path, pkg := range sharedCache.ListedPackages {
// If "GOGARBLE=foo/bar", "foo/bar_test" should also match.
if pkg.ForTest != "" {
path = pkg.ForTest
}
switch {
// We do not support obfuscating the runtime nor its dependencies.
case runtimeAndDeps[path],
// "unknown pc" crashes on windows in the cgo test otherwise.
path == "runtime/cgo":
// No point in obfuscating empty packages, like OS-specific ones that don't match.
case len(pkg.CompiledGoFiles) == 0:
// Test main packages like "foo/bar.test" are always obfuscated,
// just like unnamed and plugin main packages.
case pkg.Name == "main" && strings.HasSuffix(path, ".test"),
path == "command-line-arguments",
strings.HasPrefix(path, "plugin/unnamed"),
module.MatchPrefixPatterns(sharedCache.GOGARBLE, path):
pkg.ToObfuscate = true
anyToObfuscate = true
if len(pkg.GarbleActionID) == 0 {
return fmt.Errorf("package %q to be obfuscated lacks build id?", pkg.ImportPath)
}
}
}
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
// Don't error if the user ran: GOGARBLE='*' garble build runtime
if !anyToObfuscate && !module.MatchPrefixPatterns(sharedCache.GOGARBLE, "runtime") {
return fmt.Errorf("GOGARBLE=%q does not match any packages to be built", sharedCache.GOGARBLE)
}
return nil
}
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
var listedRuntimeLinknamed = false
var ErrNotFound = errors.New("not found")
var ErrNotDependency = errors.New("not a dependency")
// listPackage gets the listedPackage information for a certain package
func listPackage(from *listedPackage, path string) (*listedPackage, error) {
if path == from.ImportPath {
return from, nil
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
}
// If the path is listed in the top-level ImportMap, use its mapping instead.
// This is a common scenario when dealing with vendored packages in GOROOT.
// The map is flat, so we don't need to recurse.
if path2 := from.ImportMap[path]; path2 != "" {
start using original action IDs (#251) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out *all* the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
3 years ago
path = path2
}
pkg, ok := sharedCache.ListedPackages[path]
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
// A std package may list any other package in std, even those it doesn't depend on.
// This is due to how runtime linkname-implements std packages,
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
// such as sync/atomic or reflect, without importing them in any way.
// A few other cases don't involve runtime, like time/tzdata linknaming to time,
// but luckily those few cases are covered by runtimeLinknamed as well.
//
// If ListedPackages lacks such a package we fill it via runtimeLinknamed.
// TODO: can we instead add runtimeLinknamed to the top-level "go list" args?
if from.Standard {
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
if ok {
return pkg, nil
}
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
if listedRuntimeLinknamed {
return nil, fmt.Errorf("package %q still missing after go list call", path)
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
}
startTime := time.Now()
missing := make([]string, 0, len(runtimeLinknamed))
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
for _, linknamed := range runtimeLinknamed {
switch {
case sharedCache.ListedPackages[linknamed] != nil:
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
// We already have it; skip.
case sharedCache.GoEnv.GOOS != "js" && linknamed == "syscall/js":
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
// GOOS-specific package.
6 months ago
case sharedCache.GoEnv.GOOS != "darwin" && sharedCache.GoEnv.GOOS != "ios" && linknamed == "crypto/x509/internal/macos":
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
// GOOS-specific package.
default:
missing = append(missing, linknamed)
}
}
// We don't need any information about their dependencies, in this case.
if err := appendListedPackages(missing, false); err != nil {
return nil, fmt.Errorf("failed to load missing runtime-linknamed packages: %v", err)
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
}
pkg, ok := sharedCache.ListedPackages[path]
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
if !ok {
return nil, fmt.Errorf("std listed another std package that we can't find: %s", path)
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
}
only list missing packages when obfuscating the runtime We were listing all of std, which certainly worked, but was quite slow at over 200 packages. In practice, we can only be missing up to 20-30 packages. It was a good change as it fixed a severe bug, but it also introduced a fairly noticeable slow-down. The numbers are clear; this change shaves off multiple seconds when obfuscating the runtime with a cold cache: name old time/op new time/op delta Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5) name old bin-B new bin-B delta Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5) Since we only want to call "go list" one extra time, instead of once for every package we find out we're missing, we want to know what packages we could be missing in advance. Resurrect a smarter version of the runtime-related script. Finally, remove the runtime-related.txt test script, as it has now been superseeded by the sanity checks in listPackage. That is, obfuscating the runtime package will now panic if we are missing any necessary package information. To double check that we get the runtime's linkname edge case right, make gogarble.txt use runtime/debug.WriteHeapDump, which is implemented via a direct runtime linkname. This ensures we don't lose test coverage from runtime-related.txt.
2 years ago
listedRuntimeLinknamed = true
log.Printf("listed %d missing runtime-linknamed packages in %s", len(missing), debugSince(startTime))
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
return pkg, nil
}
if !ok {
return nil, fmt.Errorf("list %s: %w", path, ErrNotFound)
}
// Packages outside std can list any package,
// as long as they depend on it directly or indirectly.
for _, dep := range from.Deps {
ensure the runtime is built in a reproducible way We went to great lengths to ensure garble builds are reproducible. This includes how the tool itself works, as its behavior should be the same given the same inputs. However, we made one crucial mistake with the runtime package. It has go:linkname directives pointing at other packages, and some of those pointed packages aren't its dependencies. Imagine two scenarios where garble builds the runtime package: 1) We run "garble build runtime". The way we handle linkname directives calls listPackage on the target package, to obfuscate the target's import path and object name. However, since we only obtained build info of runtime and its deps, calls for some linknames such as listPackage("sync/atomic") will fail. The linkname directive will leave its target untouched. 2) We run "garble build std". Unlike the first scenario, all listPackage calls issued by runtime's linkname directives will succeed, so its linkname directive targets will be obfuscated. At best, this can result in inconsistent builds, depending on how the runtime package was built. At worst, the mismatching object names can result in errors at link time, if the target packages are actually used. The modified test reproduces the worst case scenario reliably, when the fix is reverted: > env GOCACHE=${WORK}/gocache-empty > garble build -a runtime > garble build -o=out_rebuild ./stdimporter [stderr] # test/main/stdimporter JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined [...] The fix consists of two steps. First, if we're building the runtime and listPackage fails on a package, that means we ran into scenario 1 above. To avoid the inconsistency, we fill ListedPackages with "go list [...] std". This means we'll always build runtime as described in scenario 2 above. Second, when building packages other than the runtime, we only allow listPackage to succeed if we're listing a dependency of the current package. This ensures we won't run into similar reproducibility bugs in the future. Finally, re-enable test-gotip on CI since this was the last test flake.
2 years ago
if dep == pkg.ImportPath {
return pkg, nil
}
}
// As a special case, any package can list runtime or its dependencies,
// since those are always an implicit dependency.
// We need to handle this ourselves as runtime does not appear in Deps.
// TODO: it might be faster to bring back a "runtimeAndDeps" map or func.
if pkg.ImportPath == "runtime" {
return pkg, nil
}
for _, dep := range sharedCache.ListedPackages["runtime"].Deps {
if dep == pkg.ImportPath {
return pkg, nil
}
}
return nil, fmt.Errorf("list %s: %w", path, ErrNotDependency)
}