|
|
|
// Copyright (c) 2019, The Garble Authors.
|
|
|
|
// See LICENSE for licensing information.
|
|
|
|
|
|
|
|
package main
|
|
|
|
|
|
|
|
import (
|
|
|
|
"bytes"
|
|
|
|
"encoding/base64"
|
|
|
|
"encoding/binary"
|
|
|
|
"encoding/gob"
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
"encoding/json"
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
"errors"
|
|
|
|
"flag"
|
|
|
|
"fmt"
|
|
|
|
"go/ast"
|
|
|
|
"go/importer"
|
|
|
|
"go/parser"
|
|
|
|
"go/token"
|
|
|
|
"go/types"
|
|
|
|
"io"
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
"io/fs"
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
"io/ioutil"
|
|
|
|
"log"
|
|
|
|
mathrand "math/rand"
|
|
|
|
"os"
|
|
|
|
"os/exec"
|
|
|
|
"path/filepath"
|
|
|
|
"regexp"
|
|
|
|
"runtime"
|
|
|
|
"runtime/debug"
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
"strconv"
|
|
|
|
"strings"
|
|
|
|
"unicode"
|
|
|
|
"unicode/utf8"
|
|
|
|
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
"golang.org/x/mod/modfile"
|
|
|
|
"golang.org/x/mod/module"
|
|
|
|
"golang.org/x/mod/semver"
|
|
|
|
"golang.org/x/tools/go/ast/astutil"
|
|
|
|
|
|
|
|
"mvdan.cc/garble/internal/literals"
|
|
|
|
)
|
|
|
|
|
|
|
|
var (
|
|
|
|
flagSet = flag.NewFlagSet("garble", flag.ContinueOnError)
|
|
|
|
|
|
|
|
version = "(devel)" // to match the default from runtime/debug
|
|
|
|
)
|
|
|
|
|
|
|
|
var (
|
|
|
|
flagObfuscateLiterals bool
|
|
|
|
flagGarbleTiny bool
|
|
|
|
flagDebugDir string
|
|
|
|
flagSeed string
|
|
|
|
)
|
|
|
|
|
|
|
|
func init() {
|
|
|
|
flagSet.Usage = usage
|
|
|
|
flagSet.BoolVar(&flagObfuscateLiterals, "literals", false, "Obfuscate literals such as strings")
|
|
|
|
flagSet.BoolVar(&flagGarbleTiny, "tiny", false, "Optimize for binary size, losing some ability to reverse the process")
|
|
|
|
flagSet.StringVar(&flagDebugDir, "debugdir", "", "Write the obfuscated source to a directory, e.g. -debugdir=out")
|
|
|
|
flagSet.StringVar(&flagSeed, "seed", "", "Provide a base64-encoded seed, e.g. -seed=o9WDTZ4CN4w\nFor a random seed, provide -seed=random")
|
|
|
|
}
|
|
|
|
|
|
|
|
func usage() {
|
|
|
|
fmt.Fprintf(os.Stderr, `
|
|
|
|
Garble obfuscates Go code by wrapping the Go toolchain.
|
|
|
|
|
|
|
|
garble [garble flags] command [go flags] [go arguments]
|
|
|
|
|
|
|
|
For example, to build an obfuscated program:
|
|
|
|
|
|
|
|
garble build ./cmd/foo
|
|
|
|
|
|
|
|
Similarly, to combine garble flags and Go build flags:
|
|
|
|
|
|
|
|
garble -literals build -tags=purego ./cmd/foo
|
|
|
|
|
|
|
|
The following commands are supported:
|
|
|
|
|
|
|
|
build replace "go build"
|
|
|
|
test replace "go test"
|
|
|
|
version print Garble version
|
|
|
|
reverse de-obfuscate output such as stack traces
|
|
|
|
|
|
|
|
To learn more about a command, run "garble help <command>".
|
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.
Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.
Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.
The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".
The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.
The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.
Finally, document "garble test" again.
Fixes #241.
4 years ago
|
|
|
|
|
|
|
garble accepts the following flags before a command:
|
|
|
|
|
|
|
|
`[1:])
|
|
|
|
flagSet.PrintDefaults()
|
|
|
|
fmt.Fprintf(os.Stderr, `
|
|
|
|
|
|
|
|
For more information, see https://github.com/burrowers/garble.
|
|
|
|
`[1:])
|
|
|
|
}
|
|
|
|
|
|
|
|
func main() { os.Exit(main1()) }
|
|
|
|
|
|
|
|
var (
|
|
|
|
fset = token.NewFileSet()
|
|
|
|
sharedTempDir = os.Getenv("GARBLE_SHARED")
|
|
|
|
|
simplify globals, split hash.go (#191)
The previous globals worked, but were unnecessarily complex. For
example, we passed the fromPath variable around, but it's really a
static global, since we only compile or link a single package in each Go
process. Use such global variables instead of passing them around, which
currently include the package's import path, its build ID, and its
import config path.
Also split all the hashing and build ID code into hash.go, since that's
a relatively well contained 200 lines of code that doesn't need to make
main.go any bigger. We also split the code to alter Go's own version to
a separate function, so that it can be moved out of main.go as well.
5 years ago
|
|
|
// origImporter is a go/types importer which uses the original versions
|
|
|
|
// of packages, without any obfuscation. This is helpful to make
|
|
|
|
// decisions on how to obfuscate our input code.
|
wrap types.Importer to canonicalize import paths
The docs for go/importer.ForCompiler say:
The lookup function is called each time the resulting importer
needs to resolve an import path. In this mode the importer can
only be invoked with canonical import paths (not relative or
absolute ones); it is assumed that the translation to canonical
import paths is being done by the client of the importer.
We use a lookup func for two reasons: first, to support modules, and
second, to be able to use our information from "go list -json -export".
However, go/types does not canonicalize import paths before calling
ImportFrom. This is somewhat understandable; it doesn't know whether an
importer was created with a lookup func, and ImportFrom only requires
the input path to be canonicalized in that scenario. When the lookup
func is nil, the importer canonicalizes by itself via go/build.Import.
Before this change, the added crossbuild test would fail:
> garble build net/http
[stderr]
# vendor/golang.org/x/crypto/chacha20
typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/chacha20/chacha_generic.go:10:2: could not import crypto/cipher (can't find import: "crypto/cipher")
# vendor/golang.org/x/text/secure/bidirule
typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/secure/bidirule/bidirule.go:12:2: could not import errors (can't find import: "errors")
# vendor/golang.org/x/crypto/cryptobyte
typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/cryptobyte/asn1.go:8:16: could not import encoding/asn1 (can't find import: "encoding/asn1")
# vendor/golang.org/x/text/unicode/norm
typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/unicode/norm/composition.go:7:8: could not import unicode/utf8 (can't find import: "unicode/utf8")
This is because we'd fall back to importer.Default, which only knows how
to find packages in $GOROOT/pkg. Those are missing for cross-builds,
unsurprisingly, as those built archives end up in the build cache.
After this change, we properly support importing std-vendored packages,
so we can get rid of the importer.Default workaround. And, by extension,
cross-builds now work as well.
Note that, in the added test script, the full build of the binary fails,
as there seems to be some sort of linker problem:
> garble build
[stderr]
# test/main
d9rqJyxo.uoqIiDs5: relocation target runtime.os9A16A3 not defined
We leave that as a TODO for now, as this change is subtle enough as it
is.
4 years ago
|
|
|
origImporter = importerWithMap(importer.ForCompiler(fset, "gc", func(path string) (io.ReadCloser, error) {
|
simplify globals, split hash.go (#191)
The previous globals worked, but were unnecessarily complex. For
example, we passed the fromPath variable around, but it's really a
static global, since we only compile or link a single package in each Go
process. Use such global variables instead of passing them around, which
currently include the package's import path, its build ID, and its
import config path.
Also split all the hashing and build ID code into hash.go, since that's
a relatively well contained 200 lines of code that doesn't need to make
main.go any bigger. We also split the code to alter Go's own version to
a separate function, so that it can be moved out of main.go as well.
5 years ago
|
|
|
pkg, err := listPackage(path)
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
return os.Open(pkg.Export)
|
wrap types.Importer to canonicalize import paths
The docs for go/importer.ForCompiler say:
The lookup function is called each time the resulting importer
needs to resolve an import path. In this mode the importer can
only be invoked with canonical import paths (not relative or
absolute ones); it is assumed that the translation to canonical
import paths is being done by the client of the importer.
We use a lookup func for two reasons: first, to support modules, and
second, to be able to use our information from "go list -json -export".
However, go/types does not canonicalize import paths before calling
ImportFrom. This is somewhat understandable; it doesn't know whether an
importer was created with a lookup func, and ImportFrom only requires
the input path to be canonicalized in that scenario. When the lookup
func is nil, the importer canonicalizes by itself via go/build.Import.
Before this change, the added crossbuild test would fail:
> garble build net/http
[stderr]
# vendor/golang.org/x/crypto/chacha20
typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/chacha20/chacha_generic.go:10:2: could not import crypto/cipher (can't find import: "crypto/cipher")
# vendor/golang.org/x/text/secure/bidirule
typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/secure/bidirule/bidirule.go:12:2: could not import errors (can't find import: "errors")
# vendor/golang.org/x/crypto/cryptobyte
typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/cryptobyte/asn1.go:8:16: could not import encoding/asn1 (can't find import: "encoding/asn1")
# vendor/golang.org/x/text/unicode/norm
typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/unicode/norm/composition.go:7:8: could not import unicode/utf8 (can't find import: "unicode/utf8")
This is because we'd fall back to importer.Default, which only knows how
to find packages in $GOROOT/pkg. Those are missing for cross-builds,
unsurprisingly, as those built archives end up in the build cache.
After this change, we properly support importing std-vendored packages,
so we can get rid of the importer.Default workaround. And, by extension,
cross-builds now work as well.
Note that, in the added test script, the full build of the binary fails,
as there seems to be some sort of linker problem:
> garble build
[stderr]
# test/main
d9rqJyxo.uoqIiDs5: relocation target runtime.os9A16A3 not defined
We leave that as a TODO for now, as this change is subtle enough as it
is.
4 years ago
|
|
|
}).(types.ImporterFrom).ImportFrom)
|
|
|
|
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
// Basic information about the package being currently compiled or linked.
|
|
|
|
curPkg *listedPackage
|
|
|
|
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
garbledImporter = importer.ForCompiler(fset, "gc", func(path string) (io.ReadCloser, error) {
|
|
|
|
pkgfile := cachedOutput.KnownObjectFiles[path]
|
|
|
|
if pkgfile == "" {
|
|
|
|
panic(fmt.Sprintf("missing in cachedOutput.KnownObjectFiles: %q", path))
|
|
|
|
}
|
|
|
|
return os.Open(pkgfile)
|
|
|
|
}).(types.ImporterFrom)
|
|
|
|
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
opts *flagOptions
|
|
|
|
)
|
|
|
|
|
wrap types.Importer to canonicalize import paths
The docs for go/importer.ForCompiler say:
The lookup function is called each time the resulting importer
needs to resolve an import path. In this mode the importer can
only be invoked with canonical import paths (not relative or
absolute ones); it is assumed that the translation to canonical
import paths is being done by the client of the importer.
We use a lookup func for two reasons: first, to support modules, and
second, to be able to use our information from "go list -json -export".
However, go/types does not canonicalize import paths before calling
ImportFrom. This is somewhat understandable; it doesn't know whether an
importer was created with a lookup func, and ImportFrom only requires
the input path to be canonicalized in that scenario. When the lookup
func is nil, the importer canonicalizes by itself via go/build.Import.
Before this change, the added crossbuild test would fail:
> garble build net/http
[stderr]
# vendor/golang.org/x/crypto/chacha20
typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/chacha20/chacha_generic.go:10:2: could not import crypto/cipher (can't find import: "crypto/cipher")
# vendor/golang.org/x/text/secure/bidirule
typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/secure/bidirule/bidirule.go:12:2: could not import errors (can't find import: "errors")
# vendor/golang.org/x/crypto/cryptobyte
typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/cryptobyte/asn1.go:8:16: could not import encoding/asn1 (can't find import: "encoding/asn1")
# vendor/golang.org/x/text/unicode/norm
typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/unicode/norm/composition.go:7:8: could not import unicode/utf8 (can't find import: "unicode/utf8")
This is because we'd fall back to importer.Default, which only knows how
to find packages in $GOROOT/pkg. Those are missing for cross-builds,
unsurprisingly, as those built archives end up in the build cache.
After this change, we properly support importing std-vendored packages,
so we can get rid of the importer.Default workaround. And, by extension,
cross-builds now work as well.
Note that, in the added test script, the full build of the binary fails,
as there seems to be some sort of linker problem:
> garble build
[stderr]
# test/main
d9rqJyxo.uoqIiDs5: relocation target runtime.os9A16A3 not defined
We leave that as a TODO for now, as this change is subtle enough as it
is.
4 years ago
|
|
|
type importerWithMap func(path, dir string, mode types.ImportMode) (*types.Package, error)
|
|
|
|
|
|
|
|
func (fn importerWithMap) Import(path string) (*types.Package, error) {
|
|
|
|
panic("should never be called")
|
|
|
|
}
|
|
|
|
|
|
|
|
func (fn importerWithMap) ImportFrom(path, dir string, mode types.ImportMode) (*types.Package, error) {
|
|
|
|
if path2 := curPkg.ImportMap[path]; path2 != "" {
|
|
|
|
path = path2
|
|
|
|
}
|
|
|
|
return fn(path, dir, mode)
|
|
|
|
}
|
|
|
|
|
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.
Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.
Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.
While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.
We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.
name old time/op new time/op delta
Build-8 106ms ± 1% 92ms ± 0% -14.07% (p=0.010 n=6+4)
name old bin-B new bin-B delta
Build-8 6.60M ± 0% 6.60M ± 0% -0.01% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 208ms ± 5% 149ms ± 3% -28.27% (p=0.004 n=6+5)
name old user-time/op new user-time/op delta
Build-8 433ms ± 3% 384ms ± 3% -11.35% (p=0.002 n=6+6)
4 years ago
|
|
|
func obfuscatedTypesPackage(path string) *types.Package {
|
fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.
obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.
We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:
--- FAIL: TestScripts/reflect (0.81s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble build
[stderr]
# test/main
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]
More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.
To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.
This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.
This fixes a few panics while building the protobuf module.
4 years ago
|
|
|
if path == curPkg.ImportPath {
|
|
|
|
panic("called obfuscatedTypesPackage on the current package?")
|
|
|
|
}
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
if pkg := cachedTypesPackages[path]; pkg != nil {
|
|
|
|
return pkg
|
|
|
|
}
|
|
|
|
pkg, err := garbledImporter.ImportFrom(path, opts.GarbleDir, 0)
|
|
|
|
if err != nil {
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
panic(err)
|
|
|
|
}
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
cachedTypesPackages[path] = pkg // cache for later use
|
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.
Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.
Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.
While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.
We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.
name old time/op new time/op delta
Build-8 106ms ± 1% 92ms ± 0% -14.07% (p=0.010 n=6+4)
name old bin-B new bin-B delta
Build-8 6.60M ± 0% 6.60M ± 0% -0.01% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 208ms ± 5% 149ms ± 3% -28.27% (p=0.004 n=6+5)
name old user-time/op new user-time/op delta
Build-8 433ms ± 3% 384ms ± 3% -11.35% (p=0.002 n=6+6)
4 years ago
|
|
|
return pkg
|
|
|
|
}
|
|
|
|
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
var cachedTypesPackages = make(map[string]*types.Package)
|
|
|
|
|
|
|
|
func main1() int {
|
|
|
|
if err := flagSet.Parse(os.Args[1:]); err != nil {
|
|
|
|
return 2
|
|
|
|
}
|
|
|
|
log.SetPrefix("[garble] ")
|
|
|
|
args := flagSet.Args()
|
|
|
|
if len(args) < 1 {
|
|
|
|
usage()
|
|
|
|
return 2
|
|
|
|
}
|
|
|
|
if err := mainErr(args); err != nil {
|
|
|
|
if code, ok := err.(errJustExit); ok {
|
|
|
|
os.Exit(int(code))
|
|
|
|
}
|
|
|
|
fmt.Fprintln(os.Stderr, err)
|
|
|
|
|
|
|
|
// If the build failed and a random seed was used,
|
|
|
|
// the failure might not reproduce with a different seed.
|
|
|
|
// Print it before we exit.
|
|
|
|
if flagSeed == "random" {
|
|
|
|
fmt.Fprintf(os.Stderr, "random seed: %s\n", base64.RawStdEncoding.EncodeToString(opts.Seed))
|
|
|
|
}
|
|
|
|
return 1
|
|
|
|
}
|
|
|
|
return 0
|
|
|
|
}
|
|
|
|
|
|
|
|
type errJustExit int
|
|
|
|
|
|
|
|
func (e errJustExit) Error() string { return fmt.Sprintf("exit: %d", e) }
|
|
|
|
|
|
|
|
func goVersionOK() bool {
|
|
|
|
const (
|
|
|
|
minGoVersionSemver = "v1.17.0"
|
|
|
|
suggestedGoVersion = "1.17.x"
|
|
|
|
)
|
|
|
|
|
|
|
|
rxVersion := regexp.MustCompile(`go\d+\.\d+(\.\d)?`)
|
|
|
|
version := rxVersion.FindString(cache.GoEnv.GOVERSION)
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
if version == "" {
|
|
|
|
// Go 1.15.x and older do not have GOVERSION yet.
|
|
|
|
// We could go the extra mile and fetch it via 'go version',
|
|
|
|
// but we'd have to error anyway.
|
|
|
|
fmt.Fprintf(os.Stderr, "Go version is too old; please upgrade to Go %s or a newer devel version\n", suggestedGoVersion)
|
|
|
|
return false
|
|
|
|
}
|
|
|
|
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
versionSemver := "v" + strings.TrimPrefix(version, "go")
|
|
|
|
if semver.Compare(versionSemver, minGoVersionSemver) < 0 {
|
|
|
|
fmt.Fprintf(os.Stderr, "Go version %q is too old; please upgrade to Go %s\n", version, suggestedGoVersion)
|
|
|
|
return false
|
|
|
|
}
|
|
|
|
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
|
|
|
|
func mainErr(args []string) error {
|
|
|
|
// If we recognize an argument, we're not running within -toolexec.
|
|
|
|
switch command, args := args[0], args[1:]; command {
|
|
|
|
case "help":
|
|
|
|
if hasHelpFlag(args) || len(args) > 1 {
|
|
|
|
fmt.Fprintf(os.Stderr, "usage: garble help [command]\n")
|
|
|
|
return errJustExit(2)
|
|
|
|
}
|
|
|
|
if len(args) == 1 {
|
|
|
|
return mainErr([]string{args[0], "-h"})
|
|
|
|
}
|
|
|
|
usage()
|
|
|
|
return errJustExit(2)
|
|
|
|
case "version":
|
|
|
|
if hasHelpFlag(args) || len(args) > 0 {
|
|
|
|
fmt.Fprintf(os.Stderr, "usage: garble version\n")
|
|
|
|
return errJustExit(2)
|
|
|
|
}
|
|
|
|
// don't overwrite the version if it was set by -ldflags=-X
|
|
|
|
if info, ok := debug.ReadBuildInfo(); ok && version == "(devel)" {
|
|
|
|
mod := &info.Main
|
|
|
|
if mod.Replace != nil {
|
|
|
|
mod = mod.Replace
|
|
|
|
}
|
|
|
|
version = mod.Version
|
|
|
|
}
|
|
|
|
fmt.Println(version)
|
|
|
|
return nil
|
|
|
|
case "reverse":
|
|
|
|
return commandReverse(args)
|
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.
Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.
Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.
The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".
The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.
The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.
Finally, document "garble test" again.
Fixes #241.
4 years ago
|
|
|
case "build", "test":
|
|
|
|
cmd, err := toolexecCmd(command, args)
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
cmd.Stdout = os.Stdout
|
|
|
|
cmd.Stderr = os.Stderr
|
|
|
|
return cmd.Run()
|
|
|
|
}
|
|
|
|
|
|
|
|
if !filepath.IsAbs(args[0]) {
|
|
|
|
// -toolexec gives us an absolute path to the tool binary to
|
|
|
|
// run, so this is most likely misuse of garble by a user.
|
|
|
|
return fmt.Errorf("unknown command: %q", args[0])
|
|
|
|
}
|
|
|
|
|
|
|
|
// We're in a toolexec sub-process, not directly called by the user.
|
|
|
|
// Load the shared data and wrap the tool, like the compiler or linker.
|
|
|
|
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
if err := loadSharedCache(); err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
opts = &cache.Options
|
|
|
|
|
|
|
|
_, tool := filepath.Split(args[0])
|
|
|
|
if runtime.GOOS == "windows" {
|
|
|
|
tool = strings.TrimSuffix(tool, ".exe")
|
|
|
|
}
|
simplify globals, split hash.go (#191)
The previous globals worked, but were unnecessarily complex. For
example, we passed the fromPath variable around, but it's really a
static global, since we only compile or link a single package in each Go
process. Use such global variables instead of passing them around, which
currently include the package's import path, its build ID, and its
import config path.
Also split all the hashing and build ID code into hash.go, since that's
a relatively well contained 200 lines of code that doesn't need to make
main.go any bigger. We also split the code to alter Go's own version to
a separate function, so that it can be moved out of main.go as well.
5 years ago
|
|
|
if len(args) == 2 && args[1] == "-V=full" {
|
|
|
|
return alterToolVersion(tool, args)
|
|
|
|
}
|
|
|
|
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
toolexecImportPath := os.Getenv("TOOLEXEC_IMPORTPATH")
|
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.
Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.
Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.
The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".
The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.
The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.
Finally, document "garble test" again.
Fixes #241.
4 years ago
|
|
|
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
curPkg = cache.ListedPackages[toolexecImportPath]
|
|
|
|
if curPkg == nil {
|
|
|
|
return fmt.Errorf("TOOLEXEC_IMPORTPATH not found in listed packages: %s", toolexecImportPath)
|
|
|
|
}
|
|
|
|
|
|
|
|
transform := transformFuncs[tool]
|
|
|
|
transformed := args[1:]
|
|
|
|
// log.Println(tool, transformed)
|
|
|
|
if transform != nil {
|
|
|
|
var err error
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
if transformed, err = transform(transformed); err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
}
|
keep importmap entries in the right direction
For packages that we alter, we parse and modify the importcfg file.
Parsing is necessary so we can locate obfuscated object files,
which we use to remember what identifiers were obfuscated.
Modifying the files is necessary when we obfuscate import paths,
and those import paths have entries in an importcfg file.
However, we made one crucial mistake when writing the code.
When handling importmap entries such as:
importmap golang.org/x/net/idna=vendor/golang.org/x/net/idna
we would name the two sides beforePath and afterPath, respectively.
They were added to importMap with afterPath as the key,
but when we iterated over the map to write a modified importcfg file,
we would then assume the key is beforepath.
All in all, we would end up writing the opposite direction:
importmap vendor/golang.org/x/net/idna=golang.org/x/net/idna
This would ultimately result in the importmap never being useful,
and some rather confusing error messages such as:
cannot find package golang.org/x/net/idna (using -importcfg)
Add a test case that reproduces this error,
and fix the code so it always uses beforePath as the key.
Note that we were also updating importCfgEntries with such entries.
I could not reproduce any failure when just removing that code,
nor could I explain why it was added there in the first place.
As such, remove that bit of code as well.
Finally, a reasonable question might be why we never noticed the bug.
In practice, such "importmap"s, represented as ImportMap by "go list",
only currently appear for packages vendored into the standard library.
Until very recently, we didn't support obfuscating most of std,
so we would usually not alter the affected importcfg files.
Now that we do parse and modify them, the bug surfaced.
Fixes #408.
4 years ago
|
|
|
// log.Println(tool, transformed)
|
|
|
|
cmd := exec.Command(args[0], transformed...)
|
|
|
|
cmd.Stdout = os.Stdout
|
|
|
|
cmd.Stderr = os.Stderr
|
|
|
|
if err := cmd.Run(); err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func hasHelpFlag(flags []string) bool {
|
|
|
|
for _, f := range flags {
|
|
|
|
switch f {
|
|
|
|
case "-h", "-help", "--help":
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return false
|
|
|
|
}
|
|
|
|
|
|
|
|
// toolexecCmd builds an *exec.Cmd which is set up for running "go <command>"
|
|
|
|
// with -toolexec=garble and the supplied arguments.
|
|
|
|
//
|
|
|
|
// Note that it uses and modifies global state; in general, it should only be
|
|
|
|
// called once from mainErr in the top-level garble process.
|
|
|
|
func toolexecCmd(command string, args []string) (*exec.Cmd, error) {
|
|
|
|
// Split the flags from the package arguments, since we'll need
|
|
|
|
// to run 'go list' on the same set of packages.
|
|
|
|
flags, args := splitFlagsFromArgs(args)
|
|
|
|
if hasHelpFlag(flags) {
|
|
|
|
out, _ := exec.Command("go", command, "-h").CombinedOutput()
|
|
|
|
fmt.Fprintf(os.Stderr, `
|
|
|
|
usage: garble [garble flags] %s [arguments]
|
|
|
|
|
|
|
|
This command wraps "go %s". Below is its help:
|
|
|
|
|
|
|
|
%s`[1:], command, command, out)
|
|
|
|
return nil, errJustExit(2)
|
|
|
|
}
|
|
|
|
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
if err := setFlagOptions(); err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
// Here is the only place we initialize the cache.
|
|
|
|
// The sub-processes will parse it from a shared gob file.
|
|
|
|
cache = &sharedCache{Options: *opts}
|
|
|
|
|
|
|
|
// Note that we also need to pass build flags to 'go list', such
|
|
|
|
// as -tags.
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
cache.ForwardBuildFlags, _ = filterForwardBuildFlags(flags)
|
|
|
|
if command == "test" {
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
cache.ForwardBuildFlags = append(cache.ForwardBuildFlags, "-test")
|
|
|
|
}
|
|
|
|
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
if err := fetchGoEnv(); err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
if !goVersionOK() {
|
|
|
|
return nil, errJustExit(1)
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
}
|
|
|
|
|
|
|
|
var err error
|
|
|
|
cache.ExecPath, err = os.Executable()
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
|
|
|
if err := setListedPackages(args); err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
sharedTempDir, err = saveSharedCache()
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
os.Setenv("GARBLE_SHARED", sharedTempDir)
|
|
|
|
defer os.Remove(sharedTempDir)
|
|
|
|
|
|
|
|
goArgs := []string{
|
|
|
|
command,
|
|
|
|
"-trimpath",
|
|
|
|
"-toolexec=" + cache.ExecPath,
|
|
|
|
}
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
if flagDebugDir != "" {
|
|
|
|
// In case the user deletes the debug directory,
|
|
|
|
// and a previous build is cached,
|
|
|
|
// rebuild all packages to re-fill the debug dir.
|
|
|
|
goArgs = append(goArgs, "-a")
|
|
|
|
}
|
|
|
|
if command == "test" {
|
|
|
|
// vet is generally not useful on obfuscated code; keep it
|
|
|
|
// disabled by default.
|
|
|
|
goArgs = append(goArgs, "-vet=off")
|
|
|
|
}
|
|
|
|
goArgs = append(goArgs, flags...)
|
|
|
|
goArgs = append(goArgs, args...)
|
|
|
|
|
|
|
|
return exec.Command("go", goArgs...), nil
|
|
|
|
}
|
|
|
|
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
var transformFuncs = map[string]func([]string) (args []string, _ error){
|
|
|
|
"asm": transformAsm,
|
|
|
|
"compile": transformCompile,
|
|
|
|
"link": transformLink,
|
|
|
|
}
|
|
|
|
|
|
|
|
func transformAsm(args []string) ([]string, error) {
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
if !curPkg.ToObfuscate {
|
|
|
|
return args, nil // we're not obfuscating this package
|
|
|
|
}
|
|
|
|
|
|
|
|
flags, paths := splitFlagsFromFiles(args, ".s")
|
|
|
|
|
|
|
|
// When assembling, the import path can make its way into the output
|
|
|
|
// object file.
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
if curPkg.Name != "main" && curPkg.ToObfuscate {
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
flags = flagSetValue(flags, "-p", curPkg.obfuscatedImportPath())
|
|
|
|
}
|
|
|
|
|
|
|
|
flags = alterTrimpath(flags)
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
// If the assembler is running just for -gensymabis,
|
|
|
|
// don't obfuscate the source, as we are not assembling yet.
|
|
|
|
// The assembler will run again later; obfuscating twice is just wasteful.
|
|
|
|
symabis := false
|
|
|
|
for _, arg := range args {
|
|
|
|
if arg == "-gensymabis" {
|
|
|
|
symabis = true
|
|
|
|
break
|
|
|
|
}
|
|
|
|
}
|
|
|
|
newPaths := make([]string, 0, len(paths))
|
|
|
|
if !symabis {
|
|
|
|
var newPaths []string
|
|
|
|
for _, path := range paths {
|
|
|
|
name := filepath.Base(path)
|
|
|
|
pkgDir := filepath.Join(sharedTempDir, filepath.FromSlash(curPkg.ImportPath))
|
|
|
|
newPath := filepath.Join(pkgDir, name)
|
|
|
|
newPaths = append(newPaths, newPath)
|
|
|
|
}
|
|
|
|
return append(flags, newPaths...), nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// We need to replace all function references with their obfuscated name
|
|
|
|
// counterparts.
|
|
|
|
// Luckily, all func names in Go assembly files are immediately followed
|
|
|
|
// by the unicode "middle dot", like:
|
|
|
|
//
|
|
|
|
// TEXT ·privateAdd(SB),$0-24
|
|
|
|
const middleDot = '·'
|
|
|
|
middleDotLen := utf8.RuneLen(middleDot)
|
|
|
|
|
|
|
|
for _, path := range paths {
|
|
|
|
|
|
|
|
// Read the entire file into memory.
|
|
|
|
// If we find issues with large files, we can use bufio.
|
|
|
|
content, err := os.ReadFile(path)
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
|
|
|
// Find all middle-dot names, and replace them.
|
|
|
|
remaining := content
|
|
|
|
var buf bytes.Buffer
|
|
|
|
for {
|
|
|
|
i := bytes.IndexRune(remaining, middleDot)
|
|
|
|
if i < 0 {
|
|
|
|
buf.Write(remaining)
|
|
|
|
remaining = nil
|
|
|
|
break
|
|
|
|
}
|
|
|
|
|
|
|
|
// We want to replace "OP ·foo" and "OP $·foo",
|
|
|
|
// but not "OP somepkg·foo" just yet.
|
|
|
|
// "somepkg" is often runtime, syscall, etc.
|
|
|
|
// We don't obfuscate any of those for now.
|
|
|
|
//
|
|
|
|
// TODO: we'll likely need to deal with this
|
|
|
|
// when we start obfuscating the runtime.
|
|
|
|
// When we do, note that we can't hash with curPkg.
|
|
|
|
localName := false
|
|
|
|
if i >= 0 {
|
|
|
|
switch remaining[i-1] {
|
|
|
|
case ' ', '\t', '$':
|
|
|
|
localName = true
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
i += middleDotLen
|
|
|
|
buf.Write(remaining[:i])
|
|
|
|
remaining = remaining[i:]
|
|
|
|
|
|
|
|
// The name ends at the first rune which cannot be part
|
|
|
|
// of a Go identifier, such as a comma or space.
|
|
|
|
nameEnd := 0
|
|
|
|
for nameEnd < len(remaining) {
|
|
|
|
c, size := utf8.DecodeRune(remaining[nameEnd:])
|
|
|
|
if !unicode.IsLetter(c) && c != '_' && !unicode.IsDigit(c) {
|
|
|
|
break
|
|
|
|
}
|
|
|
|
nameEnd += size
|
|
|
|
}
|
|
|
|
name := string(remaining[:nameEnd])
|
|
|
|
remaining = remaining[nameEnd:]
|
|
|
|
|
|
|
|
if !localName {
|
|
|
|
buf.WriteString(name)
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
|
|
|
|
newName := hashWith(curPkg.GarbleActionID, name)
|
|
|
|
// log.Printf("%q hashed with %x to %q", name, curPkg.GarbleActionID, newName)
|
|
|
|
buf.WriteString(newName)
|
|
|
|
}
|
|
|
|
|
small improvements towards obfuscating the runtime
I spent a couple of days trying to obfuscate all of std.
Ultimately I failed at making it fully work,
especially when it comes to the runtime package,
but I did fix a few problems along the way, as seen here.
First, fix the TODO to allow handleDirectives and transformGo to run on
runtime packages as well, if they are considered private. Note that this
is never true right now, but it matters once we remove runtimeRelated.
Second, modify parsedebugvars in a way that doesn't break typechecking.
We can remove AST nodes or even modify them in simple ways,
but if we add new AST nodes after typechecking,
those will lack type information.
We were replacing the entire body, running into that problem.
Instead, carefully cut the body to set some defaults,
but remove everything from the point GODEBUG is read.
Finally, add commented-out debug prints of transformed asm files.
For #193.
4 years ago
|
|
|
// Uncomment for some quick debugging. Do not delete.
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
// if curPkg.ToObfuscate {
|
small improvements towards obfuscating the runtime
I spent a couple of days trying to obfuscate all of std.
Ultimately I failed at making it fully work,
especially when it comes to the runtime package,
but I did fix a few problems along the way, as seen here.
First, fix the TODO to allow handleDirectives and transformGo to run on
runtime packages as well, if they are considered private. Note that this
is never true right now, but it matters once we remove runtimeRelated.
Second, modify parsedebugvars in a way that doesn't break typechecking.
We can remove AST nodes or even modify them in simple ways,
but if we add new AST nodes after typechecking,
those will lack type information.
We were replacing the entire body, running into that problem.
Instead, carefully cut the body to set some defaults,
but remove everything from the point GODEBUG is read.
Finally, add commented-out debug prints of transformed asm files.
For #193.
4 years ago
|
|
|
// fmt.Fprintf(os.Stderr, "\n-- %s --\n%s", path, buf.Bytes())
|
|
|
|
// }
|
|
|
|
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
name := filepath.Base(path)
|
|
|
|
if path, err := writeTemp(name, buf.Bytes()); err != nil {
|
|
|
|
return nil, err
|
|
|
|
} else {
|
|
|
|
newPaths = append(newPaths, path)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return append(flags, newPaths...), nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// writeTemp is a mix between os.CreateTemp and os.WriteFile, as it writes a
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
// named source file in sharedTempDir given an input buffer.
|
|
|
|
//
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
// Note that the file is created under a directory tree following curPkg's
|
|
|
|
// import path, mimicking how files are laid out in modules and GOROOT.
|
|
|
|
func writeTemp(name string, content []byte) (string, error) {
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
pkgDir := filepath.Join(sharedTempDir, filepath.FromSlash(curPkg.ImportPath))
|
|
|
|
if err := os.MkdirAll(pkgDir, 0o777); err != nil {
|
|
|
|
return "", err
|
|
|
|
}
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
dstPath := filepath.Join(pkgDir, name)
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
if err := writeFileExclusive(dstPath, content); err != nil {
|
|
|
|
return "", err
|
|
|
|
}
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
return dstPath, nil
|
|
|
|
}
|
|
|
|
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
func transformCompile(args []string) ([]string, error) {
|
|
|
|
var err error
|
|
|
|
flags, paths := splitFlagsFromFiles(args, ".go")
|
always use the compiler's -dwarf=false flag (#96)
First, our original append line was completely ineffective; we never
used that "flags" slice again. Second, we only attempted to use the flag
when we obfuscated a package.
In fact, we never care about debugging information here, so for any
package we compile, we can add "-dwarf=false". At the moment, we compile
all packages, even if they aren't to be obfuscated, due to the lack of
access to the build cache.
As such, we save a significant amount of work. The numbers below were
obtained on a quiet machine with "go test -bench=. -benchtime=10x", six
times before and after the change.
name old time/op new time/op delta
Build-8 2.06s ± 4% 1.87s ± 2% -9.21% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 1.51s ± 2% 1.46s ± 1% -3.12% (p=0.004 n=6+5)
name old user-time/op new user-time/op delta
Build-8 11.9s ± 2% 10.8s ± 1% -8.71% (p=0.002 n=6+6)
While at it, only do CI builds on pushes and PRs to the master branch,
so that my PRs created from the same repo don't trigger duplicate
builds.
5 years ago
|
|
|
|
|
|
|
// We will force the linker to drop DWARF via -w, so don't spend time
|
|
|
|
// generating it.
|
|
|
|
flags = append(flags, "-dwarf=false")
|
|
|
|
|
|
|
|
for i, path := range paths {
|
|
|
|
if filepath.Base(path) == "_gomod_.go" {
|
|
|
|
// never include module info
|
|
|
|
paths = append(paths[:i], paths[i+1:]...)
|
|
|
|
break
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
var files []*ast.File
|
|
|
|
for _, path := range paths {
|
|
|
|
file, err := parser.ParseFile(fset, path, nil, parser.ParseComments)
|
|
|
|
if err != nil {
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
files = append(files, file)
|
|
|
|
}
|
|
|
|
tf := newTransformer()
|
|
|
|
if err := tf.typecheck(files); err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
|
|
|
flags = alterTrimpath(flags)
|
|
|
|
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
// Note that if the file already exists in the cache from another build,
|
|
|
|
// we don't need to write to it again thanks to the hash.
|
|
|
|
// TODO: as an optimization, just load that one gob file.
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
if err := loadCachedOutputs(); err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
|
|
|
|
tf.findReflectFunctions(files)
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
newImportCfg, err := processImportCfg(flags)
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
if err := writeGobExclusive(
|
|
|
|
garbleExportFile(curPkg),
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
cachedOutput,
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
); err != nil && !errors.Is(err, fs.ErrExist) {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
|
|
|
// We can't obfuscate literals in the runtime and its dependencies,
|
|
|
|
// because obfuscated literals sometimes escape to heap,
|
|
|
|
// and that's not allowed in the runtime itself.
|
|
|
|
if runtimeAndDeps[curPkg.ImportPath] {
|
|
|
|
opts.ObfuscateLiterals = false
|
|
|
|
}
|
|
|
|
|
|
|
|
// Literal obfuscation uses math/rand, so seed it deterministically.
|
|
|
|
randSeed := opts.Seed
|
|
|
|
if len(randSeed) == 0 {
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
randSeed = curPkg.GarbleActionID
|
|
|
|
}
|
|
|
|
// log.Printf("seeding math/rand with %x\n", randSeed)
|
|
|
|
mathrand.Seed(int64(binary.BigEndian.Uint64(randSeed)))
|
|
|
|
|
|
|
|
tf.prefillIgnoreObjects(files)
|
|
|
|
// log.Println(flags)
|
|
|
|
|
|
|
|
// If this is a package to obfuscate, swap the -p flag with the new
|
|
|
|
// package path.
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
newPkgPath := ""
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
if curPkg.Name != "main" && curPkg.ToObfuscate {
|
|
|
|
newPkgPath = curPkg.obfuscatedImportPath()
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
flags = flagSetValue(flags, "-p", newPkgPath)
|
|
|
|
}
|
|
|
|
|
always use the compiler's -dwarf=false flag (#96)
First, our original append line was completely ineffective; we never
used that "flags" slice again. Second, we only attempted to use the flag
when we obfuscated a package.
In fact, we never care about debugging information here, so for any
package we compile, we can add "-dwarf=false". At the moment, we compile
all packages, even if they aren't to be obfuscated, due to the lack of
access to the build cache.
As such, we save a significant amount of work. The numbers below were
obtained on a quiet machine with "go test -bench=. -benchtime=10x", six
times before and after the change.
name old time/op new time/op delta
Build-8 2.06s ± 4% 1.87s ± 2% -9.21% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 1.51s ± 2% 1.46s ± 1% -3.12% (p=0.004 n=6+5)
name old user-time/op new user-time/op delta
Build-8 11.9s ± 2% 10.8s ± 1% -8.71% (p=0.002 n=6+6)
While at it, only do CI builds on pushes and PRs to the master branch,
so that my PRs created from the same repo don't trigger duplicate
builds.
5 years ago
|
|
|
newPaths := make([]string, 0, len(files))
|
|
|
|
|
|
|
|
for i, file := range files {
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
name := filepath.Base(paths[i])
|
|
|
|
if curPkg.ImportPath == "runtime" && opts.Tiny {
|
|
|
|
// strip unneeded runtime code
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
stripRuntime(name, file)
|
small improvements towards obfuscating the runtime
I spent a couple of days trying to obfuscate all of std.
Ultimately I failed at making it fully work,
especially when it comes to the runtime package,
but I did fix a few problems along the way, as seen here.
First, fix the TODO to allow handleDirectives and transformGo to run on
runtime packages as well, if they are considered private. Note that this
is never true right now, but it matters once we remove runtimeRelated.
Second, modify parsedebugvars in a way that doesn't break typechecking.
We can remove AST nodes or even modify them in simple ways,
but if we add new AST nodes after typechecking,
those will lack type information.
We were replacing the entire body, running into that problem.
Instead, carefully cut the body to set some defaults,
but remove everything from the point GODEBUG is read.
Finally, add commented-out debug prints of transformed asm files.
For #193.
4 years ago
|
|
|
}
|
|
|
|
if strings.HasPrefix(name, "_cgo_") {
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
// Don't obfuscate cgo code, since it's generated and it gets messy.
|
small improvements towards obfuscating the runtime
I spent a couple of days trying to obfuscate all of std.
Ultimately I failed at making it fully work,
especially when it comes to the runtime package,
but I did fix a few problems along the way, as seen here.
First, fix the TODO to allow handleDirectives and transformGo to run on
runtime packages as well, if they are considered private. Note that this
is never true right now, but it matters once we remove runtimeRelated.
Second, modify parsedebugvars in a way that doesn't break typechecking.
We can remove AST nodes or even modify them in simple ways,
but if we add new AST nodes after typechecking,
those will lack type information.
We were replacing the entire body, running into that problem.
Instead, carefully cut the body to set some defaults,
but remove everything from the point GODEBUG is read.
Finally, add commented-out debug prints of transformed asm files.
For #193.
4 years ago
|
|
|
} else {
|
|
|
|
tf.handleDirectives(file.Comments)
|
|
|
|
file = tf.transformGo(file)
|
|
|
|
}
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
if newPkgPath != "" {
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
file.Name.Name = newPkgPath
|
|
|
|
}
|
|
|
|
|
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.
Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".
Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.
The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.
The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.
Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.
Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.
For #5.
4 years ago
|
|
|
src, err := printFile(file)
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
|
|
|
// Uncomment for some quick debugging. Do not delete.
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
// if curPkg.ToObfuscate {
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
// fmt.Fprintf(os.Stderr, "\n-- %s/%s --\n%s", curPkg.ImportPath, name, src)
|
|
|
|
// }
|
|
|
|
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
if path, err := writeTemp(name, src); err != nil {
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
return nil, err
|
|
|
|
} else {
|
|
|
|
newPaths = append(newPaths, path)
|
|
|
|
}
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
if opts.DebugDir != "" {
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
osPkgPath := filepath.FromSlash(curPkg.ImportPath)
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
pkgDebugDir := filepath.Join(opts.DebugDir, osPkgPath)
|
|
|
|
if err := os.MkdirAll(pkgDebugDir, 0o755); err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
debugFilePath := filepath.Join(pkgDebugDir, name)
|
|
|
|
if err := os.WriteFile(debugFilePath, src, 0o666); err != nil {
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
flags = flagSetValue(flags, "-importcfg", newImportCfg)
|
|
|
|
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
return append(flags, newPaths...), nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// handleDirectives looks at all the comments in a file containing build
|
|
|
|
// directives, and does the necessary for the obfuscation process to work.
|
|
|
|
//
|
|
|
|
// Right now, this means recording what local names are used with go:linkname,
|
|
|
|
// and rewriting those directives to use obfuscated name from other packages.
|
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.
Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".
Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.
The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.
The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.
Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.
Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.
For #5.
4 years ago
|
|
|
func (tf *transformer) handleDirectives(comments []*ast.CommentGroup) {
|
|
|
|
for _, group := range comments {
|
|
|
|
for _, comment := range group.List {
|
|
|
|
if !strings.HasPrefix(comment.Text, "//go:linkname ") {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
fields := strings.Fields(comment.Text)
|
|
|
|
if len(fields) != 3 {
|
|
|
|
// TODO: the 2nd argument is optional, handle when it's not present
|
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.
Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".
Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.
The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.
The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.
Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.
Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.
For #5.
4 years ago
|
|
|
continue
|
|
|
|
}
|
|
|
|
// This directive has two arguments: "go:linkname localName newName"
|
|
|
|
|
|
|
|
// obfuscate the local name, if the current package is obfuscated
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
if curPkg.ToObfuscate {
|
|
|
|
fields[1] = hashWith(curPkg.GarbleActionID, fields[1])
|
|
|
|
}
|
|
|
|
|
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.
Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".
Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.
The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.
The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.
Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.
Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.
For #5.
4 years ago
|
|
|
// If the new name is of the form "pkgpath.Name", and
|
|
|
|
// we've obfuscated "Name" in that package, rewrite the
|
|
|
|
// directive to use the obfuscated name.
|
|
|
|
newName := fields[2]
|
|
|
|
dotCnt := strings.Count(newName, ".")
|
|
|
|
if dotCnt < 1 {
|
|
|
|
// probably a malformed linkname directive
|
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.
Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".
Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.
The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.
The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.
Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.
Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.
For #5.
4 years ago
|
|
|
continue
|
|
|
|
}
|
|
|
|
|
|
|
|
// If the package path has multiple dots, split on the
|
|
|
|
// last one.
|
|
|
|
lastDotIdx := strings.LastIndex(newName, ".")
|
|
|
|
pkgPath, name := newName[:lastDotIdx], newName[lastDotIdx+1:]
|
|
|
|
|
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.
Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".
Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.
The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.
The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.
Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.
Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.
For #5.
4 years ago
|
|
|
lpkg, err := listPackage(pkgPath)
|
|
|
|
if err != nil {
|
|
|
|
// probably a made up symbol name, replace the comment
|
|
|
|
// in case the local name was obfuscated.
|
|
|
|
comment.Text = strings.Join(fields, " ")
|
|
|
|
continue
|
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.
Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".
Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.
The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.
The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.
Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.
Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.
For #5.
4 years ago
|
|
|
}
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
if lpkg.ToObfuscate {
|
|
|
|
// The name exists and was obfuscated; obfuscate
|
|
|
|
// the new name.
|
|
|
|
newName := hashWith(lpkg.GarbleActionID, name)
|
|
|
|
newPkgPath := pkgPath
|
|
|
|
if pkgPath != "main" {
|
|
|
|
newPkgPath = lpkg.obfuscatedImportPath()
|
|
|
|
}
|
|
|
|
fields[2] = newPkgPath + "." + newName
|
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.
Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".
Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.
The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.
The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.
Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.
Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.
For #5.
4 years ago
|
|
|
}
|
|
|
|
|
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.
Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".
Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.
The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.
The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.
Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.
Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.
For #5.
4 years ago
|
|
|
comment.Text = strings.Join(fields, " ")
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// cannotObfuscate is a list of some packages the runtime depends on, or
|
|
|
|
// packages which the runtime points to via go:linkname.
|
|
|
|
//
|
|
|
|
// Once we support go:linkname well and once we can obfuscate the runtime
|
|
|
|
// package, this entire map can likely go away.
|
|
|
|
//
|
|
|
|
// TODO: investigate and resolve each one of these
|
|
|
|
var cannotObfuscate = map[string]bool{
|
|
|
|
// not a "real" package
|
|
|
|
"unsafe": true,
|
|
|
|
|
|
|
|
// some linkname failure
|
|
|
|
"time": true,
|
|
|
|
"runtime/pprof": true,
|
|
|
|
|
|
|
|
// all kinds of stuff breaks when obfuscating the runtime
|
|
|
|
"syscall": true,
|
|
|
|
"internal/abi": true,
|
|
|
|
|
|
|
|
// rebuilds don't work
|
|
|
|
"os/signal": true,
|
|
|
|
|
|
|
|
// cgo breaks otherwise
|
|
|
|
"runtime/cgo": true,
|
|
|
|
|
|
|
|
// garble reverse breaks otherwise
|
|
|
|
"runtime/debug": true,
|
|
|
|
|
|
|
|
// cgo heavy net doesn't like to be obfuscated
|
|
|
|
"net": true,
|
|
|
|
|
|
|
|
// some linkname failure
|
|
|
|
"crypto/x509/internal/macos": true,
|
|
|
|
}
|
|
|
|
|
|
|
|
// Obtained from "go list -deps runtime" on Go master (1.18) as of Nov 2021.
|
|
|
|
// Note that the same command on Go 1.17 results in a subset of this list.
|
|
|
|
var runtimeAndDeps = map[string]bool{
|
|
|
|
"internal/goarch": true,
|
|
|
|
"unsafe": true,
|
|
|
|
"internal/abi": true,
|
|
|
|
"internal/cpu": true,
|
|
|
|
"internal/bytealg": true,
|
|
|
|
"internal/goexperiment": true,
|
|
|
|
"internal/goos": true,
|
|
|
|
"runtime/internal/atomic": true,
|
|
|
|
"runtime/internal/math": true,
|
|
|
|
"runtime/internal/sys": true,
|
|
|
|
"runtime": true,
|
|
|
|
}
|
|
|
|
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
// toObfuscate checks if a package should be obfuscated given its import path.
|
|
|
|
// If you are holding a listedPackage, reuse its ToObfuscate field instead.
|
|
|
|
func toObfuscate(path string) bool {
|
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.
Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.
Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.
The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".
The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.
The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.
Finally, document "garble test" again.
Fixes #241.
4 years ago
|
|
|
// We don't support obfuscating these yet.
|
|
|
|
if cannotObfuscate[path] || runtimeAndDeps[path] {
|
|
|
|
return false
|
|
|
|
}
|
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.
Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.
Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.
The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".
The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.
The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.
Finally, document "garble test" again.
Fixes #241.
4 years ago
|
|
|
// These are main packages, so we must always obfuscate them.
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
if path == "command-line-arguments" || strings.HasPrefix(path, "plugin/unnamed") {
|
|
|
|
return true
|
|
|
|
}
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
return module.MatchPrefixPatterns(cache.GOGARBLE, path)
|
|
|
|
}
|
|
|
|
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
// processImportCfg parses the importcfg file passed to a compile or link step,
|
|
|
|
// adding missing entries to KnownObjectFiles to be stored in the build cache.
|
|
|
|
// It also builds a new importcfg file to account for obfuscated import paths.
|
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.
Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.
Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.
While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.
We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.
name old time/op new time/op delta
Build-8 106ms ± 1% 92ms ± 0% -14.07% (p=0.010 n=6+4)
name old bin-B new bin-B delta
Build-8 6.60M ± 0% 6.60M ± 0% -0.01% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 208ms ± 5% 149ms ± 3% -28.27% (p=0.004 n=6+5)
name old user-time/op new user-time/op delta
Build-8 433ms ± 3% 384ms ± 3% -11.35% (p=0.002 n=6+6)
4 years ago
|
|
|
func processImportCfg(flags []string) (newImportCfg string, _ error) {
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
importCfg := flagValue(flags, "-importcfg")
|
|
|
|
if importCfg == "" {
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
return "", fmt.Errorf("could not find -importcfg argument")
|
|
|
|
}
|
|
|
|
data, err := os.ReadFile(importCfg)
|
|
|
|
if err != nil {
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
return "", err
|
|
|
|
}
|
|
|
|
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
var packagefiles, importmaps [][2]string
|
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.
Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.
Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.
While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.
We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.
name old time/op new time/op delta
Build-8 106ms ± 1% 92ms ± 0% -14.07% (p=0.010 n=6+4)
name old bin-B new bin-B delta
Build-8 6.60M ± 0% 6.60M ± 0% -0.01% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 208ms ± 5% 149ms ± 3% -28.27% (p=0.004 n=6+5)
name old user-time/op new user-time/op delta
Build-8 433ms ± 3% 384ms ± 3% -11.35% (p=0.002 n=6+6)
4 years ago
|
|
|
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
for _, line := range strings.SplitAfter(string(data), "\n") {
|
|
|
|
line = strings.TrimSpace(line)
|
|
|
|
if line == "" || strings.HasPrefix(line, "#") {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
i := strings.Index(line, " ")
|
|
|
|
if i < 0 {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
verb := line[:i]
|
|
|
|
switch verb {
|
|
|
|
case "importmap":
|
|
|
|
args := strings.TrimSpace(line[i+1:])
|
|
|
|
j := strings.Index(args, "=")
|
|
|
|
if j < 0 {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
beforePath, afterPath := args[:j], args[j+1:]
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
importmaps = append(importmaps, [2]string{beforePath, afterPath})
|
|
|
|
case "packagefile":
|
|
|
|
args := strings.TrimSpace(line[i+1:])
|
|
|
|
j := strings.Index(args, "=")
|
|
|
|
if j < 0 {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
importPath, objectPath := args[:j], args[j+1:]
|
|
|
|
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
packagefiles = append(packagefiles, [2]string{importPath, objectPath})
|
|
|
|
|
|
|
|
if prev := cachedOutput.KnownObjectFiles[importPath]; prev != "" {
|
|
|
|
// Nothing to do; recorded by one of our dependencies.
|
|
|
|
} else if strings.HasSuffix(objectPath, "_pkg_.a") {
|
|
|
|
// The path is inside a temporary directory, to be deleted soon.
|
|
|
|
// Record the final location within the build cache instead.
|
|
|
|
finalObjectPath, err := gocachePathForFile(objectPath)
|
|
|
|
if err != nil {
|
|
|
|
return "", err
|
|
|
|
}
|
|
|
|
cachedOutput.KnownObjectFiles[importPath] = finalObjectPath
|
|
|
|
} else {
|
|
|
|
cachedOutput.KnownObjectFiles[importPath] = objectPath
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
// log.Printf("%#v", buildInfo)
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
|
|
|
|
// Produce the modified importcfg file.
|
|
|
|
// This is mainly replacing the obfuscated paths.
|
|
|
|
// Note that we range over maps, so this is non-deterministic, but that
|
|
|
|
// should not matter as the file is treated like a lookup table.
|
|
|
|
newCfg, err := os.CreateTemp(sharedTempDir, "importcfg")
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
if err != nil {
|
|
|
|
return "", err
|
|
|
|
}
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
for _, pair := range importmaps {
|
|
|
|
beforePath, afterPath := pair[0], pair[1]
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
if toObfuscate(afterPath) {
|
|
|
|
lpkg, err := listPackage(beforePath)
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
if err != nil {
|
|
|
|
panic(err) // shouldn't happen
|
|
|
|
}
|
|
|
|
|
|
|
|
// Note that beforePath is not the canonical path.
|
|
|
|
// For beforePath="vendor/foo", afterPath and
|
|
|
|
// lpkg.ImportPath can be just "foo".
|
|
|
|
// Don't use obfuscatedImportPath here.
|
|
|
|
beforePath = hashWith(lpkg.GarbleActionID, beforePath)
|
|
|
|
|
|
|
|
afterPath = lpkg.obfuscatedImportPath()
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
}
|
|
|
|
fmt.Fprintf(newCfg, "importmap %s=%s\n", beforePath, afterPath)
|
|
|
|
}
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
for _, pair := range packagefiles {
|
|
|
|
impPath, pkgfile := pair[0], pair[1]
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
if toObfuscate(impPath) {
|
|
|
|
lpkg, err := listPackage(impPath)
|
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.
Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.
This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.
Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.
First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.
Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.
Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.
All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.
This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.
This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.
This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".
While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.
The numbers for Go 1.16 look very good:
name old time/op new time/op delta
Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6)
name old bin-B new bin-B delta
Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6)
name old user-time/op new user-time/op delta
Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)
4 years ago
|
|
|
if err != nil {
|
|
|
|
panic(err) // shouldn't happen
|
|
|
|
}
|
|
|
|
impPath = lpkg.obfuscatedImportPath()
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
}
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
fmt.Fprintf(newCfg, "packagefile %s=%s\n", impPath, pkgfile)
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
}
|
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.
Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.
Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.
The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".
The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.
The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.
Finally, document "garble test" again.
Fixes #241.
4 years ago
|
|
|
|
|
|
|
// Uncomment to debug the transformed importcfg. Do not delete.
|
|
|
|
// newCfg.Seek(0, 0)
|
|
|
|
// io.Copy(os.Stderr, newCfg)
|
|
|
|
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
if err := newCfg.Close(); err != nil {
|
|
|
|
return "", err
|
|
|
|
}
|
|
|
|
return newCfg.Name(), nil
|
|
|
|
}
|
|
|
|
|
|
|
|
type (
|
|
|
|
funcFullName = string
|
|
|
|
reflectParameterPosition = int
|
|
|
|
)
|
detect more std API calls which use reflection
Before, we would just notice direct calls to reflect's TypeOf and
ValueOf. Any other uses of reflection, such as encoding/json or
google.golang.org/protobuf, would require hints as documented by the
README.
Issue #162 outlines some ways we could fix this issue in a general way,
automatically detecting what functions use reflection on their parameters,
even for third party API funcs.
However, that goal is pretty significant in terms of code and effort.
As a temporary improvement, we can expand the list of "known" reflection
APIs via a static table.
Since this table is keyed by "func full name" strings, we could
potentially include third party APIs, such as:
google.golang.org/protobuf/proto.Marshal
However, for now simply include all the std APIs we know about.
If we fail to do the proper fix for automatic detection in the future,
we can then fall back to expanding this global table for third parties.
Update the README's docs, to clarify that the hint is not always
necessary anymore.
Also update the reflect.txt test to stop using the hint for encoding/json,
and to also start testing text/template with a method call.
While at it, I noticed that we weren't testing the println outputs,
as they'd go to stderr - fix that too.
Updates #162.
4 years ago
|
|
|
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
// cachedOutput contains information that will be stored as per garbleExportFile.
|
|
|
|
var cachedOutput = struct {
|
|
|
|
// KnownObjectFiles is filled from -importcfg in the current obfuscated build.
|
|
|
|
// As such, it records export data for the dependencies which might be
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
// themselves obfuscated, depending on GOGARBLE.
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
//
|
|
|
|
// TODO: We rely on obfuscated type information to know what names we didn't
|
|
|
|
// obfuscate. Instead, directly record what names we chose not to obfuscate,
|
|
|
|
// which should then avoid having to go through go/types.
|
|
|
|
KnownObjectFiles map[string]string
|
|
|
|
|
|
|
|
// KnownReflectAPIs is a static record of what std APIs use reflection on their
|
|
|
|
// parameters, so we can avoid obfuscating types used with them.
|
|
|
|
//
|
|
|
|
// TODO: we're not including fmt.Printf, as it would have many false positives,
|
|
|
|
// unless we were smart enough to detect which arguments get used as %#v or %T.
|
|
|
|
KnownReflectAPIs map[funcFullName][]reflectParameterPosition
|
|
|
|
}{
|
|
|
|
KnownObjectFiles: map[string]string{},
|
|
|
|
KnownReflectAPIs: map[funcFullName][]reflectParameterPosition{
|
|
|
|
"reflect.TypeOf": {0},
|
|
|
|
"reflect.ValueOf": {0},
|
|
|
|
},
|
|
|
|
}
|
|
|
|
|
|
|
|
// garbleExportFile returns an absolute path to a build cache entry
|
|
|
|
// which belongs to garble and corresponds to the given Go package.
|
|
|
|
//
|
|
|
|
// Unlike pkg.Export, it is only read and written by garble itself.
|
|
|
|
// Also unlike pkg.Export, it includes GarbleActionID,
|
|
|
|
// so its path will change if the obfuscated build changes.
|
|
|
|
//
|
|
|
|
// The purpose of such a file is to store garble-specific information
|
|
|
|
// in the build cache, to be reused at a later time.
|
|
|
|
// The file should have the same lifetime as pkg.Export,
|
|
|
|
// as it lives under the same cache directory that gets trimmed automatically.
|
|
|
|
func garbleExportFile(pkg *listedPackage) string {
|
|
|
|
trimmed := strings.TrimSuffix(pkg.Export, "-d")
|
|
|
|
if trimmed == pkg.Export {
|
|
|
|
panic(fmt.Sprintf("unexpected export path of %s: %q", pkg.ImportPath, pkg.Export))
|
|
|
|
}
|
|
|
|
return trimmed + "-garble-" + hashToString(pkg.GarbleActionID) + "-d"
|
|
|
|
}
|
|
|
|
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
func loadCachedOutputs() error {
|
|
|
|
for _, path := range curPkg.Deps {
|
|
|
|
pkg, err := listPackage(path)
|
|
|
|
if err != nil {
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
panic(err) // shouldn't happen
|
|
|
|
}
|
|
|
|
if pkg.Export == "" {
|
|
|
|
continue // nothing to load
|
|
|
|
}
|
|
|
|
// this function literal is used for the deferred close
|
|
|
|
err = func() error {
|
|
|
|
filename := garbleExportFile(pkg)
|
|
|
|
f, err := os.Open(filename)
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
defer f.Close()
|
|
|
|
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
// Decode appends new entries to the existing maps
|
|
|
|
if err := gob.NewDecoder(f).Decode(&cachedOutput); err != nil {
|
|
|
|
return fmt.Errorf("gob decode: %w", err)
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}()
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func (tf *transformer) findReflectFunctions(files []*ast.File) {
|
|
|
|
visitReflect := func(node ast.Node) {
|
|
|
|
funcDecl, ok := node.(*ast.FuncDecl)
|
|
|
|
if !ok {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
|
|
|
|
funcObj := tf.info.ObjectOf(funcDecl.Name).(*types.Func)
|
|
|
|
|
|
|
|
var paramNames []string
|
|
|
|
for _, param := range funcDecl.Type.Params.List {
|
|
|
|
for _, name := range param.Names {
|
|
|
|
paramNames = append(paramNames, name.Name)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
ast.Inspect(funcDecl, func(node ast.Node) bool {
|
|
|
|
call, ok := node.(*ast.CallExpr)
|
|
|
|
if !ok {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
sel, ok := call.Fun.(*ast.SelectorExpr)
|
|
|
|
if !ok {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
fnType, _ := tf.info.ObjectOf(sel.Sel).(*types.Func)
|
|
|
|
if fnType == nil || fnType.Pkg() == nil {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
|
|
|
|
fullName := fnType.FullName()
|
|
|
|
var identifiers []string
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
for _, argPos := range cachedOutput.KnownReflectAPIs[fullName] {
|
|
|
|
arg := call.Args[argPos]
|
|
|
|
|
|
|
|
ident, ok := arg.(*ast.Ident)
|
|
|
|
if !ok {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
|
|
|
|
obj := tf.info.ObjectOf(ident)
|
|
|
|
if obj.Parent() == funcObj.Scope() {
|
|
|
|
identifiers = append(identifiers, ident.Name)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if identifiers == nil {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
|
|
|
|
var argumentPosReflect []int
|
|
|
|
for _, ident := range identifiers {
|
|
|
|
for paramPos, paramName := range paramNames {
|
|
|
|
if ident == paramName {
|
|
|
|
argumentPosReflect = append(argumentPosReflect, paramPos)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
cachedOutput.KnownReflectAPIs[funcObj.FullName()] = argumentPosReflect
|
|
|
|
|
|
|
|
return true
|
|
|
|
})
|
|
|
|
}
|
|
|
|
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
lenPrevKnownReflectAPIs := len(cachedOutput.KnownReflectAPIs)
|
|
|
|
for _, file := range files {
|
|
|
|
for _, decl := range file.Decls {
|
|
|
|
visitReflect(decl)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// if a new reflectAPI is found we need to Re-evaluate all functions which might be using that API
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
if len(cachedOutput.KnownReflectAPIs) > lenPrevKnownReflectAPIs {
|
|
|
|
tf.findReflectFunctions(files)
|
|
|
|
}
|
detect more std API calls which use reflection
Before, we would just notice direct calls to reflect's TypeOf and
ValueOf. Any other uses of reflection, such as encoding/json or
google.golang.org/protobuf, would require hints as documented by the
README.
Issue #162 outlines some ways we could fix this issue in a general way,
automatically detecting what functions use reflection on their parameters,
even for third party API funcs.
However, that goal is pretty significant in terms of code and effort.
As a temporary improvement, we can expand the list of "known" reflection
APIs via a static table.
Since this table is keyed by "func full name" strings, we could
potentially include third party APIs, such as:
google.golang.org/protobuf/proto.Marshal
However, for now simply include all the std APIs we know about.
If we fail to do the proper fix for automatic detection in the future,
we can then fall back to expanding this global table for third parties.
Update the README's docs, to clarify that the hint is not always
necessary anymore.
Also update the reflect.txt test to stop using the hint for encoding/json,
and to also start testing text/template with a method call.
While at it, I noticed that we weren't testing the println outputs,
as they'd go to stderr - fix that too.
Updates #162.
4 years ago
|
|
|
}
|
|
|
|
|
|
|
|
// prefillIgnoreObjects collects objects which should not be obfuscated,
|
|
|
|
// such as those used as arguments to reflect.TypeOf or reflect.ValueOf.
|
|
|
|
// Since we obfuscate one package at a time, we only detect those if the type
|
|
|
|
// definition and the reflect usage are both in the same package.
|
|
|
|
func (tf *transformer) prefillIgnoreObjects(files []*ast.File) {
|
|
|
|
tf.ignoreObjects = make(map[types.Object]bool)
|
|
|
|
|
|
|
|
visit := func(node ast.Node) bool {
|
|
|
|
call, ok := node.(*ast.CallExpr)
|
|
|
|
if !ok {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
|
|
|
|
ident, ok := call.Fun.(*ast.Ident)
|
|
|
|
if !ok {
|
|
|
|
sel, ok := call.Fun.(*ast.SelectorExpr)
|
|
|
|
if !ok {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
|
|
|
|
ident = sel.Sel
|
|
|
|
}
|
|
|
|
|
|
|
|
fnType, _ := tf.info.ObjectOf(ident).(*types.Func)
|
detect more std API calls which use reflection
Before, we would just notice direct calls to reflect's TypeOf and
ValueOf. Any other uses of reflection, such as encoding/json or
google.golang.org/protobuf, would require hints as documented by the
README.
Issue #162 outlines some ways we could fix this issue in a general way,
automatically detecting what functions use reflection on their parameters,
even for third party API funcs.
However, that goal is pretty significant in terms of code and effort.
As a temporary improvement, we can expand the list of "known" reflection
APIs via a static table.
Since this table is keyed by "func full name" strings, we could
potentially include third party APIs, such as:
google.golang.org/protobuf/proto.Marshal
However, for now simply include all the std APIs we know about.
If we fail to do the proper fix for automatic detection in the future,
we can then fall back to expanding this global table for third parties.
Update the README's docs, to clarify that the hint is not always
necessary anymore.
Also update the reflect.txt test to stop using the hint for encoding/json,
and to also start testing text/template with a method call.
While at it, I noticed that we weren't testing the println outputs,
as they'd go to stderr - fix that too.
Updates #162.
4 years ago
|
|
|
if fnType == nil || fnType.Pkg() == nil {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
|
detect more std API calls which use reflection
Before, we would just notice direct calls to reflect's TypeOf and
ValueOf. Any other uses of reflection, such as encoding/json or
google.golang.org/protobuf, would require hints as documented by the
README.
Issue #162 outlines some ways we could fix this issue in a general way,
automatically detecting what functions use reflection on their parameters,
even for third party API funcs.
However, that goal is pretty significant in terms of code and effort.
As a temporary improvement, we can expand the list of "known" reflection
APIs via a static table.
Since this table is keyed by "func full name" strings, we could
potentially include third party APIs, such as:
google.golang.org/protobuf/proto.Marshal
However, for now simply include all the std APIs we know about.
If we fail to do the proper fix for automatic detection in the future,
we can then fall back to expanding this global table for third parties.
Update the README's docs, to clarify that the hint is not always
necessary anymore.
Also update the reflect.txt test to stop using the hint for encoding/json,
and to also start testing text/template with a method call.
While at it, I noticed that we weren't testing the println outputs,
as they'd go to stderr - fix that too.
Updates #162.
4 years ago
|
|
|
fullName := fnType.FullName()
|
|
|
|
// log.Printf("%s: %s", fset.Position(node.Pos()), fullName)
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
for _, argPos := range cachedOutput.KnownReflectAPIs[fullName] {
|
|
|
|
arg := call.Args[argPos]
|
|
|
|
argType := tf.info.TypeOf(arg)
|
|
|
|
tf.recordIgnore(argType, tf.pkg.Path())
|
|
|
|
}
|
|
|
|
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
for _, file := range files {
|
|
|
|
for _, group := range file.Comments {
|
|
|
|
for _, comment := range group.List {
|
|
|
|
name := strings.TrimPrefix(comment.Text, "//export ")
|
|
|
|
if name == comment.Text {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
name = strings.TrimSpace(name)
|
|
|
|
obj := tf.pkg.Scope().Lookup(name)
|
|
|
|
if obj == nil {
|
|
|
|
continue // not found; skip
|
|
|
|
}
|
|
|
|
tf.ignoreObjects[obj] = true
|
|
|
|
}
|
|
|
|
}
|
|
|
|
ast.Inspect(file, visit)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// transformer holds all the information and state necessary to obfuscate a
|
|
|
|
// single Go package.
|
|
|
|
type transformer struct {
|
|
|
|
// The type-checking results; the package itself, and the Info struct.
|
|
|
|
pkg *types.Package
|
|
|
|
info *types.Info
|
|
|
|
|
|
|
|
// ignoreObjects records all the objects we cannot obfuscate. An object
|
|
|
|
// is any named entity, such as a declared variable or type.
|
|
|
|
//
|
|
|
|
// This map is initialized by prefillIgnoreObjects at the start,
|
|
|
|
// and extra entries from dependencies are added by transformGo,
|
|
|
|
// for the sake of caching type lookups.
|
|
|
|
// So far, it records:
|
|
|
|
//
|
|
|
|
// * Types which are used for reflection.
|
|
|
|
// * Identifiers used in constant expressions.
|
|
|
|
// * Declarations exported via "//export".
|
|
|
|
// * Types or variables from external packages which were not obfuscated.
|
|
|
|
ignoreObjects map[types.Object]bool
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
|
handle aliases to foreign named types properly
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:
tf.info.Uses[node].(*types.TypeName)
Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.
Thus, we failed to obfuscate the name properly in the added test case:
> garble build
[stderr]
# test/main/sub
xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn
It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.
Updates #349.
4 years ago
|
|
|
// recordTypeDone helps avoid cycles in recordType.
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
recordTypeDone map[types.Type]bool
|
handle aliases to foreign named types properly
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:
tf.info.Uses[node].(*types.TypeName)
Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.
Thus, we failed to obfuscate the name properly in the added test case:
> garble build
[stderr]
# test/main/sub
xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn
It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.
Updates #349.
4 years ago
|
|
|
|
|
|
|
// fieldToStruct helps locate struct types from any of their field
|
|
|
|
// objects. Useful when obfuscating field names.
|
|
|
|
fieldToStruct map[*types.Var]*types.Struct
|
|
|
|
|
|
|
|
// fieldToAlias helps tell if an embedded struct field object is a type
|
|
|
|
// alias. Useful when obfuscating field names.
|
|
|
|
fieldToAlias map[*types.Var]*types.TypeName
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
}
|
|
|
|
|
|
|
|
// newTransformer helps initialize some maps.
|
|
|
|
func newTransformer() *transformer {
|
|
|
|
return &transformer{
|
|
|
|
info: &types.Info{
|
|
|
|
Types: make(map[ast.Expr]types.TypeAndValue),
|
|
|
|
Defs: make(map[*ast.Ident]types.Object),
|
|
|
|
Uses: make(map[*ast.Ident]types.Object),
|
|
|
|
},
|
|
|
|
recordTypeDone: make(map[types.Type]bool),
|
|
|
|
fieldToStruct: make(map[*types.Var]*types.Struct),
|
handle aliases to foreign named types properly
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:
tf.info.Uses[node].(*types.TypeName)
Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.
Thus, we failed to obfuscate the name properly in the added test case:
> garble build
[stderr]
# test/main/sub
xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn
It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.
Updates #349.
4 years ago
|
|
|
fieldToAlias: make(map[*types.Var]*types.TypeName),
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
func (tf *transformer) typecheck(files []*ast.File) error {
|
|
|
|
origTypesConfig := types.Config{Importer: origImporter}
|
|
|
|
pkg, err := origTypesConfig.Check(curPkg.ImportPath, fset, files, tf.info)
|
|
|
|
if err != nil {
|
|
|
|
return fmt.Errorf("typecheck error: %v", err)
|
|
|
|
}
|
|
|
|
tf.pkg = pkg
|
|
|
|
|
|
|
|
// Run recordType on all types reachable via types.Info.
|
|
|
|
// A bit hacky, but I could not find an easier way to do this.
|
|
|
|
for _, obj := range tf.info.Defs {
|
|
|
|
if obj != nil {
|
|
|
|
tf.recordType(obj.Type())
|
|
|
|
}
|
|
|
|
}
|
handle aliases to foreign named types properly
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:
tf.info.Uses[node].(*types.TypeName)
Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.
Thus, we failed to obfuscate the name properly in the added test case:
> garble build
[stderr]
# test/main/sub
xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn
It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.
Updates #349.
4 years ago
|
|
|
for name, obj := range tf.info.Uses {
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
if obj != nil {
|
handle aliases to foreign named types properly
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:
tf.info.Uses[node].(*types.TypeName)
Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.
Thus, we failed to obfuscate the name properly in the added test case:
> garble build
[stderr]
# test/main/sub
xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn
It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.
Updates #349.
4 years ago
|
|
|
if obj, ok := obj.(*types.TypeName); ok && obj.IsAlias() {
|
|
|
|
vr, _ := tf.info.Defs[name].(*types.Var)
|
|
|
|
if vr != nil {
|
|
|
|
tf.fieldToAlias[vr] = obj
|
|
|
|
}
|
|
|
|
}
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
tf.recordType(obj.Type())
|
|
|
|
}
|
|
|
|
}
|
|
|
|
for _, tv := range tf.info.Types {
|
|
|
|
tf.recordType(tv.Type)
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// recordType visits every reachable type after typechecking a package.
|
|
|
|
// Right now, all it does is fill the fieldToStruct field.
|
|
|
|
// Since types can be recursive, we need a map to avoid cycles.
|
|
|
|
func (tf *transformer) recordType(t types.Type) {
|
|
|
|
if tf.recordTypeDone[t] {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
tf.recordTypeDone[t] = true
|
|
|
|
switch t := t.(type) {
|
|
|
|
case interface{ Elem() types.Type }:
|
|
|
|
tf.recordType(t.Elem())
|
|
|
|
case *types.Named:
|
|
|
|
tf.recordType(t.Underlying())
|
|
|
|
}
|
|
|
|
strct, _ := t.(*types.Struct)
|
|
|
|
if strct == nil {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
for i := 0; i < strct.NumFields(); i++ {
|
|
|
|
field := strct.Field(i)
|
|
|
|
tf.fieldToStruct[field] = strct
|
|
|
|
|
|
|
|
if field.Embedded() {
|
|
|
|
tf.recordType(field.Type())
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// transformGo obfuscates the provided Go syntax file.
|
|
|
|
func (tf *transformer) transformGo(file *ast.File) *ast.File {
|
|
|
|
if opts.ObfuscateLiterals {
|
|
|
|
file = literals.Obfuscate(file, tf.info, fset, tf.ignoreObjects)
|
|
|
|
}
|
|
|
|
|
|
|
|
pre := func(cursor *astutil.Cursor) bool {
|
|
|
|
node, ok := cursor.Node().(*ast.Ident)
|
|
|
|
if !ok {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
if node.Name == "_" {
|
|
|
|
return true // unnamed remains unnamed
|
|
|
|
}
|
|
|
|
if strings.HasPrefix(node.Name, "_C") || strings.Contains(node.Name, "_cgo") {
|
|
|
|
return true // don't mess with cgo-generated code
|
|
|
|
}
|
|
|
|
obj := tf.info.ObjectOf(node)
|
|
|
|
if obj == nil {
|
obfuscate all variable names, even local ones (#420)
In the added test case, "garble -literals build" would fail:
--- FAIL: TestScripts/literals (8.29s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble -literals build
[stderr]
# test/main
Usz1FmFm.go:1: cannot call non-function string (type int), declared at Usz1FmFm.go:1
Usz1FmFm.go:1: string is not a type
Usz1FmFm.go:1: cannot call non-function append (type int), declared at Usz1FmFm.go:1
That is, for input code such as:
var append int
println("foo")
_ = append
We'd end up with obfuscated code like:
var append int
println(func() string {
// obfuscation...
x = append(x, ...)
// obfuscation...
return string(x)
})
_ = append
Which would then break, as the code is shadowing the "append" builtin.
To work around this, always obfuscate variable names, so we end up with:
var mwu1xuNz int
println(func() string {
// obfuscation...
x = append(x, ...)
// obfuscation...
return string(x)
})
_ = mwu1xuNz
This change shouldn't make the quality of our obfuscation stronger,
as local variable names do not currently end up in Go binaries.
However, this does make garble more consistent in treating identifiers,
and it completely avoids any issues related to shadowing builtins.
Moreover, this also paves the way for publishing obfuscated source code,
such as #369.
Fixes #417.
4 years ago
|
|
|
_, isImplicit := tf.info.Defs[node]
|
|
|
|
_, parentIsFile := cursor.Parent().(*ast.File)
|
|
|
|
if isImplicit && !parentIsFile {
|
|
|
|
// In a type switch like "switch foo := bar.(type) {",
|
|
|
|
// "foo" is being declared as a symbolic variable,
|
|
|
|
// as it is only actually declared in each "case SomeType:".
|
|
|
|
//
|
|
|
|
// As such, the symbolic "foo" in the syntax tree has no object,
|
|
|
|
// but it is still recorded under Defs with a nil value.
|
|
|
|
// We still want to obfuscate that syntax tree identifier,
|
|
|
|
// so if we detect the case, create a dummy types.Var for it.
|
|
|
|
//
|
|
|
|
// Note that "package mypkg" also denotes a nil object in Defs,
|
|
|
|
// and we don't want to treat that "mypkg" as a variable,
|
|
|
|
// so avoid that case by checking the type of cursor.Parent.
|
|
|
|
obj = types.NewVar(node.Pos(), tf.pkg, node.Name, nil)
|
|
|
|
} else {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
}
|
|
|
|
pkg := obj.Pkg()
|
|
|
|
if vr, ok := obj.(*types.Var); ok && vr.Embedded() {
|
obfuscate alias names like any other objects
Before this change, we would try to never obfuscate alias names. That
was far from ideal, as they can end up in field names via anonymous
fields.
Even then, we would sometimes still fail to build, because we would
inconsistently obfuscate alias names. For example, in the added test
case:
--- FAIL: TestScripts/syntax (0.23s)
testscript.go:397:
> env GOPRIVATE='test/main,private.source'
> garble build
[stderr]
# test/main/sub
Lv_a8gRD.go:15: undefined: KCvSpxmQ
To fix this problem, we set obj to be the TypeName corresponding to the
alias when it is used as an embedded field. We can then make the right
choice when obfuscating the name.
Right now, all aliases will be obfuscated. A TODO exists about not
obfuscating alias names when they're used as embedded fields in a struct
type in the same package, and that package is used for reflection -
since then, the alias name ends up as the field name.
With these changes, the protobuf module now builds.
4 years ago
|
|
|
// The docs for ObjectOf say:
|
|
|
|
//
|
|
|
|
// If id is an embedded struct field, ObjectOf returns the
|
|
|
|
// field (*Var) it defines, not the type (*TypeName) it uses.
|
|
|
|
//
|
|
|
|
// If this embedded field is a type alias, we want to
|
|
|
|
// handle the alias's TypeName instead of treating it as
|
|
|
|
// the type the alias points to.
|
obfuscate alias names like any other objects
Before this change, we would try to never obfuscate alias names. That
was far from ideal, as they can end up in field names via anonymous
fields.
Even then, we would sometimes still fail to build, because we would
inconsistently obfuscate alias names. For example, in the added test
case:
--- FAIL: TestScripts/syntax (0.23s)
testscript.go:397:
> env GOPRIVATE='test/main,private.source'
> garble build
[stderr]
# test/main/sub
Lv_a8gRD.go:15: undefined: KCvSpxmQ
To fix this problem, we set obj to be the TypeName corresponding to the
alias when it is used as an embedded field. We can then make the right
choice when obfuscating the name.
Right now, all aliases will be obfuscated. A TODO exists about not
obfuscating alias names when they're used as embedded fields in a struct
type in the same package, and that package is used for reflection -
since then, the alias name ends up as the field name.
With these changes, the protobuf module now builds.
4 years ago
|
|
|
//
|
|
|
|
// Alternatively, if we don't have an alias, we want to
|
|
|
|
// use the embedded type, not the field.
|
handle aliases to foreign named types properly
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:
tf.info.Uses[node].(*types.TypeName)
Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.
Thus, we failed to obfuscate the name properly in the added test case:
> garble build
[stderr]
# test/main/sub
xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn
It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.
Updates #349.
4 years ago
|
|
|
if tname := tf.fieldToAlias[vr]; tname != nil {
|
|
|
|
if !tname.IsAlias() {
|
|
|
|
panic("fieldToAlias recorded a non-alias TypeName?")
|
|
|
|
}
|
obfuscate alias names like any other objects
Before this change, we would try to never obfuscate alias names. That
was far from ideal, as they can end up in field names via anonymous
fields.
Even then, we would sometimes still fail to build, because we would
inconsistently obfuscate alias names. For example, in the added test
case:
--- FAIL: TestScripts/syntax (0.23s)
testscript.go:397:
> env GOPRIVATE='test/main,private.source'
> garble build
[stderr]
# test/main/sub
Lv_a8gRD.go:15: undefined: KCvSpxmQ
To fix this problem, we set obj to be the TypeName corresponding to the
alias when it is used as an embedded field. We can then make the right
choice when obfuscating the name.
Right now, all aliases will be obfuscated. A TODO exists about not
obfuscating alias names when they're used as embedded fields in a struct
type in the same package, and that package is used for reflection -
since then, the alias name ends up as the field name.
With these changes, the protobuf module now builds.
4 years ago
|
|
|
obj = tname
|
|
|
|
} else {
|
|
|
|
named := namedType(obj.Type())
|
|
|
|
if named == nil {
|
|
|
|
return true // unnamed type (probably a basic type, e.g. int)
|
|
|
|
}
|
|
|
|
// If the field embeds an alias,
|
|
|
|
// and the field is declared in a dependency,
|
|
|
|
// fieldToAlias might not tell us about the alias.
|
|
|
|
// We lack the *ast.Ident for the field declaration,
|
|
|
|
// so we can't see it in types.Info.Uses.
|
|
|
|
//
|
|
|
|
// Instead, detect such a "foreign alias embed".
|
|
|
|
// If we embed a final named type,
|
|
|
|
// but the field name does not match its name,
|
|
|
|
// then it must have been done via an alias.
|
|
|
|
// We dig out the alias's TypeName via locateForeignAlias.
|
|
|
|
if named.Obj().Name() != node.Name {
|
|
|
|
tname := locateForeignAlias(vr.Pkg().Path(), node.Name)
|
|
|
|
tf.fieldToAlias[vr] = tname // to reuse it later
|
|
|
|
obj = tname
|
|
|
|
} else {
|
|
|
|
obj = named.Obj()
|
|
|
|
}
|
|
|
|
}
|
|
|
|
pkg = obj.Pkg()
|
|
|
|
}
|
|
|
|
if pkg == nil {
|
|
|
|
return true // universe scope
|
|
|
|
}
|
|
|
|
|
|
|
|
if pkg.Name() == "main" && obj.Exported() && obj.Parent() == pkg.Scope() {
|
|
|
|
// TODO: only do this when -buildmode is plugin? what
|
|
|
|
// about other -buildmode options?
|
|
|
|
return true // could be a Go plugin API
|
|
|
|
}
|
|
|
|
|
|
|
|
if pkg.Path() == "embed" {
|
|
|
|
// The Go compiler needs to detect types such as embed.FS.
|
|
|
|
// That will fail if we change the import path or type name.
|
|
|
|
// Leave it as is.
|
|
|
|
// Luckily, the embed package just declares the FS type.
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
|
|
|
|
// We don't want to obfuscate this object.
|
|
|
|
if tf.ignoreObjects[obj] {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
|
|
|
|
path := pkg.Path()
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
lpkg, err := listPackage(path)
|
|
|
|
if err != nil {
|
|
|
|
panic(err) // shouldn't happen
|
|
|
|
}
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
if !lpkg.ToObfuscate {
|
|
|
|
return true // we're not obfuscating this package
|
|
|
|
}
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
hashToUse := lpkg.GarbleActionID
|
|
|
|
|
handle aliases to foreign named types properly
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:
tf.info.Uses[node].(*types.TypeName)
Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.
Thus, we failed to obfuscate the name properly in the added test case:
> garble build
[stderr]
# test/main/sub
xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn
It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.
Updates #349.
4 years ago
|
|
|
// log.Printf("%s: %#v %T", fset.Position(node.Pos()), node, obj)
|
|
|
|
switch obj := obj.(type) {
|
|
|
|
case *types.Var:
|
fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.
obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.
We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:
--- FAIL: TestScripts/reflect (0.81s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble build
[stderr]
# test/main
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]
More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.
To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.
This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.
This fixes a few panics while building the protobuf module.
4 years ago
|
|
|
if !obj.IsField() {
|
obfuscate all variable names, even local ones (#420)
In the added test case, "garble -literals build" would fail:
--- FAIL: TestScripts/literals (8.29s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble -literals build
[stderr]
# test/main
Usz1FmFm.go:1: cannot call non-function string (type int), declared at Usz1FmFm.go:1
Usz1FmFm.go:1: string is not a type
Usz1FmFm.go:1: cannot call non-function append (type int), declared at Usz1FmFm.go:1
That is, for input code such as:
var append int
println("foo")
_ = append
We'd end up with obfuscated code like:
var append int
println(func() string {
// obfuscation...
x = append(x, ...)
// obfuscation...
return string(x)
})
_ = append
Which would then break, as the code is shadowing the "append" builtin.
To work around this, always obfuscate variable names, so we end up with:
var mwu1xuNz int
println(func() string {
// obfuscation...
x = append(x, ...)
// obfuscation...
return string(x)
})
_ = mwu1xuNz
This change shouldn't make the quality of our obfuscation stronger,
as local variable names do not currently end up in Go binaries.
However, this does make garble more consistent in treating identifiers,
and it completely avoids any issues related to shadowing builtins.
Moreover, this also paves the way for publishing obfuscated source code,
such as #369.
Fixes #417.
4 years ago
|
|
|
// Identifiers denoting variables are always obfuscated.
|
fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.
obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.
We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:
--- FAIL: TestScripts/reflect (0.81s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble build
[stderr]
# test/main
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]
More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.
To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.
This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.
This fixes a few panics while building the protobuf module.
4 years ago
|
|
|
break
|
|
|
|
}
|
|
|
|
// From this point on, we deal with struct fields.
|
|
|
|
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
// Fields don't get hashed with the package's action ID.
|
|
|
|
// They get hashed with the type of their parent struct.
|
|
|
|
// This is because one struct can be converted to another,
|
|
|
|
// as long as the underlying types are identical,
|
|
|
|
// even if the structs are defined in different packages.
|
|
|
|
//
|
|
|
|
// TODO: Consider only doing this for structs where all
|
|
|
|
// fields are exported. We only need this special case
|
|
|
|
// for cross-package conversions, which can't work if
|
|
|
|
// any field is unexported. If that is done, add a test
|
|
|
|
// that ensures unexported fields from different
|
|
|
|
// packages result in different obfuscated names.
|
fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.
obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.
We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:
--- FAIL: TestScripts/reflect (0.81s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble build
[stderr]
# test/main
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]
More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.
To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.
This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.
This fixes a few panics while building the protobuf module.
4 years ago
|
|
|
strct := tf.fieldToStruct[obj]
|
|
|
|
if strct == nil {
|
|
|
|
panic("could not find for " + node.Name)
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
}
|
fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.
obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.
We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:
--- FAIL: TestScripts/reflect (0.81s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble build
[stderr]
# test/main
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]
More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.
To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.
This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.
This fixes a few panics while building the protobuf module.
4 years ago
|
|
|
// TODO: We should probably strip field tags here.
|
|
|
|
// Do we need to do anything else to make a
|
|
|
|
// struct type "canonical"?
|
|
|
|
fieldsHash := []byte(strct.String())
|
|
|
|
hashToUse = addGarbleToHash(fieldsHash)
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
|
|
|
|
// If the struct of this field was not obfuscated, do not obfuscate
|
|
|
|
// any of that struct's fields.
|
|
|
|
parent, ok := cursor.Parent().(*ast.SelectorExpr)
|
|
|
|
if !ok {
|
|
|
|
break
|
|
|
|
}
|
|
|
|
named := namedType(tf.info.TypeOf(parent.X))
|
|
|
|
if named == nil {
|
|
|
|
break // TODO(mvdan): add a test
|
|
|
|
}
|
|
|
|
if name := named.Obj().Name(); strings.HasPrefix(name, "_Ctype") {
|
|
|
|
// A field accessor on a cgo type, such as a C struct.
|
|
|
|
// We're not obfuscating cgo names.
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
if path != curPkg.ImportPath {
|
|
|
|
obfPkg := obfuscatedTypesPackage(path)
|
|
|
|
if obfPkg.Scope().Lookup(named.Obj().Name()) != nil {
|
|
|
|
tf.recordIgnore(named, path)
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
}
|
|
|
|
case *types.TypeName:
|
|
|
|
// If the type was not obfuscated in the package were it was defined,
|
|
|
|
// do not obfuscate it here.
|
fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.
obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.
We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:
--- FAIL: TestScripts/reflect (0.81s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble build
[stderr]
# test/main
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]
More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.
To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.
This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.
This fixes a few panics while building the protobuf module.
4 years ago
|
|
|
if path != curPkg.ImportPath {
|
|
|
|
named := namedType(obj.Type())
|
|
|
|
if named == nil {
|
|
|
|
break // TODO(mvdan): add a test
|
|
|
|
}
|
fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.
obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.
We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:
--- FAIL: TestScripts/reflect (0.81s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble build
[stderr]
# test/main
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]
More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.
To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.
This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.
This fixes a few panics while building the protobuf module.
4 years ago
|
|
|
obfPkg := obfuscatedTypesPackage(path)
|
|
|
|
if obfPkg.Scope().Lookup(obj.Name()) != nil {
|
|
|
|
tf.recordIgnore(named, path)
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
}
|
|
|
|
case *types.Func:
|
|
|
|
sign := obj.Type().(*types.Signature)
|
|
|
|
if obj.Exported() && sign.Recv() != nil {
|
|
|
|
return true // might implement an interface
|
|
|
|
}
|
|
|
|
switch node.Name {
|
|
|
|
case "main", "init", "TestMain":
|
|
|
|
return true // don't break them
|
|
|
|
}
|
|
|
|
if strings.HasPrefix(node.Name, "Test") && isTestSignature(sign) {
|
|
|
|
return true // don't break tests
|
|
|
|
}
|
|
|
|
default:
|
|
|
|
return true // we only want to rename the above
|
|
|
|
}
|
fix garbling names belonging to indirect imports (#203)
main.go includes a lengthy comment that documents this edge case, why it
happened, and how we are fixing it. To summarize, we should no longer
error with a build error in those cases. Read the comment for details.
A few other minor changes were done to allow writing this patch.
First, the actionID and contentID funcs were renamed, since they started
to collide with variable names.
Second, the logging has been improved a bit, which allowed me to debug
the issue.
Third, the "cache" global shared by all garble sub-processes now
includes the necessary parameters to run "go list -toolexec", including
the path to garble and the build flags being used.
Thanks to lu4p for writing a test case, which also applied gofmt to that
testdata Go file.
Fixes #180.
Closes #181, since it includes its test case.
4 years ago
|
|
|
|
|
|
|
origName := node.Name
|
|
|
|
_ = origName // used for debug prints below
|
|
|
|
|
hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes #310.
4 years ago
|
|
|
node.Name = hashWith(hashToUse, node.Name)
|
fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.
obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.
We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:
--- FAIL: TestScripts/reflect (0.81s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble build
[stderr]
# test/main
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]
More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.
To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.
This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.
This fixes a few panics while building the protobuf module.
4 years ago
|
|
|
// log.Printf("%q (%T) hashed with %x to %q", origName, obj, hashToUse, node.Name)
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
post := func(cursor *astutil.Cursor) bool {
|
|
|
|
imp, ok := cursor.Node().(*ast.ImportSpec)
|
|
|
|
if !ok {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
path, err := strconv.Unquote(imp.Path.Value)
|
|
|
|
if err != nil {
|
|
|
|
panic(err) // should never happen
|
|
|
|
}
|
|
|
|
// We're importing an obfuscated package.
|
|
|
|
// Replace the import path with its obfuscated version.
|
|
|
|
// If the import was unnamed, give it the name of the
|
|
|
|
// original package name, to keep references working.
|
|
|
|
lpkg, err := listPackage(path)
|
|
|
|
if err != nil {
|
|
|
|
panic(err) // should never happen
|
|
|
|
}
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
if !lpkg.ToObfuscate {
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
newPath := lpkg.obfuscatedImportPath()
|
|
|
|
imp.Path.Value = strconv.Quote(newPath)
|
|
|
|
if imp.Name == nil {
|
|
|
|
imp.Name = &ast.Ident{Name: lpkg.Name}
|
|
|
|
}
|
|
|
|
return true
|
|
|
|
}
|
|
|
|
|
|
|
|
return astutil.Apply(file, pre, post).(*ast.File)
|
|
|
|
}
|
|
|
|
|
|
|
|
// locateForeignAlias finds the TypeName for an alias by the name aliasName,
|
|
|
|
// which must be declared in one of the dependencies of dependentImportPath.
|
|
|
|
func locateForeignAlias(dependentImportPath, aliasName string) *types.TypeName {
|
|
|
|
var found *types.TypeName
|
|
|
|
lpkg, err := listPackage(dependentImportPath)
|
|
|
|
if err != nil {
|
|
|
|
panic(err) // shouldn't happen
|
|
|
|
}
|
|
|
|
for _, importedPath := range lpkg.Imports {
|
|
|
|
pkg2, err := origImporter.ImportFrom(importedPath, opts.GarbleDir, 0)
|
|
|
|
if err != nil {
|
|
|
|
panic(err)
|
|
|
|
}
|
|
|
|
tname, ok := pkg2.Scope().Lookup(aliasName).(*types.TypeName)
|
|
|
|
if ok && tname.IsAlias() {
|
|
|
|
if found != nil {
|
|
|
|
// We assume that the alias is declared exactly
|
|
|
|
// once in the set of direct imports.
|
|
|
|
// This might not be the case, e.g. if two
|
|
|
|
// imports declare the same alias name.
|
|
|
|
//
|
|
|
|
// TODO: Think how we could solve that
|
|
|
|
// efficiently, if it happens in practice.
|
|
|
|
panic(fmt.Sprintf("found multiple TypeNames for %s", aliasName))
|
|
|
|
}
|
|
|
|
found = tname
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if found == nil {
|
|
|
|
// This should never happen.
|
|
|
|
// If package A embeds an alias declared in a dependency,
|
|
|
|
// it must show up in the form of "B.Alias",
|
|
|
|
// so A must import B and B must declare "Alias".
|
|
|
|
panic(fmt.Sprintf("could not find TypeName for %s", aliasName))
|
|
|
|
}
|
|
|
|
return found
|
|
|
|
}
|
|
|
|
|
record types into ignoreObjects more reliably
Our previous logic only took care of fairly simple types, such as a
simple struct or a pointer to a struct. If we had a struct embedding
another struct, we'd fail to record the objects for the fields in the
inner struct, and that would lead to miscompilation:
> garble build
[stderr]
# test/main
LZmt64Nm.go:7: outer.InnerField undefined (type *CcUt1wkQ.EmbeddingOuter has no field or method InnerField)
To fix this issue, make the function that records all objects under a
types.Type smarter. Since it now does more than just dealing with
structs, it's also renamed.
Since the function now walks types properly, we get to remove the extra
ast.Inspect in recordReflectArgs, which is nice.
We also make it a method, to avoid the map parameter. A boolean
parameter is also added, since we need this feature to only look at the
current package when looking at reflect calls.
Finally, we add a test case, a simplified version of the original bug
report.
Fixes #315.
4 years ago
|
|
|
// recordIgnore adds any named types (including fields) under typ to
|
|
|
|
// ignoreObjects.
|
|
|
|
//
|
|
|
|
// Only the names declared in package pkgPath are recorded. This is to ensure
|
|
|
|
// that reflection detection only happens within the package declaring a type.
|
|
|
|
// Detecting it in downstream packages could result in inconsistencies.
|
|
|
|
func (tf *transformer) recordIgnore(t types.Type, pkgPath string) {
|
record types into ignoreObjects more reliably
Our previous logic only took care of fairly simple types, such as a
simple struct or a pointer to a struct. If we had a struct embedding
another struct, we'd fail to record the objects for the fields in the
inner struct, and that would lead to miscompilation:
> garble build
[stderr]
# test/main
LZmt64Nm.go:7: outer.InnerField undefined (type *CcUt1wkQ.EmbeddingOuter has no field or method InnerField)
To fix this issue, make the function that records all objects under a
types.Type smarter. Since it now does more than just dealing with
structs, it's also renamed.
Since the function now walks types properly, we get to remove the extra
ast.Inspect in recordReflectArgs, which is nice.
We also make it a method, to avoid the map parameter. A boolean
parameter is also added, since we need this feature to only look at the
current package when looking at reflect calls.
Finally, we add a test case, a simplified version of the original bug
report.
Fixes #315.
4 years ago
|
|
|
switch t := t.(type) {
|
|
|
|
case *types.Named:
|
|
|
|
obj := t.Obj()
|
|
|
|
if obj.Pkg() == nil || obj.Pkg().Path() != pkgPath {
|
|
|
|
return // not from the specified package
|
record types into ignoreObjects more reliably
Our previous logic only took care of fairly simple types, such as a
simple struct or a pointer to a struct. If we had a struct embedding
another struct, we'd fail to record the objects for the fields in the
inner struct, and that would lead to miscompilation:
> garble build
[stderr]
# test/main
LZmt64Nm.go:7: outer.InnerField undefined (type *CcUt1wkQ.EmbeddingOuter has no field or method InnerField)
To fix this issue, make the function that records all objects under a
types.Type smarter. Since it now does more than just dealing with
structs, it's also renamed.
Since the function now walks types properly, we get to remove the extra
ast.Inspect in recordReflectArgs, which is nice.
We also make it a method, to avoid the map parameter. A boolean
parameter is also added, since we need this feature to only look at the
current package when looking at reflect calls.
Finally, we add a test case, a simplified version of the original bug
report.
Fixes #315.
4 years ago
|
|
|
}
|
|
|
|
if tf.ignoreObjects[obj] {
|
|
|
|
return // prevent endless recursion
|
|
|
|
}
|
|
|
|
tf.ignoreObjects[obj] = true
|
|
|
|
|
|
|
|
// Record the underlying type, too.
|
|
|
|
tf.recordIgnore(t.Underlying(), pkgPath)
|
record types into ignoreObjects more reliably
Our previous logic only took care of fairly simple types, such as a
simple struct or a pointer to a struct. If we had a struct embedding
another struct, we'd fail to record the objects for the fields in the
inner struct, and that would lead to miscompilation:
> garble build
[stderr]
# test/main
LZmt64Nm.go:7: outer.InnerField undefined (type *CcUt1wkQ.EmbeddingOuter has no field or method InnerField)
To fix this issue, make the function that records all objects under a
types.Type smarter. Since it now does more than just dealing with
structs, it's also renamed.
Since the function now walks types properly, we get to remove the extra
ast.Inspect in recordReflectArgs, which is nice.
We also make it a method, to avoid the map parameter. A boolean
parameter is also added, since we need this feature to only look at the
current package when looking at reflect calls.
Finally, we add a test case, a simplified version of the original bug
report.
Fixes #315.
4 years ago
|
|
|
|
|
|
|
case *types.Struct:
|
|
|
|
for i := 0; i < t.NumFields(); i++ {
|
|
|
|
field := t.Field(i)
|
|
|
|
|
|
|
|
// This check is similar to the one in *types.Named.
|
|
|
|
// It's necessary for unnamed struct types,
|
|
|
|
// as they aren't named but still have named fields.
|
|
|
|
if field.Pkg() == nil || field.Pkg().Path() != pkgPath {
|
|
|
|
return // not from the specified package
|
|
|
|
}
|
|
|
|
|
record types into ignoreObjects more reliably
Our previous logic only took care of fairly simple types, such as a
simple struct or a pointer to a struct. If we had a struct embedding
another struct, we'd fail to record the objects for the fields in the
inner struct, and that would lead to miscompilation:
> garble build
[stderr]
# test/main
LZmt64Nm.go:7: outer.InnerField undefined (type *CcUt1wkQ.EmbeddingOuter has no field or method InnerField)
To fix this issue, make the function that records all objects under a
types.Type smarter. Since it now does more than just dealing with
structs, it's also renamed.
Since the function now walks types properly, we get to remove the extra
ast.Inspect in recordReflectArgs, which is nice.
We also make it a method, to avoid the map parameter. A boolean
parameter is also added, since we need this feature to only look at the
current package when looking at reflect calls.
Finally, we add a test case, a simplified version of the original bug
report.
Fixes #315.
4 years ago
|
|
|
// Record the field itself, too.
|
|
|
|
tf.ignoreObjects[field] = true
|
|
|
|
|
|
|
|
tf.recordIgnore(field.Type(), pkgPath)
|
record types into ignoreObjects more reliably
Our previous logic only took care of fairly simple types, such as a
simple struct or a pointer to a struct. If we had a struct embedding
another struct, we'd fail to record the objects for the fields in the
inner struct, and that would lead to miscompilation:
> garble build
[stderr]
# test/main
LZmt64Nm.go:7: outer.InnerField undefined (type *CcUt1wkQ.EmbeddingOuter has no field or method InnerField)
To fix this issue, make the function that records all objects under a
types.Type smarter. Since it now does more than just dealing with
structs, it's also renamed.
Since the function now walks types properly, we get to remove the extra
ast.Inspect in recordReflectArgs, which is nice.
We also make it a method, to avoid the map parameter. A boolean
parameter is also added, since we need this feature to only look at the
current package when looking at reflect calls.
Finally, we add a test case, a simplified version of the original bug
report.
Fixes #315.
4 years ago
|
|
|
}
|
|
|
|
|
|
|
|
case interface{ Elem() types.Type }:
|
|
|
|
// Get past pointers, slices, etc.
|
|
|
|
tf.recordIgnore(t.Elem(), pkgPath)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// named tries to obtain the *types.Named behind a type, if there is one.
|
|
|
|
// This is useful to obtain "testing.T" from "*testing.T", or to obtain the type
|
|
|
|
// declaration object from an embedded field.
|
|
|
|
func namedType(t types.Type) *types.Named {
|
|
|
|
switch t := t.(type) {
|
|
|
|
case *types.Named:
|
|
|
|
return t
|
|
|
|
case interface{ Elem() types.Type }:
|
|
|
|
return namedType(t.Elem())
|
|
|
|
default:
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// isTestSignature returns true if the signature matches "func _(*testing.T)".
|
|
|
|
func isTestSignature(sign *types.Signature) bool {
|
|
|
|
if sign.Recv() != nil {
|
|
|
|
return false // test funcs don't have receivers
|
|
|
|
}
|
|
|
|
params := sign.Params()
|
|
|
|
if params.Len() != 1 {
|
|
|
|
return false // too many parameters for a test func
|
|
|
|
}
|
|
|
|
named := namedType(params.At(0).Type())
|
|
|
|
if named == nil {
|
|
|
|
return false // the only parameter isn't named, like "string"
|
|
|
|
}
|
|
|
|
obj := named.Obj()
|
|
|
|
return obj != nil && obj.Pkg().Path() == "testing" && obj.Name() == "T"
|
|
|
|
}
|
|
|
|
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
func transformLink(args []string) ([]string, error) {
|
initial support for build caching (#142)
As per the discussion in https://github.com/golang/go/issues/41145, it
turns out that we don't need special support for build caching in
-toolexec. We can simply modify the behavior of "[...]/compile -V=full"
and "[...]/link -V=full" so that they include garble's own version and
options in the printed build ID.
The part of the build ID that matters is the last, since it's the
"content ID" which is used to work out whether there is a need to redo
the action (build) or not. Since cmd/go parses the last word in the
output as "buildID=...", we simply add "+garble buildID=_/_/_/${hash}".
The slashes let us imitate a full binary build ID, but we assume that
the other components such as the action ID are not necessary, since the
only reader here is cmd/go and it only consumes the content ID.
The reported content ID includes the tool's original content ID,
garble's own content ID from the built binary, and the garble options
which modify how we obfuscate code. If any of the three changes, we
should use a different build cache key. GOPRIVATE also affects caching,
since a different GOPRIVATE value means that we might have to garble a
different set of packages.
Include tests, which mainly check that 'garble build -v' prints package
lines when we expect to always need to rebuild packages, and that it
prints nothing when we should be reusing the build cache even when the
built binary is missing.
After this change, 'go test' on Go 1.15.2 stabilizes at about 8s on my
machine, whereas it used to be at around 25s before.
5 years ago
|
|
|
// We can't split by the ".a" extension, because cached object files
|
|
|
|
// lack any extension.
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
flags, args := splitFlagsFromArgs(args)
|
|
|
|
|
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.
Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.
Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.
While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.
We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.
name old time/op new time/op delta
Build-8 106ms ± 1% 92ms ± 0% -14.07% (p=0.010 n=6+4)
name old bin-B new bin-B delta
Build-8 6.60M ± 0% 6.60M ± 0% -0.01% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 208ms ± 5% 149ms ± 3% -28.27% (p=0.004 n=6+5)
name old user-time/op new user-time/op delta
Build-8 433ms ± 3% 384ms ± 3% -11.35% (p=0.002 n=6+6)
4 years ago
|
|
|
newImportCfg, err := processImportCfg(flags)
|
|
|
|
if err != nil {
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
|
|
|
// Make sure -X works with obfuscated identifiers.
|
|
|
|
// To cover both obfuscated and non-obfuscated names,
|
|
|
|
// duplicate each flag with a obfuscated version.
|
|
|
|
flagValueIter(flags, "-X", func(val string) {
|
|
|
|
// val is in the form of "pkg.name=str"
|
|
|
|
i := strings.IndexByte(val, '=')
|
|
|
|
if i <= 0 {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
name := val[:i]
|
|
|
|
str := val[i+1:]
|
|
|
|
j := strings.LastIndexByte(name, '.')
|
|
|
|
if j <= 0 {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
pkg := name[:j]
|
|
|
|
name = name[j+1:]
|
|
|
|
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
// If the package path is "main", it's the current top-level
|
|
|
|
// package we are linking.
|
|
|
|
// Otherwise, find it in the cache.
|
|
|
|
lpkg := curPkg
|
|
|
|
if pkg != "main" {
|
|
|
|
lpkg = cache.ListedPackages[pkg]
|
|
|
|
}
|
ignore -ldflags=-X flags mentioning unknown packages
That would panic, since the *listedPackage would be nil for a package
path we aren't aware of:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x88 pc=0x126b57d]
goroutine 1 [running]:
main.transformLink.func1(0x7ffeefbff28b, 0x5d)
mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1260 +0x17d
main.flagValueIter(0xc0000a8e20, 0x2f, 0x2f, 0x12e278e, 0x2, 0xc000129e28)
mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1410 +0x1e9
main.transformLink(0xc0000a8e20, 0x30, 0x36, 0x4, 0xc000114648, 0x23, 0x12dfd60, 0x0)
mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1241 +0x1b9
main.mainErr(0xc0000a8e10, 0x31, 0x37, 0x37, 0x0)
mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:287 +0x389
main.main1(0xc000096058)
mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:150 +0xe7
main.main()
mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:83 +0x25
The linker ignores such unknown references, so we should too.
Fixes #259.
4 years ago
|
|
|
if lpkg == nil {
|
|
|
|
// We couldn't find the package.
|
|
|
|
// Perhaps a typo, perhaps not part of the build.
|
|
|
|
// cmd/link ignores those, so we should too.
|
|
|
|
return
|
|
|
|
}
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
// As before, the main package must remain as "main".
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
newPkg := pkg
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
if pkg != "main" {
|
|
|
|
newPkg = lpkg.obfuscatedImportPath()
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
}
|
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
4 years ago
|
|
|
newName := hashWith(lpkg.GarbleActionID, name)
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
flags = append(flags, fmt.Sprintf("-X=%s.%s=%s", newPkg, newName, str))
|
|
|
|
})
|
|
|
|
|
update support for Go 1.17 in time for beta1
Back in early April we added initial support for Go 1.17,
working on a commit from master at that time. For that to work, we just
needed to add a couple of packages to runtimeRelated and tweak printFile
a bit to not break the new "//go:build" directives.
A significant amount of changes have landed since, though, and the tests
broke in multiple ways.
Most notably, the new register ABI is enabled by default for GOOS=amd64.
That affected garble indirectly in two ways: there's a new internal
package to add to runtimeRelated, and we must make reverse.txt more
clever in making its output constant across ABIs.
Another noticeable change is that Go 1.17 changes how its own version is
injected into the runtime package. It used to be via a constant in
runtime/internal/sys, such as:
const TheVersion = `devel ...`
Since we couldn't override such constants via the linker's -X flag,
we had to directly alter the declaration while compiling.
Thankfully, Go 1.17 simply uses a "var buildVersion string" in the
runtime package, and its value is injected by the linker.
This means we can now override it with the linker's -X flag.
We make the code to alter TheVersion for Go 1.16 a bit more clever,
to not break the package when building with Go 1.17.
Finally, our hack to work around ambiguous TOOLEXEC_IMPORTPATH values
now only kicks in for non-test packages, since Go 1.17 includes our
upstream fix. Otherwise, some tests would end up with the ".test"
variant suffix added a second time:
test/bar [test/bar.test] [test/bar [test/bar.test].test]
All the code to keep compatibility with Go 1.16.x remains in place.
We're still leaving TODOs to remind ourselves to remove it or simplify
it once we remove support for 1.16.x.
The 1.17 development freeze has already been in place for a month,
and beta1 is due to come this week, so it's unlikely that Go will change
in any considerable way at this point. Hence, we can say that support
for 1.17 is done.
Fixes #347.
4 years ago
|
|
|
// Starting in Go 1.17, Go's version is implicitly injected by the linker.
|
|
|
|
// It's the same method as -X, so we can override it with an extra flag.
|
|
|
|
flags = append(flags, "-X=runtime.buildVersion=unknown")
|
|
|
|
|
|
|
|
// Ensure we strip the -buildid flag, to not leak any build IDs for the
|
|
|
|
// link operation or the main package's compilation.
|
|
|
|
flags = flagSetValue(flags, "-buildid", "")
|
|
|
|
|
|
|
|
// Strip debug information and symbol tables.
|
|
|
|
flags = append(flags, "-w", "-s")
|
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.
That worked well, in most cases. Unfortunately, it had some flaws:
* Complexity. Even when most of the code is maintained in a separate
module, the import_obfuscation.go file was still close to a thousand
lines of code.
* Go compatibility. The object file format changes between Go releases,
so we were supporting Go 1.15, but not 1.16. Fixing the object file
package to work with 1.16 would probably break 1.15 support.
* Bugs. For example, we recently had to add a workaround for #224, since
import paths containing dots after the domain would end up escaped.
Another example is #190, which seems to be caused by the object file
parser or writer corrupting the compiled code and causing segfaults in
some rare edge cases.
Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:
1) Replace its "package foo" lines with the obfuscated package path. No
need to separate the package path and name, since the obfuscated path
does not contain slashes.
2) Replace the "-p pkg/foo" flag with the obfuscated path.
3) Replace the "import" spec lines with the obfuscated package paths,
for those dependencies which were obfuscated.
4) Replace the "-importcfg [...]" file with a version that uses the
obfuscated paths instead.
The linker also needs that last step, since it also uses an importcfg
file to find object files.
There are three noteworthy drawbacks to this new method:
1) Since we no longer write object files, we can't use them to store
data to be cached. As such, the -debugdir flag goes back to using the
"-a" build flag to always rebuild all packages. On the plus side,
that caching didn't work very well; see #176.
2) The package name "main" remains in all declarations under it, not
just "func main", since we can only rename entire packages. This
seems fine, as it gives little information to the end user.
3) The -tiny mode no longer sets all lines to 0, since it did that by
modifying object files. As a temporary measure, we instead set all
top-level declarations to be on line 1. A TODO is added to hopefully
improve this again in the near future.
The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.
Fixes #176.
Fixes #190.
4 years ago
|
|
|
|
|
|
|
flags = flagSetValue(flags, "-importcfg", newImportCfg)
|
|
|
|
return append(flags, args...), nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func splitFlagsFromArgs(all []string) (flags, args []string) {
|
|
|
|
for i := 0; i < len(all); i++ {
|
|
|
|
arg := all[i]
|
|
|
|
if !strings.HasPrefix(arg, "-") {
|
always use the compiler's -dwarf=false flag (#96)
First, our original append line was completely ineffective; we never
used that "flags" slice again. Second, we only attempted to use the flag
when we obfuscated a package.
In fact, we never care about debugging information here, so for any
package we compile, we can add "-dwarf=false". At the moment, we compile
all packages, even if they aren't to be obfuscated, due to the lack of
access to the build cache.
As such, we save a significant amount of work. The numbers below were
obtained on a quiet machine with "go test -bench=. -benchtime=10x", six
times before and after the change.
name old time/op new time/op delta
Build-8 2.06s ± 4% 1.87s ± 2% -9.21% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 1.51s ± 2% 1.46s ± 1% -3.12% (p=0.004 n=6+5)
name old user-time/op new user-time/op delta
Build-8 11.9s ± 2% 10.8s ± 1% -8.71% (p=0.002 n=6+6)
While at it, only do CI builds on pushes and PRs to the master branch,
so that my PRs created from the same repo don't trigger duplicate
builds.
5 years ago
|
|
|
return all[:i:i], all[i:]
|
|
|
|
}
|
|
|
|
if booleanFlags[arg] || strings.Contains(arg, "=") {
|
|
|
|
// Either "-bool" or "-name=value".
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
// "-name value", so the next arg is part of this flag.
|
|
|
|
i++
|
|
|
|
}
|
|
|
|
return all, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func alterTrimpath(flags []string) []string {
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
// If the value of -trimpath doesn't contain the separator ';', the 'go
|
|
|
|
// build' command is most likely not using '-trimpath'.
|
|
|
|
trimpath := flagValue(flags, "-trimpath")
|
|
|
|
|
|
|
|
// Add our temporary dir to the beginning of -trimpath, so that we don't
|
|
|
|
// leak temporary dirs. Needs to be at the beginning, since there may be
|
|
|
|
// shorter prefixes later in the list, such as $PWD if TMPDIR=$PWD/tmp.
|
|
|
|
return flagSetValue(flags, "-trimpath", sharedTempDir+"=>;"+trimpath)
|
avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes #328.
4 years ago
|
|
|
}
|
|
|
|
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
// forwardBuildFlags is obtained from 'go help build' as of Go 1.17.
|
|
|
|
var forwardBuildFlags = map[string]bool{
|
|
|
|
// These shouldn't be used in nested cmd/go calls.
|
|
|
|
"-a": false,
|
|
|
|
"-n": false,
|
|
|
|
"-x": false,
|
|
|
|
"-v": false,
|
|
|
|
|
|
|
|
// These are always set by garble.
|
|
|
|
"-trimpath": false,
|
|
|
|
"-toolexec": false,
|
|
|
|
|
|
|
|
"-p": true,
|
|
|
|
"-race": true,
|
|
|
|
"-msan": true,
|
|
|
|
"-work": true,
|
|
|
|
"-asmflags": true,
|
|
|
|
"-buildmode": true,
|
|
|
|
"-compiler": true,
|
|
|
|
"-gccgoflags": true,
|
|
|
|
"-gcflags": true,
|
|
|
|
"-installsuffix": true,
|
|
|
|
"-ldflags": true,
|
|
|
|
"-linkshared": true,
|
|
|
|
"-mod": true,
|
|
|
|
"-modcacherw": true,
|
|
|
|
"-modfile": true,
|
|
|
|
"-pkgdir": true,
|
|
|
|
"-tags": true,
|
|
|
|
"-overlay": true,
|
|
|
|
}
|
|
|
|
|
|
|
|
// booleanFlags is obtained from 'go help build' and 'go help testflag' as of Go 1.17.
|
|
|
|
var booleanFlags = map[string]bool{
|
|
|
|
// Shared build flags.
|
|
|
|
"-a": true,
|
|
|
|
"-i": true,
|
|
|
|
"-n": true,
|
|
|
|
"-v": true,
|
|
|
|
"-x": true,
|
|
|
|
"-race": true,
|
|
|
|
"-msan": true,
|
|
|
|
"-linkshared": true,
|
|
|
|
"-modcacherw": true,
|
|
|
|
"-trimpath": true,
|
|
|
|
|
|
|
|
// Test flags (TODO: support its special -args flag)
|
|
|
|
"-c": true,
|
|
|
|
"-json": true,
|
|
|
|
"-cover": true,
|
|
|
|
"-failfast": true,
|
|
|
|
"-short": true,
|
|
|
|
"-benchmem": true,
|
|
|
|
}
|
|
|
|
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
func filterForwardBuildFlags(flags []string) (filtered []string, firstUnknown string) {
|
|
|
|
for i := 0; i < len(flags); i++ {
|
|
|
|
arg := flags[i]
|
|
|
|
name := arg
|
|
|
|
if i := strings.IndexByte(arg, '='); i > 0 {
|
|
|
|
name = arg[:i]
|
|
|
|
}
|
|
|
|
|
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
4 years ago
|
|
|
buildFlag := forwardBuildFlags[name]
|
|
|
|
if buildFlag {
|
|
|
|
filtered = append(filtered, arg)
|
|
|
|
} else {
|
|
|
|
firstUnknown = name
|
|
|
|
}
|
|
|
|
if booleanFlags[arg] || strings.Contains(arg, "=") {
|
|
|
|
// Either "-bool" or "-name=value".
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
// "-name value", so the next arg is part of this flag.
|
|
|
|
if i++; buildFlag && i < len(flags) {
|
|
|
|
filtered = append(filtered, flags[i])
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return filtered, firstUnknown
|
|
|
|
}
|
|
|
|
|
|
|
|
// splitFlagsFromFiles splits args into a list of flag and file arguments. Since
|
|
|
|
// we can't rely on "--" being present, and we don't parse all flags upfront, we
|
|
|
|
// rely on finding the first argument that doesn't begin with "-" and that has
|
|
|
|
// the extension we expect for the list of paths.
|
|
|
|
//
|
|
|
|
// This function only makes sense for lower-level tool commands, such as
|
|
|
|
// "compile" or "link", since their arguments are predictable.
|
|
|
|
func splitFlagsFromFiles(all []string, ext string) (flags, paths []string) {
|
|
|
|
for i, arg := range all {
|
|
|
|
if !strings.HasPrefix(arg, "-") && strings.HasSuffix(arg, ext) {
|
|
|
|
return all[:i:i], all[i:]
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return all, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// flagValue retrieves the value of a flag such as "-foo", from strings in the
|
|
|
|
// list of arguments like "-foo=bar" or "-foo" "bar". If the flag is repeated,
|
|
|
|
// the last value is returned.
|
|
|
|
func flagValue(flags []string, name string) string {
|
|
|
|
lastVal := ""
|
|
|
|
flagValueIter(flags, name, func(val string) {
|
|
|
|
lastVal = val
|
|
|
|
})
|
|
|
|
return lastVal
|
|
|
|
}
|
|
|
|
|
|
|
|
// flagValueIter retrieves all the values for a flag such as "-foo", like
|
|
|
|
// flagValue. The difference is that it allows handling complex flags, such as
|
|
|
|
// those whose values compose a list.
|
|
|
|
func flagValueIter(flags []string, name string, fn func(string)) {
|
|
|
|
for i, arg := range flags {
|
|
|
|
if val := strings.TrimPrefix(arg, name+"="); val != arg {
|
|
|
|
// -name=value
|
|
|
|
fn(val)
|
|
|
|
}
|
|
|
|
if arg == name { // -name ...
|
|
|
|
if i+1 < len(flags) {
|
|
|
|
// -name value
|
|
|
|
fn(flags[i+1])
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
func flagSetValue(flags []string, name, value string) []string {
|
|
|
|
for i, arg := range flags {
|
|
|
|
if strings.HasPrefix(arg, name+"=") {
|
|
|
|
// -name=value
|
|
|
|
flags[i] = name + "=" + value
|
|
|
|
return flags
|
|
|
|
}
|
|
|
|
if arg == name { // -name ...
|
|
|
|
if i+1 < len(flags) {
|
|
|
|
// -name value
|
|
|
|
flags[i+1] = value
|
|
|
|
return flags
|
|
|
|
}
|
|
|
|
return flags
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return append(flags, name+"="+value)
|
|
|
|
}
|
|
|
|
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
func fetchGoEnv() error {
|
|
|
|
out, err := exec.Command("go", "env", "-json",
|
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.
This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.
Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.
One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.
Finally, add a TODO about potentially saving even more work.
4 years ago
|
|
|
"GOPRIVATE", "GOMOD", "GOVERSION", "GOCACHE",
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
).CombinedOutput()
|
|
|
|
if err != nil {
|
|
|
|
fmt.Fprintf(os.Stderr, `Can't find Go toolchain: %v
|
|
|
|
|
|
|
|
This is likely due to go not being installed/setup correctly.
|
|
|
|
|
|
|
|
How to install Go: https://golang.org/doc/install
|
|
|
|
`, err)
|
|
|
|
return errJustExit(1)
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
}
|
|
|
|
if err := json.Unmarshal(out, &cache.GoEnv); err != nil {
|
|
|
|
return err
|
|
|
|
}
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
cache.GOGARBLE = os.Getenv("GOGARBLE")
|
|
|
|
if cache.GOGARBLE != "" {
|
|
|
|
// GOGARBLE is non-empty; nothing to do.
|
|
|
|
} else if cache.GoEnv.GOPRIVATE != "" {
|
|
|
|
// GOGARBLE is empty and GOPRIVATE is non-empty.
|
|
|
|
// Set GOGARBLE to GOPRIVATE's value.
|
|
|
|
cache.GOGARBLE = cache.GoEnv.GOPRIVATE
|
|
|
|
} else {
|
|
|
|
// If GOPRIVATE isn't set and we're in a module, use its module
|
|
|
|
// path as a GOPRIVATE default. Include a _test variant too.
|
|
|
|
// TODO(mvdan): we shouldn't need the _test variant here,
|
|
|
|
// as the import path should not include it; only the package name.
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
if mod, err := ioutil.ReadFile(cache.GoEnv.GOMOD); err == nil {
|
|
|
|
modpath := modfile.ModulePath(mod)
|
|
|
|
if modpath != "" {
|
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes #276.
3 years ago
|
|
|
cache.GOGARBLE = modpath + "," + modpath + "_test"
|
use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:
* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used
Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":
$ go env -json GOPRIVATE GOMOD GOVERSION
{
"GOMOD": "/home/mvdan/src/garble/go.mod",
"GOPRIVATE": "",
"GOVERSION": "go1.16.3"
}
Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.
Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.
This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:
name old time/op new time/op delta
Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6)
name old bin-B new bin-B delta
Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6)
name old user-time/op new user-time/op delta
Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)
4 years ago
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|