Commit Graph

79 Commits (master)

Author SHA1 Message Date
Daniel Martí 20a92460d5 all: use cmd.Environ rather than os.Environ
Added in Go 1.19, this keeps os/exec's default environment logic,
such as ensuring that $PWD is always set.
2 months ago
Daniel Martí d138afaf32 don't panic when we can error as easily
Panicking in small helpers or in funcs that don't return error
has proved useful to keep code easier to maintain,
particularly for cases that should typically never happen.

However, in these cases we can error just as easily.
In particular, I was getting a panic whenever I forgot
that I was running garble with Go master (1.23), which is over the top.
3 months ago
Daniel Martí ad2ecc7f2f drop Go 1.21 and start using go/version
Needing to awkwardly treat Go versions as if they were semver
is no longer necessary thanks to go/version being in Go 1.22.0 now.
3 months ago
pagran e8fe80d627
add trash block generator (#825)
add trash block generator

For making static code analysis even more difficult, added feature for
generating trash blocks that will never be executed. In combination
with control flow flattening makes it hard to separate trash code from
the real one, plus it causes a large number of trash references to
different methods.

Trash blocks contain 2 types of statements:
1. Function/method call with writing the results into local variables
and passing them to other calls
2. Shuffling or assigning random values to local variables
4 months ago
pagran 3a9c9aa3d4
fix shuffle obfuscation compiler optimization
In some cases, compiler could optimize the shuffle obfuscator,
causing exposing the obfuscated literal.
As a fix, added xor encryption of array indexes.
5 months ago
Daniel Martí d283d8479c add and test initial support for Go 1.22
The Go 1.21 linker patches luckily rebased on master as of
de5b418bea70aaf27de1f47e9b5813940d1e15a4 just fine.
The addition of the strings import in the second patch was removed,
since the file in Go 1.22 now has this package import.

We can remove the Go 1.20 linker patches too, since we no longer support
that Go version in the upcoming release.

Start treating runtime/internal/startlinetest as part of the runtime,
since otherwise its test-only trickery breaks "garble build std":

    # runtime/internal/startlinetest
    [...]/XS7r7lPHkTG.s:23: ABI selector only permitted when compiling runtime, reference was to "HGoWHDsKwh.AlfA2or7Nnb"
    asm: assembly of $WORK/.tmp/garble-shared1535203339/HGoWHDsKwh/XS7r7lPHkTG.s failed

While here, update actions/checkout and staticcheck in CI.
6 months ago
pagran 5e80f12be7
implement flattening hardening
Without hardening, obfuscation is vulnerable to analysis via symbolic
execution because all keys are opened, and it is easy to trace their
connections. Added extendable (contribution-friendly) hardening
mechanism that makes it harder to determine relationship between key and
execution block through key obfuscation.

There are 2 hardeners implemented and both are compatible with literal
obfuscation, which can make analysis even more difficult.
6 months ago
pagran afe1aad916 Removing obsolete TODO
indirect import issue has already been fixed
10 months ago
pagran 9612b29423
add generic function support for control flow obfuscation 10 months ago
pagran 260cad2a3f
add "max" flag value and limits for control flow obfuscation parameters 11 months ago
pagran 0e2e483472
add control flow obfuscation
Implemented control flow flattening with additional features such as block splitting and junk jumps
11 months ago
Daniel Martí 6dd5c53a91 internal/linker: place files under GARBLE_CACHE
This means we now have a unified cache directory for garble,
which is now documented in the README.

I considered using the same hash-based cache used for pkgCache,
but decided against it since that cache implementation only stores
regular files without any executable bits set.
We could force the cache package to do what we want here,
but I'm leaning against it for now given that single files work OK.
12 months ago
Daniel Martí 7d1bd13778 replace our caching inside GOCACHE with GARBLE_CACHE
For each Go package we obfuscate, we need to store information about
how we obfuscated it, which is needed when obfuscating its dependents.
For example, if A depends on B to use the type B.Foo, A needs to know
whether or not B.Foo was obfuscated; it depends on B's use of reflect.

We record this information in a gob file, which is cached on disk.
To avoid rolling our own custom cache, and since garble is so closely
connected with cmd/go already, we piggybacked off of Go's GOCACHE.
In particular, for each build cache entry per `go list`'s Export field,
we would store a "garble" sibling file with that gob content.

However, this was brittle for two reasons:

1) We were doing this without cmd/go's permission or knowledge.
   We were careful to use filename suffixes similar to Export files,
   meaning that `go clean` and other commands would treat them the same.
   However, this could confuse cmd/go at any point in the future.

2) cmd/go trims cache entries in GOCACHE regularly, to keep the size of
   the build and test caches under control. Right now, this means that
   every 24h, any file not accessed in the last five days is deleted.
   However, that trimming heuristic is done per-file.
   If the trimming removed Garble's sibling file but not the original
   Export file, this could cause errors such as
   "cannot load garble export file" which users already ran into.

Instead, start using github.com/rogpeppe/go-internal/cache,
an exported copy of cmd/go's own cache implementation for GOCACHE.
Since we need an entirely separate directory, we introduce GARBLE_CACHE,
defaulting to the "garble" directory inside the user's cache directory.
For example, on Linux this would be ~/.cache/garble.

Inside GARBLE_CACHE, our gob file cache will be under "build",
which helps clarify that this cache is used when obfuscating Go builds,
and allows placing other kinds of caches inside GARBLE_CACHE.
For example, we already have a need for storing linker binaries,
which for now still use their own caching mechanism.

This commit does not make our cache properly resistant to removed files.
The proof is that our seed.txtar testscript still fails the second case.
However, we do rewrite all of our caching logic away from Export files,
which in itself is a considerable refactor, and we add a few TODOs.

One notable change is how we load gob files from dependencies
when building the cache entry for the current package.
We used to load the gob files from all packages in the Deps field.
However, that is the list of all _transitive_ dependencies.
Since these gob files are already flat, meaning they contain information
about all of their transitive dependencies as well, we need only load
the gob files from the direct dependencies, the Imports field.

Performance is largely unchanged, since the behavior is similar.
However, the change from Deps to Imports saves us some work,
which can be seen in the reduced mallocs per obfuscated build.

It's unclear why the binary size isn't stable.
When reverting the Deps to Imports change, it then settles at 5.386Mi,
which is almost exactly in between the two measurements below.
I'm not sure why, but that metric appears to be slightly unstable.

    goos: linux
    goarch: amd64
    pkg: mvdan.cc/garble
    cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics
            │    old     │             new              │
            │   sec/op   │   sec/op    vs base          │
    Build-8   11.09 ± 1%   11.08 ± 1%  ~ (p=0.796 n=10)

            │     old      │                 new                 │
            │    bin-B     │    bin-B      vs base               │
    Build-8   5.390Mi ± 0%   5.382Mi ± 0%  -0.14% (p=0.000 n=10)

            │      old      │               new               │
            │ cached-sec/op │ cached-sec/op  vs base          │
    Build-8     415.5m ± 4%     421.6m ± 1%  ~ (p=0.190 n=10)

            │     old     │                new                 │
            │ mallocs/op  │ mallocs/op   vs base               │
    Build-8   35.43M ± 0%   34.05M ± 0%  -3.89% (p=0.000 n=10)

            │    old     │             new              │
            │ sys-sec/op │ sys-sec/op  vs base          │
    Build-8   5.662 ± 1%   5.701 ± 2%  ~ (p=0.280 n=10)
12 months ago
Daniel Martí 83ee4d0509 internal/literals: add fuzzer
To inevstigate #721, I wrote this fuzzer to see if any particular
combination of string literals and literal obfuscators would result
in a broken program.

I didn't find anything, but I reckon this fuzzer can still be useful.
1 year ago
Daniel Martí 6933b42920 internal/linker: add README with docs
Copying these examples in each of the PRs doesn't feel necessary.
1 year ago
Daniel Martí 5adb4ac800 internal/linker: generate 1.20 patches again
From https://github.com/burrowers/go-patches/pull/3.
1 year ago
Daniel Martí 6e1e750755 internal/linker: add working patches for Go 1.21 (master)
From https://github.com/burrowers/go-patches/pull/4.
1 year ago
Daniel Martí 14daadbf85 internal/linker: patch each major Go version separately
Go 1.21 already breaks one of the three patches.
The complexity of the patches will likely increase,
and we only ever need to actively maintain the highest set of patches,
so splitting by major version feels natural.
1 year ago
Daniel Martí d6fd552245 internal/linker: show `git apply` stderr on error
Otherwise we don't see why a patch failed to apply.
1 year ago
Dominic Breuker b587d8c01a
use the "simple" obfuscator for large literals
Changes literal obfuscation such that literals of any size will be obfuscated,
but beyond `maxSize` we only use the `simple` obfuscator.
This one seems to apply AND, OR, or XOR operators byte-wise and should be safe to use,
unlike some of the other obfuscators which are quadratic on the literal size or worse.

The test for literals is changed a bit to verify that obfuscation is applied.
The code written to the `extra_literals.go` file by the test helper now ensures
that Go does not optimize the literals away when we build the binary.
We also append a unique string to all literals so that we can test that
an unobfuscated build contains this string while an obfuscated build does not.
1 year ago
Daniel Martí cf49f7f3b1 use the host's GOEXE when building the linker
We're building the linker binary for the host GOOS,
not the target GOOS that we happen to be building for.

I noticed that, after running `go test`, my garble cache
would contain both link and link.exe, which made no sense
as I run linux and not windows.

`go env` has GOHOSTOS to mirror GOOS, but there is no
GOHOSTEXE to mirror GOEXE, so we reconstruct it from
runtime.GOOS, which is equivalent to GOHOSTOS.

Add a regression test as well.
1 year ago
pagran b87830eb97
use git in english when matching its output
See db58416d7f.

Fixes #698.
1 year ago
Daniel Martí 059c1d68e2 use fewer build flags when building std or cmd
When we use `go list` on the standard library, we need to be careful
about what flags are passed from the top-level build command,
because some flags are not going to be appropriate.
In particular, GOFLAGS=-modfile=... resulted in a failure,
reproduced via the GOFLAGS variable added to linker.txtar:

	go: inconsistent vendoring in /home/mvdan/tip/src:
		golang.org/x/crypto@v0.5.1-0.20230203195927-310bfa40f1e4: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
		golang.org/x/net@v0.7.0: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
		golang.org/x/sys@v0.5.1-0.20230208141308-4fee21c92339: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
		golang.org/x/text@v0.7.1-0.20230207171107-30dadde3188b: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod

		To ignore the vendor directory, use -mod=readonly or -mod=mod.
		To sync the vendor directory, run:
			go mod vendor

To work around this problem, reset the -mod and -modfile flags when
calling "go list" on the standard library, as those are the only two
flags which alter how we load the main module in a build.

The code which builds a modified cmd/link has a similar problem;
it already reset GOOS and GOARCH, but it could similarly run into
problems if other env vars like GOFLAGS were set.
To be on the safe side, we also disable GOENV and GOEXPERIMENT,
which we borrow from Go's bootstrapping commands.
1 year ago
pagran 86b7e334ba
implement funcInfo.entryoff encryption
At linker stage, we now encrypt funcInfo.entryoff value with a simple algorithm (1 xor + 1 mul). 
This makes it harder to relate function metadata (e.g. name) to function itself in binary, almost without affecting performance.
1 year ago
Daniel Martí 859a5877c9 internal/literals: re-enable the seed obfuscator
Now that we dropped Go 1.19, we can use it again,
since the referenced bug was fixed in Go 1.20.

This is a separate commit, as this change does alter the way we
obfuscate Go builds, so it's not just a cleanup.
1 year ago
pagran 9a8608f061
internal/literals: add benchmark to measure the run-time overhead 1 year ago
pagran b0f8dfb409
prevent "git apply" from walking parent directories
Even when setting git's current directory to a temporary directory,
it could find a git repository in a parent directory and still malfunction.
Use the --git-dir flag to ensure that walking doesn't happen at all.

While here, ensure that "git apply" is always applying our patches,
and add a regression test to linker.txtar when not testing with -short.
1 year ago
pagran f7bde1d40e
don't use the current directory to patch and build cmd/link
We were noticing sporadic `go test` failures where running
an obfuscated binary could panic with:

    fatal error: invalid function symbol table

It turns out that this could happen when modinfo.txtar runs first;
when it patched the linker with `git -C apply`, its current directory
had a valid git repository initialized, and apparently that somehow
prevents the patches from being applied.

It's unclear whether this is git's fault or our own, but in any case,
using a temporary working directory is easier and fixes the bug.
Do the same for `go build`, just in case.
1 year ago
pagran e04c605c93 preallocate data in shuffle and split literal obfuscator 1 year ago
lu4p 5b2193351f Decrease binary size for -literals
Only string literals over 8 characters in length are now being
obfuscated. This leads to around 20% smaller binaries when building with
-literals.

Fixes #618
1 year ago
pagran 45394ac13f
update linker patches from our fork 1 year ago
pagran 22c177f088
remove all unexported func names with -tiny via the linker
The patch to the linker does this when generating the pclntab,
which is the binary section containing func names.
When `-tiny` is being used, we look for unexported funcs,
and set their names to the offset `0` - a shared empty string.

We also avoid including the original name in the binary,
which saves a significant amount of space.
The following stats were collected on GOOS=linux,
which show that `-tiny` is now about 4% smaller:

	go build                  1203067
	garble build               782336
	(old) garble -tiny build   688128
	(new) garble -tiny build   659456
1 year ago
pagran 6ace03322f
patch and rebuild cmd/link to modify the magic value in pclntab
This value is hard-coded in the linker and written in a header.
We could rewrite the final binary, like we used to do with import paths,
but that would require once again maintaining libraries to do so.

Instead, we're now modifying the linker to do what we want.
It's not particularly hard, as every Go install has its source code,
and rebuilding a slightly modified linker only takes a few seconds at most.

Thanks to `go build -overlay`, we only need to copy the files we modify,
and right now we're just modifying one file in the toolchain.
We use a git patch, as the change is fairly static and small,
and the patch is easier to understand and maintain.

The other side of this change is in the runtime,
as it also hard-codes the magic value when loading information.
We modify the code via syntax trees in that case, like `-tiny` does,
because the change is tiny (one literal) and the affected lines of code
are modified regularly between major Go releases.

Since rebuilding a slightly modified linker can take a few seconds,
and Go's build cache does not cache linked binaries,
we keep our own cached version of the rebuilt binary in `os.UserCacheDir`.

The feature isn't perfect, and will be improved in the future.
See the TODOs about the added dependency on `git`,
or how we are currently only able to cache one linker binary at once.

Fixes #622.
1 year ago
Daniel Martí d955196470 avoid using math/rand's global funcs like Seed and Intn
Go 1.20 is starting to deprecate the use of math/rand's global state,
per https://go.dev/issue/56319 and https://go.dev/issue/20661.
The reasoning is sound:

	Deprecated: Programs that call Seed and then expect a specific sequence
	of results from the global random source (using functions such as Int)
	can be broken when a dependency changes how much it consumes from the
	global random source. To avoid such breakages, programs that need a
	specific result sequence should use NewRand(NewSource(seed)) to obtain a
	random generator that other packages cannot access.

Aside from the tests, we used math/rand only for obfuscating literals,
which caused a deterministic series of calls like Intn. Our call to Seed
was also deterministic, per either GarbleActionID or the -seed flag.

However, our determinism was fragile. If any of our dependencies or
other packages made any calls to math/rand's global funcs, then our
determinism could be broken entirely, and it's hard to notice.

Start using separate math/rand.Rand objects for each use case.
Also make uses of crypto/rand use "cryptorand" for consistency.

Note that this requires a bit of a refactor in internal/literals
to start passing around Rand objects. We also do away with unnecessary
short funcs, especially since math/rand's Read never errors,
and we can obtain a byte via math/rand's Uint32.
1 year ago
Daniel Martí 3c7141e801 update the state of a few TODOs related to upstream Go
The generics issue has been fixed for the upcoming Go 1.20.
Include that version as a reminder for when we can drop Go 1.19.

The fs.SkipAll proposal is also implemented for Go 1.20.

The BinaryContentID comment was a little bit trickier.
We did get stamped VCS information some time ago,
but it only provides us with the current commit info and a dirty bit.
That is not enough for our use of the build cache,
because we want any uncommitted changes to garble to cause rebuilds.

I don't think we'll get any better than using garble's own build ID.
Reword the quasi-TODO to instead explain what we're doing and why.
2 years ago
Daniel Martí 21bd89ff73 slight simplifications and alloc reductions
Reuse a buffer and a map across loop iterations, because we can.

Make recordTypeDone only track named types, as that is enough to detect
type cycles. Without named types, there can be no cycles.

These two reduce allocs by a fraction of a percent:

	name      old time/op         new time/op         delta
	Build-16          10.4s ± 2%          10.4s ± 1%    ~     (p=0.739 n=10+10)

	name      old bin-B           new bin-B           delta
	Build-16          5.51M ± 0%          5.51M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          391ms ± 9%          407ms ± 7%    ~     (p=0.095 n=10+9)

	name      old mallocs/op      new mallocs/op      delta
	Build-16          34.5M ± 0%          34.4M ± 0%  -0.12%  (p=0.000 n=10+10)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          5.87s ± 5%          5.82s ± 5%    ~     (p=0.182 n=10+9)

It doesn't seem like much, but remember that these stats are for the
entire set of processes, where garble only accounts for about 10% of the
total wall time when compared to the compiler or linker. So a ~0.1%
decrease globally is still significant.

linkerVariableStrings is also indexed by *types.Var rather than types.Object,
since -ldflags=-X only supports setting the string value of variables.
This shouldn't make a significant difference in terms of allocs,
but at least the map is less prone to confusion with other object types.
To ensure the new code doesn't trip up on non-variables, we add test cases.

Finally, for the sake of clarity, index into the types.Info maps like
Defs and Uses rather than calling ObjectOf if we know whether the
identifier we have is a definition of a name or the use of a defined name.
This isn't better in terms of performance, as ObjectOf is a tiny method,
but just like with linkerVariableStrings before, the new code is clearer.
2 years ago
lu4p 84ba444b7c
Disable seed obfuscator (#535)
The seed obfuscator uses a type declaration in order to declare a function,
which returns a function with the same type.

This breaks when obfuscating literals inside generic functions, because
type declarations inside generic functions are not currently supported.

Therefore the obfuscator gets disabled until
https://github.com/golang/go/issues/47631 is fixed.
2 years ago
shellhazard 22e3d30216
support code taking the address of a []byte literal (#530) 2 years ago
lu4p d555639657 Remove unused imports via go/types.
Fixes #481
2 years ago
Daniel Martí 2d4cc49d50 CI: bump gotip to February
While here, fix two typos.
2 years ago
Daniel Martí c9341790d4 avoid obfuscating literals set via -ldflags=-X
The -X linker flag sets a string variable to a given value,
which is often used to inject strings such as versions.

The way garble's literal obfuscation works,
we replace string literals with anonymous functions which,
when evaluated, result in the original string.

Both of these features work fine separately,
but when intersecting, they break. For example, given:

	var myVar = "original"
	[...]
	-ldflags=-X=main.myVar=replaced

The -X flag effectively replaces the initial value,
and -literals adds code to be run at init time:

	var myVar = "replaced"
	func init() { myVar = func() string { ... } }

Since the init func runs later, -literals breaks -X.
To avoid that problem,
don't obfuscate literals whose variables are set via -ldflags=-X.

We also leave TODOs about obfuscating those in the future,
but we're also leaving regression tests to ensure we get it right.

Fixes #323.
2 years ago
Daniel Martí d25e718d0c stop passing ignoreObjects to literals.Obfuscate
Literal obfuscation uses constant folding now,
so it no longer needs to record identifiers to ignore.
Remove the parameter and the outdated bit of docs.
2 years ago
Daniel Martí 4f0657a19a prepare for v0.5.0
While here, add a TODO I forgot about, and run gofumpt.

Also bump all test timeouts slightly,
as the Mac and Windows hosted runners are a bit slow
and I've hit failures twice recently.
2 years ago
Daniel Martí 29ea99fc5f CI: test on GOARCH=386
Note that this cross-compilation disables cgo by default,
and so the cgo.txt test script isn't run on GOARCH=386.
That seems fine for now, as the test isn't arch-specific.

This testing uncovered one build failure in internal/literals;
the comparison between int and math.MaxUint32 is invalid on 32-bit.
To fix that build failure, use int64 consistently.

One test also incorrectly assumed amd64; it now supports 386 too.
For any other architecture, it's being skipped for now.

I also had to increase the -race test timeout,
as it usually takes 8-9m on GitHub Actions,
and the timeout would sometimes trigger.

Finally, use "go env" rather than "go version" on CI,
which gives us much more useful information,
and also includes Go's own version now via GOVERSION.

Fixes #426.
2 years ago
lu4p a645929151
obfuscate literals via constant folding
Constants don't need to be added to ignoreObjs anymore,
because go/types now does this work for us.

Fixes #360
3 years ago
hasheddan 6b632d07e2 Fix minor typo in RecordUsedAsConstants docstring
Updates RecordUsedAsConstants docstring with minor typo fix for
identifiers.

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
3 years ago
lu4p 3ab59000f3 Follow up: Obfuscate more byte slice literals 3 years ago
Daniel Martí 691a44cecb avoid breaking const declarations using iotas
With the -literals flag, we try to convert some const declarations to
vars, as long as that doesn't break typechecking. We really only do that
for typed constant strings, really.

There was a quirk: if a numerical constant had a type and used iota, we
would not obfuscate its value, but we would still convert the
declaration from const to var. Since iotas only work within const
declarations, that would break compilation:

	> garble -literals build
	[stderr]
	# test/main
	FeWE3zwi.go:19: undefined: iota
	exit status 2

To fix the problem, make the logic more conservative: only obfuscate
constant declarations where the values are typed strings, meaning that
any numerical constants are left entirely untouched.

This fixes the build of google.golang.org/protobuf/runtime/protoiface
with -literals turned on.
3 years ago
lu4p c1672cdc0d Obfuscate more byte slice literals
Slices with hex, octal, binary and rune elements are now obfuscated.
3 years ago
lu4p 552a6bcfb0 Obfuscate literals in string slices and arrays
Fixes #354
3 years ago