garble

Commit Graph

Author	SHA1	Message	Date
Daniel Martí	9cb4a6f0c8	amend panic message after decodeHash got renamed	3 months ago
Daniel Martí	6f0e46f80b	strip struct tags when hashing structs for type identity This was a long standing TODO, and a user finally ran into it. The fix isn't terribly straightforward, but I'm pretty happy with it. Add a test case where the same struct field is identical with no tag, with the "tagged1" json tag, and again with "tagged2". While here, we add a test case for a regular named field too. Fixes #801.	6 months ago
Daniel Martí	978fd6d518	appease Go 1.22's stricter base64 sanity checks We were using an alphabet with a duplicate character on purpose. Go 1.21 was perfectly fine with that, but 1.22 started noticing: panic: encoding alphabet includes duplicate symbols I can't fault the new sanity check, because it makes sense in general. What we are doing here is slightly bizarre, because we don't care about decoding the name hashes at all. Appease the sanity check by replacing dashes with duplicate characters as a follow-up step. While here, use "a" rather than "z", which is more common and less likely to be noticeable.	7 months ago
pagran	0e2e483472	add control flow obfuscation Implemented control flow flattening with additional features such as block splitting and junk jumps	11 months ago
Daniel Martí	cee53a7868	make GarbleActionID a full sha256 hash This is in preparation for the switch to Go's cache package, whose ActionID type is also a full sha256 hash with 32 bytes. We were using "short" hashes as shown by `go tool buildid`, since that was consistent and 15 bytes was generally enough.	1 year ago
Daniel Martí	414e3b7f70	tidy our build ID hash code a bit First, rename "component" to "hash", since it's shorter and more useful. A full build ID is two or four hashes joined with slashes. Second, add sanity checks that buildIDHashLength is being followed. Otherwise the use of []byte could lead to human error. Third, move all the hash encoding and decoding logic together.	1 year ago
Daniel Martí	0c9a59127a	rename cache global to sharedCache Since we will start importing github.com/rogpeppe/go-internal/cache, and I don't want to have to rename it or leave confusion around.	1 year ago
pagran	86b7e334ba	implement funcInfo.entryoff encryption At linker stage, we now encrypt funcInfo.entryoff value with a simple algorithm (1 xor + 1 mul). This makes it harder to relate function metadata (e.g. name) to function itself in binary, almost without affecting performance.	1 year ago
pagran	9a8608f061	internal/literals: add benchmark to measure the run-time overhead	1 year ago
pagran	6ace03322f	patch and rebuild cmd/link to modify the magic value in pclntab This value is hard-coded in the linker and written in a header. We could rewrite the final binary, like we used to do with import paths, but that would require once again maintaining libraries to do so. Instead, we're now modifying the linker to do what we want. It's not particularly hard, as every Go install has its source code, and rebuilding a slightly modified linker only takes a few seconds at most. Thanks to `go build -overlay`, we only need to copy the files we modify, and right now we're just modifying one file in the toolchain. We use a git patch, as the change is fairly static and small, and the patch is easier to understand and maintain. The other side of this change is in the runtime, as it also hard-codes the magic value when loading information. We modify the code via syntax trees in that case, like `-tiny` does, because the change is tiny (one literal) and the affected lines of code are modified regularly between major Go releases. Since rebuilding a slightly modified linker can take a few seconds, and Go's build cache does not cache linked binaries, we keep our own cached version of the rebuilt binary in `os.UserCacheDir`. The feature isn't perfect, and will be improved in the future. See the TODOs about the added dependency on `git`, or how we are currently only able to cache one linker binary at once. Fixes #622.	1 year ago
Daniel Martí	41df1f8725	slightly reduce the range of hashed name lengths v0.8.0 was improved to obfuscate names so that they didn't all end up with the same length. We went from a length fixed to 8 to a range between 8 and 15. Since the new mechanism chooses a length evenly between 8 and 15, the average hashed name length went from 8 to 11.5. While this improved obfuscation, it also increased the size of obfuscated binaries by quite a bit. We do need to use a reasonably large length to avoid collisions within packages, but we also don't want it to be large enough to cause too much bloat. Reduce the minimum and maximum from 8 and 15 to 6 and 12. This means that the average goes fro 11.5 to 9, much closer to the old average. We could consider a range of 4 to 12, but 4 bytes is short enough where collisions become likely in practice, even if it's just the minimum. And a range of 6 to 10 shrinks the range a bit too much. name old time/op new time/op delta Build-16 20.6s ± 0% 20.6s ± 0% ~ (p=0.421 n=5+5) name old bin-B new bin-B delta Build-16 5.77M ± 0% 5.66M ± 0% -1.92% (p=0.008 n=5+5) name old cached-time/op new cached-time/op delta Build-16 705ms ± 4% 703ms ± 2% ~ (p=0.690 n=5+5) name old mallocs/op new mallocs/op delta Build-16 25.0M ± 0% 25.0M ± 0% -0.08% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build-16 8.11s ± 2% 8.11s ± 2% ~ (p=0.841 n=5+5) Updates #618.	1 year ago
Daniel Martí	7ead5998bc	drop shadowed and unused global consts I forgot these in the recent hash length randomness refactor.	1 year ago
Daniel Martí	481e3a1f09	default to GOGARBLE=, stop using GOPRIVATE We can drop the code that kicked in when GOGARBLE was empty. We can also add the value in addGarbleToHash unconditionally, as we never allow it to be empty. In the tests, remove all GOGARBLE lines where it just meant "obfuscate everything" or "obfuscate the entire main module". cgo.txtar had "obfuscate everything" as a separate step, so remove it entirely. linkname.txtar started failing because the imported package did not import strings, so listPackage errored out. This wasn't a problem when strings itself wasn't obfuscated, as transformLinkname silently left strings.IndexByte untouched. It is a problem when IndexByte does get obfuscated. Make that kind of listPackage error visible, and fix it. reflect.txtar started failing with "unreachable method" runtime throws. It's not clear to me why; it appears that GOGARBLE= makes the linker think that ExportedMethodName is suddenly unreachable. Work around the problem by making the method explicitly reachable, and leave a TODO as a reminder to investigate. Finally, gogarble.txtar no longer needs to test for GOPRIVATE. The rest of the test is left the same, as we still want the various values for GOGARBLE to continue to work just like before. Fixes #594.	1 year ago
Daniel Martí	b6a0284f84	immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.	2 years ago
Azrotronik	73b77ce6be	randomize the length of hashes used for identifiers and filenames Otherwise all of those names share the same exact length, which can be a rather easy pattern to spot that garble was used.	2 years ago
Daniel Martí	c1c90fee13	make obfuscation fully deterministic with -seed The default behavior of garble is to seed via the build inputs, including the build IDs of the entire Go build of each package. This works well as a default, and does give us determinism, but it means that building for different platforms will result in different obfuscation per platform. Instead, when -seed is provided, don't use any other hash seed or salt. This means that a particular Go name will be obfuscated the same way as long as the seed, package path, and name itself remain constant. In other words, when the user supplies a custom -seed, we assume they know what they're doing in terms of storage and rotation. Expand the README docs with more examples and detail. Fixes #449.	2 years ago
Daniel Martí	ad87a0e2bc	don't let -debug affect the build cache hashes Noticed while debugging #492, as adding -debug to a previously run command would unexpectedly rebuild all packages in the build, as if -a was given. While here, remove commented out testscript that were kept in error.	2 years ago
Daniel Martí	70b1cb2fd8	CI: start enforcing vet and staticcheck Fix a staticcheck warning about unused code, as well as an unparam warning and a missing copyright header. We also bump the action versions to their latest releases, and drop unnecessary "name" fields for self-describing steps. Note that we drop the "go env" commands, as setup-go does that now. Finally, I did briefly try to add caching, but then realised it didn't help us at all. Document why.	2 years ago
Daniel Martí	cffb5acd11	remove another allocation per hashed name We use a buffer to turn a hash into a name via base64. We can reuse that buffer to save one repeated allocation. The returned value is still unique, as we convert to a string. name old time/op new time/op delta Build-16 9.26s ± 3% 9.39s ± 2% ~ (p=0.151 n=5+5) name old bin-B new bin-B delta Build-16 5.16M ± 0% 5.16M ± 0% ~ (all equal) name old cached-time/op new cached-time/op delta Build-16 314ms ± 5% 324ms ± 6% ~ (p=0.310 n=5+5) name old mallocs/op new mallocs/op delta Build-16 29.8M ± 0% 29.4M ± 0% -1.38% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build-16 4.70s ± 4% 4.66s ± 2% ~ (p=0.548 n=5+5)	2 years ago
Daniel Martí	cdc1efd95b	avoid allocating twice for every name we hash name old time/op new time/op delta Build-16 9.25s ± 2% 9.26s ± 3% ~ (p=0.841 n=5+5) name old bin-B new bin-B delta Build-16 5.16M ± 0% 5.16M ± 0% ~ (all equal) name old cached-time/op new cached-time/op delta Build-16 316ms ± 5% 314ms ± 5% ~ (p=1.000 n=5+5) name old mallocs/op new mallocs/op delta Build-16 30.7M ± 0% 29.8M ± 0% -2.90% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Build-16 4.78s ± 4% 4.70s ± 4% ~ (p=0.690 n=5+5)	2 years ago
Daniel Martí	ff803004d0	avoid build ID mismatches when using -debugdir A recent change added the -debugdir value to addGarbleToHash, which is part of the hash seed for all obfuscation taking place. In principle, it was okay to add, just like any other garble flag. In practice, it broke the added test case: > garble -debugdir ./debug1 build [stderr] # test/main FBi9xa6e.(*ac0bCOhR).String: relocation target FBi9xa6e.rV55e6H9 not defined qmECK6zf.init.0: relocation target qmECK6zf.eUU08z98 not defined [...] This is because -debugdir gets turned into an absolute path, and not all garble processes ended up using it consistently. The fix is rather simple; since -debugdir doesn't affect obfuscation, don't include it in the build hash seeding at all. Fixes #451.	2 years ago
Daniel Martí	06d5972223	add -debug to aid in debugging (#431 ) Not really meant for end users, but they might still debug failures before filing bugs. We add the -debug flag itself, as well as machinery to deduplicate output lines. There are quite a lot of them otherwise, which aren't helpful and simply add noise. In the future, if we always want to output a debug log line, such as "choosing not to obfuscate here because X", we can simply insert the unique position string. Finally, turn all commented-out log.Printf calls to debugf. Improve a few log lines to be more human-friendly, and also add a few extras like how long it takes to load files. We can improve the logging further in the future. This seems like a good starting point.	2 years ago
Daniel Martí	5f74a1c9f0	unify the definition and storage of flag values The parent garble process parses the original flags, as provided by the user via the command line. Previously, those got stored in the shared cache file, so that child processes spawned by toolexec could see them. Unfortunately, this made the code relatively easy to misuse. A child process would always see flagLiterals as zero value, given that it should never see such a flag argument directly. Similarly, one would have to be careful with cached options, as they could only be consumed after the cache file is loaded. Simplify the situation by deduplicating the storage of flags. Now, the parent passes all flags onto children via toolexec. One exception is GarbleDir, which now becomes an env var. This seems in line with other top-level dirs like GARBLE_SHARED. Finally, we turn -seed into a flag.Value, which lets us implement its "set" behavior as part of flag.Parse. Overall, we barely reduce the amount of code involved, but we certainly remove a couple of footguns. As part of the cleanup, we also introduce appendFlags.	2 years ago
Daniel Martí	fceb19f6da	deprecate using GOPRIVATE in favor of GOGARBLE (#427 ) Piggybacking off of GOPRIVATE is great for a number of reasons: * People tend to obfuscate private code, whose package paths will generally be in GOPRIVATE already * Its meaning and syntax are well understood * It allows all the flexibility we need without adding our own env var or config option However, using GOPRIVATE directly has one main drawback. It's fairly common to also want to obfuscate public dependencies, to make the code in private packages even harder to follow. However, using "GOPRIVATE=" will result in two main downsides: GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled. Downloading modules, such as when adding or updating dependencies, or when the local cache is cold, can be less reliable. * GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled. Adding entries to go.sum, such as when adding or updating dependencies, can be less secure. We will continue to consume GOPRIVATE as a fallback, but we now expect users to set GOGARBLE instead. The new logic is documented in the README. While here, rewrite some uses of "private" with "to obfuscate", to make the code easier to follow and harder to misunderstand. Fixes #276.	2 years ago
Daniel Martí	b5bef981ee	stop relying on nested "go list -toolexec" calls (#422 ) We rely on importcfg files to load type info for obfuscated packages. We use this type information to remember what names we didn't obfuscate. Unfortunately, indirect dependencies aren't listed in importcfg files, so we relied on extra "go list -toolexec" calls to locate object files. This worked fine, but added a significant amount of complexity. The extra "go list -export -toolexec=garble" invocations weren't slow, as they avoided rebuilding or re-obfuscating thanks to the build cache. Still, it was hard to reason about how garble runs during a build if we might have multiple layers of -toolexec invocations. Instead, record the export files we encounter in an incremental map, and persist it in the build cache via the gob file we're already using. This way, each garble invocation knows where all object files are, even those for indirect imports. One wrinkle is that importcfg files can point to temporary object files. In that case, figure out its final location in the build cache. This requires hard-coding a bit of knowledge about how GOCACHE works, but it seems relatively harmless given how it's very little code. Plus, if GOCACHE ever changes, it will be obvious when our code breaks. Finally, add a TODO about potentially saving even more work.	3 years ago
Daniel Martí	29356f30f7	update runtimeAndDeps for Go 1.18 In particular, internal/abi now has some actual code, so obfuscating those literals was breaking as expected. Document how to update the list in the future as well. The change above gets "go test" to just one test failure on: go version devel go1.18-578ada410d Tue Nov 9 22:58:24 2021 +0000 linux/amd64 We also move the doc about why we disable GarbleLiterals, so that it's next to where the disabling happens. While here, we also rename GarbleLiterals to ObfuscateLiterals, as we have been trying to move away from "to garble" as a verb. Finally, limit the verbosity of diffoscope. One test was failing for me, and diffoscope printed thousands of lines. Not particularly useful when I'm trying to skim test results. Usually, seeing a few dozen lines of output is enough. Updates #385.	3 years ago
Daniel Martí	c9b0b07853	hash field names equally in all packages Packages P1 and P2 can define identical struct types T1 and T2, and one can convert from type T1 to T2 or vice versa. The spec defines two identical struct types as: Two struct types are identical if they have the same sequence of fields, and if corresponding fields have the same names, and identical types, and identical tags. Non-exported field names from different packages are always different. Unfortunately, garble broke this: since we obfuscated field names differently depending on the package, cross-package conversions like the case above would result in typechecking errors. To fix this, implement Joe Tsai's idea: hash struct field names with the string representation of the entire struct. This way, identical struct types will have their field names obfuscated in the same way in all packages across a build. Note that we had to refactor "reverse" a bit to start using transformer, since now it needs to keep track of struct types as well. This failure was affecting the build of google.golang.org/protobuf, since it makes regular use of cross-package struct conversions. Note that the protobuf module still fails to build, but for other reasons. The package that used to fail now succeeds, so the build gets a bit further than before. #240 tracks adding relevant third-party Go modules to CI, so we'll track the other remaining failures there. Fixes #310.	3 years ago
Daniel Martí	3afc993266	use "go env -json" to collect env info all at once In the worst case scenario, when GOPRIVATE isn't set at all, we would run these three commands: * "go env GOPRIVATE", to fetch GOPRIVATE itself * "go list -m", for GOPRIVATE's fallback * "go version", to check the version of Go being used Now that we support Go 1.16 and later, all these three can be obtained via "go env -json": $ go env -json GOPRIVATE GOMOD GOVERSION { "GOMOD": "/home/mvdan/src/garble/go.mod", "GOPRIVATE": "", "GOVERSION": "go1.16.3" } Note that we don't get the module path directly, but we can use the x/mod/modfile Go API to parse it from the GOMOD file cheaply. Notably, this also simplifies our Go version checking logic, as now we get just the version string without the "go version" prefix and "GOOS/GOARCH" suffix we don't care about. This makes our code a bit more maintainable and robust. When running a short incremental build, we can also see a small speed-up, as saving two "go" invocations can save a few milliseconds: name old time/op new time/op delta Build/Cache-8 168ms ± 0% 166ms ± 1% -1.26% (p=0.009 n=6+6) name old bin-B new bin-B delta Build/Cache-8 6.36M ± 0% 6.36M ± 0% +0.12% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build/Cache-8 222ms ± 2% 219ms ± 3% ~ (p=0.589 n=6+6) name old user-time/op new user-time/op delta Build/Cache-8 857ms ± 1% 846ms ± 1% -1.31% (p=0.041 n=6+6)	3 years ago
Daniel Martí	10ec00b37a	make flags like -literals and GOPRIVATE affect hashing (#288 ) In `6898d61637`, we switched from using action IDs from "go list -toolexec=garble" to those from the original "go list". We still wanted the obfuscation and hashing to change if the version of garble changes, so we hashed that "original action ID" with garble's own content ID, and called the new hash "garble action ID". While working on a different patch, I noticed something weird: with the new mechanism, adding or removing flags like -literals did not alter those hashes, unlike the old method. This is because the old method used ownContentID, which includes such bits of information, but the new method does not. Change that, and add a test that locks in the behavior we want. In seed.txt, we check that a single function name gets hashed in particular ways in different scenarios. Note that we use a mix of "cmp" and "! bincmp", since the former has no negated form. While at it, the seed.txt test is revamped a bit. Now, we only run with -literals once, as this test is mainly about -seed. We also declare seed strings once, as environment variables, which makes it easier to track what each step is doing.	3 years ago
Daniel Martí	b03cd08c09	avoid one more call to 'go tool buildid' (#253 ) We use it to get the content ID of garble's binary, which is used for both the garble action IDs, as well as 'go tool compile -V=full'. Since those two happen in separate processes, both used to call 'go tool buildid' separately. Store it in the gob cache the first time, and reuse it the second time. Since each call to cmd/go costs about 10ms (new process, running its many init funcs, etc), this results in a nice speed-up for our small benchmark. Most builds will take many seconds though, so note that a ~15ms speedup there will likely not be noticeable. While at it, simplify the buildInfo global, as now it just contains a map representation of the -importcfg contents. It now has better names, docs, and a simpler representation. We also stop using the term "garbled import", as it was a bit confusing. "obfuscated types.Package" is a much better description. name old time/op new time/op delta Build-8 106ms ± 1% 92ms ± 0% -14.07% (p=0.010 n=6+4) name old bin-B new bin-B delta Build-8 6.60M ± 0% 6.60M ± 0% -0.01% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 208ms ± 5% 149ms ± 3% -28.27% (p=0.004 n=6+5) name old user-time/op new user-time/op delta Build-8 433ms ± 3% 384ms ± 3% -11.35% (p=0.002 n=6+6)	3 years ago
Daniel Martí	6898d61637	start using original action IDs (#251 ) When we obfuscate a name, what we do is hash the name with the action ID of the package that contains the name. To ensure that the hash changes if the garble tool changes, we used the action ID of the obfuscated build, which is different than the original action ID, as we include garble's own content ID in "go tool compile -V=full" via -toolexec. Let's call that the "obfuscated action ID". Remember that a content ID is roughly the hash of a binary or object file, and an action ID contains the hash of a package's source code plus the content IDs of its dependencies. This had the advantage that it did what we wanted. However, it had one massive drawback: when we compile a package, we only have the obfuscated action IDs of its dependencies. This is because one can't have the content ID of dependent packages before they are built. Usually, this is not a problem, because hashing a foreign name means it comes from a dependency, where we already have the obfuscated action ID. However, that's not always the case. First, go:linkname directives can point to any symbol that ends up in the binary, even if the package is not a dependency. So garble could only support linkname targets belonging to dependencies. This is at the root of why we could not obfuscate the runtime; it contains linkname directives targeting the net package, for example, which depends on runtime. Second, some other places did not have an easy access to obfuscated action IDs, like transformAsm, which had to recover it from a temporary file stored by transformCompile. Plus, this was all pretty expensive, as each toolexec sub-process had to make repeated calls to buildidOf with the object files of dependencies. We even had to use extra calls to "go list" in the case of indirect dependencies, as their export files do not appear in importcfg files. All in all, the old method was complex and expensive. A better mechanism is to use the original action IDs directly, as listed by "go list" without garble in the picture. This would mean that the hashing does not change if garble changes, meaning weaker obfuscation. To regain that property, we define the "garble action ID", which is just the original action ID hashed together with garble's own content ID. This is practically the same as the obfuscated build ID we used before, but since it doesn't go through "go tool compile -V=full" and the obfuscated build itself, we can work out all the garble action IDs upfront, before the obfuscated build even starts. This fixes all of our problems. Now we know all garble build IDs upfront, so a bunch of hacks can be entirely removed. Plus, since we know them upfront, we can also cache them and avoid repeated calls to "go tool buildid". While at it, make use of the new BuildID field in Go 1.16's "list -json -export". This avoids the vast majority of "go tool buildid" calls, as the only ones that remain are 2 on the garble binary itself. The numbers for Go 1.16 look very good: name old time/op new time/op delta Build-8 146ms ± 4% 101ms ± 1% -31.01% (p=0.002 n=6+6) name old bin-B new bin-B delta Build-8 6.61M ± 0% 6.60M ± 0% -0.09% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 321ms ± 7% 202ms ± 6% -37.11% (p=0.002 n=6+6) name old user-time/op new user-time/op delta Build-8 538ms ± 4% 414ms ± 4% -23.12% (p=0.002 n=6+6)	3 years ago
Daniel Martí	a223147093	use more bits for the obfuscated name hashes (#248 ) We've been using four base64 characters for obfuscated names for a while. And that has mostly worked, since most packages only have up to a few hundred exported or unexported names at a time. However, we have already encountered two collisions in the wild, which can be reproduced with one seed but not another: [...] PsaN.hQyW is a field, not a method [...] byte is not a type In both of those cases, we happened to run into a collision by chance. And that's not terribly unlikely to begin with; even with just 100 names, the probability of a collision was about 0.03%. It dramatically goes up if there are more names; with 500, we're already around 0.75%. It's clear that four base64 chars is not enough to properly avoid collisions in the vast majority of cases. But how many characters are enough? The target should be that, even with a very large package and lots of names, we should still practically never have a collision. I did some basic estimation with "lots of names" being ten thousand, with "practically never" being a one in a million chance. We need to go all the way up to eight characters to reach that probability. It's entirely possible that 7 or even 6 characters would be enough for most users. However, collisions result in confusing errors which are also hard to reproduce for us unless we can use exactly the same seed and source code for a build. So, play it safe, and use 8 characters. The constant now also has documentation explaining how we arrived at that figure.	3 years ago
Daniel Martí	840cf9b68d	make hashWith a bit smarter (#238 ) It used to return five-byte strings, and now it returns four bytes with nearly the same number of bits of entropy. It also avoids the exported vs unexported dance if the name isn't an identifier, which is now common with import paths. See the added docs for more details.	3 years ago
Daniel Martí	e64fccd367	better document and position the hash base64 encoding (#234 ) We now document why we use a custom base64 charset. The old "b64" name was also too generic, so it might have been misused for other purposes.	3 years ago
Daniel Martí	79c775e218	obfuscate unexported names like exported ones (#227 ) In `90fa325da7`, the obfuscation logic was changed to use hashes for exported names, but incremental names starting at just one letter for unexported names. Presumably, this was done for the sake of binary size. I argue that this is not a good idea for the default mode for a number of reasons: 1) It makes reversing of stack traces nearly impossible for unexported names, since replacing an obfuscated name "c" with "originalName" would trigger too many false positives by matching single characters. 2) Exported and unexported names aren't different. We need to know how names were obfuscated at a later time in both cases, thanks to use cases like -ldflags=-X. Using short names for one but not the other doesn't make a lot of sense, and makes the logic inconsistent. 3) Shaving off three bytes for unexported names doesn't seem like a huge deal for the default mode, when we already have -tiny to optimize for size. This saves us a bit of work, but most importantly, simplifies the obfuscation state as we no longer need to carry privateNameMap between the compile and link stages. name old time/op new time/op delta Build-8 153ms ± 2% 150ms ± 2% ~ (p=0.065 n=6+6) name old bin-B new bin-B delta Build-8 7.09M ± 0% 7.08M ± 0% -0.24% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 296ms ± 5% 277ms ± 6% -6.50% (p=0.026 n=6+6) name old user-time/op new user-time/op delta Build-8 562ms ± 1% 558ms ± 3% ~ (p=0.329 n=5+6) Note that I do not oppose using short names for both exported and unexported names in the future for -tiny, since reversing of stack traces will by design not work there. The code can be resurrected from the git history if we want to improve -tiny that way in the future, as we'd need to store state in header files again. Another major cleanup we can do here is to no longer use the garbledImports map. From a look at obfuscateImports, we hash a package's import path with its action ID, much like exported names, so we can simply re-do that hashing for the linker's -X flag. garbledImports does have some logic to handle duplicate package names, but it's worth noting that should not affect package paths, as they are always unique. That area of code could probably do with some simplification in the future, too. While at it, make hashWith panic if either parameter is empty. obfuscateImports was hashing the main package path without a salt due to a bug, so we want to catch those in the future. Finally, make some tiny spacing and typo tweaks to the README.	3 years ago
Daniel Martí	249501b5e9	fix garbling names belonging to indirect imports (#203 ) main.go includes a lengthy comment that documents this edge case, why it happened, and how we are fixing it. To summarize, we should no longer error with a build error in those cases. Read the comment for details. A few other minor changes were done to allow writing this patch. First, the actionID and contentID funcs were renamed, since they started to collide with variable names. Second, the logging has been improved a bit, which allowed me to debug the issue. Third, the "cache" global shared by all garble sub-processes now includes the necessary parameters to run "go list -toolexec", including the path to garble and the build flags being used. Thanks to lu4p for writing a test case, which also applied gofmt to that testdata Go file. Fixes #180. Closes #181, since it includes its test case.	3 years ago
lu4p	cf290b8e6d	Share data between processes via a shared file. (#192 ) Previously garble heavily used env vars to share data between processes. This also makes it easy to share complex data between processes. The complexity of main.go is considerably reduced.	4 years ago
Daniel Martí	dfa622fe50	simplify globals, split hash.go (#191 ) The previous globals worked, but were unnecessarily complex. For example, we passed the fromPath variable around, but it's really a static global, since we only compile or link a single package in each Go process. Use such global variables instead of passing them around, which currently include the package's import path, its build ID, and its import config path. Also split all the hashing and build ID code into hash.go, since that's a relatively well contained 200 lines of code that doesn't need to make main.go any bigger. We also split the code to alter Go's own version to a separate function, so that it can be moved out of main.go as well.	4 years ago

38 Commits (master)