More

indygreg2 · 2024-12-04T15:29:57 1733326197

I could never justify the time investment to upstream a lot of my python-build-standalone work. I made some attempts. But it always felt like I was swimming against a heavy current and the talk to meaningful action ratio was too high. The payoff would be there. But it was the kind of work someone would have to pay me to do: not how I would choose to spend my free time on nights and weekends.

I’m optimistic the Astral folks will have better success than me and I support them in their efforts. They have viable, popular solutions in hand. Hopefully that helps convert others to their cause. “If you build it they will come.”

indygreg2 · on Aug 9, 2022

Some of Apple's code signing is open source (mostly in SecurityFramework). But not enough is open source to be able to build a modern `codesign`. The source you linked is ~10 years old and woefully out of date, for example!

I don't believe there are any Apple open source references for how notarization works (at least none before it was a public App Store Connect API).

There are even times when Apple's open source releases trail functionality they are shipping in macOS. For example, Apple recently added an alternative DER encoding of entitlements, which are expressed as a plist. I don't believe Apple ever published code for how the DER encoding works. Instead, we needed to use Apple's tooling as an oracle to incrementally derive the encoding.

indygreg2 · on Aug 9, 2022

Note that gon is a glorified front-end for executing processes like `codesign`, `altool`, and even `ditto` for zip file generation.

This Rust implementation, by contrast, has all the functionality implemented in pure Rust: there is no calling out to external processes for anything. You could drop the statically linked `rcodesign` executable into a Linux container with no other files and it would work.

That's not to discredit gon or its authors: it is a fantastic tool for streamlining common functionality. But the mechanism is completely different.

infotogivenm · on Aug 10, 2022

Never knew it wrapped shell commands - good to know!

indygreg2 · on Aug 8, 2022

XAR signing is effectively just an RFC 5652 CMS signature plus some minimal data structure manipulation. Code at https://github.com/indygreg/PyOxidizer/blob/faa7dfcea5d66bf5....

Mach-O and bundles, by contrast, require a myriad of additional data structures requiring thousands of lines of code to support. To my knowledge, nobody else has implemented signing of these far-more-complicated primitives. (Existing Mach-O signing solutions just do ad-hoc signing and/or don't handle Mach-O in the context of a bundle.)

ckatri · on Aug 9, 2022

> Existing Mach-O signing solutions just do ad-hoc signing and/or don't handle Mach-O in the context of a bundle.

I can assure you that saurik's ldid[0] does. Or the updated fork that I maintain at ProcursusTeam/ldid[1]. You can use -K to sign with a cert. You can find full documentation in the manpage[2].

[0] http://git.saurik.com/ldid.git

[1] https://github.com/ProcursusTeam/ldid

[2] https://man.cameronkatri.com/ldid/ldid.1

indygreg2 · on Aug 8, 2022

rcodesign can sign and notarize applications written in any language. From rcodesign's perspective, Rust is an implementation detail of the tool itself.

indygreg2 · on May 26, 2022

While this IFUNC feature does exist and it is useful, when I performed binary analysis on every package in Ubuntu in January, I found that only ~11 distinct packages have IFUNCs. It certainly looks like this ELF feature is not really used much [in open source] outside of GNU toolchain-level software!

https://gregoryszorc.com/blog/2022/01/09/bulk-analyze-linux-...

burntsushi · on May 27, 2022

Do note that IFUNC is a convenience. You don't actually need to use IFUNCs to write target specific code that dispatches dynamically at runtime.

For example, ripgrep's dependencies dispatch dynamically at runtime by querying CPUID, but nothing uses GCC's "IFUNC" thingy. So it's likely that much more software is utilizing target specific code than not. Still, it's probably less than one would like.

(I think this is less of a response to you and more of a response to this entire thread. It seems like some folks are conflating "IFUNC" with "all forms of dynamic dispatching based on CPUID.")

TkTech · on May 26, 2022

I've wanted to use them many times in the past, but the limited support on other compilers (looking at you MSVC) always made it a non-starter. If I have to support some other method of feature detection anyways, there's no point.

arthur2e5 · on May 27, 2022

The way ifunc (well, actually language-level FMV) works in GCC and clang is that the input source code, not the command-line switches, specifies what ISA extensions to build for on a candidate function. This naturally means that packagers and other vendors are not using it as much as you hope: they would need to have a separate patch to add these attributes for each architecture and that’s just not maintainable.

Even Intel’s Clear Linux does not bother to patch individual libraries. They just use the glibc library multi-versioning feature to load from different directories depending on the cpuid.

In my opinion, GCC and Clang could make the whole thing more ergonomic. Ideally you declare a function as "interesting for multi-versioning" in the source using an attribute, then in the command line define what -march's to actually clone for. Kinda like ICC’s /Qax. (On second thought preprocessor defs are sufficient, duh.)

jeffbee · on May 26, 2022

The use of ifunc in libc, openssl, and zlib covers just about everything people want from it. You want that optimized memcpy, you only need it in one place.

indygreg2 · on Feb 19, 2022

The index is a power user feature. Its forced presence in Git effectively constitutes a usability barrier for new users. After all, a VCS is effectively a glorified abstraction for "save a file." Any barrier imposed between changing a file and committing it can get in the way and confuse people. The Git index does this.

Furthermore, the index is effectively a pseudo commit without a commit message. Any workflow using the index can be implemented in terms of actual commits itself.

I think because Git doesn't have strong usability in general and especially around history rewriting, many Git users feel that the index or an index equivalent is somehow a required feature of a VCS because Git's shortcomings give that illusion. However, if you use a VCS with better history rewriting (such as Mercurial with evolve), you'll likely come around to my opinion that the index can be jettisoned without meaningful loss of functionality or productivity.

cryptonector · on Feb 19, 2022

I don't deny that the index is a power feature and that it's difficult to explain it to newbies.

Perhaps there's room for new UIs.

All I'm saying is I need this power. And it has to be easy to reach for it.

indygreg2 · on Jan 10, 2022

I once helped maintain a nearly full featured implementation of make in Python (pymake) that grew out of the Firefox project.

As much as I would like to support efforts like this, I feel like it is ultimately doomed to suffer from usability limitations because make does not have a static DAG. Rather, the DAG evolves dynamically as make evaluates targets. There are pesky problems like $(call) and $(eval) where you can dynamically inject make expressions into a partially evaluated DAG. And targets can generate files which are later loaded via include directives.

Dumping the make database in a machine readable format (for visualization or otherwise) would be incredibly valuable for debugging and could improve understanding. But since many make files have N>>1 "snapshots" of the internal DAG during their execution, there is a really thorny problem around when and which of these "snapshots" to use and how to stitch them together. Many make files aren't "static" enough to support exporting a single, complete, and valuable usable snapshot of their internal DAG.

If debugging is the target goal, I think a potentially better approach would be to define a function that takes an output filename and optional list of targets and writes out the point-in-time build graph when that function is called. This way makefile authors could insert probes in the make file (including at the bottom of a make file so the function is called on file load) and capture the state(s) of the build graph exactly when they need to. There could also be a specially named variable holding the names of targets that when pre- or post-evaluated would trigger the dumping of the database.

Best of luck to the person doing this work. Debugging make is notoriously hard and any progress in this area will be much welcomed by its many users.

loo · on Jan 10, 2022

This must be why after 45 years make still has no "list targets" option.

gjvc · on Jan 10, 2022

Does pymake handle colons in filenames any better / easier / etc than GNU make? (I have a task where I have a bunch of files to transform and they have HH:MM:DD timestamps embedded in the basename of the file)

indygreg2 · on April 8, 2021

I removed the background image and made the text blacker. Might take a force refresh to pick up the CSS file change.

Is it good enough or are further tweaks needed? If more, my web design skills are mediocre, so actionable feedback would be appreciated.

NavinF · on April 8, 2021

Looks perfect to me after ctrl+shift+r. I'm glad you removed that background image, it was rather pointless.

indygreg2 · on April 5, 2020

I converted PyOxidizer's configuration files from TOML to Starlark because I found it effectively impossible to express complex primitives in a static configuration file and the static nature was constraining end-user utility.

A common solution to this problem is to invent some kind of templating or pre-evaluation of your static config file. But I find these solutions quickly externalize a lot of complexity and are frustrating because it is often difficult to debug their evaluation.

At the point you want to do programming-like things in a config file, you might as well use a "real" programming language. Yes, it is complex in its own way. But if your target audience is programmers, I think it is an easy decision to justify.

I'm extremely happy with Starlark and PyOxidizer's configuration files are vastly more powerful than the TOML ones were.

https://pyoxidizer.readthedocs.io/en/stable/config.html

mixmastamyk · on April 5, 2020

> I found it effectively impossible to express complex primitives in a static configuration file

I don't quite understand this part. Any examples?

indygreg2 · on April 5, 2020

Compare https://pyoxidizer.readthedocs.io/en/v0.4.0/config.html#file... to https://pyoxidizer.readthedocs.io/en/stable/config.html#file....

In PyOxidizer's case, I wanted to create virtual pipelines of actions to perform. In TOML, we could create sections to express each stage in a logical pipeline. But if you wanted to share stages between pipelines, you were out of luck. With Starlark, you can define a stage as a function and have multiple functions reference it because is "just" calling a function.

I suppose I could have defined names for stages and made this work in TOML. So let's use a slightly more complicated example.

PyOxidizer config files need to allow filtering of resources. Essentially, call fn(x) to determine whether something is relevant. In the TOML world, we had to define explicit stages that applied filtering semantics: there were config primitives dedicated to applying filtering logic. PyOxidizer's config files had to expose primitives that could perform filtering logic desired by end-users. By contrast, Starlark exposes an iterable of objects and config files can examine attributes of each object and deploy their own logic for determining whether said object is relevant. This is far more powerful, as config files can define their own filtering rules in a programming language without being constrained by what the TOML-based config syntax supports.