Linux-libre architecture and how to modify it for other uses cases?

Sun Oct 24 15:27:39 UTC 2021

Hello, Denis, apologies for the huge delay in getting back to you.

On Oct 13, 2021, "Denis 'GNUtoo' Carikli" <GNUtoo at cyberdimension.org> wrote:

> As far as I know there are only two smartphones (The Openmoko
> Freerunner and the Librem5) where the internal WiFi works without
> requiring distributions to ship nonfree firmwares (this is because the
> nonfree firmwares are integrated in a dedicated memory connected
> directly to the WiFi chip).

Ooh, I didn't know the freerunner was like that.  I've always assumed it
wouldn't work, and left it at that.

> I also made a Makefile.deblob (that I attached) that downloads
> Linux

I can imagine why you'd want to have something like that, but I'm afraid
that's not welcome here, because of the pointer to non-Free Linux.

IMHO a script with that link wouldn't comply with the GNU FSDG.

> The part that I didn't manage to do yet and that looks complicated for
> me is to find a way for Replicant to reuse the deblobing part without
> reusing the firmware blocking part in a way that requires the least
> amount of maintenance.

It's not so easy, indeed.  Though commenting out some of the
reject_firmware and clean_blob entries in deblob-<kver>, and 'blobname'
commands in deblob-check would likely get you the sources you wish, the
blob requests and names would still be caught by the generic patterns
that help us find new blob requests, so you'd either have to comment
those out too, or add 'accept' commands to match and avoid flagging
them.

> The idea here is to enable people not familiar with linux-libre to very
> easily rebase the Replicant changes on top of more recent kernels
> versions, like we do with Linux, but while still keeping deblobed
> kernels.

That's a laudable goal, but Linux changes so much from one version to
another that there's no easy way to keep up without getting familiar
with the cleaning-up scripts :-(

> I guess the code is complex because the problem it tries to solve is
> far from trivial (automatically detecting nonfree firmwares looks really
> impressive to me) and that it needs regular expressions anyway.

*nod*

> What I didn't understand well is what deblob-check is supposed to do
> beside finding binary code.

deblob-check has multiple behaviors depending on the command line
options.

It scans source files, recognizing known acceptable patterns, flagging
suspicious or known-unacceptable ones.

It cleans up source files, keeping acceptable patterns and turning
suspicious and unacceptable ones into /*(DEBLOBBED)*/.

It is also usable to scan entire tarballs for remaining problems.  We
even have patterns to flag some unexpected transformations to help us
catch new blob names or firmware requests in files that are cleaned up,
and that would otherwise become syntax errors, such as firmware-loading
calls mangled into r/*(DEBLOBBED)*/e(...) or common blob extensions
mangled into "firmware/*(DEBLOBBED)*/ (we clean-up extensions followed
by quotes when they're not covered by an 'accept' pattern)

> Is it supposed to actually patch the code as well? 

Yeah, deblob-<kver> runs it in clean_blob, so that all the knowledge
about acceptable and suspicious patterns is put to use to clean up just
what we want.

> Does it handle the various backends (like sed, python, perl, awk) in a
> generic way? 

It does, they're supposed to behave in equivalent ways, but there are
some gotchas.  The patterns have to be written in specific ways that
enable them to be mechanically edited to suit the selected backend, for
one.  The order of patterns is not relevant when using automata-based
regexp engines (sed and awk), but it does matter when using python and
perl.

> Are the regex generic? Or are some primitive reimplemented in the
> various backends?

We use a slightly constrained form of sed regexps (that was originally
the only backend), that enables us to turn them into regexps for the
other backends.  There's code to catch some early common mistakes:

  # Check that all regular expressions match our requirements.

There's also an option to compile each pattern separately, when using
python: set DEBLOB_CHECK_PYTHON_REGEX=debug in the environment, and it
will report any (transformed) regexp that fails to compile in python.
This helps me locate mismatched groupings after adding a few dozen
patterns for a new release.

> And how does deblob-5.10 and deblob-check are supposed to interact
> together?

If deblob-check is present, deblob-5.10 will call it to clean up just
the specific not-known-good bits.  There are a few sed-based custom
cleaning up commands, that would require patterns, substitutions or
context-aware matching that can't be expressed in deblob-check, but the
bulk of the changes are made by deblob-check.

If deblob-check is not present, files that it would clean up are
deleted, and Kconfig files are edited so that the corresponding
configurations are marked as depending on NONFREE, so that they can't be
enabled.

> Also each run of the scripts takes a very long time, so doing complete
> tests would take quite some time. 

*nod*.  I've considered means to run it in parallel, but at this point
the script-based cleaning up technology is regarded as legacy,
maintenance mode; focus for future development is on the
history-rewriting subproject.  It will still rely on deblob-check to
catch and flag suspicious bits, but the actual cleaning up is going to
be mostly manual, done hopefully once at the point of introduction of
the code requiring cleaning up.

> Are there faster setups?

No, the cleaning up cycles are indeed long.  That's one of the
motivations to switch to the history-rewriting project, that will only
involve checking and merging new patches for each release.

> Could some of the faster setups suit my needs? 

I suppose if you were to make the changes you wish as a patch on top of
some GNU Linux-libre release, you might be able to cherry-pick them onto
other releases.  Whether or not that would work better for you, I can't
tell, but it would be sort of future-proof, in that it would work even
with future releases for which cleaning-up scripts will no longer be
available.

> Or would it be possible for me to somehow parallelize the deblob
> scripts without too much efforts?

It is likely possible to rewrite the deblob script into a Makefile or
somesuch, changing the editing primitives (shell functions) so as to
output a Makefile that groups all changes to file into:

file: [deblob-check] [deblob-<kver>.make]
	@<edit1>
        @<edit2>

but I envision various pitfalls with this arrangement.  I mean, it can
work for one-shot runs, but you'd have to restore the original tree
before running it again, and touching files could leave them unchanged.

I hope this helps.  I had planned to add more detailed thoughts on the
general structure of deblob-check, but then you'd would have to wait
longer, and those bits probably wouldn't help you much anyway.

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>