*.py packaging, again

Jeff Johnson n3npq at mac.com
Thu Jul 14 14:36:37 CEST 2011

On Jul 14, 2011, at 8:04 AM, Tomasz Pala wrote:

> On Thu, Jul 14, 2011 at 07:21:35 -0400, Jeff Johnson wrote:
>>>> compiles stuff? If so, can we then launch a suid helper that injects
>>>> the newly created files into the rpm package that contained the
>>>> original .py?
>>> OMG...
>>> Please don't mess with rpm database in indeterministic time (like some
>>> runtime) and do NOT use SUID.
>>> This is excelent example of what I've called 'overcomplicated'.
>> And you comments (which I agree with) are negatively phrased.
>> State what you want to see positively.
> 1. I don't know python internals, but if these cache files are not
> strictly machine-dependant (i.e. they don't differ on machines having
> the same arch) they should be shipped in compiled form (otherwise we
> might as well end up with %_make; %configure; ... in %build of every
> package) - this is not gentoo.

Historically, the *.py[co] files have been packaged in *.rpm as if the
content is both per-arch and per-python-version. I don't know what
the implementation reality actually is, or whether the choices are unnecessarily
restrictive and overly paranoid "future proofing" or …

.. but if the *.py[co] are portable (whatever that means), then
the portability should be reflected in the design of the packaging.

There are aspects of eggs, where eggs can load other eggs dynamically,
that are very different from other static content delivered in *.rpm,
and the dynamic side effects are purely on the install system and
are not easily captured on the build system. E.g. what other eggs
might be installed as a side effect likely has dynamic state associated
that cannot be reproduced within a build for all usage cases.

> 2. if anything wants to alter rpm database, it must be done during
> transaction or shortly after it (i.e. within installation process); it's
> not acceptable to do changes there during regular system use (on
> unprivilidged account) - after all it's MD5 repo and might be
> hard-locked by admin.

Well that isn't necessarily true or even needed (depending on what
is attempted).

Headers can be digitally signed, and mostly do not contain any critically
important information that can't be found in the original *.rpm file.

There are a few (mostly timestamps) pieces of information that can
be assigned only during a "transaction", but that does NOT mean
"within installation process" necessarily.

It would mean that any attempt to register a "package" in an rpmdb
	1) carry digests or attempt signing
	2) faithfully include a "transaction id" and a "installation time"
just like an RPM installation does.

And yes -- if you choose to treat an rpmdb with concepts like
"locked" and "privileged" -- then other installers will be
constrained in what what is permitted.

Both "locked" and "privileged" can be avoided if an rpm tool
which is already "privileged" pulls information rather than having
the python JIT'er push directly into an rpmdb.

This introduces a TOCTTOU window
between when an egg is installed (and might queue a request for rpmdb registration)
and when the registration is actually handled. There are several approaches
to avoiding TOCTTOU queueing latency if you think a bit.

> 3. install -> compile -> save to rpmdb is pointless - saved MD5 might be already
> compromised by some malicious software and thus undetectable even via
> verification run from outer source (i.e. rescuecd). Checksums must be
> calculated in clean environment and be comparable between different
> systems (so we're back to point 1).

And the TOCTTOU window can be eliminated by
	1) including the digests in the queued request
	2) signing the queued request
and having RPM undertake verification before servicing the rpmdb
registration request.

Other issues like "clean" and "safe" can only be determined by
examining explicit thread models or auditing an actual implementation.

> Otherwise we might just put some
> rm -f /usr/share/python/**/__pycache__/* to cron or %post of python
> pkgs to ensure that no malicious/leftovers will harm our system. If the
> files are to be regenerated by regular user, why not remove them once a
> week to save space?

That's one approach to caching:
	Don't do it! Disk space is expensive!

>> Or prepare yourself for
>> the SUID audit that will inevitably be attempted.
> SUID is broken by design and obsoleted (and disabled entirely on some
> machines). Are we talking about CAP_DAC_OVERRIDE?

CAP_DAC_OVERRIDE isn't _THAT_ much better than SUID imho. But
you can advocate whatever security scheme you wish: CAP_DAC_OVERRIDE
is certainly finer grained privilege access than SUID is.

73 de Jeff

More information about the pld-devel-en mailing list