Terrible performance of Python dependency generator

Jeffrey Johnson n3npq at me.com
Mon Nov 23 10:35:32 CET 2015


> On Nov 23, 2015, at 4:16 AM, Jacek Konieczny <jajcus at jajcus.net> wrote:
> 
> On 2015-11-22 22:03, Jeffrey Johnson wrote:
>> Dependencies are automatically generated only for executable files.
> 
> That is not true for Python dependencies and this would not work for
> Python dependencies.
> 

(aside)
“Only executable files”  SHOULD be true for all automated dependencies imho,
as that is what rpm dependencies were originally designed for, to verify that
executables had all necessary prerequisites. YMMV, everyone’s does.

> There are two useful types of Python dependencies:
> 
> 1. python(abi) – this is extracted from .pyc or .pyo files. These are
> not the executable scripts, but non-executable library files in /usr/lib
> or /usr/share. Checking a single *.py[co] file would do for the whole
> package. On the other hand, this dependency is a bit redundant, because
> files for each python abi are going to a different directory and the
> directory dependency should be enough.
> 
> 2. pythonegg(*) – this are extracted from meta-data in *.egg-info
> directories. A package usually contains only one such directory.
> 
> Currently it works as all /usr/{lib*,share}/pythonX.Y/* files are passed
> to pythoneggs.py. Among this file there would be some *.pyc and some
> file from the egg-info directory, so all the important dependencies
> would be extracted.
> 
> Examining only the executables would return only the '/usr/bin/python',
> or even '/bin/sh' dependency.
> 
> I guess I will hack rpmfc.c to run Python helper only for a single
> py[co] file and a single file in every egg-info directory.
> 

Whatever works for you …

>> So
>> using %files -f manifest, one can make a pass in %install to generate
>> the manifest, and doing both
>> 	1) add a %attr marker to set the execute bits
>> 	2) chmod -x on the file in %buildroot
>> 
>> and then generate dependencies manually (using a two pass build to
>> edit Requires: etc into the spec file.
> 
> Sounds like a very ugly hack.
> 

Yep.

> BTW we don't need a manifest to preserve proper file permissions as in
> PLD we _always_ provide permissions explicitly in %files. So we could
> just chmod -R a-x all the Python files. But that is not what file
> permissions are for!
> 

(aside)
There are other benefits to a manifest, particularly when filtering
large trees of files (which you surely have with drupal) to split
into sub packages. But you can package however you wish.

>> The better fix would be to use the embedded python interpreter yo
>> avoid repeatedly involving a shell that invokes python.
> 
> That wouldn't work much better than no repeat a stupid check for each file.
> 

Its not the check, but the overhead of invoking python for every file, that
you are seeing.

>> Bur the fundamental problem is with user overridable external
>> helper scripts that conform to ancient expectations of the helper API
>> and still must classify files and generate cross referenced tag data
>> dynamically.
> 
> The 'ancient expectations of the helper API' actually made some sense in
> terms of performance (single process to handle a file list). Executing
> any external process for every file is plain stupid.
> 

Yes the ancient API was dirt simple and was preserved. The metadata
has changed so that the dependencies are attached to each file in a package
is what becomes problematic.

The original API was a single shell script … these days there are
too many types of dependencies to handle in ne single shell script.

> And Python (and probably not only Python) dependencies are not per-file,
> but per python package. Linking dependencies checks to specific files is
> quite artificial.
> 

We disagree here. There is functionality within rpm that disables dependencies
attached to a file when that file is excluded.

Of course you can put every file in its own package and choose not to
install that package to achieve the same effect.

But automatic dependencies are a file attribute carried in package metadata,
including pythonegg(…), not a package attribute imposed on the files within.

73 de Jeff
> Jacek
> _______________________________________________
> pld-devel-en mailing list
> pld-devel-en at lists.pld-linux.org
> http://lists.pld-linux.org/mailman/listinfo/pld-devel-en



More information about the pld-devel-en mailing list