Terrible performance of Python dependency generator

Jacek Konieczny jajcus at jajcus.net
Sun Nov 22 21:39:58 CET 2015


Hi,

We will probably need to rebuild the python-* packages again and I
already hate that. Such python-django takes 45 minutes to build and most
of that is in the auto-dependency generator. That is insane! It should
not take that long!

/usr/lib/rpm/pythoneggs.py is used to find the dependencies and it is
not that slow by itself… but it is called twice (Provides + Requires)
for each file in /usr/share/pythonX.Y. And big Python packages have lots
of files there. Most of them not adding any extra dependency
information.

That is strange, as the dependency helpers accept list of file names on
their stdout… and RPM (in lib/rpmfc.c) always feeds them with one
filename only. Why is that?

I can even see a buffer for a file list in the code (iob_python in the
rpmfc_s struct), but it seems not used.

I tried to invent some smart hack to limit number of files examined –
usually checking a single *.py file and the *.egg-info/PKG-INFO should
be enough, but I was not able to inject this in the weird rpmfc logic.
And I do not quite understand what it is supposed to do (what are those
'colors' and what files should be python-colored).

Can this be fixed somehow? How have we ended with this?

Jacek


More information about the pld-devel-en mailing list