Git migration: subdirs under packages/

Paweł Sikora pluto at agmk.net
Mon Jun 25 20:49:22 CEST 2012


On Monday 25 of June 2012 19:43:42 Paweł Sikora wrote:
> On Monday 25 of June 2012 18:52:32 Jan Rękorajski wrote:
> > Small suggestion - add ability to slug.py to work with package(s)
> > in a one level hash directories based on the first letter
> > of package name, like this:
> > 
> > packages/[0-9A-Za-z]/<package dir>
> > 
> > Rationale: having entire packages checked out is RPITA, entering/listing
> > packages/ directory is painfully slow - much slower than with CVS.
> > One level hash will greatly speedup things.
> 
> this is a bad workaround. the core problem is in glibc's readdir()
> which calls getdents syscall multiple times with small 32k buffer.
> e.g, for rpm/packages, `ls -1` produces:
> 
> (...)
> getdents(3, /* 913 entries */, 32768)   = 32760
> getdents(3, /* 911 entries */, 32768)   = 32744
> getdents(3, /* 914 entries */, 32768)   = 32736
> getdents(3, /* 906 entries */, 32768)   = 32760
> getdents(3, /* 919 entries */, 32768)   = 32752
> getdents(3, /* 919 entries */, 32768)   = 32768
> getdents(3, /* 917 entries */, 32768)   = 32744
> getdents(3, /* 919 entries */, 32768)   = 32744
> getdents(3, /* 917 entries */, 32768)   = 32744
> getdents(3, /* 907 entries */, 32768)   = 32728
> getdents(3, /* 915 entries */, 32768)   = 32736
> getdents(3, /* 918 entries */, 32768)   = 32752
> getdents(3, /* 918 entries */, 32768)   = 32744
> getdents(3, /* 921 entries */, 32768)   = 32752
> getdents(3, /* 907 entries */, 32768)   = 32752
> getdents(3, /* 465 entries */, 32768)   = 16784
> getdents(3, /* 0 entries */, 32768)     = 0
> (...)

...and the major performance issue is the `mc` listing algorithm for custom view
with the 'size' column. it finally calls the lstat() for each entry (~15k times).



More information about the pld-devel-en mailing list