TEST build OK: openjdk8.spec - openjdk 8 and 9 hangs on glibc 2.34 in vserver guest

Jan Palus atler at pld-linux.org
Wed Nov 24 21:30:03 CET 2021


On 24.11.2021 21:04, Jan Palus wrote:
> > More fun with that.
> > 
> > working system, glibc 2.34, on any kernel
> > 
> > mkdir /test/
> > rsync -avPH / /test/ --exclude /test/ --exclude /proc --exclude /sys
> > mount /proc /test/proc -o bind
> > chroot /test/; java -version - hangs at some retry
> > 
> > mkdir -p /test/sys/devices/system/cpu
> > chroot /test/; java -version - no hangs
> > (so proc is mounted but /sys is not; just dir exists)
> 
> Great finding! Indeed the following is a quick reproducer for any java8
> version -- openjdk8 302/312, icedtea8 and even Oracle JDK8 202:
> 
> systemd-run --wait -t -p InaccessiblePaths=/sys /usr/lib64/jvm/openjdk8/bin/java -version
> 
> So far couldn't reproduce with openjdk11 and FWIW it does not reproduce
> on aarch64.

If I had to guess... Code that opens /sys/devices/system/cpu is found
in glibc:

https://sourceware.org/git?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/getsysstats.c;h=1391e360b8f8e86ce39a56d4bf1117f37d07eed9;hb=ae37d06c7d127817ba43850f0f898b793d42aea7#l95

Notice that result is different between case in which directory does not
exist (1) and exists but is empty (0). Now looking at java8 code there's
a juicy comment right next to getting number of CPUs:

    // Most versions of linux have a bug where the number of processors are
    // determined by looking at the /proc file system.  In a chroot environment,
    // the system call returns 1.  This causes the VM to act as if it is
    // a single processor and elide locking (see is_MP() call).
    static bool unsafe_chroot_detected = false;
    static const char *unstable_chroot_error = "/proc file system not found.\n"
                         "Java may be unstable running multithreaded in a chroot "
                         "environment on Linux when /proc filesystem is not mounted.";
    
    void os::Linux::initialize_system_info() {
      set_processor_count(sysconf(_SC_NPROCESSORS_CONF));

But... the code in glibc does not look at /proc... at least not in 2.34.

I suppose this is the commit to be blamed for this regression:

https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=f13fb81ad3159543741e9132685335002a6d5df2

glibc 2.33 used to look at /proc too, but it no longer does in 2.34.

Will check how openjdk11 handles that, but if any of this is true I'd say the
easiest fix would be to always lock and not optimize based on cpu count.


More information about the pld-devel-en mailing list