SOURCES (LINUX_2_6): linux-2.6-improve-text-size.patch (NEW) - sma...
pluto
pluto at pld-linux.org
Wed Dec 28 22:00:24 CET 2005
Author: pluto Date: Wed Dec 28 21:00:24 2005 GMT
Module: SOURCES Tag: LINUX_2_6
---- Log message:
- smaller/faster code :)
---- Files affected:
SOURCES:
linux-2.6-improve-text-size.patch (NONE -> 1.1.2.1) (NEW)
---- Diffs:
================================================================
Index: SOURCES/linux-2.6-improve-text-size.patch
diff -u /dev/null SOURCES/linux-2.6-improve-text-size.patch:1.1.2.1
--- /dev/null Wed Dec 28 22:00:24 2005
+++ SOURCES/linux-2.6-improve-text-size.patch Wed Dec 28 22:00:19 2005
@@ -0,0 +1,81 @@
+this patchset (for the 2.6.16 tree) consists of two patches:
+
+ gcc-no-forced-inlining.patch
+ gcc-unit-at-a-time.patch
+
+the purpose of these patches is to reduce the kernel's .text size, in
+particular if CONFIG_CC_OPTIMIZE_FOR_SIZE is specified. The effect of
+the patches on x86 is:
+
+ text data bss dec hex filename
+ 3286166 869852 387260 4543278 45532e vmlinux-orig
+ 3194123 955168 387260 4536551 4538e7 vmlinux-inline
+ 3119495 884960 387748 4392203 43050b vmlinux-inline+units
+ 437271 77646 32192 547109 85925 vmlinux-tiny-orig
+ 452694 77646 32192 562532 89564 vmlinux-tiny-inline
+ 431891 77422 32128 541441 84301 vmlinux-tiny-inline+units
+
+i.e. a 5.3% .text reduction (!) with a larger .config, and a 1.2% .text
+reduction with a smaller .config.
+
+i've also done test-builds with CC_OPTIMIZE_FOR_SIZE disabled:
+
+ text data bss dec hex filename
+4080998 870384 387260 5338642 517612 vmlinux-speed-orig
+4084421 872024 387260 5343705 5189d9 vmlinux-speed-inline
+4010957 834048 387748 5232753 4fd871 vmlinux-speed-inline+units
+
+so the more flexible inlining did not result in many changes [which is
+good, we want gcc to inline those in the optimized-for-speed case], but
+unit-at-a-time optimization resulted in smaller code - very likely
+meaning speed advantages as well.
+
+unit-at-a-time still increases the kernel stack footprint somewhat (by
+about 5% in the CC_OPTIMIZE_FOR_SIZE case), but not by the insane degree
+gcc3 used to, which prompted the original -fno-unit-at-a-time addition.
+
+so i think the combination of the two patches is a win both for small
+and for large systems. In fact the 5.3% .text reduction for embedded
+kernels is very significant.
+
+ arch/i386/Makefile | 6 +++---
+ include/linux/compiler-gcc4.h | 9 +++++----
+ 2 files changed, 8 insertions(+), 7 deletions(-)
+
+--- a/include/linux/compiler-gcc4.h
++++ b/include/linux/compiler-gcc4.h
+@@ -3,14 +3,15 @@
+ /* These definitions are for GCC v4.x. */
+ #include <linux/compiler-gcc.h>
+
+-#define inline inline __attribute__((always_inline))
+-#define __inline__ __inline__ __attribute__((always_inline))
+-#define __inline __inline __attribute__((always_inline))
++#define inline inline
++#define __inline__ __inline__
++#define __inline __inline
+ #define __deprecated __attribute__((deprecated))
+ #define __attribute_used__ __attribute__((__used__))
+ #define __attribute_pure__ __attribute__((pure))
+ #define __attribute_const__ __attribute__((__const__))
+-#define noinline __attribute__((noinline))
++#define noinline __attribute__((noinline))
++#define __always_inline inline __attribute__((always_inline))
+ #define __must_check __attribute__((warn_unused_result))
+ #define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
+
+--- a/arch/i386/Makefile
++++ b/arch/i386/Makefile
+@@ -42,9 +42,9 @@ include $(srctree)/arch/i386/Makefile.cp
+ GCC_VERSION := $(call cc-version)
+ cflags-$(CONFIG_REGPARM) += $(shell if [ $(GCC_VERSION) -ge 0300 ] ; then echo "-mregparm=3"; fi ;)
+
+-# Disable unit-at-a-time mode, it makes gcc use a lot more stack
+-# due to the lack of sharing of stacklots.
+-CFLAGS += $(call cc-option,-fno-unit-at-a-time)
++# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
++# a lot more stack due to the lack of sharing of stacklots:
++CFLAGS += $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then $(call cc-option,-fno-unit-at-a-time); fi ;)
+
+ CFLAGS += $(cflags-y)
+
================================================================
More information about the pld-cvs-commit
mailing list