SOURCES: kernel-desktop-suspend2.patch (NEW) - all-in-one suspend2...

sparky sparky at pld-linux.org
Mon May 1 17:20:54 CEST 2006


Author: sparky                       Date: Mon May  1 15:20:54 2006 GMT
Module: SOURCES                       Tag: HEAD
---- Log message:
- all-in-one suspend2 patch, with tweeks for applying on preemptrt-patched
  kernel; won't apply without preempt-rt

---- Files affected:
SOURCES:
   kernel-desktop-suspend2.patch (NONE -> 1.1)  (NEW)

---- Diffs:

================================================================
Index: SOURCES/kernel-desktop-suspend2.patch
diff -u /dev/null SOURCES/kernel-desktop-suspend2.patch:1.1
--- /dev/null	Mon May  1 17:20:54 2006
+++ SOURCES/kernel-desktop-suspend2.patch	Mon May  1 17:20:49 2006
@@ -0,0 +1,16284 @@
+diff -Nur linux-2.6.16.11/Documentation/kernel-parameters.txt linux-2.6.16.11.suspend2/Documentation/kernel-parameters.txt
+--- linux-2.6.16.11/Documentation/kernel-parameters.txt	2006-05-01 00:10:59.000000000 +0000
++++ linux-2.6.16.11.suspend2/Documentation/kernel-parameters.txt	2006-05-01 00:12:25.000000000 +0000
+@@ -72,6 +72,7 @@
+ 	SERIAL	Serial support is enabled.
+ 	SMP	The kernel is an SMP kernel.
+ 	SPARC	Sparc architecture is enabled.
++	SUSPEND2 Suspend2 is enabled.
+ 	SWSUSP	Software suspend is enabled.
+ 	TS	Appropriate touchscreen support is enabled.
+ 	USB	USB support is enabled.
+@@ -1049,6 +1050,8 @@
+ 	noresume	[SWSUSP] Disables resume and restores original swap
+ 			space.
+ 
++	noresume2	[SUSPEND2] Disables resuming and restores original swap signature.
++ 
+ 	no-scroll	[VGA] Disables scrollback.
+ 			This is required for the Braillex ib80-piezo Braille
+ 			reader made by F.H. Papenmeier (Germany).
+@@ -1319,6 +1322,11 @@
+ 	resume=		[SWSUSP]
+ 			Specify the partition device for software suspend
+ 
++ 	resume2=	[SUSPEND2] Specify the storage device for Suspend2.
++			Format: <writer>:<writer-parameters>.
++			See Documentation/power/suspend2.txt for details of the
++			formats	for available image writers.
++
+ 	rhash_entries=	[KNL,NET]
+ 			Set number of hash buckets for route cache
+ 
+diff -Nur linux-2.6.16.11/Documentation/power/internals.txt linux-2.6.16.11.suspend2/Documentation/power/internals.txt
+--- linux-2.6.16.11/Documentation/power/internals.txt	1970-01-01 00:00:00.000000000 +0000
++++ linux-2.6.16.11.suspend2/Documentation/power/internals.txt	2006-05-01 00:12:25.000000000 +0000
+@@ -0,0 +1,362 @@
++		Software Suspend 2.2 Internal Documentation.
++				Version 1
++
++1.  Introduction.
++
++    Software Suspend 2.2 is an addition to the Linux Kernel, designed to
++    allow the user to quickly shutdown and quickly boot a computer, without
++    needing to close documents or programs. It is equivalent to the
++    hibernate facility in some laptops. This implementation, however,
++    requires no special BIOS or hardware support.
++
++    The code in these files is based upon the original implementation
++    prepared by Gabor Kuti and additional work by Pavel Machek and a
++    host of others. This code has been substantially reworked by Nigel
++    Cunningham, again with the help and testing of many others, not the
++    least of whom is Michael Frank, At its heart, however, the operation is
++    essentially the same as Gabor's version.
++
++2.  Overview of operation.
++
++    The basic sequence of operations is as follows:
++
++	a. Quiesce all other activity.
++	b. Ensure enough memory and storage space are available, and attempt
++	   to free memory/storage if necessary.
++	c. Allocate the required memory and storage space.
++	d. Write the image.
++	e. Power down.
++
++    There are a number of complicating factors which mean that things are
++    not as simple as the above would imply, however...
++
++    o The activity of each process must be stopped at a point where it will
++    not be holding locks necessary for saving the image, or unexpectedly
++    restart operations due to something like a timeout and thereby make
++    our image inconsistent.
++
++    o It is desirous that we sync outstanding I/O to disk before calculating
++    image statistics. This reduces corruption if one should suspend but
++    then not resume, and also makes later parts of the operation safer (see
++    below).
++
++    o We need to get as close as we can to an atomic copy of the data.
++    Inconsistencies in the image will result in inconsistent memory contents at
++    resume time, and thus in instability of the system and/or file system
++    corruption. This would appear to imply a maximum image size of one half of
++    the amount of RAM, but we have a solution... (again, below).
++
++    o In 2.6, we choose to play nicely with the other suspend-to-disk
++    implementations.
++
++3.  Detailed description of internals.
++
++    a. Quiescing activity.
++
++    Safely quiescing the system is achieved using two methods.
++
++    First, we note that the vast majority of processes don't need to run during
++    suspend. They can be 'frozen'. We therefore implement a refrigerator
++    routine, which processes enter and in which they remain until the cycle is
++    complete. Processes enter the refrigerator via try_to_freeze() invocations
++    at appropriate places.  A process cannot be frozen in any old place. It
++    must not be holding locks that will be needed for writing the image or
++    freezing other processes. For this reason, userspace processes generally
++    enter the refrigerator via the signal handling code, and kernel threads at
++    the place in their event loops where they drop locks and yield to other
++    processes or sleep.
++
++    The second part of our method for quisescing the system involves freezing
++    the filesystems. We use the standard freeze_bdev and thaw_bdev functions to
++    ensure that all of the user's data is synced to disk before we begin to
++    write the image.
++
++    Quiescing the system works most quickly and reliably when we add one more
++    element to the algorithm: separating the freezing of userspace processes
++    from the freezing of kernel space processes, and doing the filesystem freeze
++    in between. The filesystem freeze needs to be done while kernel threads such
++    as kjournald can still run.At the same time, though, everything will be less
++    racy and run more quickly if we stop userspace submitting more I/O work
++    while we're trying to quiesce.
++
++    Quiescing the system is therefore done in three steps:
++	- Freeze userspace
++	- Freeze filesystems
++	- Freeze kernel threads
++
++    If we need to free memory, we thaw kernel threads and filesystems, but not
++    userspace. We can then free caches without worrying about deadlocks due to
++    swap files being on frozen filesystems or such like.
++
++    b. Ensure enough memory & storage are available.
++
++    We have a number of constraints to meet to be able to successfully suspend
++    and resume.
++
++    First, the image will be written in two parts, described below. One of these
++    parts needs to have an atomic copy made, which of course implies a maximum
++    size of one half of the amount of system memory. The other part ('pageset')
++    is not atomically copied, and can therefore be as large or small as desired.
++
++    Second, we have constraints on the amount of storage available. In these
++    calculations, we may also consider any compression that will be done. The
++    cryptoapi module allows the user to configure an expected compression ratio.
++   
++    Third, the user can specify an arbitrary limit on the image size, in
++    megabytes. This limit is treated as a soft limit, so that we don't fail the
++    attempt to suspend if we cannot meet this constraint.
++
++    c. Allocate the required memory and storage space.
++
++    Having done the initial freeze, we determine whether the above constraints
++    are met, and seek to allocate the metadata for the image. If the constraints
++    are not met, or we fail to allocate the required space for the metadata, we
++    seek to free the amount of memory that we calculate is needed and try again.
++    We allow up to four iterations of this loop before aborting the cycle. If we
++    do fail, it should only be because of a bug in Suspend's calculations.
++    
++    These steps are merged together in the prepare_image function, found in
++    prepare_image.c. The functions are merged because of the cyclical nature
++    of the problem of calculating how much memory and storage is needed. Since
++    the data structures containing the information about the image must
++    themselves take memory and use storage, the amount of memory and storage
++    required changes as we prepare the image. Since the changes are not large,
++    only one or two iterations will be required to achieve a solution.
++
++    d. Write the image.
++
++    We previously mentioned the need to create an atomic copy of the data, and
++    the half-of-memory limitation that is implied in this. This limitation is
++    circumvented by dividing the memory to be saved into two parts, called
++    pagesets.
++
++    Pageset2 contains the page cache - the pages on the active and inactive
++    lists. These pages are saved first and reloaded last. While saving these
++    pages, the swapwriter module carefully ensures that the work of writing
++    the pages doesn't make the image inconsistent. Pages added to the LRU
++    lists are immediately shot down, and careful accounting for available
++    memory aids debugging. No atomic copy of these pages needs to be made.
++
++    Writing the image requires memory, of course, and at this point we have
++    also not yet suspended the drivers. To avoid the possibility of remaining
++    activity corrupting the image, we allocate a special memory pool. Calls
++    to __alloc_pages and __free_pages_ok are then diverted to use our memory
++    pool. Pages in the memory pool are saved as part of pageset1 regardless of
++    whether or not they are used.
++
++    Once pageset2 has been saved, we suspend the drivers and save the CPU
++    context before making an atomic copy of pageset1, resuming the drivers
++    and saving the atomic copy. After saving the two pagesets, we just need to
++    save our metadata before powering down.
++
++    Having saved pageset2 pages, we can safely overwrite their contents with
++    the atomic copy of pageset1. This is how we manage to overcome the half of
++    memory limitation. Pageset2 is normally far larger than pageset1, and
++    pageset1 is normally much smaller than half of the memory, with the result
++    that pageset2 pages can be safely overwritten with the atomic copy of
++    pageset1. This is where we need to be careful about syncing, however.
++    Pageset2 will probably contain filesystem meta data. If this is overwritten
++    with pageset1 and then a sync occurs, the filesystem will be corrupted -
++    at least until resume time and another sync of the restored data. Since
++    there is a possibility that the user might not resume or (may it never be!)
++    that suspend might oops, we do our utmost to avoid syncing filesystems after
++    copying pageset1.
++
++    e. Power down.
++
++    Powering down uses standard kernel routines. Prior to this, however, we
++    suspend drivers again, ensuring that write caches are flushed.
++
++4.  The method of writing the image.
++
++    Suspend2 contains an internal API which is designed to simplify the
++    implementation of new methods of transforming the image to be written and
++    writing the image itself. In early versions of Suspend2, compression support
++    was inlined in the image writing code, and the data structures and code for
++    managing swap were intertwined with the rest of the code. A number of people
++    had expressed interest in implementing image encryption, and alternative
++    methods of storing the image. This internal API makes that possible by
++    implementing 'modules'.
++
++    A module is a single file which encapsulates the functionality needed
++    to transform a pageset of data (encryption or compression, for example),
++    or to write the pageset to a device. The former type of module is called
++    a 'page-transformer', the later a 'writer'.
++
++    Modules are linked together in pipeline fashion. There may be zero or more
++    page transformers in a pipeline, and there is always exactly one writer.
++    The pipeline follows this pattern:
++
++		---------------------------------
++		|          Suspend2 Core        |
++		---------------------------------
++				|
++				|
++		---------------------------------
++		|	Page transformer 1	|
++		---------------------------------
++				|
++				|
++		---------------------------------
++		|	Page transformer 2	|
++		---------------------------------
++				|
++				|
++		---------------------------------
++		|            Writer		|
++		---------------------------------
++
++    During the writing of an image, the core code feeds pages one at a time
++    to the first module. This module performs whatever transformations it
++    implements on the incoming data, completely consuming the incoming data and
++    feeding output in a similar manner to the next module. A module may buffer
++    its output.
++
++    During reading, the pipeline works in the reverse direction. The core code
++    calls the first module with the address of a buffer which should be filled.
++    (Note that the buffer size is always PAGE_SIZE at this time). This module
++    will in turn request data from the next module and so on down until the
++    writer is made to read from the stored image.
++
++    Part of definition of the structure of a module thus looks like this:
++
++        int (*rw_init) (int rw, int stream_number);
++        int (*rw_cleanup) (int rw);
++        int (*write_chunk) (struct page *buffer_page);
++        int (*read_chunk) (struct page *buffer_page, int sync);
++
++    It should be noted that the _cleanup routine may be called before the
++    full stream of data has been read or written. While writing the image,
++    the user may (depending upon settings) choose to abort suspending, and
++    if we are in the midst of writing the last portion of the image, a portion
++    of the second pageset may be reread.
++
++    In addition to the above routines for writing the data, all modules have a
++    number of other routines:
++
++    TYPE indicates whether the module is a page transformer or a writer.
++    #define TRANSFORMER_MODULE 1
++    #define WRITER_MODULE 2
++
++    NAME is the name of the module, used in generic messages.
++
++    MODULE_LIST is used to link the module into the list of all modules.
++
++    MEMORY_NEEDED returns the number of pages of memory required by the module
++    to do its work.
++
++    STORAGE_NEEDED returns the number of pages in the suspend header required
++    to store the module's configuration data.
++
++    PRINT_DEBUG_INFO fills a buffer with information to be displayed about the
++    operation or settings of the module.
++
++    SAVE_CONFIG_INFO returns a buffer of PAGE_SIZE or smaller (the size is the
++    return code), containing the module's configuration info. This information
++    will be written in the image header and restored at resume time. Since this
++    buffer is allocated after the atomic copy of the kernel is made, you don't
++    need to worry about the buffer being freed.
++
++    LOAD_CONFIG_INFO gives the module a pointer to the the configuration info
++    which was saved during suspending. Once again, the module doesn't need to
++    worry about freeing the buffer. The kernel will be overwritten with the
++    original kernel, so no memory leak will occur.
++
++    OPS contains the operations specific to transformers and writers. These are
++    described below.
++
++    The complete definition of struct suspend_module_ops is:
++
++	struct suspend_module_ops {
++	        /* Functions common to all modules */
++	        int type;
++	        char *name;
++	        struct module *module;
++	        int disabled;
++	        struct list_head module_list;
++
++	        /* List of filters or writers */
++	        struct list_head list, type_list;
++
++	        /*
++	         * Requirements for memory and storage in
++	         * the image header..
++	         */
++	        unsigned long (*memory_needed) (void);
++	        unsigned long (*storage_needed) (void);
++
++	        /*
++	         * Debug info
++	         */
++	        int (*print_debug_info) (char *buffer, int size);
++	        int (*save_config_info) (char *buffer);
++	        void (*load_config_info) (char *buffer, int len);
++
++	        /*
++	         * Initialise & cleanup - general routines called
++	         * at the start and end of a cycle.
++	         */
++	        int (*initialise) (int starting_cycle);
++	        void (*cleanup) (int finishing_cycle);
++
++	        /*
++	         * Calls for allocating storage (writers only).
++	         *
++	         * Header space is allocated separately. Note that allocation
++	         * of space for the header might result in allocated space
++	         * being stolen from the main pool if there is no unallocated
++	         * space. We have to be able to allocate enough space for
++	         * the header. We can eat memory to ensure there is enough
++	         * for the main pool.
++	         */
++
++	        int (*storage_available) (void);
++	        int (*allocate_header_space) (int space_requested);
++	        int (*allocate_storage) (int space_requested);
++	        int (*storage_allocated) (void);
++	        int (*release_storage) (void);
++
++	        /*
++	         * Routines used in image I/O.
++	         */
++	        int (*rw_init) (int rw, int stream_number);
++	        int (*rw_cleanup) (int rw);
++	        int (*write_chunk) (struct page *buffer_page);
++	        int (*read_chunk) (struct page *buffer_page, int sync);
++
++	        /* Reset module if image exists but reading aborted */
++	        void (*noresume_reset) (void);
++
++	        /* Read and write the metadata */
++	        int (*write_header_init) (void);
++	        int (*write_header_cleanup) (void);
++
++	        int (*read_header_init) (void);
++	        int (*read_header_cleanup) (void);
++
++	        int (*rw_header_chunk) (int rw, char *buffer_start, int buffer_size);
++
++	        /* Attempt to parse an image location */
++	        int (*parse_sig_location) (char *buffer, int only_writer);
++
++	        /* Determine whether image exists that we can restore */
++	        int (*image_exists) (void);
++
++	        /* Mark the image as having tried to resume */
++	        void (*mark_resume_attempted) (void);
++
++	        /* Destroy image if one exists */
++	        int (*invalidate_image) (void);
++	};
++
++
++	Expected compression returns the expected ratio between the amount of
++	data sent to this module and the amount of data it passes to the next
++	module. The value is used by the core code to calculate the amount of
++	space required to write the image. If the ratio is not achieved, the
++	writer will complain when it runs out of space with data still to
++	write, and the core code will abort the suspend.
++
++	transformer_list links together page transformers, in the order in
++	which they register, which is in turn determined by order in the
++	Makefile.
+diff -Nur linux-2.6.16.11/Documentation/power/suspend2.txt linux-2.6.16.11.suspend2/Documentation/power/suspend2.txt
+--- linux-2.6.16.11/Documentation/power/suspend2.txt	1970-01-01 00:00:00.000000000 +0000
++++ linux-2.6.16.11.suspend2/Documentation/power/suspend2.txt	2006-05-01 00:12:25.000000000 +0000
+@@ -0,0 +1,663 @@
++	--- Suspend2, version 2.2 ---
++
++1.  What is it?
++2.  Why would you want it?
++3.  What do you need to use it?
++4.  Why not just use the version already in the kernel?
++5.  How do you use it?
++6.  What do all those entries in /proc/suspend2 do?
++7.  How do you get support?
++8.  I think I've found a bug. What should I do?
++9.  When will XXX be supported?
++10  How does it work?
++11. Who wrote Suspend2?
++
++1. What is it?
++
++   Imagine you're sitting at your computer, working away. For some reason, you
++   need to turn off your computer for a while - perhaps it's time to go home
++   for the day. When you come back to your computer next, you're going to want
++   to carry on where you left off. Now imagine that you could push a button and
++   have your computer store the contents of its memory to disk and power down.
++   Then, when you next start up your computer, it loads that image back into
++   memory and you can carry on from where you were, just as if you'd never
++   turned the computer off. Far less time to start up, no reopening
++   applications and finding what directory you put that file in yesterday.
++   That's what Suspend2 does.
++
++2. Why would you want it?
++
++   Why wouldn't you want it?
++   
++   Being able to save the state of your system and quickly restore it improves
++   your productivity - you get a useful system in far less time than through
++   the normal boot process.
++   
++3. What do you need to use it?
++
++   a. Kernel Support.
++
++   i) The Suspend2 patch.
++   
++   Suspend2 is part of the Linux Kernel. This version is not part of Linus's
++   2.6 tree at the moment, so you will need to download the kernel source and
++   apply the latest patch. Having done that, enable the appropriate options in
++   make [menu|x]config (under General Setup), compile and install your kernel.
++   Suspend2 works with SMP, Highmem, preemption, x86-32, PPC and x86_64.
++
++   Suspend2 patches are available from http://suspend2.net.
++
++   ii) Compression and encryption support.
++
++   Compression and encryption support are implemented via the
++   cryptoapi. You will therefore want to select any Cryptoapi transforms that
++   you want to use on your image from the Cryptoapi menu while configuring
++   your kernel.
++
++   You can also tell Suspend to write it's image to an encrypted and/or
++   compressed filesystem/swap partition. In that case, you don't need to do
++   anything special for Suspend2 when it comes to kernel configuration.
++
++   iii) Configuring other options.
++
++   While you're configuring your kernel, try to configure as much as possible
++   to build as modules. We recommend this because there are a number of drivers
++   that are still in the process of implementing proper power management
++   support. In those cases, the best way to work around their current lack is
++   to build them as modules and remove the modules while suspending. You might
++   also bug the driver authors to get their support up to speed, or even help!
++
++   b. Storage.
++
++   i) Swap.
++
++   Suspend2 can store the suspend image in your swap partition, a swap file or
++   a combination thereof. Whichever combination you choose, you will probably
++   want to create enough swap space to store the largest image you could have,
++   plus the space you'd normally use for swap. A good rule of thumb would be
++   to calculate the amount of swap you'd want without using Suspend2, and then
++   add the amount of memory you have. This swapspace can be arranged in any way
++   you'd like. It can be in one partition or file, or spread over a number. The
++   only requirement is that they be active when you start a suspend cycle.
++   
++   There is one exception to this requirement. Suspend2 has the ability to turn
++   on one swap file or partition at the start of suspending and turn it back off
++   at the end. If you want to ensure you have enough memory to store a image
++   when your memory is fully used, you might want to make one swap partition or
++   file for 'normal' use, and another for Suspend2 to activate & deactivate
++   automatically. (Further details below).
++
++   ii) Normal files.
++
++   Suspend2 includes a 'filewriter'. The filewriter can store
++   your image in a simple file. Since Linux has the idea of everything being
++   a file, this is more powerful than it initially sounds. If, for example,
++   you were to set up a network block device file, you could suspend to a
++   network server. This has been tested and works to a point, but nbd itself
++   isn't stateless enough for our purposes.
++
++   Take extra care when setting up the filewriter. If you just type commands
++   without thinking and then try to suspend, you could cause irreversible
++   corruption on your filesystems! Make sure you have backups. Also, because
++   the filewriter is comparatively new, it's not as well tested as the
++   swapwriter. Be aware that there may be bugs that could cause damage to your
++   data even if you are careful! You have been warned!
++
++   Most people will only want to suspend to a local file. To achieve that, do
++   something along the lines of:
++
++   echo "Suspend2" > /suspend-file
++   dd if=/dev/zero bs=1M count=512 >> suspend-file
++
++   This will create a 512MB file called /suspend-file. To get Suspend2 to use
++   it:
++
++   echo /suspend-file > /proc/suspend2/filewriter_target
++
++   Then
++
++   cat /proc/suspend2/resume2
++
++   Put the results of this into your bootloader's configuration (see also step
++   C, below:
++
++   ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE---
++   # cat /proc/suspend2/resume2
++   file:/dev/hda2:0x1e001
++   
++   In this example, we would edit the append= line of our lilo.conf|menu.lst
++   so that it included:
++
++   resume2=file:/dev/hda2:0x1e001
++   ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE---
++ 
++   For those who are thinking 'Could I make the file sparse?', the answer is
++   'No!'. At the moment, there is no way for Suspend2 to fill in the holes in
++   a sparse file while suspending. In the longer term (post merge!), I'd like
++   to change things so that the file could be dynamically resized as needed.
++   Right now, however, that's not possible.
++
++   c. Bootloader configuration.
++   
++   Using Suspend2 also requires that you add an extra parameter to 
++   your lilo.conf or equivalent. Here's an example for a swap partition:
++
++   append="resume2=swap:/dev/hda1"
++
++   This would tell Suspend2 that /dev/hda1 is a swap partition you 
++   have. Suspend2 will use the swap signature of this partition as a
++   pointer to your data when you suspend. This means that (in this example)
++   /dev/hda1 doesn't need to be _the_ swap partition where all of your data
++   is actually stored. It just needs to be a swap partition that has a
++   valid signature.
++
++   You don't need to have a swap partition for this purpose. Suspend2
++   can also use a swap file, but usage is a little more complex. Having made
++   your swap file, turn it on and do 
++
++   cat /proc/suspend2/headerlocations
++
++   (this assumes you've already compiled your kernel with Suspend2
++   support and booted it). The results of the cat command will tell you
++   what you need to put in lilo.conf:
++
++   For swap partitions like /dev/hda1, simply use resume2=/dev/hda1.
++   For swapfile `swapfile`, use resume2=swap:/dev/hda2:0x242d at 4096.
++
++   If the swapfile changes for any reason (it is moved to a different
++   location, it is deleted and recreated, or the filesystem is
++   defragmented) then you will have to check
++   /proc/suspend2/headerlocations for a new resume_block value.
++
++   Once you've compiled and installed the kernel, adjusted your lilo.conf
++   and rerun lilo, you should only need to reboot for the most basic part
++   of Suspend2 to be ready.
++
++   If you only compile in the swapwriter, or only compile in the filewriter,
++   you don't need to add the "swap:" part of the resume2= parameters above.
++   resume2=/dev/hda2:0x242d at 4096 will work just as well.
++
++   d. The hibernate script.
++
++   Since the driver model in 2.6 kernels is still being developed, you may need
++   to do more, however. Users of Suspend2 usually start the process via a script
++   which prepares for the suspend, tells the kernel to do its stuff and then
++   restore things afterwards. This script might involve:
++
++   - Switching to a text console and back if X doesn't like the video card
++     status on resume.
++   - Un/reloading PCMCIA support since it doesn't play well with suspend.
<<Diff was trimmed, longer than 597 lines>>


More information about the pld-cvs-commit mailing list