SOURCES: kernel-desktop-suspend2.patch (NEW) - all-in-one suspend2...
sparky
sparky at pld-linux.org
Mon May 1 17:20:54 CEST 2006
Author: sparky Date: Mon May 1 15:20:54 2006 GMT
Module: SOURCES Tag: HEAD
---- Log message:
- all-in-one suspend2 patch, with tweeks for applying on preemptrt-patched
kernel; won't apply without preempt-rt
---- Files affected:
SOURCES:
kernel-desktop-suspend2.patch (NONE -> 1.1) (NEW)
---- Diffs:
================================================================
Index: SOURCES/kernel-desktop-suspend2.patch
diff -u /dev/null SOURCES/kernel-desktop-suspend2.patch:1.1
--- /dev/null Mon May 1 17:20:54 2006
+++ SOURCES/kernel-desktop-suspend2.patch Mon May 1 17:20:49 2006
@@ -0,0 +1,16284 @@
+diff -Nur linux-2.6.16.11/Documentation/kernel-parameters.txt linux-2.6.16.11.suspend2/Documentation/kernel-parameters.txt
+--- linux-2.6.16.11/Documentation/kernel-parameters.txt 2006-05-01 00:10:59.000000000 +0000
++++ linux-2.6.16.11.suspend2/Documentation/kernel-parameters.txt 2006-05-01 00:12:25.000000000 +0000
+@@ -72,6 +72,7 @@
+ SERIAL Serial support is enabled.
+ SMP The kernel is an SMP kernel.
+ SPARC Sparc architecture is enabled.
++ SUSPEND2 Suspend2 is enabled.
+ SWSUSP Software suspend is enabled.
+ TS Appropriate touchscreen support is enabled.
+ USB USB support is enabled.
+@@ -1049,6 +1050,8 @@
+ noresume [SWSUSP] Disables resume and restores original swap
+ space.
+
++ noresume2 [SUSPEND2] Disables resuming and restores original swap signature.
++
+ no-scroll [VGA] Disables scrollback.
+ This is required for the Braillex ib80-piezo Braille
+ reader made by F.H. Papenmeier (Germany).
+@@ -1319,6 +1322,11 @@
+ resume= [SWSUSP]
+ Specify the partition device for software suspend
+
++ resume2= [SUSPEND2] Specify the storage device for Suspend2.
++ Format: <writer>:<writer-parameters>.
++ See Documentation/power/suspend2.txt for details of the
++ formats for available image writers.
++
+ rhash_entries= [KNL,NET]
+ Set number of hash buckets for route cache
+
+diff -Nur linux-2.6.16.11/Documentation/power/internals.txt linux-2.6.16.11.suspend2/Documentation/power/internals.txt
+--- linux-2.6.16.11/Documentation/power/internals.txt 1970-01-01 00:00:00.000000000 +0000
++++ linux-2.6.16.11.suspend2/Documentation/power/internals.txt 2006-05-01 00:12:25.000000000 +0000
+@@ -0,0 +1,362 @@
++ Software Suspend 2.2 Internal Documentation.
++ Version 1
++
++1. Introduction.
++
++ Software Suspend 2.2 is an addition to the Linux Kernel, designed to
++ allow the user to quickly shutdown and quickly boot a computer, without
++ needing to close documents or programs. It is equivalent to the
++ hibernate facility in some laptops. This implementation, however,
++ requires no special BIOS or hardware support.
++
++ The code in these files is based upon the original implementation
++ prepared by Gabor Kuti and additional work by Pavel Machek and a
++ host of others. This code has been substantially reworked by Nigel
++ Cunningham, again with the help and testing of many others, not the
++ least of whom is Michael Frank, At its heart, however, the operation is
++ essentially the same as Gabor's version.
++
++2. Overview of operation.
++
++ The basic sequence of operations is as follows:
++
++ a. Quiesce all other activity.
++ b. Ensure enough memory and storage space are available, and attempt
++ to free memory/storage if necessary.
++ c. Allocate the required memory and storage space.
++ d. Write the image.
++ e. Power down.
++
++ There are a number of complicating factors which mean that things are
++ not as simple as the above would imply, however...
++
++ o The activity of each process must be stopped at a point where it will
++ not be holding locks necessary for saving the image, or unexpectedly
++ restart operations due to something like a timeout and thereby make
++ our image inconsistent.
++
++ o It is desirous that we sync outstanding I/O to disk before calculating
++ image statistics. This reduces corruption if one should suspend but
++ then not resume, and also makes later parts of the operation safer (see
++ below).
++
++ o We need to get as close as we can to an atomic copy of the data.
++ Inconsistencies in the image will result in inconsistent memory contents at
++ resume time, and thus in instability of the system and/or file system
++ corruption. This would appear to imply a maximum image size of one half of
++ the amount of RAM, but we have a solution... (again, below).
++
++ o In 2.6, we choose to play nicely with the other suspend-to-disk
++ implementations.
++
++3. Detailed description of internals.
++
++ a. Quiescing activity.
++
++ Safely quiescing the system is achieved using two methods.
++
++ First, we note that the vast majority of processes don't need to run during
++ suspend. They can be 'frozen'. We therefore implement a refrigerator
++ routine, which processes enter and in which they remain until the cycle is
++ complete. Processes enter the refrigerator via try_to_freeze() invocations
++ at appropriate places. A process cannot be frozen in any old place. It
++ must not be holding locks that will be needed for writing the image or
++ freezing other processes. For this reason, userspace processes generally
++ enter the refrigerator via the signal handling code, and kernel threads at
++ the place in their event loops where they drop locks and yield to other
++ processes or sleep.
++
++ The second part of our method for quisescing the system involves freezing
++ the filesystems. We use the standard freeze_bdev and thaw_bdev functions to
++ ensure that all of the user's data is synced to disk before we begin to
++ write the image.
++
++ Quiescing the system works most quickly and reliably when we add one more
++ element to the algorithm: separating the freezing of userspace processes
++ from the freezing of kernel space processes, and doing the filesystem freeze
++ in between. The filesystem freeze needs to be done while kernel threads such
++ as kjournald can still run.At the same time, though, everything will be less
++ racy and run more quickly if we stop userspace submitting more I/O work
++ while we're trying to quiesce.
++
++ Quiescing the system is therefore done in three steps:
++ - Freeze userspace
++ - Freeze filesystems
++ - Freeze kernel threads
++
++ If we need to free memory, we thaw kernel threads and filesystems, but not
++ userspace. We can then free caches without worrying about deadlocks due to
++ swap files being on frozen filesystems or such like.
++
++ b. Ensure enough memory & storage are available.
++
++ We have a number of constraints to meet to be able to successfully suspend
++ and resume.
++
++ First, the image will be written in two parts, described below. One of these
++ parts needs to have an atomic copy made, which of course implies a maximum
++ size of one half of the amount of system memory. The other part ('pageset')
++ is not atomically copied, and can therefore be as large or small as desired.
++
++ Second, we have constraints on the amount of storage available. In these
++ calculations, we may also consider any compression that will be done. The
++ cryptoapi module allows the user to configure an expected compression ratio.
++
++ Third, the user can specify an arbitrary limit on the image size, in
++ megabytes. This limit is treated as a soft limit, so that we don't fail the
++ attempt to suspend if we cannot meet this constraint.
++
++ c. Allocate the required memory and storage space.
++
++ Having done the initial freeze, we determine whether the above constraints
++ are met, and seek to allocate the metadata for the image. If the constraints
++ are not met, or we fail to allocate the required space for the metadata, we
++ seek to free the amount of memory that we calculate is needed and try again.
++ We allow up to four iterations of this loop before aborting the cycle. If we
++ do fail, it should only be because of a bug in Suspend's calculations.
++
++ These steps are merged together in the prepare_image function, found in
++ prepare_image.c. The functions are merged because of the cyclical nature
++ of the problem of calculating how much memory and storage is needed. Since
++ the data structures containing the information about the image must
++ themselves take memory and use storage, the amount of memory and storage
++ required changes as we prepare the image. Since the changes are not large,
++ only one or two iterations will be required to achieve a solution.
++
++ d. Write the image.
++
++ We previously mentioned the need to create an atomic copy of the data, and
++ the half-of-memory limitation that is implied in this. This limitation is
++ circumvented by dividing the memory to be saved into two parts, called
++ pagesets.
++
++ Pageset2 contains the page cache - the pages on the active and inactive
++ lists. These pages are saved first and reloaded last. While saving these
++ pages, the swapwriter module carefully ensures that the work of writing
++ the pages doesn't make the image inconsistent. Pages added to the LRU
++ lists are immediately shot down, and careful accounting for available
++ memory aids debugging. No atomic copy of these pages needs to be made.
++
++ Writing the image requires memory, of course, and at this point we have
++ also not yet suspended the drivers. To avoid the possibility of remaining
++ activity corrupting the image, we allocate a special memory pool. Calls
++ to __alloc_pages and __free_pages_ok are then diverted to use our memory
++ pool. Pages in the memory pool are saved as part of pageset1 regardless of
++ whether or not they are used.
++
++ Once pageset2 has been saved, we suspend the drivers and save the CPU
++ context before making an atomic copy of pageset1, resuming the drivers
++ and saving the atomic copy. After saving the two pagesets, we just need to
++ save our metadata before powering down.
++
++ Having saved pageset2 pages, we can safely overwrite their contents with
++ the atomic copy of pageset1. This is how we manage to overcome the half of
++ memory limitation. Pageset2 is normally far larger than pageset1, and
++ pageset1 is normally much smaller than half of the memory, with the result
++ that pageset2 pages can be safely overwritten with the atomic copy of
++ pageset1. This is where we need to be careful about syncing, however.
++ Pageset2 will probably contain filesystem meta data. If this is overwritten
++ with pageset1 and then a sync occurs, the filesystem will be corrupted -
++ at least until resume time and another sync of the restored data. Since
++ there is a possibility that the user might not resume or (may it never be!)
++ that suspend might oops, we do our utmost to avoid syncing filesystems after
++ copying pageset1.
++
++ e. Power down.
++
++ Powering down uses standard kernel routines. Prior to this, however, we
++ suspend drivers again, ensuring that write caches are flushed.
++
++4. The method of writing the image.
++
++ Suspend2 contains an internal API which is designed to simplify the
++ implementation of new methods of transforming the image to be written and
++ writing the image itself. In early versions of Suspend2, compression support
++ was inlined in the image writing code, and the data structures and code for
++ managing swap were intertwined with the rest of the code. A number of people
++ had expressed interest in implementing image encryption, and alternative
++ methods of storing the image. This internal API makes that possible by
++ implementing 'modules'.
++
++ A module is a single file which encapsulates the functionality needed
++ to transform a pageset of data (encryption or compression, for example),
++ or to write the pageset to a device. The former type of module is called
++ a 'page-transformer', the later a 'writer'.
++
++ Modules are linked together in pipeline fashion. There may be zero or more
++ page transformers in a pipeline, and there is always exactly one writer.
++ The pipeline follows this pattern:
++
++ ---------------------------------
++ | Suspend2 Core |
++ ---------------------------------
++ |
++ |
++ ---------------------------------
++ | Page transformer 1 |
++ ---------------------------------
++ |
++ |
++ ---------------------------------
++ | Page transformer 2 |
++ ---------------------------------
++ |
++ |
++ ---------------------------------
++ | Writer |
++ ---------------------------------
++
++ During the writing of an image, the core code feeds pages one at a time
++ to the first module. This module performs whatever transformations it
++ implements on the incoming data, completely consuming the incoming data and
++ feeding output in a similar manner to the next module. A module may buffer
++ its output.
++
++ During reading, the pipeline works in the reverse direction. The core code
++ calls the first module with the address of a buffer which should be filled.
++ (Note that the buffer size is always PAGE_SIZE at this time). This module
++ will in turn request data from the next module and so on down until the
++ writer is made to read from the stored image.
++
++ Part of definition of the structure of a module thus looks like this:
++
++ int (*rw_init) (int rw, int stream_number);
++ int (*rw_cleanup) (int rw);
++ int (*write_chunk) (struct page *buffer_page);
++ int (*read_chunk) (struct page *buffer_page, int sync);
++
++ It should be noted that the _cleanup routine may be called before the
++ full stream of data has been read or written. While writing the image,
++ the user may (depending upon settings) choose to abort suspending, and
++ if we are in the midst of writing the last portion of the image, a portion
++ of the second pageset may be reread.
++
++ In addition to the above routines for writing the data, all modules have a
++ number of other routines:
++
++ TYPE indicates whether the module is a page transformer or a writer.
++ #define TRANSFORMER_MODULE 1
++ #define WRITER_MODULE 2
++
++ NAME is the name of the module, used in generic messages.
++
++ MODULE_LIST is used to link the module into the list of all modules.
++
++ MEMORY_NEEDED returns the number of pages of memory required by the module
++ to do its work.
++
++ STORAGE_NEEDED returns the number of pages in the suspend header required
++ to store the module's configuration data.
++
++ PRINT_DEBUG_INFO fills a buffer with information to be displayed about the
++ operation or settings of the module.
++
++ SAVE_CONFIG_INFO returns a buffer of PAGE_SIZE or smaller (the size is the
++ return code), containing the module's configuration info. This information
++ will be written in the image header and restored at resume time. Since this
++ buffer is allocated after the atomic copy of the kernel is made, you don't
++ need to worry about the buffer being freed.
++
++ LOAD_CONFIG_INFO gives the module a pointer to the the configuration info
++ which was saved during suspending. Once again, the module doesn't need to
++ worry about freeing the buffer. The kernel will be overwritten with the
++ original kernel, so no memory leak will occur.
++
++ OPS contains the operations specific to transformers and writers. These are
++ described below.
++
++ The complete definition of struct suspend_module_ops is:
++
++ struct suspend_module_ops {
++ /* Functions common to all modules */
++ int type;
++ char *name;
++ struct module *module;
++ int disabled;
++ struct list_head module_list;
++
++ /* List of filters or writers */
++ struct list_head list, type_list;
++
++ /*
++ * Requirements for memory and storage in
++ * the image header..
++ */
++ unsigned long (*memory_needed) (void);
++ unsigned long (*storage_needed) (void);
++
++ /*
++ * Debug info
++ */
++ int (*print_debug_info) (char *buffer, int size);
++ int (*save_config_info) (char *buffer);
++ void (*load_config_info) (char *buffer, int len);
++
++ /*
++ * Initialise & cleanup - general routines called
++ * at the start and end of a cycle.
++ */
++ int (*initialise) (int starting_cycle);
++ void (*cleanup) (int finishing_cycle);
++
++ /*
++ * Calls for allocating storage (writers only).
++ *
++ * Header space is allocated separately. Note that allocation
++ * of space for the header might result in allocated space
++ * being stolen from the main pool if there is no unallocated
++ * space. We have to be able to allocate enough space for
++ * the header. We can eat memory to ensure there is enough
++ * for the main pool.
++ */
++
++ int (*storage_available) (void);
++ int (*allocate_header_space) (int space_requested);
++ int (*allocate_storage) (int space_requested);
++ int (*storage_allocated) (void);
++ int (*release_storage) (void);
++
++ /*
++ * Routines used in image I/O.
++ */
++ int (*rw_init) (int rw, int stream_number);
++ int (*rw_cleanup) (int rw);
++ int (*write_chunk) (struct page *buffer_page);
++ int (*read_chunk) (struct page *buffer_page, int sync);
++
++ /* Reset module if image exists but reading aborted */
++ void (*noresume_reset) (void);
++
++ /* Read and write the metadata */
++ int (*write_header_init) (void);
++ int (*write_header_cleanup) (void);
++
++ int (*read_header_init) (void);
++ int (*read_header_cleanup) (void);
++
++ int (*rw_header_chunk) (int rw, char *buffer_start, int buffer_size);
++
++ /* Attempt to parse an image location */
++ int (*parse_sig_location) (char *buffer, int only_writer);
++
++ /* Determine whether image exists that we can restore */
++ int (*image_exists) (void);
++
++ /* Mark the image as having tried to resume */
++ void (*mark_resume_attempted) (void);
++
++ /* Destroy image if one exists */
++ int (*invalidate_image) (void);
++ };
++
++
++ Expected compression returns the expected ratio between the amount of
++ data sent to this module and the amount of data it passes to the next
++ module. The value is used by the core code to calculate the amount of
++ space required to write the image. If the ratio is not achieved, the
++ writer will complain when it runs out of space with data still to
++ write, and the core code will abort the suspend.
++
++ transformer_list links together page transformers, in the order in
++ which they register, which is in turn determined by order in the
++ Makefile.
+diff -Nur linux-2.6.16.11/Documentation/power/suspend2.txt linux-2.6.16.11.suspend2/Documentation/power/suspend2.txt
+--- linux-2.6.16.11/Documentation/power/suspend2.txt 1970-01-01 00:00:00.000000000 +0000
++++ linux-2.6.16.11.suspend2/Documentation/power/suspend2.txt 2006-05-01 00:12:25.000000000 +0000
+@@ -0,0 +1,663 @@
++ --- Suspend2, version 2.2 ---
++
++1. What is it?
++2. Why would you want it?
++3. What do you need to use it?
++4. Why not just use the version already in the kernel?
++5. How do you use it?
++6. What do all those entries in /proc/suspend2 do?
++7. How do you get support?
++8. I think I've found a bug. What should I do?
++9. When will XXX be supported?
++10 How does it work?
++11. Who wrote Suspend2?
++
++1. What is it?
++
++ Imagine you're sitting at your computer, working away. For some reason, you
++ need to turn off your computer for a while - perhaps it's time to go home
++ for the day. When you come back to your computer next, you're going to want
++ to carry on where you left off. Now imagine that you could push a button and
++ have your computer store the contents of its memory to disk and power down.
++ Then, when you next start up your computer, it loads that image back into
++ memory and you can carry on from where you were, just as if you'd never
++ turned the computer off. Far less time to start up, no reopening
++ applications and finding what directory you put that file in yesterday.
++ That's what Suspend2 does.
++
++2. Why would you want it?
++
++ Why wouldn't you want it?
++
++ Being able to save the state of your system and quickly restore it improves
++ your productivity - you get a useful system in far less time than through
++ the normal boot process.
++
++3. What do you need to use it?
++
++ a. Kernel Support.
++
++ i) The Suspend2 patch.
++
++ Suspend2 is part of the Linux Kernel. This version is not part of Linus's
++ 2.6 tree at the moment, so you will need to download the kernel source and
++ apply the latest patch. Having done that, enable the appropriate options in
++ make [menu|x]config (under General Setup), compile and install your kernel.
++ Suspend2 works with SMP, Highmem, preemption, x86-32, PPC and x86_64.
++
++ Suspend2 patches are available from http://suspend2.net.
++
++ ii) Compression and encryption support.
++
++ Compression and encryption support are implemented via the
++ cryptoapi. You will therefore want to select any Cryptoapi transforms that
++ you want to use on your image from the Cryptoapi menu while configuring
++ your kernel.
++
++ You can also tell Suspend to write it's image to an encrypted and/or
++ compressed filesystem/swap partition. In that case, you don't need to do
++ anything special for Suspend2 when it comes to kernel configuration.
++
++ iii) Configuring other options.
++
++ While you're configuring your kernel, try to configure as much as possible
++ to build as modules. We recommend this because there are a number of drivers
++ that are still in the process of implementing proper power management
++ support. In those cases, the best way to work around their current lack is
++ to build them as modules and remove the modules while suspending. You might
++ also bug the driver authors to get their support up to speed, or even help!
++
++ b. Storage.
++
++ i) Swap.
++
++ Suspend2 can store the suspend image in your swap partition, a swap file or
++ a combination thereof. Whichever combination you choose, you will probably
++ want to create enough swap space to store the largest image you could have,
++ plus the space you'd normally use for swap. A good rule of thumb would be
++ to calculate the amount of swap you'd want without using Suspend2, and then
++ add the amount of memory you have. This swapspace can be arranged in any way
++ you'd like. It can be in one partition or file, or spread over a number. The
++ only requirement is that they be active when you start a suspend cycle.
++
++ There is one exception to this requirement. Suspend2 has the ability to turn
++ on one swap file or partition at the start of suspending and turn it back off
++ at the end. If you want to ensure you have enough memory to store a image
++ when your memory is fully used, you might want to make one swap partition or
++ file for 'normal' use, and another for Suspend2 to activate & deactivate
++ automatically. (Further details below).
++
++ ii) Normal files.
++
++ Suspend2 includes a 'filewriter'. The filewriter can store
++ your image in a simple file. Since Linux has the idea of everything being
++ a file, this is more powerful than it initially sounds. If, for example,
++ you were to set up a network block device file, you could suspend to a
++ network server. This has been tested and works to a point, but nbd itself
++ isn't stateless enough for our purposes.
++
++ Take extra care when setting up the filewriter. If you just type commands
++ without thinking and then try to suspend, you could cause irreversible
++ corruption on your filesystems! Make sure you have backups. Also, because
++ the filewriter is comparatively new, it's not as well tested as the
++ swapwriter. Be aware that there may be bugs that could cause damage to your
++ data even if you are careful! You have been warned!
++
++ Most people will only want to suspend to a local file. To achieve that, do
++ something along the lines of:
++
++ echo "Suspend2" > /suspend-file
++ dd if=/dev/zero bs=1M count=512 >> suspend-file
++
++ This will create a 512MB file called /suspend-file. To get Suspend2 to use
++ it:
++
++ echo /suspend-file > /proc/suspend2/filewriter_target
++
++ Then
++
++ cat /proc/suspend2/resume2
++
++ Put the results of this into your bootloader's configuration (see also step
++ C, below:
++
++ ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE---
++ # cat /proc/suspend2/resume2
++ file:/dev/hda2:0x1e001
++
++ In this example, we would edit the append= line of our lilo.conf|menu.lst
++ so that it included:
++
++ resume2=file:/dev/hda2:0x1e001
++ ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE---
++
++ For those who are thinking 'Could I make the file sparse?', the answer is
++ 'No!'. At the moment, there is no way for Suspend2 to fill in the holes in
++ a sparse file while suspending. In the longer term (post merge!), I'd like
++ to change things so that the file could be dynamically resized as needed.
++ Right now, however, that's not possible.
++
++ c. Bootloader configuration.
++
++ Using Suspend2 also requires that you add an extra parameter to
++ your lilo.conf or equivalent. Here's an example for a swap partition:
++
++ append="resume2=swap:/dev/hda1"
++
++ This would tell Suspend2 that /dev/hda1 is a swap partition you
++ have. Suspend2 will use the swap signature of this partition as a
++ pointer to your data when you suspend. This means that (in this example)
++ /dev/hda1 doesn't need to be _the_ swap partition where all of your data
++ is actually stored. It just needs to be a swap partition that has a
++ valid signature.
++
++ You don't need to have a swap partition for this purpose. Suspend2
++ can also use a swap file, but usage is a little more complex. Having made
++ your swap file, turn it on and do
++
++ cat /proc/suspend2/headerlocations
++
++ (this assumes you've already compiled your kernel with Suspend2
++ support and booted it). The results of the cat command will tell you
++ what you need to put in lilo.conf:
++
++ For swap partitions like /dev/hda1, simply use resume2=/dev/hda1.
++ For swapfile `swapfile`, use resume2=swap:/dev/hda2:0x242d at 4096.
++
++ If the swapfile changes for any reason (it is moved to a different
++ location, it is deleted and recreated, or the filesystem is
++ defragmented) then you will have to check
++ /proc/suspend2/headerlocations for a new resume_block value.
++
++ Once you've compiled and installed the kernel, adjusted your lilo.conf
++ and rerun lilo, you should only need to reboot for the most basic part
++ of Suspend2 to be ready.
++
++ If you only compile in the swapwriter, or only compile in the filewriter,
++ you don't need to add the "swap:" part of the resume2= parameters above.
++ resume2=/dev/hda2:0x242d at 4096 will work just as well.
++
++ d. The hibernate script.
++
++ Since the driver model in 2.6 kernels is still being developed, you may need
++ to do more, however. Users of Suspend2 usually start the process via a script
++ which prepares for the suspend, tells the kernel to do its stuff and then
++ restore things afterwards. This script might involve:
++
++ - Switching to a text console and back if X doesn't like the video card
++ status on resume.
++ - Un/reloading PCMCIA support since it doesn't play well with suspend.
<<Diff was trimmed, longer than 597 lines>>
More information about the pld-cvs-commit
mailing list