SOURCES (LINUX_2_6_11): inotify-2.6.12-rc3.patch (NEW) - recovered...
pluto
pluto at pld-linux.org
Sun Jun 26 12:10:59 CEST 2005
Author: pluto Date: Sun Jun 26 10:10:59 2005 GMT
Module: SOURCES Tag: LINUX_2_6_11
---- Log message:
- recovered (deleted by accident).
---- Files affected:
SOURCES:
inotify-2.6.12-rc3.patch (1.1.2.2.2.1 -> 1.1.2.2.2.2) (NEW)
---- Diffs:
================================================================
Index: SOURCES/inotify-2.6.12-rc3.patch
diff -u /dev/null SOURCES/inotify-2.6.12-rc3.patch:1.1.2.2.2.2
--- /dev/null Sun Jun 26 12:10:59 2005
+++ SOURCES/inotify-2.6.12-rc3.patch Sun Jun 26 12:10:54 2005
@@ -0,0 +1,1958 @@
+Subject: [patch] latest inotify.
+From: Robert Love <rml at novell.com>
+
+Below is the latest inotify, against 2.6.12-rc3.
+
+Changes since the last post:
+
+ - Explicitly define IN_ALL_EVENTS
+ - Couple of bug fixes related to the recent changes
+ - Fix for Viro's race
+ - Add some rationale to the documentation
+ - Misc. cleanup and such
+
+Enjoy.
+
+ Robert Love
+
+
+inotify!
+
+inotify is intended to correct the deficiencies of dnotify, particularly
+its inability to scale and its terrible user interface:
+
+ * dnotify requires the opening of one fd per each directory
+ that you intend to watch. This quickly results in too many
+ open files and pins removable media, preventing unmount.
+ * dnotify is directory-based. You only learn about changes to
+ directories. Sure, a change to a file in a directory affects
+ the directory, but you are then forced to keep a cache of
+ stat structures.
+ * dnotify's interface to user-space is awful. Signals?
+
+inotify provides a more usable, simple, powerful solution to file change
+notification:
+
+ * inotify's interface is a device node, not SIGIO. You open a
+ single fd to the device node, which is select()-able.
+ * inotify has an event that says "the filesystem that the item
+ you were watching is on was unmounted."
+ * inotify can watch directories or files.
+
+Inotify is currently used by Beagle (a desktop search infrastructure),
+Gamin (a FAM replacement), and other projects.
+
+Signed-off-by: Robert Love <rml at novell.com>
+
+ Documentation/filesystems/inotify.txt | 123 ++++
+ fs/Kconfig | 13
+ fs/Makefile | 1
+ fs/attr.c | 33 -
+ fs/compat.c | 12
+ fs/file_table.c | 3
+ fs/inode.c | 6
+ fs/inotify.c | 971 ++++++++++++++++++++++++++++++++++
+ fs/namei.c | 30 -
+ fs/open.c | 4
+ fs/read_write.c | 15
+ fs/xattr.c | 5
+ include/linux/fs.h | 6
+ include/linux/fsnotify.h | 230 ++++++++
+ include/linux/inotify.h | 126 ++++
+ include/linux/sched.h | 4
+ kernel/user.c | 4
+ 17 files changed, 1530 insertions(+), 56 deletions(-)
+
+diff -urN linux-2.6.12-rc3/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt
+--- linux-2.6.12-rc3/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.000000000 -0500
++++ linux/Documentation/filesystems/inotify.txt 2005-04-28 16:36:13.000000000 -0400
+@@ -0,0 +1,123 @@
++ inotify
++ a powerful yet simple file change notification system
++
++
++
++Document started 15 Mar 2005 by Robert Love <rml at novell.com>
++
++(i) User Interface
++
++Inotify is controlled by a device node, /dev/inotify. If you do not use udev,
++this device may need to be created manually. First step, open it
++
++ int dev_fd = open ("/dev/inotify", O_RDONLY);
++
++Change events are managed by "watches". A watch is an (object,mask) pair where
++the object is a file or directory and the mask is a bitmask of one or more
++inotify events that the application wishes to receive. See <linux/inotify.h>
++for valid events. A watch is referenced by a watch descriptor, or wd.
++
++Watches are added via a file descriptor.
++
++Watches on a directory will return events on any files inside of the directory.
++
++Adding a watch is simple,
++
++ /* 'wd' represents the watch on fd with mask */
++ struct inotify_request req = { fd, mask };
++ int wd = ioctl (dev_fd, INOTIFY_WATCH, &req);
++
++You can add a large number of files via something like
++
++ for each file to watch {
++ struct inotify_request req;
++ int file_fd;
++
++ file_fd = open (file, O_RDONLY);
++ if (fd < 0) {
++ perror ("open");
++ break;
++ }
++
++ req.fd = file_fd;
++ req.mask = mask;
++
++ wd = ioctl (dev_fd, INOTIFY_WATCH, &req);
++
++ close (fd);
++ }
++
++You can update an existing watch in the same manner, by passing in a new mask.
++
++An existing watch is removed via the INOTIFY_IGNORE ioctl, for example
++
++ ioctl (dev_fd, INOTIFY_IGNORE, wd);
++
++Events are provided in the form of an inotify_event structure that is read(2)
++from /dev/inotify. The filename is of dynamic length and follows the struct.
++It is of size len. The filename is padded with null bytes to ensure proper
++alignment. This padding is reflected in len.
++
++You can slurp multiple events by passing a large buffer, for example
++
++ size_t len = read (fd, buf, BUF_LEN);
++
++Will return as many events as are available and fit in BUF_LEN.
++
++/dev/inotify is also select() and poll() able.
++
++You can find the size of the current event queue via the FIONREAD ioctl.
++
++All watches are destroyed and cleaned up on close.
++
++
++(ii) Internal Kernel Implementation
++
++Each open inotify device is associated with an inotify_device structure.
++
++Each watch is associated with an inotify_watch structure. Watches are chained
++off of each associated device and each associated inode.
++
++See fs/inotify.c for the locking and lifetime rules.
++
++
++(iii) Rationale
++
++Q: What is the design decision behind not tying the watch to the
++open fd of the watched object?
++
++A: Watches are associated with an open inotify device, not an
++open file. This solves the primary problem with dnotify:
++keeping the file open pins the file and thus, worse, pins the
++mount. Dnotify is therefore infeasible for use on a desktop
++system with removable media as the media cannot be unmounted.
++
++Q: What is the design decision behind using an-fd-per-device as
++opposed to an fd-per-watch?
++
++A: An fd-per-watch quickly consumes more file descriptors than
++are allowed, more fd's than are feasible to manage, and more
++fd's than are ideally select()-able. Yes, root can bump the
++per-process fd limit and yes, users can use epoll, but requiring
++both is silly and an extraneous requirement. A watch consumes
++less memory than an open file, separating the number spaces is
++thus sensible. The current design is what user-space developers
++want: Users open the device, once, and add n watches, requiring
++but one fd and no twiddling with fd limits.
++Opening /dev/inotify two thousand times is silly. If we can
++implement user-space's preferences cleanly--and we can, the idr
++layer makes stuff like this trivial--then we should.
++
++Q: Why a device node?
++
++A: The second biggest problem with dnotify is that the user
++interface sucks ass. Signals are a terrible, terrible interface
++for file notification. Or for anything, for that matter. The
++idea solution, from all perspectives, is a file descriptor based
++one that allows basic file I/O and poll/select. Obtaining the
++fd and managing the watches could of been done either via a
++device file or a family of new system calls. We decided to
++implement a device file because adding three or four new system
++calls that mirrored open, close, and ioctl seemed silly. A
++character device makes sense from user-space and was easy to
++implement inside of the kernel.
+diff -urN linux-2.6.12-rc3/fs/attr.c linux/fs/attr.c
+--- linux-2.6.12-rc3/fs/attr.c 2005-04-27 11:49:45.000000000 -0400
++++ linux/fs/attr.c 2005-04-27 11:50:41.000000000 -0400
+@@ -10,7 +10,7 @@
+ #include <linux/mm.h>
+ #include <linux/string.h>
+ #include <linux/smp_lock.h>
+-#include <linux/dnotify.h>
++#include <linux/fsnotify.h>
+ #include <linux/fcntl.h>
+ #include <linux/quotaops.h>
+ #include <linux/security.h>
+@@ -107,31 +107,8 @@
+ out:
+ return error;
+ }
+-
+ EXPORT_SYMBOL(inode_setattr);
+
+-int setattr_mask(unsigned int ia_valid)
+-{
+- unsigned long dn_mask = 0;
+-
+- if (ia_valid & ATTR_UID)
+- dn_mask |= DN_ATTRIB;
+- if (ia_valid & ATTR_GID)
+- dn_mask |= DN_ATTRIB;
+- if (ia_valid & ATTR_SIZE)
+- dn_mask |= DN_MODIFY;
+- /* both times implies a utime(s) call */
+- if ((ia_valid & (ATTR_ATIME|ATTR_MTIME)) == (ATTR_ATIME|ATTR_MTIME))
+- dn_mask |= DN_ATTRIB;
+- else if (ia_valid & ATTR_ATIME)
+- dn_mask |= DN_ACCESS;
+- else if (ia_valid & ATTR_MTIME)
+- dn_mask |= DN_MODIFY;
+- if (ia_valid & ATTR_MODE)
+- dn_mask |= DN_ATTRIB;
+- return dn_mask;
+-}
+-
+ int notify_change(struct dentry * dentry, struct iattr * attr)
+ {
+ struct inode *inode = dentry->d_inode;
+@@ -197,11 +174,9 @@
+ if (ia_valid & ATTR_SIZE)
+ up_write(&dentry->d_inode->i_alloc_sem);
+
+- if (!error) {
+- unsigned long dn_mask = setattr_mask(ia_valid);
+- if (dn_mask)
+- dnotify_parent(dentry, dn_mask);
+- }
++ if (!error)
++ fsnotify_change(dentry, ia_valid);
++
+ return error;
+ }
+
+diff -urN linux-2.6.12-rc3/fs/compat.c linux/fs/compat.c
+--- linux-2.6.12-rc3/fs/compat.c 2005-04-27 11:49:46.000000000 -0400
++++ linux/fs/compat.c 2005-04-27 11:50:41.000000000 -0400
+@@ -37,7 +37,7 @@
+ #include <linux/ctype.h>
+ #include <linux/module.h>
+ #include <linux/dirent.h>
+-#include <linux/dnotify.h>
++#include <linux/fsnotify.h>
+ #include <linux/highuid.h>
+ #include <linux/sunrpc/svc.h>
+ #include <linux/nfsd/nfsd.h>
+@@ -1307,9 +1307,13 @@
+ out:
+ if (iov != iovstack)
+ kfree(iov);
+- if ((ret + (type == READ)) > 0)
+- dnotify_parent(file->f_dentry,
+- (type == READ) ? DN_ACCESS : DN_MODIFY);
++ if ((ret + (type == READ)) > 0) {
++ struct dentry *dentry = file->f_dentry;
++ if (type == READ)
++ fsnotify_access(dentry);
++ else
++ fsnotify_modify(dentry);
++ }
+ return ret;
+ }
+
+diff -urN linux-2.6.12-rc3/fs/file_table.c linux/fs/file_table.c
+--- linux-2.6.12-rc3/fs/file_table.c 2005-03-02 02:37:47.000000000 -0500
++++ linux/fs/file_table.c 2005-04-27 11:50:41.000000000 -0400
+@@ -16,6 +16,7 @@
+ #include <linux/eventpoll.h>
+ #include <linux/mount.h>
+ #include <linux/cdev.h>
++#include <linux/fsnotify.h>
+
+ /* sysctl tunables... */
+ struct files_stat_struct files_stat = {
+@@ -123,6 +124,8 @@
+ struct inode *inode = dentry->d_inode;
+
+ might_sleep();
++
++ fsnotify_close(file);
+ /*
+ * The function eventpoll_release() should be the first called
+ * in the file cleanup chain.
+diff -urN linux-2.6.12-rc3/fs/inode.c linux/fs/inode.c
+--- linux-2.6.12-rc3/fs/inode.c 2005-04-27 11:49:46.000000000 -0400
++++ linux/fs/inode.c 2005-04-27 11:50:41.000000000 -0400
+@@ -21,6 +21,7 @@
+ #include <linux/pagemap.h>
+ #include <linux/cdev.h>
+ #include <linux/bootmem.h>
++#include <linux/inotify.h>
+
+ /*
+ * This is needed for the following functions:
+@@ -129,6 +130,10 @@
+ #ifdef CONFIG_QUOTA
+ memset(&inode->i_dquot, 0, sizeof(inode->i_dquot));
+ #endif
++#ifdef CONFIG_INOTIFY
++ INIT_LIST_HEAD(&inode->inotify_watches);
++ sema_init(&inode->inotify_sem, 1);
++#endif
+ inode->i_pipe = NULL;
+ inode->i_bdev = NULL;
+ inode->i_cdev = NULL;
+@@ -355,6 +360,7 @@
+
+ down(&iprune_sem);
+ spin_lock(&inode_lock);
++ inotify_unmount_inodes(&sb->s_inodes);
+ busy = invalidate_list(&sb->s_inodes, &throw_away);
+ spin_unlock(&inode_lock);
+
+diff -urN linux-2.6.12-rc3/fs/inotify.c linux/fs/inotify.c
+--- linux-2.6.12-rc3/fs/inotify.c 1969-12-31 19:00:00.000000000 -0500
++++ linux/fs/inotify.c 2005-04-28 16:31:13.000000000 -0400
+@@ -0,0 +1,971 @@
++/*
++ * fs/inotify.c - inode-based file event notifications
++ *
++ * Authors:
++ * John McCutchan <ttb at tentacle.dhs.org>
++ * Robert Love <rml at novell.com>
++ *
++ * Copyright (C) 2005 John McCutchan
++ *
++ * This program is free software; you can redistribute it and/or modify it
++ * under the terms of the GNU General Public License as published by the
++ * Free Software Foundation; either version 2, or (at your option) any
++ * later version.
++ *
++ * This program is distributed in the hope that it will be useful, but
++ * WITHOUT ANY WARRANTY; without even the implied warranty of
++ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
++ * General Public License for more details.
++ */
++
++#include <linux/module.h>
++#include <linux/kernel.h>
++#include <linux/sched.h>
++#include <linux/spinlock.h>
++#include <linux/idr.h>
++#include <linux/slab.h>
++#include <linux/fs.h>
++#include <linux/file.h>
++#include <linux/namei.h>
++#include <linux/poll.h>
++#include <linux/device.h>
++#include <linux/miscdevice.h>
++#include <linux/init.h>
++#include <linux/list.h>
++#include <linux/writeback.h>
++#include <linux/inotify.h>
++
++#include <asm/ioctls.h>
++
++static atomic_t inotify_cookie;
++
++static kmem_cache_t *watch_cachep;
++static kmem_cache_t *event_cachep;
++
++static int max_user_devices;
++static int max_user_watches;
++static unsigned int max_queued_events;
++
++/*
++ * Lock ordering:
++ *
++ * dentry->d_lock (used to keep d_move() away from dentry->d_parent)
++ * iprune_sem (synchronize versus shrink_icache_memory())
++ * inode_lock (protects the super_block->s_inodes list)
++ * inode->inotify_sem (protects inode->inotify_watches and watches->i_list)
++ * inotify_dev->sem (protects inotify_device and watches->d_list)
++ */
++
++/*
++ * Lifetimes of the three main data structures--inotify_device, inode, and
++ * inotify_watch--are managed by reference count.
++ *
++ * inotify_device: Lifetime is from open until release. Additional references
++ * can bump the count via get_inotify_dev() and drop the count via
++ * put_inotify_dev().
++ *
++ * inotify_watch: Lifetime is from create_watch() to destory_watch().
++ * Additional references can bump the count via get_inotify_watch() and drop
++ * the count via put_inotify_watch().
++ *
++ * inode: Pinned so long as the inode is associated with a watch, from
++ * create_watch() to put_inotify_watch().
++ */
++
++/*
++ * struct inotify_device - represents an open instance of an inotify device
++ *
++ * This structure is protected by the semaphore 'sem'.
++ */
++struct inotify_device {
++ wait_queue_head_t wq; /* wait queue for i/o */
++ struct idr idr; /* idr mapping wd -> watch */
++ struct semaphore sem; /* protects this bad boy */
++ struct list_head events; /* list of queued events */
++ struct list_head watches; /* list of watches */
++ atomic_t count; /* reference count */
++ struct user_struct *user; /* user who opened this dev */
++ unsigned int queue_size; /* size of the queue (bytes) */
++ unsigned int event_count; /* number of pending events */
++ unsigned int max_events; /* maximum number of events */
++};
++
++/*
++ * struct inotify_kernel_event - An intofiy event, originating from a watch and
++ * queued for user-space. A list of these is attached to each instance of the
++ * device. In read(), this list is walked and all events that can fit in the
++ * buffer are returned.
++ *
++ * Protected by dev->sem of the device in which we are queued.
++ */
++struct inotify_kernel_event {
++ struct inotify_event event; /* the user-space event */
++ struct list_head list; /* entry in inotify_device's list */
++ char *name; /* filename, if any */
++};
++
++/*
++ * struct inotify_watch - represents a watch request on a specific inode
++ *
++ * d_list is protected by dev->sem of the associated watch->dev.
++ * i_list and mask are protected by inode->inotify_sem of the associated inode.
++ * dev, inode, and wd are never written to once the watch is created.
++ */
++struct inotify_watch {
++ struct list_head d_list; /* entry in inotify_device's list */
++ struct list_head i_list; /* entry in inode's list */
++ atomic_t count; /* reference count */
++ struct inotify_device *dev; /* associated device */
++ struct inode *inode; /* associated inode */
++ s32 wd; /* watch descriptor */
++ u32 mask; /* event mask for this watch */
++};
++
++static ssize_t show_max_queued_events(struct class_device *class, char *buf)
++{
++ return sprintf(buf, "%d\n", max_queued_events);
++}
++
++static ssize_t store_max_queued_events(struct class_device *class,
++ const char *buf, size_t count)
++{
++ unsigned int max;
++
++ if (sscanf(buf, "%u", &max) > 0 && max > 0) {
++ max_queued_events = max;
++ return strlen(buf);
++ }
++ return -EINVAL;
++}
++
++static ssize_t show_max_user_devices(struct class_device *class, char *buf)
++{
++ return sprintf(buf, "%d\n", max_user_devices);
++}
++
++static ssize_t store_max_user_devices(struct class_device *class,
++ const char *buf, size_t count)
++{
++ int max;
++
++ if (sscanf(buf, "%d", &max) > 0 && max > 0) {
++ max_user_devices = max;
++ return strlen(buf);
++ }
++ return -EINVAL;
++}
++
++static ssize_t show_max_user_watches(struct class_device *class, char *buf)
++{
++ return sprintf(buf, "%d\n", max_user_watches);
++}
++
++static ssize_t store_max_user_watches(struct class_device *class,
++ const char *buf, size_t count)
++{
++ int max;
++
++ if (sscanf(buf, "%d", &max) > 0 && max > 0) {
++ max_user_watches = max;
++ return strlen(buf);
++ }
++ return -EINVAL;
++}
++
++static CLASS_DEVICE_ATTR(max_queued_events, S_IRUGO | S_IWUSR,
++ show_max_queued_events, store_max_queued_events);
++static CLASS_DEVICE_ATTR(max_user_devices, S_IRUGO | S_IWUSR,
++ show_max_user_devices, store_max_user_devices);
++static CLASS_DEVICE_ATTR(max_user_watches, S_IRUGO | S_IWUSR,
++ show_max_user_watches, store_max_user_watches);
++
++static inline void get_inotify_dev(struct inotify_device *dev)
++{
++ atomic_inc(&dev->count);
++}
++
++static inline void put_inotify_dev(struct inotify_device *dev)
++{
++ if (atomic_dec_and_test(&dev->count)) {
++ atomic_dec(&dev->user->inotify_devs);
++ free_uid(dev->user);
++ kfree(dev);
++ }
++}
++
++static inline void get_inotify_watch(struct inotify_watch *watch)
++{
++ atomic_inc(&watch->count);
++}
++
++/*
++ * put_inotify_watch - decrements the ref count on a given watch. cleans up
++ * the watch and its references if the count reaches zero.
++ */
++static inline void put_inotify_watch(struct inotify_watch *watch)
++{
++ if (atomic_dec_and_test(&watch->count)) {
++ put_inotify_dev(watch->dev);
++ iput(watch->inode);
++ kmem_cache_free(watch_cachep, watch);
++ }
++}
++
++/*
++ * kernel_event - create a new kernel event with the given parameters
++ *
++ * This function can sleep.
++ */
++static struct inotify_kernel_event * kernel_event(s32 wd, u32 mask, u32 cookie,
++ const char *name)
++{
++ struct inotify_kernel_event *kevent;
++
++ kevent = kmem_cache_alloc(event_cachep, GFP_KERNEL);
++ if (unlikely(!kevent))
++ return NULL;
++
++ /* we hand this out to user-space, so zero it just in case */
++ memset(&kevent->event, 0, sizeof(struct inotify_event));
++
++ kevent->event.wd = wd;
++ kevent->event.mask = mask;
++ kevent->event.cookie = cookie;
++
++ INIT_LIST_HEAD(&kevent->list);
++
++ if (name) {
++ size_t len, rem, event_size = sizeof(struct inotify_event);
++
++ /*
++ * We need to pad the filename so as to properly align an
++ * array of inotify_event structures. Because the structure is
++ * small and the common case is a small filename, we just round
++ * up to the next multiple of the structure's sizeof. This is
++ * simple and safe for all architectures.
++ */
++ len = strlen(name) + 1;
++ rem = event_size - len;
++ if (len > event_size) {
++ rem = event_size - (len % event_size);
++ if (len % event_size == 0)
++ rem = 0;
++ }
++ len += rem;
++
++ kevent->name = kmalloc(len, GFP_KERNEL);
++ if (unlikely(!kevent->name)) {
<<Diff was trimmed, longer than 597 lines>>
More information about the pld-cvs-commit
mailing list