Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

fscache: Rewrite documentation

Rewrite the fscache documentation.

Changes
=======
ver #3:
- The volume coherency data is now an arbitrarily-sized blob, not a u64.

ver #2:
- Put quoting around some bits of C being referred to in the docs[1].
- Stripped the markup off the ref to the netfs lib doc[2].

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/20211130175119.63d0e7aa@canb.auug.org.au/ [1]
Link: https://lore.kernel.org/r/20211130162311.105fcfa5@canb.auug.org.au/ [2]
Link: https://lore.kernel.org/r/163819672252.215744.15454333549935901588.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906986754.143852.17703291789683936950.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967193834.1823006.15991526817786159772.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021585970.640689.3162537597817521032.stgit@warthog.procyon.org.uk/ # v4

+934 -2364
+359 -607
Documentation/filesystems/caching/backend-api.rst
··· 1 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 - ========================== 4 - FS-Cache Cache backend API 5 - ========================== 3 + ================= 4 + Cache Backend API 5 + ================= 6 6 7 7 The FS-Cache system provides an API by which actual caches can be supplied to 8 8 FS-Cache for it to then serve out to network filesystems and other interested 9 - parties. 9 + parties. This API is used by:: 10 10 11 - This API is declared in <linux/fscache-cache.h>. 12 - 13 - 14 - Initialising and Registering a Cache 15 - ==================================== 16 - 17 - To start off, a cache definition must be initialised and registered for each 18 - cache the backend wants to make available. For instance, CacheFS does this in 19 - the fill_super() operation on mounting. 20 - 21 - The cache definition (struct fscache_cache) should be initialised by calling:: 22 - 23 - void fscache_init_cache(struct fscache_cache *cache, 24 - struct fscache_cache_ops *ops, 25 - const char *idfmt, 26 - ...); 27 - 28 - Where: 29 - 30 - * "cache" is a pointer to the cache definition; 31 - 32 - * "ops" is a pointer to the table of operations that the backend supports on 33 - this cache; and 34 - 35 - * "idfmt" is a format and printf-style arguments for constructing a label 36 - for the cache. 11 + #include <linux/fscache-cache.h>. 37 12 38 13 39 - The cache should then be registered with FS-Cache by passing a pointer to the 40 - previously initialised cache definition to:: 14 + Overview 15 + ======== 16 + 17 + Interaction with the API is handled on three levels: cache, volume and data 18 + storage, and each level has its own type of cookie object: 19 + 20 + ======================= ======================= 21 + COOKIE C TYPE 22 + ======================= ======================= 23 + Cache cookie struct fscache_cache 24 + Volume cookie struct fscache_volume 25 + Data storage cookie struct fscache_cookie 26 + ======================= ======================= 27 + 28 + Cookies are used to provide some filesystem data to the cache, manage state and 29 + pin the cache during access in addition to acting as reference points for the 30 + API functions. Each cookie has a debugging ID that is included in trace points 31 + to make it easier to correlate traces. Note, though, that debugging IDs are 32 + simply allocated from incrementing counters and will eventually wrap. 33 + 34 + The cache backend and the network filesystem can both ask for cache cookies - 35 + and if they ask for one of the same name, they'll get the same cookie. Volume 36 + and data cookies, however, are created at the behest of the filesystem only. 37 + 38 + 39 + Cache Cookies 40 + ============= 41 + 42 + Caches are represented in the API by cache cookies. These are objects of 43 + type:: 44 + 45 + struct fscache_cache { 46 + void *cache_priv; 47 + unsigned int debug_id; 48 + char *name; 49 + ... 50 + }; 51 + 52 + There are a few fields that the cache backend might be interested in. The 53 + ``debug_id`` can be used in tracing to match lines referring to the same cache 54 + and ``name`` is the name the cache was registered with. The ``cache_priv`` 55 + member is private data provided by the cache when it is brought online. The 56 + other fields are for internal use. 57 + 58 + 59 + Registering a Cache 60 + =================== 61 + 62 + When a cache backend wants to bring a cache online, it should first register 63 + the cache name and that will get it a cache cookie. This is done with:: 64 + 65 + struct fscache_cache *fscache_acquire_cache(const char *name); 66 + 67 + This will look up and potentially create a cache cookie. The cache cookie may 68 + have already been created by a network filesystem looking for it, in which case 69 + that cache cookie will be used. If the cache cookie is not in use by another 70 + cache, it will be moved into the preparing state, otherwise it will return 71 + busy. 72 + 73 + If successful, the cache backend can then start setting up the cache. In the 74 + event that the initialisation fails, the cache backend should call:: 75 + 76 + void fscache_relinquish_cookie(struct fscache_cache *cache); 77 + 78 + to reset and discard the cookie. 79 + 80 + 81 + Bringing a Cache Online 82 + ======================= 83 + 84 + Once the cache is set up, it can be brought online by calling:: 41 85 42 86 int fscache_add_cache(struct fscache_cache *cache, 43 - struct fscache_object *fsdef, 44 - const char *tagname); 87 + const struct fscache_cache_ops *ops, 88 + void *cache_priv); 45 89 46 - Two extra arguments should also be supplied: 47 - 48 - * "fsdef" which should point to the object representation for the FS-Cache 49 - master index in this cache. Netfs primary index entries will be created 50 - here. FS-Cache keeps the caller's reference to the index object if 51 - successful and will release it upon withdrawal of the cache. 52 - 53 - * "tagname" which, if given, should be a text string naming this cache. If 54 - this is NULL, the identifier will be used instead. For CacheFS, the 55 - identifier is set to name the underlying block device and the tag can be 56 - supplied by mount. 57 - 58 - This function may return -ENOMEM if it ran out of memory or -EEXIST if the tag 59 - is already in use. 0 will be returned on success. 90 + This stores the cache operations table pointer and cache private data into the 91 + cache cookie and moves the cache to the active state, thereby allowing accesses 92 + to take place. 60 93 61 94 62 - Unregistering a Cache 63 - ===================== 95 + Withdrawing a Cache From Service 96 + ================================ 64 97 65 - A cache can be withdrawn from the system by calling this function with a 66 - pointer to the cache definition:: 98 + The cache backend can withdraw a cache from service by calling this function:: 67 99 68 100 void fscache_withdraw_cache(struct fscache_cache *cache); 69 101 70 - In CacheFS's case, this is called by put_super(). 102 + This moves the cache to the withdrawn state to prevent new cache- and 103 + volume-level accesses from starting and then waits for outstanding cache-level 104 + accesses to complete. 105 + 106 + The cache must then go through the data storage objects it has and tell fscache 107 + to withdraw them, calling:: 108 + 109 + void fscache_withdraw_cookie(struct fscache_cookie *cookie); 110 + 111 + on the cookie that each object belongs to. This schedules the specified cookie 112 + for withdrawal. This gets offloaded to a workqueue. The cache backend can 113 + test for completion by calling:: 114 + 115 + bool fscache_are_objects_withdrawn(struct fscache_cookie *cache); 116 + 117 + Once all the cookies are withdrawn, a cache backend can withdraw all the 118 + volumes, calling:: 119 + 120 + void fscache_withdraw_volume(struct fscache_volume *volume); 121 + 122 + to tell fscache that a volume has been withdrawn. This waits for all 123 + outstanding accesses on the volume to complete before returning. 124 + 125 + When the the cache is completely withdrawn, fscache should be notified by 126 + calling:: 127 + 128 + void fscache_cache_relinquish(struct fscache_cache *cache); 129 + 130 + to clear fields in the cookie and discard the caller's ref on it. 71 131 72 132 73 - Security 74 - ======== 133 + Volume Cookies 134 + ============== 75 135 76 - The cache methods are executed one of two contexts: 136 + Within a cache, the data storage objects are organised into logical volumes. 137 + These are represented in the API as objects of type:: 77 138 78 - (1) that of the userspace process that issued the netfs operation that caused 79 - the cache method to be invoked, or 139 + struct fscache_volume { 140 + struct fscache_cache *cache; 141 + void *cache_priv; 142 + unsigned int debug_id; 143 + char *key; 144 + unsigned int key_hash; 145 + ... 146 + u8 coherency_len; 147 + u8 coherency[]; 148 + }; 80 149 81 - (2) that of one of the processes in the FS-Cache thread pool. 150 + There are a number of fields here that are of interest to the caching backend: 82 151 83 - In either case, this may not be an appropriate context in which to access the 84 - cache. 152 + * ``cache`` - The parent cache cookie. 85 153 86 - The calling process's fsuid, fsgid and SELinux security identities may need to 87 - be masqueraded for the duration of the cache driver's access to the cache. 88 - This is left to the cache to handle; FS-Cache makes no effort in this regard. 154 + * ``cache_priv`` - A place for the cache to stash private data. 155 + 156 + * ``debug_id`` - A debugging ID for logging in tracepoints. 157 + 158 + * ``key`` - A printable string with no '/' characters in it that represents 159 + the index key for the volume. The key is NUL-terminated and padded out to 160 + a multiple of 4 bytes. 161 + 162 + * ``key_hash`` - A hash of the index key. This should work out the same, no 163 + matter the cpu arch and endianness. 164 + 165 + * ``coherency`` - A piece of coherency data that should be checked when the 166 + volume is bound to in the cache. 167 + 168 + * ``coherency_len`` - The amount of data in the coherency buffer. 89 169 90 170 91 - Control and Statistics Presentation 92 - =================================== 171 + Data Storage Cookies 172 + ==================== 93 173 94 - The cache may present data to the outside world through FS-Cache's interfaces 95 - in sysfs and procfs - the former for control and the latter for statistics. 96 - 97 - A sysfs directory called /sys/fs/fscache/<cachetag>/ is created if CONFIG_SYSFS 98 - is enabled. This is accessible through the kobject struct fscache_cache::kobj 99 - and is for use by the cache as it sees fit. 100 - 101 - 102 - Relevant Data Structures 103 - ======================== 104 - 105 - * Index/Data file FS-Cache representation cookie:: 174 + A volume is a logical group of data storage objects, each of which is 175 + represented to the network filesystem by a cookie. Cookies are represented in 176 + the API as objects of type:: 106 177 107 178 struct fscache_cookie { 108 - struct fscache_object_def *def; 109 - struct fscache_netfs *netfs; 110 - void *netfs_data; 179 + struct fscache_volume *volume; 180 + void *cache_priv; 181 + unsigned long flags; 182 + unsigned int debug_id; 183 + unsigned int inval_counter; 184 + loff_t object_size; 185 + u8 advice; 186 + u32 key_hash; 187 + u8 key_len; 188 + u8 aux_len; 111 189 ... 112 190 }; 113 191 114 - The fields that might be of use to the backend describe the object 115 - definition, the netfs definition and the netfs's data for this cookie. 116 - The object definition contain functions supplied by the netfs for loading 117 - and matching index entries; these are required to provide some of the 118 - cache operations. 192 + The fields in the cookie that are of interest to the cache backend are: 193 + 194 + * ``volume`` - The parent volume cookie. 195 + 196 + * ``cache_priv`` - A place for the cache to stash private data. 197 + 198 + * ``flags`` - A collection of bit flags, including: 199 + 200 + * FSCACHE_COOKIE_NO_DATA_TO_READ - There is no data available in the 201 + cache to be read as the cookie has been created or invalidated. 202 + 203 + * FSCACHE_COOKIE_NEEDS_UPDATE - The coherency data and/or object size has 204 + been changed and needs committing. 205 + 206 + * FSCACHE_COOKIE_LOCAL_WRITE - The netfs's data has been modified 207 + locally, so the cache object may be in an incoherent state with respect 208 + to the server. 209 + 210 + * FSCACHE_COOKIE_HAVE_DATA - The backend should set this if it 211 + successfully stores data into the cache. 212 + 213 + * FSCACHE_COOKIE_RETIRED - The cookie was invalidated when it was 214 + relinquished and the cached data should be discarded. 215 + 216 + * ``debug_id`` - A debugging ID for logging in tracepoints. 217 + 218 + * ``inval_counter`` - The number of invalidations done on the cookie. 219 + 220 + * ``advice`` - Information about how the cookie is to be used. 221 + 222 + * ``key_hash`` - A hash of the index key. This should work out the same, no 223 + matter the cpu arch and endianness. 224 + 225 + * ``key_len`` - The length of the index key. 226 + 227 + * ``aux_len`` - The length of the coherency data buffer. 228 + 229 + Each cookie has an index key, which may be stored inline to the cookie or 230 + elsewhere. A pointer to this can be obtained by calling:: 231 + 232 + void *fscache_get_key(struct fscache_cookie *cookie); 233 + 234 + The index key is a binary blob, the storage for which is padded out to a 235 + multiple of 4 bytes. 236 + 237 + Each cookie also has a buffer for coherency data. This may also be inline or 238 + detached from the cookie and a pointer is obtained by calling:: 239 + 240 + void *fscache_get_aux(struct fscache_cookie *cookie); 119 241 120 242 121 - * In-cache object representation:: 122 243 123 - struct fscache_object { 124 - int debug_id; 125 - enum { 126 - FSCACHE_OBJECT_RECYCLING, 127 - ... 128 - } state; 129 - spinlock_t lock 130 - struct fscache_cache *cache; 131 - struct fscache_cookie *cookie; 244 + Cookie Accounting 245 + ================= 246 + 247 + Data storage cookies are counted and this is used to block cache withdrawal 248 + completion until all objects have been destroyed. The following functions are 249 + provided to the cache to deal with that:: 250 + 251 + void fscache_count_object(struct fscache_cache *cache); 252 + void fscache_uncount_object(struct fscache_cache *cache); 253 + void fscache_wait_for_objects(struct fscache_cache *cache); 254 + 255 + The count function records the allocation of an object in a cache and the 256 + uncount function records its destruction. Warning: by the time the uncount 257 + function returns, the cache may have been destroyed. 258 + 259 + The wait function can be used during the withdrawal procedure to wait for 260 + fscache to finish withdrawing all the objects in the cache. When it completes, 261 + there will be no remaining objects referring to the cache object or any volume 262 + objects. 263 + 264 + 265 + Cache Management API 266 + ==================== 267 + 268 + The cache backend implements the cache management API by providing a table of 269 + operations that fscache can use to manage various aspects of the cache. These 270 + are held in a structure of type:: 271 + 272 + struct fscache_cache_ops { 273 + const char *name; 132 274 ... 133 275 }; 134 276 135 - Structures of this type should be allocated by the cache backend and 136 - passed to FS-Cache when requested by the appropriate cache operation. In 137 - the case of CacheFS, they're embedded in CacheFS's internal object 138 - structures. 277 + This contains a printable name for the cache backend driver plus a number of 278 + pointers to methods to allow fscache to request management of the cache: 139 279 140 - The debug_id is a simple integer that can be used in debugging messages 141 - that refer to a particular object. In such a case it should be printed 142 - using "OBJ%x" to be consistent with FS-Cache. 280 + * Set up a volume cookie [optional]:: 143 281 144 - Each object contains a pointer to the cookie that represents the object it 145 - is backing. An object should retired when put_object() is called if it is 146 - in state FSCACHE_OBJECT_RECYCLING. The fscache_object struct should be 147 - initialised by calling fscache_object_init(object). 282 + void (*acquire_volume)(struct fscache_volume *volume); 148 283 284 + This method is called when a volume cookie is being created. The caller 285 + holds a cache-level access pin to prevent the cache from going away for 286 + the duration. This method should set up the resources to access a volume 287 + in the cache and should not return until it has done so. 149 288 150 - * FS-Cache operation record:: 289 + If successful, it can set ``cache_priv`` to its own data. 151 290 152 - struct fscache_operation { 153 - atomic_t usage; 154 - struct fscache_object *object; 155 - unsigned long flags; 156 - #define FSCACHE_OP_EXCLUSIVE 157 - void (*processor)(struct fscache_operation *op); 158 - void (*release)(struct fscache_operation *op); 159 - ... 160 - }; 161 291 162 - FS-Cache has a pool of threads that it uses to give CPU time to the 163 - various asynchronous operations that need to be done as part of driving 164 - the cache. These are represented by the above structure. The processor 165 - method is called to give the op CPU time, and the release method to get 166 - rid of it when its usage count reaches 0. 292 + * Clean up volume cookie [optional]:: 167 293 168 - An operation can be made exclusive upon an object by setting the 169 - appropriate flag before enqueuing it with fscache_enqueue_operation(). If 170 - an operation needs more processing time, it should be enqueued again. 294 + void (*free_volume)(struct fscache_volume *volume); 171 295 296 + This method is called when a volume cookie is being released if 297 + ``cache_priv`` is set. 172 298 173 - * FS-Cache retrieval operation record:: 174 299 175 - struct fscache_retrieval { 176 - struct fscache_operation op; 177 - struct address_space *mapping; 178 - struct list_head *to_do; 179 - ... 180 - }; 300 + * Look up a cookie in the cache [mandatory]:: 181 301 182 - A structure of this type is allocated by FS-Cache to record retrieval and 183 - allocation requests made by the netfs. This struct is then passed to the 184 - backend to do the operation. The backend may get extra refs to it by 185 - calling fscache_get_retrieval() and refs may be discarded by calling 186 - fscache_put_retrieval(). 302 + bool (*lookup_cookie)(struct fscache_cookie *cookie); 187 303 188 - A retrieval operation can be used by the backend to do retrieval work. To 189 - do this, the retrieval->op.processor method pointer should be set 190 - appropriately by the backend and fscache_enqueue_retrieval() called to 191 - submit it to the thread pool. CacheFiles, for example, uses this to queue 192 - page examination when it detects PG_lock being cleared. 304 + This method is called to look up/create the resources needed to access the 305 + data storage for a cookie. It is called from a worker thread with a 306 + volume-level access pin in the cache to prevent it from being withdrawn. 193 307 194 - The to_do field is an empty list available for the cache backend to use as 195 - it sees fit. 308 + True should be returned if successful and false otherwise. If false is 309 + returned, the withdraw_cookie op (see below) will be called. 196 310 311 + If lookup fails, but the object could still be created (e.g. it hasn't 312 + been cached before), then:: 197 313 198 - * FS-Cache storage operation record:: 314 + void fscache_cookie_lookup_negative( 315 + struct fscache_cookie *cookie); 199 316 200 - struct fscache_storage { 201 - struct fscache_operation op; 202 - pgoff_t store_limit; 203 - ... 204 - }; 317 + can be called to let the network filesystem proceed and start downloading 318 + stuff whilst the cache backend gets on with the job of creating things. 205 319 206 - A structure of this type is allocated by FS-Cache to record outstanding 207 - writes to be made. FS-Cache itself enqueues this operation and invokes 208 - the write_page() method on the object at appropriate times to effect 209 - storage. 320 + If successful, ``cookie->cache_priv`` can be set. 210 321 211 322 212 - Cache Operations 213 - ================ 323 + * Withdraw an object without any cookie access counts held [mandatory]:: 214 324 215 - The cache backend provides FS-Cache with a table of operations that can be 216 - performed on the denizens of the cache. These are held in a structure of type: 325 + void (*withdraw_cookie)(struct fscache_cookie *cookie); 217 326 218 - :: 327 + This method is called to withdraw a cookie from service. It will be 328 + called when the cookie is relinquished by the netfs, withdrawn or culled 329 + by the cache backend or closed after a period of non-use by fscache. 219 330 220 - struct fscache_cache_ops 331 + The caller doesn't hold any access pins, but it is called from a 332 + non-reentrant work item to manage races between the various ways 333 + withdrawal can occur. 221 334 222 - * Name of cache provider [mandatory]:: 335 + The cookie will have the ``FSCACHE_COOKIE_RETIRED`` flag set on it if the 336 + associated data is to be removed from the cache. 223 337 224 - const char *name 225 338 226 - This isn't strictly an operation, but should be pointed at a string naming 227 - the backend. 339 + * Change the size of a data storage object [mandatory]:: 228 340 341 + void (*resize_cookie)(struct netfs_cache_resources *cres, 342 + loff_t new_size); 229 343 230 - * Allocate a new object [mandatory]:: 344 + This method is called to inform the cache backend of a change in size of 345 + the netfs file due to local truncation. The cache backend should make all 346 + of the changes it needs to make before returning as this is done under the 347 + netfs inode mutex. 231 348 232 - struct fscache_object *(*alloc_object)(struct fscache_cache *cache, 233 - struct fscache_cookie *cookie) 349 + The caller holds a cookie-level access pin to prevent a race with 350 + withdrawal and the netfs must have the cookie marked in-use to prevent 351 + garbage collection or culling from removing any resources. 234 352 235 - This method is used to allocate a cache object representation to back a 236 - cookie in a particular cache. fscache_object_init() should be called on 237 - the object to initialise it prior to returning. 238 353 239 - This function may also be used to parse the index key to be used for 240 - multiple lookup calls to turn it into a more convenient form. FS-Cache 241 - will call the lookup_complete() method to allow the cache to release the 242 - form once lookup is complete or aborted. 354 + * Invalidate a data storage object [mandatory]:: 243 355 356 + bool (*invalidate_cookie)(struct fscache_cookie *cookie); 244 357 245 - * Look up and create object [mandatory]:: 358 + This is called when the network filesystem detects a third-party 359 + modification or when an O_DIRECT write is made locally. This requests 360 + that the cache backend should throw away all the data in the cache for 361 + this object and start afresh. It should return true if successful and 362 + false otherwise. 246 363 247 - void (*lookup_object)(struct fscache_object *object) 364 + On entry, new I O/operations are blocked. Once the cache is in a position 365 + to accept I/O again, the backend should release the block by calling:: 248 366 249 - This method is used to look up an object, given that the object is already 250 - allocated and attached to the cookie. This should instantiate that object 251 - in the cache if it can. 367 + void fscache_resume_after_invalidation(struct fscache_cookie *cookie); 252 368 253 - The method should call fscache_object_lookup_negative() as soon as 254 - possible if it determines the object doesn't exist in the cache. If the 255 - object is found to exist and the netfs indicates that it is valid then 256 - fscache_obtained_object() should be called once the object is in a 257 - position to have data stored in it. Similarly, fscache_obtained_object() 258 - should also be called once a non-present object has been created. 369 + If the method returns false, caching will be withdrawn for this cookie. 259 370 260 - If a lookup error occurs, fscache_object_lookup_error() should be called 261 - to abort the lookup of that object. 262 371 372 + * Prepare to make local modifications to the cache [mandatory]:: 263 373 264 - * Release lookup data [mandatory]:: 374 + void (*prepare_to_write)(struct fscache_cookie *cookie); 265 375 266 - void (*lookup_complete)(struct fscache_object *object) 376 + This method is called when the network filesystem finds that it is going 377 + to need to modify the contents of the cache due to local writes or 378 + truncations. This gives the cache a chance to note that a cache object 379 + may be incoherent with respect to the server and may need writing back 380 + later. This may also cause the cached data to be scrapped on later 381 + rebinding if not properly committed. 267 382 268 - This method is called to ask the cache to release any resources it was 269 - using to perform a lookup. 270 383 384 + * Begin an operation for the netfs lib [mandatory]:: 271 385 272 - * Increment object refcount [mandatory]:: 386 + bool (*begin_operation)(struct netfs_cache_resources *cres, 387 + enum fscache_want_state want_state); 273 388 274 - struct fscache_object *(*grab_object)(struct fscache_object *object) 389 + This method is called when an I/O operation is being set up (read, write 390 + or resize). The caller holds an access pin on the cookie and must have 391 + marked the cookie as in-use. 275 392 276 - This method is called to increment the reference count on an object. It 277 - may fail (for instance if the cache is being withdrawn) by returning NULL. 278 - It should return the object pointer if successful. 393 + If it can, the backend should attach any resources it needs to keep around 394 + to the netfs_cache_resources object and return true. 279 395 396 + If it can't complete the setup, it should return false. 280 397 281 - * Lock/Unlock object [mandatory]:: 398 + The want_state parameter indicates the state the caller needs the cache 399 + object to be in and what it wants to do during the operation: 282 400 283 - void (*lock_object)(struct fscache_object *object) 284 - void (*unlock_object)(struct fscache_object *object) 401 + * ``FSCACHE_WANT_PARAMS`` - The caller just wants to access cache 402 + object parameters; it doesn't need to do data I/O yet. 285 403 286 - These methods are used to exclusively lock an object. It must be possible 287 - to schedule with the lock held, so a spinlock isn't sufficient. 404 + * ``FSCACHE_WANT_READ`` - The caller wants to read data. 288 405 406 + * ``FSCACHE_WANT_WRITE`` - The caller wants to write to or resize the 407 + cache object. 289 408 290 - * Pin/Unpin object [optional]:: 409 + Note that there won't necessarily be anything attached to the cookie's 410 + cache_priv yet if the cookie is still being created. 291 411 292 - int (*pin_object)(struct fscache_object *object) 293 - void (*unpin_object)(struct fscache_object *object) 294 412 295 - These methods are used to pin an object into the cache. Once pinned an 296 - object cannot be reclaimed to make space. Return -ENOSPC if there's not 297 - enough space in the cache to permit this. 413 + Data I/O API 414 + ============ 298 415 416 + A cache backend provides a data I/O API by through the netfs library's ``struct 417 + netfs_cache_ops`` attached to a ``struct netfs_cache_resources`` by the 418 + ``begin_operation`` method described above. 299 419 300 - * Check coherency state of an object [mandatory]:: 420 + See the Documentation/filesystems/netfs_library.rst for a description. 301 421 302 - int (*check_consistency)(struct fscache_object *object) 303 422 304 - This method is called to have the cache check the saved auxiliary data of 305 - the object against the netfs's idea of the state. 0 should be returned 306 - if they're consistent and -ESTALE otherwise. -ENOMEM and -ERESTARTSYS 307 - may also be returned. 308 - 309 - * Update object [mandatory]:: 310 - 311 - int (*update_object)(struct fscache_object *object) 312 - 313 - This is called to update the index entry for the specified object. The 314 - new information should be in object->cookie->netfs_data. This can be 315 - obtained by calling object->cookie->def->get_aux()/get_attr(). 316 - 317 - 318 - * Invalidate data object [mandatory]:: 319 - 320 - int (*invalidate_object)(struct fscache_operation *op) 321 - 322 - This is called to invalidate a data object (as pointed to by op->object). 323 - All the data stored for this object should be discarded and an 324 - attr_changed operation should be performed. The caller will follow up 325 - with an object update operation. 326 - 327 - fscache_op_complete() must be called on op before returning. 328 - 329 - 330 - * Discard object [mandatory]:: 331 - 332 - void (*drop_object)(struct fscache_object *object) 333 - 334 - This method is called to indicate that an object has been unbound from its 335 - cookie, and that the cache should release the object's resources and 336 - retire it if it's in state FSCACHE_OBJECT_RECYCLING. 337 - 338 - This method should not attempt to release any references held by the 339 - caller. The caller will invoke the put_object() method as appropriate. 340 - 341 - 342 - * Release object reference [mandatory]:: 343 - 344 - void (*put_object)(struct fscache_object *object) 345 - 346 - This method is used to discard a reference to an object. The object may 347 - be freed when all the references to it are released. 348 - 349 - 350 - * Synchronise a cache [mandatory]:: 351 - 352 - void (*sync)(struct fscache_cache *cache) 353 - 354 - This is called to ask the backend to synchronise a cache with its backing 355 - device. 356 - 357 - 358 - * Dissociate a cache [mandatory]:: 359 - 360 - void (*dissociate_pages)(struct fscache_cache *cache) 361 - 362 - This is called to ask a cache to perform any page dissociations as part of 363 - cache withdrawal. 364 - 365 - 366 - * Notification that the attributes on a netfs file changed [mandatory]:: 367 - 368 - int (*attr_changed)(struct fscache_object *object); 369 - 370 - This is called to indicate to the cache that certain attributes on a netfs 371 - file have changed (for example the maximum size a file may reach). The 372 - cache can read these from the netfs by calling the cookie's get_attr() 373 - method. 374 - 375 - The cache may use the file size information to reserve space on the cache. 376 - It should also call fscache_set_store_limit() to indicate to FS-Cache the 377 - highest byte it's willing to store for an object. 378 - 379 - This method may return -ve if an error occurred or the cache object cannot 380 - be expanded. In such a case, the object will be withdrawn from service. 381 - 382 - This operation is run asynchronously from FS-Cache's thread pool, and 383 - storage and retrieval operations from the netfs are excluded during the 384 - execution of this operation. 385 - 386 - 387 - * Reserve cache space for an object's data [optional]:: 388 - 389 - int (*reserve_space)(struct fscache_object *object, loff_t size); 390 - 391 - This is called to request that cache space be reserved to hold the data 392 - for an object and the metadata used to track it. Zero size should be 393 - taken as request to cancel a reservation. 394 - 395 - This should return 0 if successful, -ENOSPC if there isn't enough space 396 - available, or -ENOMEM or -EIO on other errors. 397 - 398 - The reservation may exceed the current size of the object, thus permitting 399 - future expansion. If the amount of space consumed by an object would 400 - exceed the reservation, it's permitted to refuse requests to allocate 401 - pages, but not required. An object may be pruned down to its reservation 402 - size if larger than that already. 403 - 404 - 405 - * Request page be read from cache [mandatory]:: 406 - 407 - int (*read_or_alloc_page)(struct fscache_retrieval *op, 408 - struct page *page, 409 - gfp_t gfp) 410 - 411 - This is called to attempt to read a netfs page from the cache, or to 412 - reserve a backing block if not. FS-Cache will have done as much checking 413 - as it can before calling, but most of the work belongs to the backend. 414 - 415 - If there's no page in the cache, then -ENODATA should be returned if the 416 - backend managed to reserve a backing block; -ENOBUFS or -ENOMEM if it 417 - didn't. 418 - 419 - If there is suitable data in the cache, then a read operation should be 420 - queued and 0 returned. When the read finishes, fscache_end_io() should be 421 - called. 422 - 423 - The fscache_mark_pages_cached() should be called for the page if any cache 424 - metadata is retained. This will indicate to the netfs that the page needs 425 - explicit uncaching. This operation takes a pagevec, thus allowing several 426 - pages to be marked at once. 427 - 428 - The retrieval record pointed to by op should be retained for each page 429 - queued and released when I/O on the page has been formally ended. 430 - fscache_get/put_retrieval() are available for this purpose. 431 - 432 - The retrieval record may be used to get CPU time via the FS-Cache thread 433 - pool. If this is desired, the op->op.processor should be set to point to 434 - the appropriate processing routine, and fscache_enqueue_retrieval() should 435 - be called at an appropriate point to request CPU time. For instance, the 436 - retrieval routine could be enqueued upon the completion of a disk read. 437 - The to_do field in the retrieval record is provided to aid in this. 438 - 439 - If an I/O error occurs, fscache_io_error() should be called and -ENOBUFS 440 - returned if possible or fscache_end_io() called with a suitable error 441 - code. 442 - 443 - fscache_put_retrieval() should be called after a page or pages are dealt 444 - with. This will complete the operation when all pages are dealt with. 445 - 446 - 447 - * Request pages be read from cache [mandatory]:: 448 - 449 - int (*read_or_alloc_pages)(struct fscache_retrieval *op, 450 - struct list_head *pages, 451 - unsigned *nr_pages, 452 - gfp_t gfp) 453 - 454 - This is like the read_or_alloc_page() method, except it is handed a list 455 - of pages instead of one page. Any pages on which a read operation is 456 - started must be added to the page cache for the specified mapping and also 457 - to the LRU. Such pages must also be removed from the pages list and 458 - ``*nr_pages`` decremented per page. 459 - 460 - If there was an error such as -ENOMEM, then that should be returned; else 461 - if one or more pages couldn't be read or allocated, then -ENOBUFS should 462 - be returned; else if one or more pages couldn't be read, then -ENODATA 463 - should be returned. If all the pages are dispatched then 0 should be 464 - returned. 465 - 466 - 467 - * Request page be allocated in the cache [mandatory]:: 468 - 469 - int (*allocate_page)(struct fscache_retrieval *op, 470 - struct page *page, 471 - gfp_t gfp) 472 - 473 - This is like the read_or_alloc_page() method, except that it shouldn't 474 - read from the cache, even if there's data there that could be retrieved. 475 - It should, however, set up any internal metadata required such that 476 - the write_page() method can write to the cache. 477 - 478 - If there's no backing block available, then -ENOBUFS should be returned 479 - (or -ENOMEM if there were other problems). If a block is successfully 480 - allocated, then the netfs page should be marked and 0 returned. 481 - 482 - 483 - * Request pages be allocated in the cache [mandatory]:: 484 - 485 - int (*allocate_pages)(struct fscache_retrieval *op, 486 - struct list_head *pages, 487 - unsigned *nr_pages, 488 - gfp_t gfp) 489 - 490 - This is an multiple page version of the allocate_page() method. pages and 491 - nr_pages should be treated as for the read_or_alloc_pages() method. 492 - 493 - 494 - * Request page be written to cache [mandatory]:: 495 - 496 - int (*write_page)(struct fscache_storage *op, 497 - struct page *page); 498 - 499 - This is called to write from a page on which there was a previously 500 - successful read_or_alloc_page() call or similar. FS-Cache filters out 501 - pages that don't have mappings. 502 - 503 - This method is called asynchronously from the FS-Cache thread pool. It is 504 - not required to actually store anything, provided -ENODATA is then 505 - returned to the next read of this page. 506 - 507 - If an error occurred, then a negative error code should be returned, 508 - otherwise zero should be returned. FS-Cache will take appropriate action 509 - in response to an error, such as withdrawing this object. 510 - 511 - If this method returns success then FS-Cache will inform the netfs 512 - appropriately. 513 - 514 - 515 - * Discard retained per-page metadata [mandatory]:: 516 - 517 - void (*uncache_page)(struct fscache_object *object, struct page *page) 518 - 519 - This is called when a netfs page is being evicted from the pagecache. The 520 - cache backend should tear down any internal representation or tracking it 521 - maintains for this page. 522 - 523 - 524 - FS-Cache Utilities 525 - ================== 423 + Miscellaneous Functions 424 + ======================= 526 425 527 426 FS-Cache provides some utilities that a cache backend may make use of: 528 427 529 428 * Note occurrence of an I/O error in a cache:: 530 429 531 - void fscache_io_error(struct fscache_cache *cache) 430 + void fscache_io_error(struct fscache_cache *cache); 532 431 533 - This tells FS-Cache that an I/O error occurred in the cache. After this 534 - has been called, only resource dissociation operations (object and page 535 - release) will be passed from the netfs to the cache backend for the 536 - specified cache. 432 + This tells FS-Cache that an I/O error occurred in the cache. This 433 + prevents any new I/O from being started on the cache. 537 434 538 435 This does not actually withdraw the cache. That must be done separately. 539 436 437 + * Note cessation of caching on a cookie due to failure:: 540 438 541 - * Invoke the retrieval I/O completion function:: 439 + void fscache_caching_failed(struct fscache_cookie *cookie); 542 440 543 - void fscache_end_io(struct fscache_retrieval *op, struct page *page, 544 - int error); 441 + This notes that a the caching that was being done on a cookie failed in 442 + some way, for instance the backing storage failed to be created or 443 + invalidation failed and that no further I/O operations should take place 444 + on it until the cache is reset. 545 445 546 - This is called to note the end of an attempt to retrieve a page. The 547 - error value should be 0 if successful and an error otherwise. 446 + * Count I/O requests:: 548 447 448 + void fscache_count_read(void); 449 + void fscache_count_write(void); 549 450 550 - * Record that one or more pages being retrieved or allocated have been dealt 551 - with:: 451 + These record reads and writes from/to the cache. The numbers are 452 + displayed in /proc/fs/fscache/stats. 552 453 553 - void fscache_retrieval_complete(struct fscache_retrieval *op, 554 - int n_pages); 454 + * Count out-of-space errors:: 555 455 556 - This is called to record the fact that one or more pages have been dealt 557 - with and are no longer the concern of this operation. When the number of 558 - pages remaining in the operation reaches 0, the operation will be 559 - completed. 456 + void fscache_count_no_write_space(void); 457 + void fscache_count_no_create_space(void); 560 458 459 + These record ENOSPC errors in the cache, divided into failures of data 460 + writes and failures of filesystem object creations (e.g. mkdir). 561 461 562 - * Record operation completion:: 462 + * Count objects culled:: 563 463 564 - void fscache_op_complete(struct fscache_operation *op); 464 + void fscache_count_culled(void); 565 465 566 - This is called to record the completion of an operation. This deducts 567 - this operation from the parent object's run state, potentially permitting 568 - one or more pending operations to start running. 466 + This records the culling of an object. 569 467 468 + * Get the cookie from a set of cache resources:: 570 469 571 - * Set highest store limit:: 470 + struct fscache_cookie *fscache_cres_cookie(struct netfs_cache_resources *cres) 572 471 573 - void fscache_set_store_limit(struct fscache_object *object, 574 - loff_t i_size); 472 + Pull a pointer to the cookie from the cache resources. This may return a 473 + NULL cookie if no cookie was set. 575 474 576 - This sets the limit FS-Cache imposes on the highest byte it's willing to 577 - try and store for a netfs. Any page over this limit is automatically 578 - rejected by fscache_read_alloc_page() and co with -ENOBUFS. 579 475 476 + API Function Reference 477 + ====================== 580 478 581 - * Mark pages as being cached:: 582 - 583 - void fscache_mark_pages_cached(struct fscache_retrieval *op, 584 - struct pagevec *pagevec); 585 - 586 - This marks a set of pages as being cached. After this has been called, 587 - the netfs must call fscache_uncache_page() to unmark the pages. 588 - 589 - 590 - * Perform coherency check on an object:: 591 - 592 - enum fscache_checkaux fscache_check_aux(struct fscache_object *object, 593 - const void *data, 594 - uint16_t datalen); 595 - 596 - This asks the netfs to perform a coherency check on an object that has 597 - just been looked up. The cookie attached to the object will determine the 598 - netfs to use. data and datalen should specify where the auxiliary data 599 - retrieved from the cache can be found. 600 - 601 - One of three values will be returned: 602 - 603 - FSCACHE_CHECKAUX_OKAY 604 - The coherency data indicates the object is valid as is. 605 - 606 - FSCACHE_CHECKAUX_NEEDS_UPDATE 607 - The coherency data needs updating, but otherwise the object is 608 - valid. 609 - 610 - FSCACHE_CHECKAUX_OBSOLETE 611 - The coherency data indicates that the object is obsolete and should 612 - be discarded. 613 - 614 - 615 - * Initialise a freshly allocated object:: 616 - 617 - void fscache_object_init(struct fscache_object *object); 618 - 619 - This initialises all the fields in an object representation. 620 - 621 - 622 - * Indicate the destruction of an object:: 623 - 624 - void fscache_object_destroyed(struct fscache_cache *cache); 625 - 626 - This must be called to inform FS-Cache that an object that belonged to a 627 - cache has been destroyed and deallocated. This will allow continuation 628 - of the cache withdrawal process when it is stopped pending destruction of 629 - all the objects. 630 - 631 - 632 - * Indicate negative lookup on an object:: 633 - 634 - void fscache_object_lookup_negative(struct fscache_object *object); 635 - 636 - This is called to indicate to FS-Cache that a lookup process for an object 637 - found a negative result. 638 - 639 - This changes the state of an object to permit reads pending on lookup 640 - completion to go off and start fetching data from the netfs server as it's 641 - known at this point that there can't be any data in the cache. 642 - 643 - This may be called multiple times on an object. Only the first call is 644 - significant - all subsequent calls are ignored. 645 - 646 - 647 - * Indicate an object has been obtained:: 648 - 649 - void fscache_obtained_object(struct fscache_object *object); 650 - 651 - This is called to indicate to FS-Cache that a lookup process for an object 652 - produced a positive result, or that an object was created. This should 653 - only be called once for any particular object. 654 - 655 - This changes the state of an object to indicate: 656 - 657 - (1) if no call to fscache_object_lookup_negative() has been made on 658 - this object, that there may be data available, and that reads can 659 - now go and look for it; and 660 - 661 - (2) that writes may now proceed against this object. 662 - 663 - 664 - * Indicate that object lookup failed:: 665 - 666 - void fscache_object_lookup_error(struct fscache_object *object); 667 - 668 - This marks an object as having encountered a fatal error (usually EIO) 669 - and causes it to move into a state whereby it will be withdrawn as soon 670 - as possible. 671 - 672 - 673 - * Indicate that a stale object was found and discarded:: 674 - 675 - void fscache_object_retrying_stale(struct fscache_object *object); 676 - 677 - This is called to indicate that the lookup procedure found an object in 678 - the cache that the netfs decided was stale. The object has been 679 - discarded from the cache and the lookup will be performed again. 680 - 681 - 682 - * Indicate that the caching backend killed an object:: 683 - 684 - void fscache_object_mark_killed(struct fscache_object *object, 685 - enum fscache_why_object_killed why); 686 - 687 - This is called to indicate that the cache backend preemptively killed an 688 - object. The why parameter should be set to indicate the reason: 689 - 690 - FSCACHE_OBJECT_IS_STALE 691 - - the object was stale and needs discarding. 692 - 693 - FSCACHE_OBJECT_NO_SPACE 694 - - there was insufficient cache space 695 - 696 - FSCACHE_OBJECT_WAS_RETIRED 697 - - the object was retired when relinquished. 698 - 699 - FSCACHE_OBJECT_WAS_CULLED 700 - - the object was culled to make space. 701 - 702 - 703 - * Get and release references on a retrieval record:: 704 - 705 - void fscache_get_retrieval(struct fscache_retrieval *op); 706 - void fscache_put_retrieval(struct fscache_retrieval *op); 707 - 708 - These two functions are used to retain a retrieval record while doing 709 - asynchronous data retrieval and block allocation. 710 - 711 - 712 - * Enqueue a retrieval record for processing:: 713 - 714 - void fscache_enqueue_retrieval(struct fscache_retrieval *op); 715 - 716 - This enqueues a retrieval record for processing by the FS-Cache thread 717 - pool. One of the threads in the pool will invoke the retrieval record's 718 - op->op.processor callback function. This function may be called from 719 - within the callback function. 720 - 721 - 722 - * List of object state names:: 723 - 724 - const char *fscache_object_states[]; 725 - 726 - For debugging purposes, this may be used to turn the state that an object 727 - is in into a text string for display purposes. 479 + .. kernel-doc:: include/linux/fscache-cache.h
+3 -3
Documentation/filesystems/caching/cachefiles.rst
··· 1 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 - =============================================== 4 - CacheFiles: CACHE ON ALREADY MOUNTED FILESYSTEM 5 - =============================================== 3 + =================================== 4 + Cache on Already Mounted Filesystem 5 + =================================== 6 6 7 7 .. Contents: 8 8
+151 -368
Documentation/filesystems/caching/fscache.rst
··· 10 10 This facility is a general purpose cache for network filesystems, though it 11 11 could be used for caching other things such as ISO9660 filesystems too. 12 12 13 - FS-Cache mediates between cache backends (such as CacheFS) and network 13 + FS-Cache mediates between cache backends (such as CacheFiles) and network 14 14 filesystems:: 15 15 16 16 +---------+ 17 - | | +--------------+ 18 - | NFS |--+ | | 19 - | | | +-->| CacheFS | 20 - +---------+ | +----------+ | | /dev/hda5 | 21 - | | | | +--------------+ 22 - +---------+ +-->| | | 23 - | | | |--+ 24 - | AFS |----->| FS-Cache | 25 - | | | |--+ 26 - +---------+ +-->| | | 27 - | | | | +--------------+ 28 - +---------+ | +----------+ | | | 29 - | | | +-->| CacheFiles | 30 - | ISOFS |--+ | /var/cache | 31 - | | +--------------+ 17 + | | +--------------+ 18 + | NFS |--+ | | 19 + | | | +-->| CacheFS | 20 + +---------+ | +----------+ | | /dev/hda5 | 21 + | | | | +--------------+ 22 + +---------+ +-------------->| | | 23 + | | +-------+ | |--+ 24 + | AFS |----->| | | FS-Cache | 25 + | | | netfs |-->| |--+ 26 + +---------+ +-->| lib | | | | 27 + | | | | | | +--------------+ 28 + +---------+ | +-------+ +----------+ | | | 29 + | | | +-->| CacheFiles | 30 + | 9P |--+ | /var/cache | 31 + | | +--------------+ 32 32 +---------+ 33 33 34 34 Or to look at it another way, FS-Cache is a module that provides a caching ··· 84 84 one-off access of a small portion of it (such as might be done with the 85 85 "file" program). 86 86 87 - It instead serves the cache out in PAGE_SIZE chunks as and when requested by 88 - the netfs('s) using it. 87 + It instead serves the cache out in chunks as and when requested by the netfs 88 + using it. 89 89 90 90 91 91 FS-Cache provides the following facilities: 92 92 93 - (1) More than one cache can be used at once. Caches can be selected 93 + * More than one cache can be used at once. Caches can be selected 94 94 explicitly by use of tags. 95 95 96 - (2) Caches can be added / removed at any time. 96 + * Caches can be added / removed at any time, even whilst being accessed. 97 97 98 - (3) The netfs is provided with an interface that allows either party to 98 + * The netfs is provided with an interface that allows either party to 99 99 withdraw caching facilities from a file (required for (2)). 100 100 101 - (4) The interface to the netfs returns as few errors as possible, preferring 101 + * The interface to the netfs returns as few errors as possible, preferring 102 102 rather to let the netfs remain oblivious. 103 103 104 - (5) Cookies are used to represent indices, files and other objects to the 105 - netfs. The simplest cookie is just a NULL pointer - indicating nothing 106 - cached there. 104 + * There are three types of cookie: cache, volume and data file cookies. 105 + Cache cookies represent the cache as a whole and are not normally visible 106 + to the netfs; the netfs gets a volume cookie to represent a collection of 107 + files (typically something that a netfs would get for a superblock); and 108 + data file cookies are used to cache data (something that would be got for 109 + an inode). 107 110 108 - (6) The netfs is allowed to propose - dynamically - any index hierarchy it 109 - desires, though it must be aware that the index search function is 110 - recursive, stack space is limited, and indices can only be children of 111 - indices. 111 + * Volumes are matched using a key. This is a printable string that is used 112 + to encode all the information that might be needed to distinguish one 113 + superblock, say, from another. This would be a compound of things like 114 + cell name or server address, volume name or share path. It must be a 115 + valid pathname. 112 116 113 - (7) Data I/O is done direct to and from the netfs's pages. The netfs 114 - indicates that page A is at index B of the data-file represented by cookie 115 - C, and that it should be read or written. The cache backend may or may 116 - not start I/O on that page, but if it does, a netfs callback will be 117 - invoked to indicate completion. The I/O may be either synchronous or 118 - asynchronous. 117 + * Cookies are matched using a key. This is a binary blob and is used to 118 + represent the object within a volume (so the volume key need not form 119 + part of the blob). This might include things like an inode number and 120 + uniquifier or a file handle. 119 121 120 - (8) Cookies can be "retired" upon release. At this point FS-Cache will mark 121 - them as obsolete and the index hierarchy rooted at that point will get 122 - recycled. 122 + * Cookie resources are set up and pinned by marking the cookie in-use. 123 + This prevents the backing resources from being culled. Timed garbage 124 + collection is employed to eliminate cookies that haven't been used for a 125 + short while, thereby reducing resource overload. This is intended to be 126 + used when a file is opened or closed. 123 127 124 - (9) The netfs provides a "match" function for index searches. In addition to 125 - saying whether a match was made or not, this can also specify that an 126 - entry should be updated or deleted. 128 + A cookie can be marked in-use multiple times simultaneously; each mark 129 + must be unused. 127 130 128 - (10) As much as possible is done asynchronously. 131 + * Begin/end access functions are provided to delay cache withdrawal for the 132 + duration of an operation and prevent structs from being freed whilst 133 + we're looking at them. 129 134 135 + * Data I/O is done by asynchronous DIO to/from a buffer described by the 136 + netfs using an iov_iter. 130 137 131 - FS-Cache maintains a virtual indexing tree in which all indices, files, objects 132 - and pages are kept. Bits of this tree may actually reside in one or more 133 - caches:: 138 + * An invalidation facility is available to discard data from the cache and 139 + to deal with I/O that's in progress that is accessing old data. 134 140 135 - FSDEF 136 - | 137 - +------------------------------------+ 138 - | | 139 - NFS AFS 140 - | | 141 - +--------------------------+ +-----------+ 142 - | | | | 143 - homedir mirror afs.org redhat.com 144 - | | | 145 - +------------+ +---------------+ +----------+ 146 - | | | | | | 147 - 00001 00002 00007 00125 vol00001 vol00002 148 - | | | | | 149 - +---+---+ +-----+ +---+ +------+------+ +-----+----+ 150 - | | | | | | | | | | | | | 151 - PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak 152 - | | 153 - PG0 +-------+ 154 - | | 155 - 00001 00003 156 - | 157 - +---+---+ 158 - | | | 159 - PG0 PG1 PG2 160 - 161 - In the example above, you can see two netfs's being backed: NFS and AFS. These 162 - have different index hierarchies: 163 - 164 - * The NFS primary index contains per-server indices. Each server index is 165 - indexed by NFS file handles to get data file objects. Each data file 166 - objects can have an array of pages, but may also have further child 167 - objects, such as extended attributes and directory entries. Extended 168 - attribute objects themselves have page-array contents. 169 - 170 - * The AFS primary index contains per-cell indices. Each cell index contains 171 - per-logical-volume indices. Each of volume index contains up to three 172 - indices for the read-write, read-only and backup mirrors of those volumes. 173 - Each of these contains vnode data file objects, each of which contains an 174 - array of pages. 175 - 176 - The very top index is the FS-Cache master index in which individual netfs's 177 - have entries. 178 - 179 - Any index object may reside in more than one cache, provided it only has index 180 - children. Any index with non-index object children will be assumed to only 181 - reside in one cache. 141 + * Cookies can be "retired" upon release, thereby causing the object to be 142 + removed from the cache. 182 143 183 144 184 145 The netfs API to FS-Cache can be found in: ··· 150 189 151 190 Documentation/filesystems/caching/backend-api.rst 152 191 153 - A description of the internal representations and object state machine can be 154 - found in: 155 - 156 - Documentation/filesystems/caching/object.rst 157 - 158 192 159 193 Statistical Information 160 194 ======================= ··· 157 201 If FS-Cache is compiled with the following options enabled:: 158 202 159 203 CONFIG_FSCACHE_STATS=y 160 - CONFIG_FSCACHE_HISTOGRAM=y 161 204 162 - then it will gather certain statistics and display them through a number of 163 - proc files. 205 + then it will gather certain statistics and display them through: 164 206 165 - /proc/fs/fscache/stats 166 - ---------------------- 207 + /proc/fs/fscache/stats 167 208 168 - This shows counts of a number of events that can happen in FS-Cache: 209 + This shows counts of a number of events that can happen in FS-Cache: 169 210 170 211 +--------------+-------+-------------------------------------------------------+ 171 212 |CLASS |EVENT |MEANING | 172 213 +==============+=======+=======================================================+ 173 - |Cookies |idx=N |Number of index cookies allocated | 214 + |Cookies |n=N |Number of data storage cookies allocated | 174 215 + +-------+-------------------------------------------------------+ 175 - | |dat=N |Number of data storage cookies allocated | 216 + | |v=N |Number of volume index cookies allocated | 176 217 + +-------+-------------------------------------------------------+ 177 - | |spc=N |Number of special cookies allocated | 178 - +--------------+-------+-------------------------------------------------------+ 179 - |Objects |alc=N |Number of objects allocated | 218 + | |vcol=N |Number of volume index key collisions | 180 219 + +-------+-------------------------------------------------------+ 181 - | |nal=N |Number of object allocation failures | 182 - + +-------+-------------------------------------------------------+ 183 - | |avl=N |Number of objects that reached the available state | 184 - + +-------+-------------------------------------------------------+ 185 - | |ded=N |Number of objects that reached the dead state | 186 - +--------------+-------+-------------------------------------------------------+ 187 - |ChkAux |non=N |Number of objects that didn't have a coherency check | 188 - + +-------+-------------------------------------------------------+ 189 - | |ok=N |Number of objects that passed a coherency check | 190 - + +-------+-------------------------------------------------------+ 191 - | |upd=N |Number of objects that needed a coherency data update | 192 - + +-------+-------------------------------------------------------+ 193 - | |obs=N |Number of objects that were declared obsolete | 194 - +--------------+-------+-------------------------------------------------------+ 195 - |Pages |mrk=N |Number of pages marked as being cached | 196 - | |unc=N |Number of uncache page requests seen | 220 + | |voom=N |Number of OOM events when allocating volume cookies | 197 221 +--------------+-------+-------------------------------------------------------+ 198 222 |Acquire |n=N |Number of acquire cookie requests seen | 199 223 + +-------+-------------------------------------------------------+ 200 - | |nul=N |Number of acq reqs given a NULL parent | 201 - + +-------+-------------------------------------------------------+ 202 - | |noc=N |Number of acq reqs rejected due to no cache available | 203 - + +-------+-------------------------------------------------------+ 204 224 | |ok=N |Number of acq reqs succeeded | 205 - + +-------+-------------------------------------------------------+ 206 - | |nbf=N |Number of acq reqs rejected due to error | 207 225 + +-------+-------------------------------------------------------+ 208 226 | |oom=N |Number of acq reqs failed on ENOMEM | 209 227 +--------------+-------+-------------------------------------------------------+ 210 - |Lookups |n=N |Number of lookup calls made on cache backends | 228 + |LRU |n=N |Number of cookies currently on the LRU | 211 229 + +-------+-------------------------------------------------------+ 212 - | |neg=N |Number of negative lookups made | 230 + | |exp=N |Number of cookies expired off of the LRU | 213 231 + +-------+-------------------------------------------------------+ 214 - | |pos=N |Number of positive lookups made | 232 + | |rmv=N |Number of cookies removed from the LRU | 215 233 + +-------+-------------------------------------------------------+ 216 - | |crt=N |Number of objects created by lookup | 234 + | |drp=N |Number of LRU'd cookies relinquished/withdrawn | 217 235 + +-------+-------------------------------------------------------+ 218 - | |tmo=N |Number of lookups timed out and requeued | 236 + | |at=N |Time till next LRU cull (jiffies) | 237 + +--------------+-------+-------------------------------------------------------+ 238 + |Invals |n=N |Number of invalidations | 219 239 +--------------+-------+-------------------------------------------------------+ 220 240 |Updates |n=N |Number of update cookie requests seen | 221 241 + +-------+-------------------------------------------------------+ 222 - | |nul=N |Number of upd reqs given a NULL parent | 242 + | |rsz=N |Number of resize requests | 223 243 + +-------+-------------------------------------------------------+ 224 - | |run=N |Number of upd reqs granted CPU time | 244 + | |rsn=N |Number of skipped resize requests | 225 245 +--------------+-------+-------------------------------------------------------+ 226 246 |Relinqs |n=N |Number of relinquish cookie requests seen | 227 247 + +-------+-------------------------------------------------------+ 228 - | |nul=N |Number of rlq reqs given a NULL parent | 248 + | |rtr=N |Number of rlq reqs with retire=true | 229 249 + +-------+-------------------------------------------------------+ 230 - | |wcr=N |Number of rlq reqs waited on completion of creation | 250 + | |drop=N |Number of cookies no longer blocking re-acquisition | 231 251 +--------------+-------+-------------------------------------------------------+ 232 - |AttrChg |n=N |Number of attribute changed requests seen | 252 + |NoSpace |nwr=N |Number of write requests refused due to lack of space | 233 253 + +-------+-------------------------------------------------------+ 234 - | |ok=N |Number of attr changed requests queued | 254 + | |ncr=N |Number of create requests refused due to lack of space | 235 255 + +-------+-------------------------------------------------------+ 236 - | |nbf=N |Number of attr changed rejected -ENOBUFS | 237 - + +-------+-------------------------------------------------------+ 238 - | |oom=N |Number of attr changed failed -ENOMEM | 239 - + +-------+-------------------------------------------------------+ 240 - | |run=N |Number of attr changed ops given CPU time | 256 + | |cull=N |Number of objects culled to make space | 241 257 +--------------+-------+-------------------------------------------------------+ 242 - |Allocs |n=N |Number of allocation requests seen | 258 + |IO |rd=N |Number of read operations in the cache | 243 259 + +-------+-------------------------------------------------------+ 244 - | |ok=N |Number of successful alloc reqs | 245 - + +-------+-------------------------------------------------------+ 246 - | |wt=N |Number of alloc reqs that waited on lookup completion | 247 - + +-------+-------------------------------------------------------+ 248 - | |nbf=N |Number of alloc reqs rejected -ENOBUFS | 249 - + +-------+-------------------------------------------------------+ 250 - | |int=N |Number of alloc reqs aborted -ERESTARTSYS | 251 - + +-------+-------------------------------------------------------+ 252 - | |ops=N |Number of alloc reqs submitted | 253 - + +-------+-------------------------------------------------------+ 254 - | |owt=N |Number of alloc reqs waited for CPU time | 255 - + +-------+-------------------------------------------------------+ 256 - | |abt=N |Number of alloc reqs aborted due to object death | 257 - +--------------+-------+-------------------------------------------------------+ 258 - |Retrvls |n=N |Number of retrieval (read) requests seen | 259 - + +-------+-------------------------------------------------------+ 260 - | |ok=N |Number of successful retr reqs | 261 - + +-------+-------------------------------------------------------+ 262 - | |wt=N |Number of retr reqs that waited on lookup completion | 263 - + +-------+-------------------------------------------------------+ 264 - | |nod=N |Number of retr reqs returned -ENODATA | 265 - + +-------+-------------------------------------------------------+ 266 - | |nbf=N |Number of retr reqs rejected -ENOBUFS | 267 - + +-------+-------------------------------------------------------+ 268 - | |int=N |Number of retr reqs aborted -ERESTARTSYS | 269 - + +-------+-------------------------------------------------------+ 270 - | |oom=N |Number of retr reqs failed -ENOMEM | 271 - + +-------+-------------------------------------------------------+ 272 - | |ops=N |Number of retr reqs submitted | 273 - + +-------+-------------------------------------------------------+ 274 - | |owt=N |Number of retr reqs waited for CPU time | 275 - + +-------+-------------------------------------------------------+ 276 - | |abt=N |Number of retr reqs aborted due to object death | 277 - +--------------+-------+-------------------------------------------------------+ 278 - |Stores |n=N |Number of storage (write) requests seen | 279 - + +-------+-------------------------------------------------------+ 280 - | |ok=N |Number of successful store reqs | 281 - + +-------+-------------------------------------------------------+ 282 - | |agn=N |Number of store reqs on a page already pending storage | 283 - + +-------+-------------------------------------------------------+ 284 - | |nbf=N |Number of store reqs rejected -ENOBUFS | 285 - + +-------+-------------------------------------------------------+ 286 - | |oom=N |Number of store reqs failed -ENOMEM | 287 - + +-------+-------------------------------------------------------+ 288 - | |ops=N |Number of store reqs submitted | 289 - + +-------+-------------------------------------------------------+ 290 - | |run=N |Number of store reqs granted CPU time | 291 - + +-------+-------------------------------------------------------+ 292 - | |pgs=N |Number of pages given store req processing time | 293 - + +-------+-------------------------------------------------------+ 294 - | |rxd=N |Number of store reqs deleted from tracking tree | 295 - + +-------+-------------------------------------------------------+ 296 - | |olm=N |Number of store reqs over store limit | 297 - +--------------+-------+-------------------------------------------------------+ 298 - |VmScan |nos=N |Number of release reqs against pages with no | 299 - | | |pending store | 300 - + +-------+-------------------------------------------------------+ 301 - | |gon=N |Number of release reqs against pages stored by | 302 - | | |time lock granted | 303 - + +-------+-------------------------------------------------------+ 304 - | |bsy=N |Number of release reqs ignored due to in-progress store| 305 - + +-------+-------------------------------------------------------+ 306 - | |can=N |Number of page stores cancelled due to release req | 307 - +--------------+-------+-------------------------------------------------------+ 308 - |Ops |pend=N |Number of times async ops added to pending queues | 309 - + +-------+-------------------------------------------------------+ 310 - | |run=N |Number of times async ops given CPU time | 311 - + +-------+-------------------------------------------------------+ 312 - | |enq=N |Number of times async ops queued for processing | 313 - + +-------+-------------------------------------------------------+ 314 - | |can=N |Number of async ops cancelled | 315 - + +-------+-------------------------------------------------------+ 316 - | |rej=N |Number of async ops rejected due to object | 317 - | | |lookup/create failure | 318 - + +-------+-------------------------------------------------------+ 319 - | |ini=N |Number of async ops initialised | 320 - + +-------+-------------------------------------------------------+ 321 - | |dfr=N |Number of async ops queued for deferred release | 322 - + +-------+-------------------------------------------------------+ 323 - | |rel=N |Number of async ops released | 324 - | | |(should equal ini=N when idle) | 325 - + +-------+-------------------------------------------------------+ 326 - | |gc=N |Number of deferred-release async ops garbage collected | 327 - +--------------+-------+-------------------------------------------------------+ 328 - |CacheOp |alo=N |Number of in-progress alloc_object() cache ops | 329 - + +-------+-------------------------------------------------------+ 330 - | |luo=N |Number of in-progress lookup_object() cache ops | 331 - + +-------+-------------------------------------------------------+ 332 - | |luc=N |Number of in-progress lookup_complete() cache ops | 333 - + +-------+-------------------------------------------------------+ 334 - | |gro=N |Number of in-progress grab_object() cache ops | 335 - + +-------+-------------------------------------------------------+ 336 - | |upo=N |Number of in-progress update_object() cache ops | 337 - + +-------+-------------------------------------------------------+ 338 - | |dro=N |Number of in-progress drop_object() cache ops | 339 - + +-------+-------------------------------------------------------+ 340 - | |pto=N |Number of in-progress put_object() cache ops | 341 - + +-------+-------------------------------------------------------+ 342 - | |syn=N |Number of in-progress sync_cache() cache ops | 343 - + +-------+-------------------------------------------------------+ 344 - | |atc=N |Number of in-progress attr_changed() cache ops | 345 - + +-------+-------------------------------------------------------+ 346 - | |rap=N |Number of in-progress read_or_alloc_page() cache ops | 347 - + +-------+-------------------------------------------------------+ 348 - | |ras=N |Number of in-progress read_or_alloc_pages() cache ops | 349 - + +-------+-------------------------------------------------------+ 350 - | |alp=N |Number of in-progress allocate_page() cache ops | 351 - + +-------+-------------------------------------------------------+ 352 - | |als=N |Number of in-progress allocate_pages() cache ops | 353 - + +-------+-------------------------------------------------------+ 354 - | |wrp=N |Number of in-progress write_page() cache ops | 355 - + +-------+-------------------------------------------------------+ 356 - | |ucp=N |Number of in-progress uncache_page() cache ops | 357 - + +-------+-------------------------------------------------------+ 358 - | |dsp=N |Number of in-progress dissociate_pages() cache ops | 359 - +--------------+-------+-------------------------------------------------------+ 360 - |CacheEv |nsp=N |Number of object lookups/creations rejected due to | 361 - | | |lack of space | 362 - + +-------+-------------------------------------------------------+ 363 - | |stl=N |Number of stale objects deleted | 364 - + +-------+-------------------------------------------------------+ 365 - | |rtr=N |Number of objects retired when relinquished | 366 - + +-------+-------------------------------------------------------+ 367 - | |cul=N |Number of objects culled | 260 + | |wr=N |Number of write operations in the cache | 368 261 +--------------+-------+-------------------------------------------------------+ 369 262 263 + Netfslib will also add some stats counters of its own. 370 264 371 265 372 - /proc/fs/fscache/histogram 373 - -------------------------- 266 + Cache List 267 + ========== 374 268 375 - :: 269 + FS-Cache provides a list of cache cookies: 376 270 377 - cat /proc/fs/fscache/histogram 378 - JIFS SECS OBJ INST OP RUNS OBJ RUNS RETRV DLY RETRIEVLS 379 - ===== ===== ========= ========= ========= ========= ========= 380 - 381 - This shows the breakdown of the number of times each amount of time 382 - between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The 383 - columns are as follows: 384 - 385 - ========= ======================================================= 386 - COLUMN TIME MEASUREMENT 387 - ========= ======================================================= 388 - OBJ INST Length of time to instantiate an object 389 - OP RUNS Length of time a call to process an operation took 390 - OBJ RUNS Length of time a call to process an object event took 391 - RETRV DLY Time between an requesting a read and lookup completing 392 - RETRIEVLS Time between beginning and end of a retrieval 393 - ========= ======================================================= 394 - 395 - Each row shows the number of events that took a particular range of times. 396 - Each step is 1 jiffy in size. The JIFS column indicates the particular 397 - jiffy range covered, and the SECS field the equivalent number of seconds. 398 - 399 - 400 - 401 - Object List 402 - =========== 403 - 404 - If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a 405 - list of all the objects currently allocated and allow them to be viewed 406 - through:: 407 - 408 - /proc/fs/fscache/objects 271 + /proc/fs/fscache/cookies 409 272 410 273 This will look something like:: 411 274 412 - [root@andromeda ~]# head /proc/fs/fscache/objects 413 - OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA 414 - ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================ 415 - 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a 416 - 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a 275 + # cat /proc/fs/fscache/caches 276 + CACHE REF VOLS OBJS ACCES S NAME 277 + ======== ===== ===== ===== ===== = =============== 278 + 00000001 2 1 2123 1 A default 417 279 418 - where the first set of columns before the '|' describe the object: 280 + where the columns are: 419 281 420 282 ======= =============================================================== 421 283 COLUMN DESCRIPTION 422 284 ======= =============================================================== 423 - OBJECT Object debugging ID (appears as OBJ%x in some debug messages) 424 - PARENT Debugging ID of parent object 425 - STAT Object state 426 - CHLDN Number of child objects of this object 427 - OPS Number of outstanding operations on this object 428 - OOP Number of outstanding child object management operations 429 - IPR 430 - EX Number of outstanding exclusive operations 431 - READS Number of outstanding read operations 432 - EM Object's event mask 433 - EV Events raised on this object 434 - F Object flags 435 - S Object work item busy state mask (1:pending 2:running) 285 + CACHE Cache cookie debug ID (also appears in traces) 286 + REF Number of references on the cache cookie 287 + VOLS Number of volumes cookies in this cache 288 + OBJS Number of cache objects in use 289 + ACCES Number of accesses pinning the cache 290 + S State 291 + NAME Name of the cache. 436 292 ======= =============================================================== 437 293 438 - and the second set of columns describe the object's cookie, if present: 294 + The state can be (-) Inactive, (P)reparing, (A)ctive, (E)rror or (W)ithdrawing. 439 295 440 - ================ ====================================================== 441 - COLUMN DESCRIPTION 442 - ================ ====================================================== 443 - NETFS_COOKIE_DEF Name of netfs cookie definition 444 - TY Cookie type (IX - index, DT - data, hex - special) 445 - FL Cookie flags 446 - NETFS_DATA Netfs private data stored in the cookie 447 - OBJECT_KEY Object key } 1 column, with separating comma 448 - AUX_DATA Object aux data } presence may be configured 449 - ================ ====================================================== 450 296 451 - The data shown may be filtered by attaching the a key to an appropriate keyring 452 - before viewing the file. Something like:: 297 + Volume List 298 + =========== 453 299 454 - keyctl add user fscache:objlist <restrictions> @s 300 + FS-Cache provides a list of volume cookies: 455 301 456 - where <restrictions> are a selection of the following letters: 302 + /proc/fs/fscache/volumes 457 303 458 - == ========================================================= 459 - K Show hexdump of object key (don't show if not given) 460 - A Show hexdump of object aux data (don't show if not given) 461 - == ========================================================= 304 + This will look something like:: 462 305 463 - and the following paired letters: 306 + VOLUME REF nCOOK ACC FL CACHE KEY 307 + ======== ===== ===== === == =============== ================ 308 + 00000001 55 54 1 00 default afs,example.com,100058 464 309 465 - == ========================================================= 466 - C Show objects that have a cookie 467 - c Show objects that don't have a cookie 468 - B Show objects that are busy 469 - b Show objects that aren't busy 470 - W Show objects that have pending writes 471 - w Show objects that don't have pending writes 472 - R Show objects that have outstanding reads 473 - r Show objects that don't have outstanding reads 474 - S Show objects that have work queued 475 - s Show objects that don't have work queued 476 - == ========================================================= 310 + where the columns are: 477 311 478 - If neither side of a letter pair is given, then both are implied. For example: 312 + ======= =============================================================== 313 + COLUMN DESCRIPTION 314 + ======= =============================================================== 315 + VOLUME The volume cookie debug ID (also appears in traces) 316 + REF Number of references on the volume cookie 317 + nCOOK Number of cookies in the volume 318 + ACC Number of accesses pinning the cache 319 + FL Flags on the volume cookie 320 + CACHE Name of the cache or "-" 321 + KEY The indexing key for the volume 322 + ======= =============================================================== 479 323 480 - keyctl add user fscache:objlist KB @s 481 324 482 - shows objects that are busy, and lists their object keys, but does not dump 483 - their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is 484 - not implied. 325 + Cookie List 326 + =========== 485 327 486 - By default all objects and all fields will be shown. 328 + FS-Cache provides a list of cookies: 329 + 330 + /proc/fs/fscache/cookies 331 + 332 + This will look something like:: 333 + 334 + # head /proc/fs/fscache/cookies 335 + COOKIE VOLUME REF ACT ACC S FL DEF 336 + ======== ======== === === === = == ================ 337 + 00000435 00000001 1 0 -1 - 08 0000000201d080070000000000000000, 0000000000000000 338 + 00000436 00000001 1 0 -1 - 00 0000005601d080080000000000000000, 0000000000000051 339 + 00000437 00000001 1 0 -1 - 08 00023b3001d0823f0000000000000000, 0000000000000000 340 + 00000438 00000001 1 0 -1 - 08 0000005801d0807b0000000000000000, 0000000000000000 341 + 00000439 00000001 1 0 -1 - 08 00023b3201d080a10000000000000000, 0000000000000000 342 + 0000043a 00000001 1 0 -1 - 08 00023b3401d080a30000000000000000, 0000000000000000 343 + 0000043b 00000001 1 0 -1 - 08 00023b3601d080b30000000000000000, 0000000000000000 344 + 0000043c 00000001 1 0 -1 - 08 00023b3801d080b40000000000000000, 0000000000000000 345 + 346 + where the columns are: 347 + 348 + ======= =============================================================== 349 + COLUMN DESCRIPTION 350 + ======= =============================================================== 351 + COOKIE The cookie debug ID (also appears in traces) 352 + VOLUME The parent volume cookie debug ID 353 + REF Number of references on the volume cookie 354 + ACT Number of times the cookie is marked for in use 355 + ACC Number of access pins in the cookie 356 + S State of the cookie 357 + FL Flags on the cookie 358 + DEF Key, auxiliary data 359 + ======= =============================================================== 487 360 488 361 489 362 Debugging ··· 334 549 3 8 Cookie management Function entry trace 335 550 4 16 Function exit trace 336 551 5 32 General 337 - 6 64 Page handling Function entry trace 338 - 7 128 Function exit trace 339 - 8 256 General 340 - 9 512 Operation management Function entry trace 552 + 6-8 (Not used) 553 + 9 512 I/O operation management Function entry trace 341 554 10 1024 Function exit trace 342 555 11 2048 General 343 556 ======= ======= =============================== ======================= ··· 343 560 The appropriate set of values should be OR'd together and the result written to 344 561 the control file. For example:: 345 562 346 - echo $((1|8|64)) >/sys/module/fscache/parameters/debug 563 + echo $((1|8|512)) >/sys/module/fscache/parameters/debug 347 564 348 565 will turn on all function entry debugging.
+1 -3
Documentation/filesystems/caching/index.rst
··· 7 7 :maxdepth: 2 8 8 9 9 fscache 10 - object 10 + netfs-api 11 11 backend-api 12 12 cachefiles 13 - netfs-api 14 - operations
+410 -854
Documentation/filesystems/caching/netfs-api.rst
··· 1 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 - =============================== 4 - FS-Cache Network Filesystem API 5 - =============================== 3 + ============================== 4 + Network Filesystem Caching API 5 + ============================== 6 6 7 - There's an API by which a network filesystem can make use of the FS-Cache 8 - facilities. This is based around a number of principles: 7 + Fscache provides an API by which a network filesystem can make use of local 8 + caching facilities. The API is arranged around a number of principles: 9 9 10 - (1) Caches can store a number of different object types. There are two main 11 - object types: indices and files. The first is a special type used by 12 - FS-Cache to make finding objects faster and to make retiring of groups of 13 - objects easier. 10 + (1) A cache is logically organised into volumes and data storage objects 11 + within those volumes. 14 12 15 - (2) Every index, file or other object is represented by a cookie. This cookie 16 - may or may not have anything associated with it, but the netfs doesn't 17 - need to care. 13 + (2) Volumes and data storage objects are represented by various types of 14 + cookie. 18 15 19 - (3) Barring the top-level index (one entry per cached netfs), the index 20 - hierarchy for each netfs is structured according the whim of the netfs. 16 + (3) Cookies have keys that distinguish them from their peers. 21 17 22 - This API is declared in <linux/fscache.h>. 18 + (4) Cookies have coherency data that allows a cache to determine if the 19 + cached data is still valid. 20 + 21 + (5) I/O is done asynchronously where possible. 22 + 23 + This API is used by:: 24 + 25 + #include <linux/fscache.h>. 23 26 24 27 .. This document contains the following sections: 25 28 26 - (1) Network filesystem definition 27 - (2) Index definition 28 - (3) Object definition 29 - (4) Network filesystem (un)registration 30 - (5) Cache tag lookup 31 - (6) Index registration 32 - (7) Data file registration 33 - (8) Miscellaneous object registration 34 - (9) Setting the data file size 35 - (10) Page alloc/read/write 36 - (11) Page uncaching 37 - (12) Index and data file consistency 38 - (13) Cookie enablement 39 - (14) Miscellaneous cookie operations 40 - (15) Cookie unregistration 41 - (16) Index invalidation 42 - (17) Data file invalidation 43 - (18) FS-Cache specific page flags. 44 - 45 - 46 - Network Filesystem Definition 47 - ============================= 48 - 49 - FS-Cache needs a description of the network filesystem. This is specified 50 - using a record of the following structure:: 51 - 52 - struct fscache_netfs { 53 - uint32_t version; 54 - const char *name; 55 - struct fscache_cookie *primary_index; 56 - ... 57 - }; 58 - 59 - This first two fields should be filled in before registration, and the third 60 - will be filled in by the registration function; any other fields should just be 61 - ignored and are for internal use only. 62 - 63 - The fields are: 64 - 65 - (1) The name of the netfs (used as the key in the toplevel index). 66 - 67 - (2) The version of the netfs (if the name matches but the version doesn't, the 68 - entire in-cache hierarchy for this netfs will be scrapped and begun 69 - afresh). 70 - 71 - (3) The cookie representing the primary index will be allocated according to 72 - another parameter passed into the registration function. 73 - 74 - For example, kAFS (linux/fs/afs/) uses the following definitions to describe 75 - itself:: 76 - 77 - struct fscache_netfs afs_cache_netfs = { 78 - .version = 0, 79 - .name = "afs", 80 - }; 81 - 82 - 83 - Index Definition 84 - ================ 85 - 86 - Indices are used for two purposes: 87 - 88 - (1) To aid the finding of a file based on a series of keys (such as AFS's 89 - "cell", "volume ID", "vnode ID"). 90 - 91 - (2) To make it easier to discard a subset of all the files cached based around 92 - a particular key - for instance to mirror the removal of an AFS volume. 93 - 94 - However, since it's unlikely that any two netfs's are going to want to define 95 - their index hierarchies in quite the same way, FS-Cache tries to impose as few 96 - restraints as possible on how an index is structured and where it is placed in 97 - the tree. The netfs can even mix indices and data files at the same level, but 98 - it's not recommended. 99 - 100 - Each index entry consists of a key of indeterminate length plus some auxiliary 101 - data, also of indeterminate length. 102 - 103 - There are some limits on indices: 104 - 105 - (1) Any index containing non-index objects should be restricted to a single 106 - cache. Any such objects created within an index will be created in the 107 - first cache only. The cache in which an index is created can be 108 - controlled by cache tags (see below). 109 - 110 - (2) The entry data must be atomically journallable, so it is limited to about 111 - 400 bytes at present. At least 400 bytes will be available. 112 - 113 - (3) The depth of the index tree should be judged with care as the search 114 - function is recursive. Too many layers will run the kernel out of stack. 115 - 116 - 117 - Object Definition 118 - ================= 119 - 120 - To define an object, a structure of the following type should be filled out:: 121 - 122 - struct fscache_cookie_def 123 - { 124 - uint8_t name[16]; 125 - uint8_t type; 126 - 127 - struct fscache_cache_tag *(*select_cache)( 128 - const void *parent_netfs_data, 129 - const void *cookie_netfs_data); 130 - 131 - enum fscache_checkaux (*check_aux)(void *cookie_netfs_data, 132 - const void *data, 133 - uint16_t datalen, 134 - loff_t object_size); 135 - 136 - void (*get_context)(void *cookie_netfs_data, void *context); 137 - 138 - void (*put_context)(void *cookie_netfs_data, void *context); 139 - 140 - void (*mark_pages_cached)(void *cookie_netfs_data, 141 - struct address_space *mapping, 142 - struct pagevec *cached_pvec); 143 - }; 144 - 145 - This has the following fields: 146 - 147 - (1) The type of the object [mandatory]. 148 - 149 - This is one of the following values: 150 - 151 - FSCACHE_COOKIE_TYPE_INDEX 152 - This defines an index, which is a special FS-Cache type. 153 - 154 - FSCACHE_COOKIE_TYPE_DATAFILE 155 - This defines an ordinary data file. 156 - 157 - Any other value between 2 and 255 158 - This defines an extraordinary object such as an XATTR. 159 - 160 - (2) The name of the object type (NUL terminated unless all 16 chars are used) 161 - [optional]. 162 - 163 - (3) A function to select the cache in which to store an index [optional]. 164 - 165 - This function is invoked when an index needs to be instantiated in a cache 166 - during the instantiation of a non-index object. Only the immediate index 167 - parent for the non-index object will be queried. Any indices above that 168 - in the hierarchy may be stored in multiple caches. This function does not 169 - need to be supplied for any non-index object or any index that will only 170 - have index children. 171 - 172 - If this function is not supplied or if it returns NULL then the first 173 - cache in the parent's list will be chosen, or failing that, the first 174 - cache in the master list. 175 - 176 - (4) A function to check the auxiliary data [optional]. 177 - 178 - This function will be called to check that a match found in the cache for 179 - this object is valid. For instance with AFS it could check the auxiliary 180 - data against the data version number returned by the server to determine 181 - whether the index entry in a cache is still valid. 182 - 183 - If this function is absent, it will be assumed that matching objects in a 184 - cache are always valid. 185 - 186 - The function is also passed the cache's idea of the object size and may 187 - use this to manage coherency also. 188 - 189 - If present, the function should return one of the following values: 190 - 191 - FSCACHE_CHECKAUX_OKAY 192 - - the entry is okay as is 193 - 194 - FSCACHE_CHECKAUX_NEEDS_UPDATE 195 - - the entry requires update 196 - 197 - FSCACHE_CHECKAUX_OBSOLETE 198 - - the entry should be deleted 199 - 200 - This function can also be used to extract data from the auxiliary data in 201 - the cache and copy it into the netfs's structures. 202 - 203 - (5) A pair of functions to manage contexts for the completion callback 204 - [optional]. 205 - 206 - The cache read/write functions are passed a context which is then passed 207 - to the I/O completion callback function. To ensure this context remains 208 - valid until after the I/O completion is called, two functions may be 209 - provided: one to get an extra reference on the context, and one to drop a 210 - reference to it. 211 - 212 - If the context is not used or is a type of object that won't go out of 213 - scope, then these functions are not required. These functions are not 214 - required for indices as indices may not contain data. These functions may 215 - be called in interrupt context and so may not sleep. 216 - 217 - (6) A function to mark a page as retaining cache metadata [optional]. 218 - 219 - This is called by the cache to indicate that it is retaining in-memory 220 - information for this page and that the netfs should uncache the page when 221 - it has finished. This does not indicate whether there's data on the disk 222 - or not. Note that several pages at once may be presented for marking. 223 - 224 - The PG_fscache bit is set on the pages before this function would be 225 - called, so the function need not be provided if this is sufficient. 226 - 227 - This function is not required for indices as they're not permitted data. 228 - 229 - (7) A function to unmark all the pages retaining cache metadata [mandatory]. 230 - 231 - This is called by FS-Cache to indicate that a backing store is being 232 - unbound from a cookie and that all the marks on the pages should be 233 - cleared to prevent confusion. Note that the cache will have torn down all 234 - its tracking information so that the pages don't need to be explicitly 235 - uncached. 236 - 237 - This function is not required for indices as they're not permitted data. 238 - 239 - 240 - Network Filesystem (Un)registration 241 - =================================== 242 - 243 - The first step is to declare the network filesystem to the cache. This also 244 - involves specifying the layout of the primary index (for AFS, this would be the 245 - "cell" level). 246 - 247 - The registration function is:: 248 - 249 - int fscache_register_netfs(struct fscache_netfs *netfs); 250 - 251 - It just takes a pointer to the netfs definition. It returns 0 or an error as 252 - appropriate. 253 - 254 - For kAFS, registration is done as follows:: 255 - 256 - ret = fscache_register_netfs(&afs_cache_netfs); 257 - 258 - The last step is, of course, unregistration:: 259 - 260 - void fscache_unregister_netfs(struct fscache_netfs *netfs); 261 - 262 - 263 - Cache Tag Lookup 264 - ================ 265 - 266 - FS-Cache permits the use of more than one cache. To permit particular index 267 - subtrees to be bound to particular caches, the second step is to look up cache 268 - representation tags. This step is optional; it can be left entirely up to 269 - FS-Cache as to which cache should be used. The problem with doing that is that 270 - FS-Cache will always pick the first cache that was registered. 271 - 272 - To get the representation for a named tag:: 273 - 274 - struct fscache_cache_tag *fscache_lookup_cache_tag(const char *name); 275 - 276 - This takes a text string as the name and returns a representation of a tag. It 277 - will never return an error. It may return a dummy tag, however, if it runs out 278 - of memory; this will inhibit caching with this tag. 279 - 280 - Any representation so obtained must be released by passing it to this function:: 281 - 282 - void fscache_release_cache_tag(struct fscache_cache_tag *tag); 283 - 284 - The tag will be retrieved by FS-Cache when it calls the object definition 285 - operation select_cache(). 286 - 287 - 288 - Index Registration 289 - ================== 290 - 291 - The third step is to inform FS-Cache about part of an index hierarchy that can 292 - be used to locate files. This is done by requesting a cookie for each index in 293 - the path to the file:: 294 - 295 - struct fscache_cookie * 296 - fscache_acquire_cookie(struct fscache_cookie *parent, 297 - const struct fscache_object_def *def, 298 - const void *index_key, 299 - size_t index_key_len, 300 - const void *aux_data, 301 - size_t aux_data_len, 302 - void *netfs_data, 303 - loff_t object_size, 304 - bool enable); 305 - 306 - This function creates an index entry in the index represented by parent, 307 - filling in the index entry by calling the operations pointed to by def. 308 - 309 - A unique key that represents the object within the parent must be pointed to by 310 - index_key and is of length index_key_len. 311 - 312 - An optional blob of auxiliary data that is to be stored within the cache can be 313 - pointed to with aux_data and should be of length aux_data_len. This would 314 - typically be used for storing coherency data. 315 - 316 - The netfs may pass an arbitrary value in netfs_data and this will be presented 317 - to it in the event of any calling back. This may also be used in tracing or 318 - logging of messages. 319 - 320 - The cache tracks the size of the data attached to an object and this set to be 321 - object_size. For indices, this should be 0. This value will be passed to the 322 - ->check_aux() callback. 323 - 324 - Note that this function never returns an error - all errors are handled 325 - internally. It may, however, return NULL to indicate no cookie. It is quite 326 - acceptable to pass this token back to this function as the parent to another 327 - acquisition (or even to the relinquish cookie, read page and write page 328 - functions - see below). 329 - 330 - Note also that no indices are actually created in a cache until a non-index 331 - object needs to be created somewhere down the hierarchy. Furthermore, an index 332 - may be created in several different caches independently at different times. 333 - This is all handled transparently, and the netfs doesn't see any of it. 334 - 335 - A cookie will be created in the disabled state if enabled is false. A cookie 336 - must be enabled to do anything with it. A disabled cookie can be enabled by 337 - calling fscache_enable_cookie() (see below). 338 - 339 - For example, with AFS, a cell would be added to the primary index. This index 340 - entry would have a dependent inode containing volume mappings within this cell:: 341 - 342 - cell->cache = 343 - fscache_acquire_cookie(afs_cache_netfs.primary_index, 344 - &afs_cell_cache_index_def, 345 - cell->name, strlen(cell->name), 346 - NULL, 0, 347 - cell, 0, true); 348 - 349 - And then a particular volume could be added to that index by ID, creating 350 - another index for vnodes (AFS inode equivalents):: 351 - 352 - volume->cache = 353 - fscache_acquire_cookie(volume->cell->cache, 354 - &afs_volume_cache_index_def, 355 - &volume->vid, sizeof(volume->vid), 356 - NULL, 0, 357 - volume, 0, true); 29 + (1) Overview 30 + (2) Volume registration 31 + (3) Data file registration 32 + (4) Declaring a cookie to be in use 33 + (5) Resizing a data file (truncation) 34 + (6) Data I/O API 35 + (7) Data file coherency 36 + (8) Data file invalidation 37 + (9) Write back resource management 38 + (10) Caching of local modifications 39 + (11) Page release and invalidation 40 + 41 + 42 + Overview 43 + ======== 44 + 45 + The fscache hierarchy is organised on two levels from a network filesystem's 46 + point of view. The upper level represents "volumes" and the lower level 47 + represents "data storage objects". These are represented by two types of 48 + cookie, hereafter referred to as "volume cookies" and "cookies". 49 + 50 + A network filesystem acquires a volume cookie for a volume using a volume key, 51 + which represents all the information that defines that volume (e.g. cell name 52 + or server address, volume ID or share name). This must be rendered as a 53 + printable string that can be used as a directory name (ie. no '/' characters 54 + and shouldn't begin with a '.'). The maximum name length is one less than the 55 + maximum size of a filename component (allowing the cache backend one char for 56 + its own purposes). 57 + 58 + A filesystem would typically have a volume cookie for each superblock. 59 + 60 + The filesystem then acquires a cookie for each file within that volume using an 61 + object key. Object keys are binary blobs and only need to be unique within 62 + their parent volume. The cache backend is reponsible for rendering the binary 63 + blob into something it can use and may employ hash tables, trees or whatever to 64 + improve its ability to find an object. This is transparent to the network 65 + filesystem. 66 + 67 + A filesystem would typically have a cookie for each inode, and would acquire it 68 + in iget and relinquish it when evicting the cookie. 69 + 70 + Once it has a cookie, the filesystem needs to mark the cookie as being in use. 71 + This causes fscache to send the cache backend off to look up/create resources 72 + for the cookie in the background, to check its coherency and, if necessary, to 73 + mark the object as being under modification. 74 + 75 + A filesystem would typically "use" the cookie in its file open routine and 76 + unuse it in file release and it needs to use the cookie around calls to 77 + truncate the cookie locally. It *also* needs to use the cookie when the 78 + pagecache becomes dirty and unuse it when writeback is complete. This is 79 + slightly tricky, and provision is made for it. 80 + 81 + When performing a read, write or resize on a cookie, the filesystem must first 82 + begin an operation. This copies the resources into a holding struct and puts 83 + extra pins into the cache to stop cache withdrawal from tearing down the 84 + structures being used. The actual operation can then be issued and conflicting 85 + invalidations can be detected upon completion. 86 + 87 + The filesystem is expected to use netfslib to access the cache, but that's not 88 + actually required and it can use the fscache I/O API directly. 89 + 90 + 91 + Volume Registration 92 + =================== 93 + 94 + The first step for a network filsystem is to acquire a volume cookie for the 95 + volume it wants to access:: 96 + 97 + struct fscache_volume * 98 + fscache_acquire_volume(const char *volume_key, 99 + const char *cache_name, 100 + const void *coherency_data, 101 + size_t coherency_len); 102 + 103 + This function creates a volume cookie with the specified volume key as its name 104 + and notes the coherency data. 105 + 106 + The volume key must be a printable string with no '/' characters in it. It 107 + should begin with the name of the filesystem and should be no longer than 254 108 + characters. It should uniquely represent the volume and will be matched with 109 + what's stored in the cache. 110 + 111 + The caller may also specify the name of the cache to use. If specified, 112 + fscache will look up or create a cache cookie of that name and will use a cache 113 + of that name if it is online or comes online. If no cache name is specified, 114 + it will use the first cache that comes to hand and set the name to that. 115 + 116 + The specified coherency data is stored in the cookie and will be matched 117 + against coherency data stored on disk. The data pointer may be NULL if no data 118 + is provided. If the coherency data doesn't match, the entire cache volume will 119 + be invalidated. 120 + 121 + This function can return errors such as EBUSY if the volume key is already in 122 + use by an acquired volume or ENOMEM if an allocation failure occured. It may 123 + also return a NULL volume cookie if fscache is not enabled. It is safe to 124 + pass a NULL cookie to any function that takes a volume cookie. This will 125 + cause that function to do nothing. 126 + 127 + 128 + When the network filesystem has finished with a volume, it should relinquish it 129 + by calling:: 130 + 131 + void fscache_relinquish_volume(struct fscache_volume *volume, 132 + const void *coherency_data, 133 + bool invalidate); 134 + 135 + This will cause the volume to be committed or removed, and if sealed the 136 + coherency data will be set to the value supplied. The amount of coherency data 137 + must match the length specified when the volume was acquired. Note that all 138 + data cookies obtained in this volume must be relinquished before the volume is 139 + relinquished. 358 140 359 141 360 142 Data File Registration 361 143 ====================== 362 144 363 - The fourth step is to request a data file be created in the cache. This is 364 - identical to index cookie acquisition. The only difference is that the type in 365 - the object definition should be something other than index type:: 366 - 367 - vnode->cache = 368 - fscache_acquire_cookie(volume->cache, 369 - &afs_vnode_cache_object_def, 370 - &key, sizeof(key), 371 - &aux, sizeof(aux), 372 - vnode, vnode->status.size, true); 373 - 374 - 375 - Miscellaneous Object Registration 376 - ================================= 377 - 378 - An optional step is to request an object of miscellaneous type be created in 379 - the cache. This is almost identical to index cookie acquisition. The only 380 - difference is that the type in the object definition should be something other 381 - than index type. While the parent object could be an index, it's more likely 382 - it would be some other type of object such as a data file:: 383 - 384 - xattr->cache = 385 - fscache_acquire_cookie(vnode->cache, 386 - &afs_xattr_cache_object_def, 387 - &xattr->name, strlen(xattr->name), 388 - NULL, 0, 389 - xattr, strlen(xattr->val), true); 390 - 391 - Miscellaneous objects might be used to store extended attributes or directory 392 - entries for example. 393 - 394 - 395 - Setting the Data File Size 396 - ========================== 397 - 398 - The fifth step is to set the physical attributes of the file, such as its size. 399 - This doesn't automatically reserve any space in the cache, but permits the 400 - cache to adjust its metadata for data tracking appropriately:: 401 - 402 - int fscache_attr_changed(struct fscache_cookie *cookie); 403 - 404 - The cache will return -ENOBUFS if there is no backing cache or if there is no 405 - space to allocate any extra metadata required in the cache. 406 - 407 - Note that attempts to read or write data pages in the cache over this size may 408 - be rebuffed with -ENOBUFS. 409 - 410 - This operation schedules an attribute adjustment to happen asynchronously at 411 - some point in the future, and as such, it may happen after the function returns 412 - to the caller. The attribute adjustment excludes read and write operations. 413 - 414 - 415 - Page alloc/read/write 416 - ===================== 417 - 418 - And the sixth step is to store and retrieve pages in the cache. There are 419 - three functions that are used to do this. 420 - 421 - Note: 422 - 423 - (1) A page should not be re-read or re-allocated without uncaching it first. 424 - 425 - (2) A read or allocated page must be uncached when the netfs page is released 426 - from the pagecache. 427 - 428 - (3) A page should only be written to the cache if previous read or allocated. 429 - 430 - This permits the cache to maintain its page tracking in proper order. 431 - 432 - 433 - PAGE READ 434 - --------- 435 - 436 - Firstly, the netfs should ask FS-Cache to examine the caches and read the 437 - contents cached for a particular page of a particular file if present, or else 438 - allocate space to store the contents if not:: 439 - 440 - typedef 441 - void (*fscache_rw_complete_t)(struct page *page, 442 - void *context, 443 - int error); 444 - 445 - int fscache_read_or_alloc_page(struct fscache_cookie *cookie, 446 - struct page *page, 447 - fscache_rw_complete_t end_io_func, 448 - void *context, 449 - gfp_t gfp); 450 - 451 - The cookie argument must specify a cookie for an object that isn't an index, 452 - the page specified will have the data loaded into it (and is also used to 453 - specify the page number), and the gfp argument is used to control how any 454 - memory allocations made are satisfied. 455 - 456 - If the cookie indicates the inode is not cached: 457 - 458 - (1) The function will return -ENOBUFS. 459 - 460 - Else if there's a copy of the page resident in the cache: 461 - 462 - (1) The mark_pages_cached() cookie operation will be called on that page. 463 - 464 - (2) The function will submit a request to read the data from the cache's 465 - backing device directly into the page specified. 466 - 467 - (3) The function will return 0. 468 - 469 - (4) When the read is complete, end_io_func() will be invoked with: 470 - 471 - * The netfs data supplied when the cookie was created. 472 - 473 - * The page descriptor. 474 - 475 - * The context argument passed to the above function. This will be 476 - maintained with the get_context/put_context functions mentioned above. 477 - 478 - * An argument that's 0 on success or negative for an error code. 479 - 480 - If an error occurs, it should be assumed that the page contains no usable 481 - data. fscache_readpages_cancel() may need to be called. 482 - 483 - end_io_func() will be called in process context if the read is results in 484 - an error, but it might be called in interrupt context if the read is 485 - successful. 486 - 487 - Otherwise, if there's not a copy available in cache, but the cache may be able 488 - to store the page: 489 - 490 - (1) The mark_pages_cached() cookie operation will be called on that page. 491 - 492 - (2) A block may be reserved in the cache and attached to the object at the 493 - appropriate place. 494 - 495 - (3) The function will return -ENODATA. 496 - 497 - This function may also return -ENOMEM or -EINTR, in which case it won't have 498 - read any data from the cache. 499 - 500 - 501 - Page Allocate 502 - ------------- 503 - 504 - Alternatively, if there's not expected to be any data in the cache for a page 505 - because the file has been extended, a block can simply be allocated instead:: 506 - 507 - int fscache_alloc_page(struct fscache_cookie *cookie, 508 - struct page *page, 509 - gfp_t gfp); 510 - 511 - This is similar to the fscache_read_or_alloc_page() function, except that it 512 - never reads from the cache. It will return 0 if a block has been allocated, 513 - rather than -ENODATA as the other would. One or the other must be performed 514 - before writing to the cache. 515 - 516 - The mark_pages_cached() cookie operation will be called on the page if 517 - successful. 518 - 519 - 520 - Page Write 521 - ---------- 522 - 523 - Secondly, if the netfs changes the contents of the page (either due to an 524 - initial download or if a user performs a write), then the page should be 525 - written back to the cache:: 526 - 527 - int fscache_write_page(struct fscache_cookie *cookie, 528 - struct page *page, 529 - loff_t object_size, 530 - gfp_t gfp); 531 - 532 - The cookie argument must specify a data file cookie, the page specified should 533 - contain the data to be written (and is also used to specify the page number), 534 - object_size is the revised size of the object and the gfp argument is used to 535 - control how any memory allocations made are satisfied. 536 - 537 - The page must have first been read or allocated successfully and must not have 538 - been uncached before writing is performed. 539 - 540 - If the cookie indicates the inode is not cached then: 541 - 542 - (1) The function will return -ENOBUFS. 543 - 544 - Else if space can be allocated in the cache to hold this page: 545 - 546 - (1) PG_fscache_write will be set on the page. 547 - 548 - (2) The function will submit a request to write the data to cache's backing 549 - device directly from the page specified. 550 - 551 - (3) The function will return 0. 552 - 553 - (4) When the write is complete PG_fscache_write is cleared on the page and 554 - anyone waiting for that bit will be woken up. 555 - 556 - Else if there's no space available in the cache, -ENOBUFS will be returned. It 557 - is also possible for the PG_fscache_write bit to be cleared when no write took 558 - place if unforeseen circumstances arose (such as a disk error). 559 - 560 - Writing takes place asynchronously. 561 - 562 - 563 - Multiple Page Read 564 - ------------------ 565 - 566 - A facility is provided to read several pages at once, as requested by the 567 - readpages() address space operation:: 568 - 569 - int fscache_read_or_alloc_pages(struct fscache_cookie *cookie, 570 - struct address_space *mapping, 571 - struct list_head *pages, 572 - int *nr_pages, 573 - fscache_rw_complete_t end_io_func, 574 - void *context, 575 - gfp_t gfp); 576 - 577 - This works in a similar way to fscache_read_or_alloc_page(), except: 578 - 579 - (1) Any page it can retrieve data for is removed from pages and nr_pages and 580 - dispatched for reading to the disk. Reads of adjacent pages on disk may 581 - be merged for greater efficiency. 582 - 583 - (2) The mark_pages_cached() cookie operation will be called on several pages 584 - at once if they're being read or allocated. 585 - 586 - (3) If there was an general error, then that error will be returned. 587 - 588 - Else if some pages couldn't be allocated or read, then -ENOBUFS will be 589 - returned. 590 - 591 - Else if some pages couldn't be read but were allocated, then -ENODATA will 592 - be returned. 593 - 594 - Otherwise, if all pages had reads dispatched, then 0 will be returned, the 595 - list will be empty and ``*nr_pages`` will be 0. 596 - 597 - (4) end_io_func will be called once for each page being read as the reads 598 - complete. It will be called in process context if error != 0, but it may 599 - be called in interrupt context if there is no error. 600 - 601 - Note that a return of -ENODATA, -ENOBUFS or any other error does not preclude 602 - some of the pages being read and some being allocated. Those pages will have 603 - been marked appropriately and will need uncaching. 604 - 605 - 606 - Cancellation of Unread Pages 607 - ---------------------------- 608 - 609 - If one or more pages are passed to fscache_read_or_alloc_pages() but not then 610 - read from the cache and also not read from the underlying filesystem then 611 - those pages will need to have any marks and reservations removed. This can be 612 - done by calling:: 613 - 614 - void fscache_readpages_cancel(struct fscache_cookie *cookie, 615 - struct list_head *pages); 616 - 617 - prior to returning to the caller. The cookie argument should be as passed to 618 - fscache_read_or_alloc_pages(). Every page in the pages list will be examined 619 - and any that have PG_fscache set will be uncached. 620 - 621 - 622 - Page Uncaching 623 - ============== 624 - 625 - To uncache a page, this function should be called:: 626 - 627 - void fscache_uncache_page(struct fscache_cookie *cookie, 628 - struct page *page); 629 - 630 - This function permits the cache to release any in-memory representation it 631 - might be holding for this netfs page. This function must be called once for 632 - each page on which the read or write page functions above have been called to 633 - make sure the cache's in-memory tracking information gets torn down. 634 - 635 - Note that pages can't be explicitly deleted from the a data file. The whole 636 - data file must be retired (see the relinquish cookie function below). 637 - 638 - Furthermore, note that this does not cancel the asynchronous read or write 639 - operation started by the read/alloc and write functions, so the page 640 - invalidation functions must use:: 641 - 642 - bool fscache_check_page_write(struct fscache_cookie *cookie, 643 - struct page *page); 644 - 645 - to see if a page is being written to the cache, and:: 646 - 647 - void fscache_wait_on_page_write(struct fscache_cookie *cookie, 648 - struct page *page); 649 - 650 - to wait for it to finish if it is. 651 - 652 - 653 - When releasepage() is being implemented, a special FS-Cache function exists to 654 - manage the heuristics of coping with vmscan trying to eject pages, which may 655 - conflict with the cache trying to write pages to the cache (which may itself 656 - need to allocate memory):: 657 - 658 - bool fscache_maybe_release_page(struct fscache_cookie *cookie, 659 - struct page *page, 660 - gfp_t gfp); 661 - 662 - This takes the netfs cookie, and the page and gfp arguments as supplied to 663 - releasepage(). It will return false if the page cannot be released yet for 664 - some reason and if it returns true, the page has been uncached and can now be 665 - released. 666 - 667 - To make a page available for release, this function may wait for an outstanding 668 - storage request to complete, or it may attempt to cancel the storage request - 669 - in which case the page will not be stored in the cache this time. 670 - 671 - 672 - Bulk Image Page Uncache 673 - ----------------------- 674 - 675 - A convenience routine is provided to perform an uncache on all the pages 676 - attached to an inode. This assumes that the pages on the inode correspond on a 677 - 1:1 basis with the pages in the cache:: 678 - 679 - void fscache_uncache_all_inode_pages(struct fscache_cookie *cookie, 680 - struct inode *inode); 681 - 682 - This takes the netfs cookie that the pages were cached with and the inode that 683 - the pages are attached to. This function will wait for pages to finish being 684 - written to the cache and for the cache to finish with the page generally. No 685 - error is returned. 686 - 687 - 688 - Index and Data File consistency 689 - =============================== 690 - 691 - To find out whether auxiliary data for an object is up to data within the 692 - cache, the following function can be called:: 693 - 694 - int fscache_check_consistency(struct fscache_cookie *cookie, 695 - const void *aux_data); 696 - 697 - This will call back to the netfs to check whether the auxiliary data associated 698 - with a cookie is correct; if aux_data is non-NULL, it will update the auxiliary 699 - data buffer first. It returns 0 if it is and -ESTALE if it isn't; it may also 700 - return -ENOMEM and -ERESTARTSYS. 701 - 702 - To request an update of the index data for an index or other object, the 703 - following function should be called:: 704 - 705 - void fscache_update_cookie(struct fscache_cookie *cookie, 706 - const void *aux_data); 707 - 708 - This function will update the cookie's auxiliary data buffer from aux_data if 709 - that is non-NULL and then schedule this to be stored on disk. The update 710 - method in the parent index definition will be called to transfer the data. 711 - 712 - Note that partial updates may happen automatically at other times, such as when 713 - data blocks are added to a data file object. 714 - 715 - 716 - Cookie Enablement 717 - ================= 718 - 719 - Cookies exist in one of two states: enabled and disabled. If a cookie is 720 - disabled, it ignores all attempts to acquire child cookies; check, update or 721 - invalidate its state; allocate, read or write backing pages - though it is 722 - still possible to uncache pages and relinquish the cookie. 723 - 724 - The initial enablement state is set by fscache_acquire_cookie(), but the cookie 725 - can be enabled or disabled later. To disable a cookie, call:: 726 - 727 - void fscache_disable_cookie(struct fscache_cookie *cookie, 728 - const void *aux_data, 729 - bool invalidate); 730 - 731 - If the cookie is not already disabled, this locks the cookie against other 732 - enable and disable ops, marks the cookie as being disabled, discards or 733 - invalidates any backing objects and waits for cessation of activity on any 734 - associated object before unlocking the cookie. 735 - 736 - All possible failures are handled internally. The caller should consider 737 - calling fscache_uncache_all_inode_pages() afterwards to make sure all page 738 - markings are cleared up. 739 - 740 - Cookies can be enabled or reenabled with:: 741 - 742 - void fscache_enable_cookie(struct fscache_cookie *cookie, 743 - const void *aux_data, 744 - loff_t object_size, 745 - bool (*can_enable)(void *data), 746 - void *data) 747 - 748 - If the cookie is not already enabled, this locks the cookie against other 749 - enable and disable ops, invokes can_enable() and, if the cookie is not an index 750 - cookie, will begin the procedure of acquiring backing objects. 751 - 752 - The optional can_enable() function is passed the data argument and returns a 753 - ruling as to whether or not enablement should actually be permitted to begin. 754 - 755 - All possible failures are handled internally. The cookie will only be marked 756 - as enabled if provisional backing objects are allocated. 757 - 758 - The object's data size is updated from object_size and is passed to the 759 - ->check_aux() function. 760 - 761 - In both cases, the cookie's auxiliary data buffer is updated from aux_data if 762 - that is non-NULL inside the enablement lock before proceeding. 763 - 764 - 765 - Miscellaneous Cookie operations 766 - =============================== 767 - 768 - There are a number of operations that can be used to control cookies: 769 - 770 - * Cookie pinning:: 771 - 772 - int fscache_pin_cookie(struct fscache_cookie *cookie); 773 - void fscache_unpin_cookie(struct fscache_cookie *cookie); 774 - 775 - These operations permit data cookies to be pinned into the cache and to 776 - have the pinning removed. They are not permitted on index cookies. 777 - 778 - The pinning function will return 0 if successful, -ENOBUFS in the cookie 779 - isn't backed by a cache, -EOPNOTSUPP if the cache doesn't support pinning, 780 - -ENOSPC if there isn't enough space to honour the operation, -ENOMEM or 781 - -EIO if there's any other problem. 782 - 783 - * Data space reservation:: 784 - 785 - int fscache_reserve_space(struct fscache_cookie *cookie, loff_t size); 786 - 787 - This permits a netfs to request cache space be reserved to store up to the 788 - given amount of a file. It is permitted to ask for more than the current 789 - size of the file to allow for future file expansion. 790 - 791 - If size is given as zero then the reservation will be cancelled. 792 - 793 - The function will return 0 if successful, -ENOBUFS in the cookie isn't 794 - backed by a cache, -EOPNOTSUPP if the cache doesn't support reservations, 795 - -ENOSPC if there isn't enough space to honour the operation, -ENOMEM or 796 - -EIO if there's any other problem. 797 - 798 - Note that this doesn't pin an object in a cache; it can still be culled to 799 - make space if it's not in use. 800 - 801 - 802 - Cookie Unregistration 803 - ===================== 804 - 805 - To get rid of a cookie, this function should be called:: 145 + Once it has a volume cookie, a network filesystem can use it to acquire a 146 + cookie for data storage:: 147 + 148 + struct fscache_cookie * 149 + fscache_acquire_cookie(struct fscache_volume *volume, 150 + u8 advice, 151 + const void *index_key, 152 + size_t index_key_len, 153 + const void *aux_data, 154 + size_t aux_data_len, 155 + loff_t object_size) 156 + 157 + This creates the cookie in the volume using the specified index key. The index 158 + key is a binary blob of the given length and must be unique for the volume. 159 + This is saved into the cookie. There are no restrictions on the content, but 160 + its length shouldn't exceed about three quarters of the maximum filename length 161 + to allow for encoding. 162 + 163 + The caller should also pass in a piece of coherency data in aux_data. A buffer 164 + of size aux_data_len will be allocated and the coherency data copied in. It is 165 + assumed that the size is invariant over time. The coherency data is used to 166 + check the validity of data in the cache. Functions are provided by which the 167 + coherency data can be updated. 168 + 169 + The file size of the object being cached should also be provided. This may be 170 + used to trim the data and will be stored with the coherency data. 171 + 172 + This function never returns an error, though it may return a NULL cookie on 173 + allocation failure or if fscache is not enabled. It is safe to pass in a NULL 174 + volume cookie and pass the NULL cookie returned to any function that takes it. 175 + This will cause that function to do nothing. 176 + 177 + 178 + When the network filesystem has finished with a cookie, it should relinquish it 179 + by calling:: 806 180 807 181 void fscache_relinquish_cookie(struct fscache_cookie *cookie, 808 - const void *aux_data, 809 182 bool retire); 810 183 811 - If retire is non-zero, then the object will be marked for recycling, and all 812 - copies of it will be removed from all active caches in which it is present. 813 - Not only that but all child objects will also be retired. 814 - 815 - If retire is zero, then the object may be available again when next the 816 - acquisition function is called. Retirement here will overrule the pinning on a 817 - cookie. 818 - 819 - The cookie's auxiliary data will be updated from aux_data if that is non-NULL 820 - so that the cache can lazily update it on disk. 821 - 822 - One very important note - relinquish must NOT be called for a cookie unless all 823 - the cookies for "child" indices, objects and pages have been relinquished 824 - first. 184 + This will cause fscache to either commit the storage backing the cookie or 185 + delete it. 825 186 826 187 827 - Index Invalidation 828 - ================== 188 + Marking A Cookie In-Use 189 + ======================= 829 190 830 - There is no direct way to invalidate an index subtree. To do this, the caller 831 - should relinquish and retire the cookie they have, and then acquire a new one. 191 + Once a cookie has been acquired by a network filesystem, the filesystem should 192 + tell fscache when it intends to use the cookie (typically done on file open) 193 + and should say when it has finished with it (typically on file close):: 194 + 195 + void fscache_use_cookie(struct fscache_cookie *cookie, 196 + bool will_modify); 197 + void fscache_unuse_cookie(struct fscache_cookie *cookie, 198 + const void *aux_data, 199 + const loff_t *object_size); 200 + 201 + The *use* function tells fscache that it will use the cookie and, additionally, 202 + indicate if the user is intending to modify the contents locally. If not yet 203 + done, this will trigger the cache backend to go and gather the resources it 204 + needs to access/store data in the cache. This is done in the background, and 205 + so may not be complete by the time the function returns. 206 + 207 + The *unuse* function indicates that a filesystem has finished using a cookie. 208 + It optionally updates the stored coherency data and object size and then 209 + decreases the in-use counter. When the last user unuses the cookie, it is 210 + scheduled for garbage collection. If not reused within a short time, the 211 + resources will be released to reduce system resource consumption. 212 + 213 + A cookie must be marked in-use before it can be accessed for read, write or 214 + resize - and an in-use mark must be kept whilst there is dirty data in the 215 + pagecache in order to avoid an oops due to trying to open a file during process 216 + exit. 217 + 218 + Note that in-use marks are cumulative. For each time a cookie is marked 219 + in-use, it must be unused. 220 + 221 + 222 + Resizing A Data File (Truncation) 223 + ================================= 224 + 225 + If a network filesystem file is resized locally by truncation, the following 226 + should be called to notify the cache:: 227 + 228 + void fscache_resize_cookie(struct fscache_cookie *cookie, 229 + loff_t new_size); 230 + 231 + The caller must have first marked the cookie in-use. The cookie and the new 232 + size are passed in and the cache is synchronously resized. This is expected to 233 + be called from ``->setattr()`` inode operation under the inode lock. 234 + 235 + 236 + Data I/O API 237 + ============ 238 + 239 + To do data I/O operations directly through a cookie, the following functions 240 + are available:: 241 + 242 + int fscache_begin_read_operation(struct netfs_cache_resources *cres, 243 + struct fscache_cookie *cookie); 244 + int fscache_read(struct netfs_cache_resources *cres, 245 + loff_t start_pos, 246 + struct iov_iter *iter, 247 + enum netfs_read_from_hole read_hole, 248 + netfs_io_terminated_t term_func, 249 + void *term_func_priv); 250 + int fscache_write(struct netfs_cache_resources *cres, 251 + loff_t start_pos, 252 + struct iov_iter *iter, 253 + netfs_io_terminated_t term_func, 254 + void *term_func_priv); 255 + 256 + The *begin* function sets up an operation, attaching the resources required to 257 + the cache resources block from the cookie. Assuming it doesn't return an error 258 + (for instance, it will return -ENOBUFS if given a NULL cookie, but otherwise do 259 + nothing), then one of the other two functions can be issued. 260 + 261 + The *read* and *write* functions initiate a direct-IO operation. Both take the 262 + previously set up cache resources block, an indication of the start file 263 + position, and an I/O iterator that describes buffer and indicates the amount of 264 + data. 265 + 266 + The read function also takes a parameter to indicate how it should handle a 267 + partially populated region (a hole) in the disk content. This may be to ignore 268 + it, skip over an initial hole and place zeros in the buffer or give an error. 269 + 270 + The read and write functions can be given an optional termination function that 271 + will be run on completion:: 272 + 273 + typedef 274 + void (*netfs_io_terminated_t)(void *priv, ssize_t transferred_or_error, 275 + bool was_async); 276 + 277 + If a termination function is given, the operation will be run asynchronously 278 + and the termination function will be called upon completion. If not given, the 279 + operation will be run synchronously. Note that in the asynchronous case, it is 280 + possible for the operation to complete before the function returns. 281 + 282 + Both the read and write functions end the operation when they complete, 283 + detaching any pinned resources. 284 + 285 + The read operation will fail with ESTALE if invalidation occurred whilst the 286 + operation was ongoing. 287 + 288 + 289 + Data File Coherency 290 + =================== 291 + 292 + To request an update of the coherency data and file size on a cookie, the 293 + following should be called:: 294 + 295 + void fscache_update_cookie(struct fscache_cookie *cookie, 296 + const void *aux_data, 297 + const loff_t *object_size); 298 + 299 + This will update the cookie's coherency data and/or file size. 832 300 833 301 834 302 Data File Invalidation 835 303 ====================== 836 304 837 305 Sometimes it will be necessary to invalidate an object that contains data. 838 - Typically this will be necessary when the server tells the netfs of a foreign 839 - change - at which point the netfs has to throw away all the state it had for an 840 - inode and reload from the server. 306 + Typically this will be necessary when the server informs the network filesystem 307 + of a remote third-party change - at which point the filesystem has to throw 308 + away the state and cached data that it had for an file and reload from the 309 + server. 841 310 842 - To indicate that a cache object should be invalidated, the following function 843 - can be called:: 311 + To indicate that a cache object should be invalidated, the following should be 312 + called:: 844 313 845 - void fscache_invalidate(struct fscache_cookie *cookie); 314 + void fscache_invalidate(struct fscache_cookie *cookie, 315 + const void *aux_data, 316 + loff_t size, 317 + unsigned int flags); 846 318 847 - This can be called with spinlocks held as it defers the work to a thread pool. 848 - All extant storage, retrieval and attribute change ops at this point are 849 - cancelled and discarded. Some future operations will be rejected until the 850 - cache has had a chance to insert a barrier in the operations queue. After 851 - that, operations will be queued again behind the invalidation operation. 319 + This increases the invalidation counter in the cookie to cause outstanding 320 + reads to fail with -ESTALE, sets the coherency data and file size from the 321 + information supplied, blocks new I/O on the cookie and dispatches the cache to 322 + go and get rid of the old data. 852 323 853 - The invalidation operation will perform an attribute change operation and an 854 - auxiliary data update operation as it is very likely these will have changed. 855 - 856 - Using the following function, the netfs can wait for the invalidation operation 857 - to have reached a point at which it can start submitting ordinary operations 858 - once again:: 859 - 860 - void fscache_wait_on_invalidate(struct fscache_cookie *cookie); 324 + Invalidation runs asynchronously in a worker thread so that it doesn't block 325 + too much. 861 326 862 327 863 - FS-cache Specific Page Flag 864 - =========================== 328 + Write-Back Resource Management 329 + ============================== 865 330 866 - FS-Cache makes use of a page flag, PG_private_2, for its own purpose. This is 867 - given the alternative name PG_fscache. 331 + To write data to the cache from network filesystem writeback, the cache 332 + resources required need to be pinned at the point the modification is made (for 333 + instance when the page is marked dirty) as it's not possible to open a file in 334 + a thread that's exiting. 868 335 869 - PG_fscache is used to indicate that the page is known by the cache, and that 870 - the cache must be informed if the page is going to go away. It's an indication 871 - to the netfs that the cache has an interest in this page, where an interest may 872 - be a pointer to it, resources allocated or reserved for it, or I/O in progress 873 - upon it. 336 + The following facilities are provided to manage this: 874 337 875 - The netfs can use this information in methods such as releasepage() to 876 - determine whether it needs to uncache a page or update it. 338 + * An inode flag, ``I_PINNING_FSCACHE_WB``, is provided to indicate that an 339 + in-use is held on the cookie for this inode. It can only be changed if the 340 + the inode lock is held. 877 341 878 - Furthermore, if this bit is set, releasepage() and invalidatepage() operations 879 - will be called on a page to get rid of it, even if PG_private is not set. This 880 - allows caching to attempted on a page before read_cache_pages() to be called 881 - after fscache_read_or_alloc_pages() as the former will try and release pages it 882 - was given under certain circumstances. 342 + * A flag, ``unpinned_fscache_wb`` is placed in the ``writeback_control`` 343 + struct that gets set if ``__writeback_single_inode()`` clears 344 + ``I_PINNING_FSCACHE_WB`` because all the dirty pages were cleared. 883 345 884 - This bit does not overlap with such as PG_private. This means that FS-Cache 885 - can be used with a filesystem that uses the block buffering code. 346 + To support this, the following functions are provided:: 886 347 887 - There are a number of operations defined on this flag:: 348 + int fscache_set_page_dirty(struct page *page, 349 + struct fscache_cookie *cookie); 350 + void fscache_unpin_writeback(struct writeback_control *wbc, 351 + struct fscache_cookie *cookie); 352 + void fscache_clear_inode_writeback(struct fscache_cookie *cookie, 353 + struct inode *inode, 354 + const void *aux); 888 355 889 - int PageFsCache(struct page *page); 890 - void SetPageFsCache(struct page *page) 891 - void ClearPageFsCache(struct page *page) 892 - int TestSetPageFsCache(struct page *page) 893 - int TestClearPageFsCache(struct page *page) 356 + The *set* function is intended to be called from the filesystem's 357 + ``set_page_dirty`` address space operation. If ``I_PINNING_FSCACHE_WB`` is not 358 + set, it sets that flag and increments the use count on the cookie (the caller 359 + must already have called ``fscache_use_cookie()``). 894 360 895 - These functions are bit test, bit set, bit clear, bit test and set and bit 896 - test and clear operations on PG_fscache. 361 + The *unpin* function is intended to be called from the filesystem's 362 + ``write_inode`` superblock operation. It cleans up after writing by unusing 363 + the cookie if unpinned_fscache_wb is set in the writeback_control struct. 364 + 365 + The *clear* function is intended to be called from the netfs's ``evict_inode`` 366 + superblock operation. It must be called *after* 367 + ``truncate_inode_pages_final()``, but *before* ``clear_inode()``. This cleans 368 + up any hanging ``I_PINNING_FSCACHE_WB``. It also allows the coherency data to 369 + be updated. 370 + 371 + 372 + Caching of Local Modifications 373 + ============================== 374 + 375 + If a network filesystem has locally modified data that it wants to write to the 376 + cache, it needs to mark the pages to indicate that a write is in progress, and 377 + if the mark is already present, it needs to wait for it to be removed first 378 + (presumably due to an already in-progress operation). This prevents multiple 379 + competing DIO writes to the same storage in the cache. 380 + 381 + Firstly, the netfs should determine if caching is available by doing something 382 + like:: 383 + 384 + bool caching = fscache_cookie_enabled(cookie); 385 + 386 + If caching is to be attempted, pages should be waited for and then marked using 387 + the following functions provided by the netfs helper library:: 388 + 389 + void set_page_fscache(struct page *page); 390 + void wait_on_page_fscache(struct page *page); 391 + int wait_on_page_fscache_killable(struct page *page); 392 + 393 + Once all the pages in the span are marked, the netfs can ask fscache to 394 + schedule a write of that region:: 395 + 396 + void fscache_write_to_cache(struct fscache_cookie *cookie, 397 + struct address_space *mapping, 398 + loff_t start, size_t len, loff_t i_size, 399 + netfs_io_terminated_t term_func, 400 + void *term_func_priv, 401 + bool caching) 402 + 403 + And if an error occurs before that point is reached, the marks can be removed 404 + by calling:: 405 + 406 + void fscache_clear_page_bits(struct fscache_cookie *cookie, 407 + struct address_space *mapping, 408 + loff_t start, size_t len, 409 + bool caching) 410 + 411 + In both of these functions, the cookie representing the cache object to be 412 + written to and a pointer to the mapping to which the source pages are attached 413 + are passed in; start and len indicate the size of the region that's going to be 414 + written (it doesn't have to align to page boundaries necessarily, but it does 415 + have to align to DIO boundaries on the backing filesystem). The caching 416 + parameter indicates if caching should be skipped, and if false, the functions 417 + do nothing. 418 + 419 + The write function takes some additional parameters: i_size indicates the size 420 + of the netfs file and term_func indicates an optional completion function, to 421 + which term_func_priv will be passed, along with the error or amount written. 422 + 423 + Note that the write function will always run asynchronously and will unmark all 424 + the pages upon completion before calling term_func. 425 + 426 + 427 + Page Release and Invalidation 428 + ============================= 429 + 430 + Fscache keeps track of whether we have any data in the cache yet for a cache 431 + object we've just created. It knows it doesn't have to do any reading until it 432 + has done a write and then the page it wrote from has been released by the VM, 433 + after which it *has* to look in the cache. 434 + 435 + To inform fscache that a page might now be in the cache, the following function 436 + should be called from the ``releasepage`` address space op:: 437 + 438 + void fscache_note_page_release(struct fscache_cookie *cookie); 439 + 440 + if the page has been released (ie. releasepage returned true). 441 + 442 + Page release and page invalidation should also wait for any mark left on the 443 + page to say that a DIO write is underway from that page:: 444 + 445 + void wait_on_page_fscache(struct page *page); 446 + int wait_on_page_fscache_killable(struct page *page); 447 + 448 + 449 + API Function Reference 450 + ====================== 451 + 452 + .. kernel-doc:: include/linux/fscache.h
-313
Documentation/filesystems/caching/object.rst
··· 1 - .. SPDX-License-Identifier: GPL-2.0 2 - 3 - ==================================================== 4 - In-Kernel Cache Object Representation and Management 5 - ==================================================== 6 - 7 - By: David Howells <dhowells@redhat.com> 8 - 9 - .. Contents: 10 - 11 - (*) Representation 12 - 13 - (*) Object management state machine. 14 - 15 - - Provision of cpu time. 16 - - Locking simplification. 17 - 18 - (*) The set of states. 19 - 20 - (*) The set of events. 21 - 22 - 23 - Representation 24 - ============== 25 - 26 - FS-Cache maintains an in-kernel representation of each object that a netfs is 27 - currently interested in. Such objects are represented by the fscache_cookie 28 - struct and are referred to as cookies. 29 - 30 - FS-Cache also maintains a separate in-kernel representation of the objects that 31 - a cache backend is currently actively caching. Such objects are represented by 32 - the fscache_object struct. The cache backends allocate these upon request, and 33 - are expected to embed them in their own representations. These are referred to 34 - as objects. 35 - 36 - There is a 1:N relationship between cookies and objects. A cookie may be 37 - represented by multiple objects - an index may exist in more than one cache - 38 - or even by no objects (it may not be cached). 39 - 40 - Furthermore, both cookies and objects are hierarchical. The two hierarchies 41 - correspond, but the cookies tree is a superset of the union of the object trees 42 - of multiple caches:: 43 - 44 - NETFS INDEX TREE : CACHE 1 : CACHE 2 45 - : : 46 - : +-----------+ : 47 - +----------->| IObject | : 48 - +-----------+ | : +-----------+ : 49 - | ICookie |-------+ : | : 50 - +-----------+ | : | : +-----------+ 51 - | +------------------------------>| IObject | 52 - | : | : +-----------+ 53 - | : V : | 54 - | : +-----------+ : | 55 - V +----------->| IObject | : | 56 - +-----------+ | : +-----------+ : | 57 - | ICookie |-------+ : | : V 58 - +-----------+ | : | : +-----------+ 59 - | +------------------------------>| IObject | 60 - +-----+-----+ : | : +-----------+ 61 - | | : | : | 62 - V | : V : | 63 - +-----------+ | : +-----------+ : | 64 - | ICookie |------------------------->| IObject | : | 65 - +-----------+ | : +-----------+ : | 66 - | V : | : V 67 - | +-----------+ : | : +-----------+ 68 - | | ICookie |-------------------------------->| IObject | 69 - | +-----------+ : | : +-----------+ 70 - V | : V : | 71 - +-----------+ | : +-----------+ : | 72 - | DCookie |------------------------->| DObject | : | 73 - +-----------+ | : +-----------+ : | 74 - | : : | 75 - +-------+-------+ : : | 76 - | | : : | 77 - V V : : V 78 - +-----------+ +-----------+ : : +-----------+ 79 - | DCookie | | DCookie |------------------------>| DObject | 80 - +-----------+ +-----------+ : : +-----------+ 81 - : : 82 - 83 - In the above illustration, ICookie and IObject represent indices and DCookie 84 - and DObject represent data storage objects. Indices may have representation in 85 - multiple caches, but currently, non-index objects may not. Objects of any type 86 - may also be entirely unrepresented. 87 - 88 - As far as the netfs API goes, the netfs is only actually permitted to see 89 - pointers to the cookies. The cookies themselves and any objects attached to 90 - those cookies are hidden from it. 91 - 92 - 93 - Object Management State Machine 94 - =============================== 95 - 96 - Within FS-Cache, each active object is managed by its own individual state 97 - machine. The state for an object is kept in the fscache_object struct, in 98 - object->state. A cookie may point to a set of objects that are in different 99 - states. 100 - 101 - Each state has an action associated with it that is invoked when the machine 102 - wakes up in that state. There are four logical sets of states: 103 - 104 - (1) Preparation: states that wait for the parent objects to become ready. The 105 - representations are hierarchical, and it is expected that an object must 106 - be created or accessed with respect to its parent object. 107 - 108 - (2) Initialisation: states that perform lookups in the cache and validate 109 - what's found and that create on disk any missing metadata. 110 - 111 - (3) Normal running: states that allow netfs operations on objects to proceed 112 - and that update the state of objects. 113 - 114 - (4) Termination: states that detach objects from their netfs cookies, that 115 - delete objects from disk, that handle disk and system errors and that free 116 - up in-memory resources. 117 - 118 - 119 - In most cases, transitioning between states is in response to signalled events. 120 - When a state has finished processing, it will usually set the mask of events in 121 - which it is interested (object->event_mask) and relinquish the worker thread. 122 - Then when an event is raised (by calling fscache_raise_event()), if the event 123 - is not masked, the object will be queued for processing (by calling 124 - fscache_enqueue_object()). 125 - 126 - 127 - Provision of CPU Time 128 - --------------------- 129 - 130 - The work to be done by the various states was given CPU time by the threads of 131 - the slow work facility. This was used in preference to the workqueue facility 132 - because: 133 - 134 - (1) Threads may be completely occupied for very long periods of time by a 135 - particular work item. These state actions may be doing sequences of 136 - synchronous, journalled disk accesses (lookup, mkdir, create, setxattr, 137 - getxattr, truncate, unlink, rmdir, rename). 138 - 139 - (2) Threads may do little actual work, but may rather spend a lot of time 140 - sleeping on I/O. This means that single-threaded and 1-per-CPU-threaded 141 - workqueues don't necessarily have the right numbers of threads. 142 - 143 - 144 - Locking Simplification 145 - ---------------------- 146 - 147 - Because only one worker thread may be operating on any particular object's 148 - state machine at once, this simplifies the locking, particularly with respect 149 - to disconnecting the netfs's representation of a cache object (fscache_cookie) 150 - from the cache backend's representation (fscache_object) - which may be 151 - requested from either end. 152 - 153 - 154 - The Set of States 155 - ================= 156 - 157 - The object state machine has a set of states that it can be in. There are 158 - preparation states in which the object sets itself up and waits for its parent 159 - object to transit to a state that allows access to its children: 160 - 161 - (1) State FSCACHE_OBJECT_INIT. 162 - 163 - Initialise the object and wait for the parent object to become active. In 164 - the cache, it is expected that it will not be possible to look an object 165 - up from the parent object, until that parent object itself has been looked 166 - up. 167 - 168 - There are initialisation states in which the object sets itself up and accesses 169 - disk for the object metadata: 170 - 171 - (2) State FSCACHE_OBJECT_LOOKING_UP. 172 - 173 - Look up the object on disk, using the parent as a starting point. 174 - FS-Cache expects the cache backend to probe the cache to see whether this 175 - object is represented there, and if it is, to see if it's valid (coherency 176 - management). 177 - 178 - The cache should call fscache_object_lookup_negative() to indicate lookup 179 - failure for whatever reason, and should call fscache_obtained_object() to 180 - indicate success. 181 - 182 - At the completion of lookup, FS-Cache will let the netfs go ahead with 183 - read operations, no matter whether the file is yet cached. If not yet 184 - cached, read operations will be immediately rejected with ENODATA until 185 - the first known page is uncached - as to that point there can be no data 186 - to be read out of the cache for that file that isn't currently also held 187 - in the pagecache. 188 - 189 - (3) State FSCACHE_OBJECT_CREATING. 190 - 191 - Create an object on disk, using the parent as a starting point. This 192 - happens if the lookup failed to find the object, or if the object's 193 - coherency data indicated what's on disk is out of date. In this state, 194 - FS-Cache expects the cache to create 195 - 196 - The cache should call fscache_obtained_object() if creation completes 197 - successfully, fscache_object_lookup_negative() otherwise. 198 - 199 - At the completion of creation, FS-Cache will start processing write 200 - operations the netfs has queued for an object. If creation failed, the 201 - write ops will be transparently discarded, and nothing recorded in the 202 - cache. 203 - 204 - There are some normal running states in which the object spends its time 205 - servicing netfs requests: 206 - 207 - (4) State FSCACHE_OBJECT_AVAILABLE. 208 - 209 - A transient state in which pending operations are started, child objects 210 - are permitted to advance from FSCACHE_OBJECT_INIT state, and temporary 211 - lookup data is freed. 212 - 213 - (5) State FSCACHE_OBJECT_ACTIVE. 214 - 215 - The normal running state. In this state, requests the netfs makes will be 216 - passed on to the cache. 217 - 218 - (6) State FSCACHE_OBJECT_INVALIDATING. 219 - 220 - The object is undergoing invalidation. When the state comes here, it 221 - discards all pending read, write and attribute change operations as it is 222 - going to clear out the cache entirely and reinitialise it. It will then 223 - continue to the FSCACHE_OBJECT_UPDATING state. 224 - 225 - (7) State FSCACHE_OBJECT_UPDATING. 226 - 227 - The state machine comes here to update the object in the cache from the 228 - netfs's records. This involves updating the auxiliary data that is used 229 - to maintain coherency. 230 - 231 - And there are terminal states in which an object cleans itself up, deallocates 232 - memory and potentially deletes stuff from disk: 233 - 234 - (8) State FSCACHE_OBJECT_LC_DYING. 235 - 236 - The object comes here if it is dying because of a lookup or creation 237 - error. This would be due to a disk error or system error of some sort. 238 - Temporary data is cleaned up, and the parent is released. 239 - 240 - (9) State FSCACHE_OBJECT_DYING. 241 - 242 - The object comes here if it is dying due to an error, because its parent 243 - cookie has been relinquished by the netfs or because the cache is being 244 - withdrawn. 245 - 246 - Any child objects waiting on this one are given CPU time so that they too 247 - can destroy themselves. This object waits for all its children to go away 248 - before advancing to the next state. 249 - 250 - (10) State FSCACHE_OBJECT_ABORT_INIT. 251 - 252 - The object comes to this state if it was waiting on its parent in 253 - FSCACHE_OBJECT_INIT, but its parent died. The object will destroy itself 254 - so that the parent may proceed from the FSCACHE_OBJECT_DYING state. 255 - 256 - (11) State FSCACHE_OBJECT_RELEASING. 257 - (12) State FSCACHE_OBJECT_RECYCLING. 258 - 259 - The object comes to one of these two states when dying once it is rid of 260 - all its children, if it is dying because the netfs relinquished its 261 - cookie. In the first state, the cached data is expected to persist, and 262 - in the second it will be deleted. 263 - 264 - (13) State FSCACHE_OBJECT_WITHDRAWING. 265 - 266 - The object transits to this state if the cache decides it wants to 267 - withdraw the object from service, perhaps to make space, but also due to 268 - error or just because the whole cache is being withdrawn. 269 - 270 - (14) State FSCACHE_OBJECT_DEAD. 271 - 272 - The object transits to this state when the in-memory object record is 273 - ready to be deleted. The object processor shouldn't ever see an object in 274 - this state. 275 - 276 - 277 - The Set of Events 278 - ----------------- 279 - 280 - There are a number of events that can be raised to an object state machine: 281 - 282 - FSCACHE_OBJECT_EV_UPDATE 283 - The netfs requested that an object be updated. The state machine will ask 284 - the cache backend to update the object, and the cache backend will ask the 285 - netfs for details of the change through its cookie definition ops. 286 - 287 - FSCACHE_OBJECT_EV_CLEARED 288 - This is signalled in two circumstances: 289 - 290 - (a) when an object's last child object is dropped and 291 - 292 - (b) when the last operation outstanding on an object is completed. 293 - 294 - This is used to proceed from the dying state. 295 - 296 - FSCACHE_OBJECT_EV_ERROR 297 - This is signalled when an I/O error occurs during the processing of some 298 - object. 299 - 300 - FSCACHE_OBJECT_EV_RELEASE, FSCACHE_OBJECT_EV_RETIRE 301 - These are signalled when the netfs relinquishes a cookie it was using. 302 - The event selected depends on whether the netfs asks for the backing 303 - object to be retired (deleted) or retained. 304 - 305 - FSCACHE_OBJECT_EV_WITHDRAW 306 - This is signalled when the cache backend wants to withdraw an object. 307 - This means that the object will have to be detached from the netfs's 308 - cookie. 309 - 310 - Because the withdrawing releasing/retiring events are all handled by the object 311 - state machine, it doesn't matter if there's a collision with both ends trying 312 - to sever the connection at the same time. The state machine can just pick 313 - which one it wants to honour, and that effects the other.
-210
Documentation/filesystems/caching/operations.rst
··· 1 - .. SPDX-License-Identifier: GPL-2.0 2 - 3 - ================================ 4 - Asynchronous Operations Handling 5 - ================================ 6 - 7 - By: David Howells <dhowells@redhat.com> 8 - 9 - .. Contents: 10 - 11 - (*) Overview. 12 - 13 - (*) Operation record initialisation. 14 - 15 - (*) Parameters. 16 - 17 - (*) Procedure. 18 - 19 - (*) Asynchronous callback. 20 - 21 - 22 - Overview 23 - ======== 24 - 25 - FS-Cache has an asynchronous operations handling facility that it uses for its 26 - data storage and retrieval routines. Its operations are represented by 27 - fscache_operation structs, though these are usually embedded into some other 28 - structure. 29 - 30 - This facility is available to and expected to be used by the cache backends, 31 - and FS-Cache will create operations and pass them off to the appropriate cache 32 - backend for completion. 33 - 34 - To make use of this facility, <linux/fscache-cache.h> should be #included. 35 - 36 - 37 - Operation Record Initialisation 38 - =============================== 39 - 40 - An operation is recorded in an fscache_operation struct:: 41 - 42 - struct fscache_operation { 43 - union { 44 - struct work_struct fast_work; 45 - struct slow_work slow_work; 46 - }; 47 - unsigned long flags; 48 - fscache_operation_processor_t processor; 49 - ... 50 - }; 51 - 52 - Someone wanting to issue an operation should allocate something with this 53 - struct embedded in it. They should initialise it by calling:: 54 - 55 - void fscache_operation_init(struct fscache_operation *op, 56 - fscache_operation_release_t release); 57 - 58 - with the operation to be initialised and the release function to use. 59 - 60 - The op->flags parameter should be set to indicate the CPU time provision and 61 - the exclusivity (see the Parameters section). 62 - 63 - The op->fast_work, op->slow_work and op->processor flags should be set as 64 - appropriate for the CPU time provision (see the Parameters section). 65 - 66 - FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the 67 - operation and waited for afterwards. 68 - 69 - 70 - Parameters 71 - ========== 72 - 73 - There are a number of parameters that can be set in the operation record's flag 74 - parameter. There are three options for the provision of CPU time in these 75 - operations: 76 - 77 - (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread 78 - may decide it wants to handle an operation itself without deferring it to 79 - another thread. 80 - 81 - This is, for example, used in read operations for calling readpages() on 82 - the backing filesystem in CacheFiles. Although readpages() does an 83 - asynchronous data fetch, the determination of whether pages exist is done 84 - synchronously - and the netfs does not proceed until this has been 85 - determined. 86 - 87 - If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags 88 - before submitting the operation, and the operating thread must wait for it 89 - to be cleared before proceeding:: 90 - 91 - wait_on_bit(&op->flags, FSCACHE_OP_WAITING, 92 - TASK_UNINTERRUPTIBLE); 93 - 94 - 95 - (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it 96 - will be given to keventd to process. Such an operation is not permitted 97 - to sleep on I/O. 98 - 99 - This is, for example, used by CacheFiles to copy data from a backing fs 100 - page to a netfs page after the backing fs has read the page in. 101 - 102 - If this option is used, op->fast_work and op->processor must be 103 - initialised before submitting the operation:: 104 - 105 - INIT_WORK(&op->fast_work, do_some_work); 106 - 107 - 108 - (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it 109 - will be given to the slow work facility to process. Such an operation is 110 - permitted to sleep on I/O. 111 - 112 - This is, for example, used by FS-Cache to handle background writes of 113 - pages that have just been fetched from a remote server. 114 - 115 - If this option is used, op->slow_work and op->processor must be 116 - initialised before submitting the operation:: 117 - 118 - fscache_operation_init_slow(op, processor) 119 - 120 - 121 - Furthermore, operations may be one of two types: 122 - 123 - (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in 124 - conjunction with any other operation on the object being operated upon. 125 - 126 - An example of this is the attribute change operation, in which the file 127 - being written to may need truncation. 128 - 129 - (2) Shareable. Operations of this type may be running simultaneously. It's 130 - up to the operation implementation to prevent interference between other 131 - operations running at the same time. 132 - 133 - 134 - Procedure 135 - ========= 136 - 137 - Operations are used through the following procedure: 138 - 139 - (1) The submitting thread must allocate the operation and initialise it 140 - itself. Normally this would be part of a more specific structure with the 141 - generic op embedded within. 142 - 143 - (2) The submitting thread must then submit the operation for processing using 144 - one of the following two functions:: 145 - 146 - int fscache_submit_op(struct fscache_object *object, 147 - struct fscache_operation *op); 148 - 149 - int fscache_submit_exclusive_op(struct fscache_object *object, 150 - struct fscache_operation *op); 151 - 152 - The first function should be used to submit non-exclusive ops and the 153 - second to submit exclusive ones. The caller must still set the 154 - FSCACHE_OP_EXCLUSIVE flag. 155 - 156 - If successful, both functions will assign the operation to the specified 157 - object and return 0. -ENOBUFS will be returned if the object specified is 158 - permanently unavailable. 159 - 160 - The operation manager will defer operations on an object that is still 161 - undergoing lookup or creation. The operation will also be deferred if an 162 - operation of conflicting exclusivity is in progress on the object. 163 - 164 - If the operation is asynchronous, the manager will retain a reference to 165 - it, so the caller should put their reference to it by passing it to:: 166 - 167 - void fscache_put_operation(struct fscache_operation *op); 168 - 169 - (3) If the submitting thread wants to do the work itself, and has marked the 170 - operation with FSCACHE_OP_MYTHREAD, then it should monitor 171 - FSCACHE_OP_WAITING as described above and check the state of the object if 172 - necessary (the object might have died while the thread was waiting). 173 - 174 - When it has finished doing its processing, it should call 175 - fscache_op_complete() and fscache_put_operation() on it. 176 - 177 - (4) The operation holds an effective lock upon the object, preventing other 178 - exclusive ops conflicting until it is released. The operation can be 179 - enqueued for further immediate asynchronous processing by adjusting the 180 - CPU time provisioning option if necessary, eg:: 181 - 182 - op->flags &= ~FSCACHE_OP_TYPE; 183 - op->flags |= ~FSCACHE_OP_FAST; 184 - 185 - and calling:: 186 - 187 - void fscache_enqueue_operation(struct fscache_operation *op) 188 - 189 - This can be used to allow other things to have use of the worker thread 190 - pools. 191 - 192 - 193 - Asynchronous Callback 194 - ===================== 195 - 196 - When used in asynchronous mode, the worker thread pool will invoke the 197 - processor method with a pointer to the operation. This should then get at the 198 - container struct by using container_of():: 199 - 200 - static void fscache_write_op(struct fscache_operation *_op) 201 - { 202 - struct fscache_storage *op = 203 - container_of(_op, struct fscache_storage, op); 204 - ... 205 - } 206 - 207 - The caller holds a reference on the operation, and will invoke 208 - fscache_put_operation() when the processor function returns. The processor 209 - function is at liberty to call fscache_enqueue_operation() or to take extra 210 - references.
+10 -6
Documentation/filesystems/netfs_library.rst
··· 454 454 void *term_func_priv); 455 455 456 456 int (*prepare_write)(struct netfs_cache_resources *cres, 457 - loff_t *_start, size_t *_len, loff_t i_size); 457 + loff_t *_start, size_t *_len, loff_t i_size, 458 + bool no_space_allocated_yet); 458 459 459 460 int (*write)(struct netfs_cache_resources *cres, 460 461 loff_t start_pos, ··· 516 515 517 516 * ``prepare_write()`` 518 517 519 - [Required] Called to adjust a write to the cache and check that there is 520 - sufficient space in the cache. The start and length values indicate the 521 - size of the write that netfslib is proposing, and this can be adjusted by 522 - the cache to respect DIO boundaries. The file size is passed for 523 - information. 518 + [Required] Called to prepare a write to the cache to take place. This 519 + involves checking to see whether the cache has sufficient space to honour 520 + the write. ``*_start`` and ``*_len`` indicate the region to be written; the 521 + region can be shrunk or it can be expanded to a page boundary either way as 522 + necessary to align for direct I/O. i_size holds the size of the object and 523 + is provided for reference. no_space_allocated_yet is set to true if the 524 + caller is certain that no data has been written to that region - for example 525 + if it tried to do a read from there already. 524 526 525 527 * ``write()`` 526 528