World Library  
Flag as Inappropriate
Email this Article

Log-structured file system

Article Id: WHEBN0000359096
Reproduction Date:

Title: Log-structured file system  
Author: World Heritage Encyclopedia
Language: English
Subject: JFFS, Journaling file system, YAFFS, Log-structured File System (BSD), Logic File System
Collection: Bell Labs, Computer File Systems, Fault-Tolerant Computer Systems
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Log-structured file system

A log-structured filesystem is a file system in which data and metadata are written sequentially to a circular buffer, called a log.[1] The design was first proposed in 1988 by John K. Ousterhout and Fred Douglis and first implemented in 1992 by John K. Ousterhout and Mendel Rosenblum.

Contents

  • Rationale 1
  • Implementations 2
  • Disadvantages 3
  • See also 4
  • References 5

Rationale

Conventional file systems tend to lay out files with great care for spatial locality and make in-place changes to their data structures in order to perform well on optical and magnetic disks, which tend to seek relatively slowly.

The design of log-structured file systems is based on the hypothesis that this will no longer be effective because ever-increasing memory sizes on modern computers would lead to I/O becoming write-heavy because reads would be almost always satisfied from memory cache. A log-structured file system thus treats its storage as a circular log and writes sequentially to the head of the log.

This has several important side effects:

  • Write throughput on optical and magnetic disks is improved because they can be batched into large sequential runs and costly seeks are kept to a minimum.
  • Writes create multiple, chronologically-advancing versions of both file data and meta-data. Some implementations make these old file versions nameable and accessible, a feature sometimes called time-travel or snapshotting. This is very similar to a versioning file system.
  • Recovery from crashes is simpler. Upon its next mount, the file system does not need to walk all its data structures to fix any inconsistencies, but can reconstruct its state from the last consistent point in the log.

Log-structured file systems, however, must reclaim free space from the tail of the log to prevent the file system from becoming full when the head of the log wraps around to meet it. The tail can release space and move forward by skipping over data for which newer versions exist farther ahead in the log. If there are no newer versions, then the data is moved and appended to the head.

To reduce the overhead incurred by this garbage collection, most implementations avoid purely circular logs and divide up their storage into segments. The head of the log simply advances into non-adjacent segments which are already free. If space is needed, the least-full segments are reclaimed first. This decreases the I/O load of the garbage collector, but becomes increasingly ineffective as the file system fills up and nears capacity.

Implementations

  • John K. Ousterhout and Mendel Rosenblum implemented the first log-structured file system for the Sprite operating system in 1992.[2][3]
  • BSD-LFS, an implementation by Margo Seltzer was added to 4.4BSD, and was later ported to 386BSD. It lacked support for snapshots. It was removed from FreeBSD and OpenBSD, but still lives on in NetBSD.
  • Plan 9's Fossil file system is also log-structured and supports snapshots.
  • NILFS is a log-structured file system implementation for Linux by NTT/Verio which supports snapshots.
  • Google Summer of Code 2005. Both projects have been abandoned.
  • LFS is another log-structured file system for Linux developed by Charles University, Prague. It was to include support for snapshots and indexed directories, but development has since ceased.
  • ULFS is a User-Level Log-structured File System (http://ulfs.sf.net using FUSE (http://fuse.sf.net).
  • Write Anywhere File Layout (WAFL) by NetApp is a file layout that supports large, high-performance RAID arrays, quick restarts without lengthy consistency checks in the event of a crash or power failure, and growing the filesystems size quickly. Built using log-structured file system concept, snapshots and off-line data deduplication (http://community.netapp.com/fukiw75442/attachments/fukiw75442/data-ontap-discussions/2334/1/WAFL.pdf).
  • LSFS is a log-structured file system with writable snapshots and inline data deduplication created by StarWind Software (https://www.starwindsoftware.com/vm-centric-storage-lsfs).
  • CASL is a proprietary log-structured filesystem that uses Solid State Devices to cache traditional hard drives (http://www.nimblestorage.com/products/architecture/).
  • ObjectiveFS is a log-structured FUSE filesystem that uses cloud object stores (e.g. Amazon S3, Google Cloud Storage and private cloud object store). (https://www.objectivefs.com)

Some kinds of storage media, such as flash memory and CD-RW, slowly degrade as they are written to and have a limited number of erase/write cycles at any one location. Log-structured file systems are sometimes used on these media because they make fewer in-place writes and thus prolong the life of the device by wear leveling. The more common such file systems include:

  • UDF is a file system commonly used on optical discs.
  • JFFS and its successor JFFS2 are simple Linux file systems intended for raw flash-based devices.
  • UBIFS is a filesystem for raw NAND flash media and also intended to replace JFFS2.
  • LogFS is a scalable flash filesystem for Linux that works on both raw flash media and block devices, intended to replace JFFS2.
  • YAFFS is a raw NAND flash-specific file system for many operating systems (including Linux).
  • F2FS is a new file system designed for the NAND flash memory-based storage devices on Linux.

Disadvantages

The design rationale for log-structured file systems assumes that most reads will be optimized away by ever-enlarging memory caches. This assumption does not always hold:

  • On magnetic media—where seeks are relatively expensive—the log structure may actually make reads much slower, since it fragments files that conventional file systems normally keep contiguous with in-place writes.
  • On flash memory—where seek times are usually negligible—the log structure may not confer a worthwhile performance gain because write fragmentation has much less of an impact on write throughput. However many flash based devices cannot rewrite part of a block, and they must first perform a (slow) erase cycle of each block before being able to re-write, so by putting all the writes in one block, this can help performance as opposed to writes scattered into various blocks, each one of which must be copied into a buffer, erased, and written back.

See also

References

  1. ^ Arpaci-Dusseau, Remzi H.; Arpaci-Dusseau, Andrea C. (2014), Log-structured File Systems (PDF), Arpaci-Dusseau Books 
  2. ^ Rosenblum, Mendel and Ousterhout, John K. (June 1990) - "The LFS Storage Manager". Proceedings of the 1990 Summer Usenix. pp315-324.
  3. ^ Rosenblum, Mendel and Ousterhout, John K. (February 1992) - "The Design and Implementation of a Log-Structured File System". ACM Transactions on Computer Systems, Vol. 10 Issue 1. pp26-52.
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 


Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.