On all Unix-like systems, the primary repository of information is the file tree, rooted at “/”. The file tree is a hierarchical set of directories, each of which may contain filesystem objects (FSOs).
In Linux, filesystem objects (FSOs) may be ordinary files, directories, symbolic links, named pipes (also called first-in first-outs or FIFOs), sockets (see below), character special (device) files, or block special (device) files (in Linux, this list is given in the find(1) command). Other Unix-like systems have an identical or similar list of FSO types.
Filesystem objects are collected on filesystems, which can be mounted and unmounted on directories in the file tree. A filesystem type (e.g., ext2 and FAT) is a specific set of conventions for arranging data on the disk to optimize speed, reliability, and so on; many people use the term “filesystem” as a synonym for the filesystem type.
Different Unix-like systems support different filesystem types. Filesystems may have slightly different sets of access control attributes and access controls can be affected by options selected at mount time. On Linux, the ext2 filesystems is currently the most popular filesystem, but Linux supports a vast number of filesystems. Most Unix-like systems tend to support multiple filesystems too.
Most filesystems on Unix-like systems store at least the following:
owning UID and GID - identifies the “owner” of the filesystem object. Only the owner or root can change the access control attributes unless otherwise noted.
permission bits - read, write, execute bits for each of user (owner), group, and other. For ordinary files, read, write, and execute have their typical meanings. In directories, the “read” permission is necessary to display a directory’s contents, while the “execute” permission is sometimes called “search” permission and is necessary to actually enter the directory to use its contents. In a directory “write” permission on a directory permits adding, removing, and renaming files in that directory; if you only want to permit adding, set the sticky bit noted below. Note that the permission values of symbolic links are never used; it’s only the values of their containing directories and the linked-to file that matter.
“sticky” bit - when set on a directory, unlinks (removes) and renames of files in that directory are limited to the file owner, the directory owner, or root privileges. This is a very common Unix extension and is specified in the Open Group’s Single Unix Specification version 2. Old versions of Unix called this the “save program text” bit and used this to indicate executable files that should stay in memory. Systems that did this ensured that only root could set this bit (otherwise users could have crashed systems by forcing “everything” into memory). In Linux, this bit has no effect on ordinary files and ordinary users can modify this bit on the files they own: Linux’s virtual memory management makes this old use irrelevant.
setuid, setgid - when set on an executable file, executing the file will set the process’ effective UID or effective GID to the value of the file’s owning UID or GID (respectively). All Unix-like systems support this. In Linux and System V systems, when setgid is set on a file that does not have any execute privileges, this indicates a file that is subject to mandatory locking during access (if the filesystem is mounted to support mandatory locking); this overload of meaning surprises many and is not universal across Unix-like systems. In fact, the Open Group’s Single Unix Specification version 2 for chmod(3) permits systems to ignore requests to turn on setgid for files that aren’t executable if such a setting has no meaning. In Linux and Solaris, when setgid is set on a directory, files created in the directory will have their GID automatically reset to that of the directory’s GID. The purpose of this approach is to support “project directories”: users can save files into such specially-set directories and the group owner automatically changes. However, setting the setgid bit on directories is not specified by standards such as the Single Unix Specification [Open Group 1997].
timestamps - access and modification times are stored for each filesystem object. However, the owner is allowed to set these values arbitrarily (see touch(1)), so be careful about trusting this information. All Unix-like systems support this.
The following attributes are Linux-unique extensions on the ext2 filesystem, though many other filesystems have similar functionality:
immutable bit - no changes to the filesystem object are allowed; only root can set or clear this bit. This is only supported by ext2 and is not portable across all Unix systems (or even all Linux filesystems).
append-only bit - only appending to the filesystem object are allowed; only root can set or clear this bit. This is only supported by ext2 and is not portable across all Unix systems (or even all Linux filesystems).
Other common extensions include some sort of bit indicating “cannot delete this file”.
Some Unix-like systems also support extended attributes (known as in the Macintosh world as “resource forks”), which are essentially name/value pairs associated with files or directories but not stored inside the data of the file or directory itself. Extended attributes can store more detailed access control information, a MIME type, and so on. Linux kernel 2.6 adds this capability, but since many systems and filesystems don’t support it, many programs choose not to use them.
Some Unix-like systems support POSIX access control lists (ACLs), which allow users to specify in more detail who specifically can access a file and how. See Section 3.2.2 for more information.
Many of these values can be influenced at mount time, so that, for example, certain bits can be treated as though they had a certain value (regardless of their values on the media). See mount(1) for more information about this. These bits are useful, but be aware that some of these are intended to simplify ease-of-use and aren’t really sufficient to prevent certain actions. For example, on Linux, mounting with “noexec” will disable execution of programs on that file system; as noted in the manual, it’s intended for mounting filesystems containing binaries for incompatible systems. On Linux, this option won’t completely prevent someone from running the files; they can copy the files somewhere else to run them, or even use the command “/lib/ld-linux.so.2” to run the file directly.
Some filesystems don’t support some of these access control values; again, see mount(1) for how these filesystems are handled. In particular, many Unix-like systems support MS-DOS disks, which by default support very few of these attributes (and there’s not standard way to define these attributes). In that case, Unix-like systems emulate the standard attributes (possibly implementing them through special on-disk files), and these attributes are generally influenced by the mount(1) command.
It’s important to note that, for adding and removing files, only the permission bits and owner of the file’s directory really matter unless the Unix-like system supports more complex schemes (such as POSIX ACLs). Unless the system has other extensions, and stock Linux 2.2 and 2.4 do not, a file that has no permissions in its permission bits can still be removed if its containing directory permits it (exception: directories marked as "sticky" have special rules). Also, if an ancestor directory permits its children to be changed by some user or group, then any of that directory’s descendants can be replaced by that user or group.
It’s worth noting that in Linux, the Linux ext2 filesystem by default reserves a small amount of space for the root user. This is a partial defense against denial-of-service attacks; even if a user fills a disk that is shared with the root user, the root user has a little space left over (e.g., for critical functions). The default is 5% of the filesystem space; see mke2fs(8), in particular its “-m” option.
The original Unix access control bits (user, group and other values for read, write, and execute) has been remarkably effective for a variety of uses. Still, a number of users have complained that this model was too difficult to use in some circumstances when sharing data between people. Many people wanted to add sets of groups, or describe special rights for a number of specific groups, to a given file or directory, and the original approach didn’t make that easy.
The IEEE formed a POSIX standard working group to identify common interfaces for a large number of security-related interfaces, including how to create more complicated access control lists (termed "POSIX ACLs"). However, after 13 years of work, the group disbanded without ever agreeing on final draft standards. The IEEE draft standard specifications (IEEE 1003.1e and IEEE 1003.2c) were last edited on October 14th, 1997, and were officially disbanded on March 10th, 1999. I believe a key reason that this effort failed was because the specification tried to cover too many different areas. As a result, it wasn’t possible to gain consensus on everything they were specifying, and the lengthy time meant that eventually everyone gave up. Copies of the draft standards are available for free.
Interestingly enough, the story doesn’t end there. Although few vendors were interested in implementing all the interfaces devised by the working group, there was a lot of interest in implementing more flexible access control lists. While there were other ways to implement access control lists, the working group had come up with a reasonable approach and written it down. Most importantly, they gave a detailed and reasonable justification of why implementors should do it this way. This is more important than it might first appear - although more sophisticated ACLs are an old idea, the problem is that users wanted an upward-compatible approach that wouldn’t cause problems with the many existing applications. A "pure ACL" approach where the old approach would be ignored would have required re-examination of many existing programs to make sure they didn’t cause security problems - any miss might have caused a security lapse. Several other alternatives were considered as well by the working group, and after careful examination they created their final approach, which emphasized compatibility with existing applications.
As a result, developers of Unix-like systems have slowly started to add POSIX access control lists, more or less as they were described in the last working draft. This includes more recent versions of SGI Irix, Sun Solaris, FreeBSD, and the Linux kernel 2.6 (which adds POSIX access control lists as well as extended attributes). For more information on the Linux kernel implementation of these and some userspace tools, see http://acl.bestbits.at.
However, while it’s important to write programs that work with POSIX ACLs, it may not be wise yet to depend on them if you’re writing portable applications. Versions of the Linux kernel before 2.6 didn’t have POSIX ACLs, and it’s worth noting that many user-space tools (particularly backup programs like tar) and filesystem formats do not necessarily support them either. Although the NFSv4 specification supports POSIX ACLs, many NFS implementations do not or only partially support them. In short, POSIX ACLs are slowly becoming available, but you may have teething pains in some cases if you depend on them extensively.
In POSIX ACLs, an FSO may have an additional set of "ACL entries" that are used for determining who can access the FSO; every directory can also have a set of default ACL entries used when an FSO is created inside it. Each ACL entry can be one of a number of different types, and each entry also what accesses are granted (r for read, w for write, x for execute). Unfortunately, the POSIX draft names for these ACL entry types are really ugly; it’s actually a simple system, complicated by bad names. There are "short form" and "long form" ways of displaying and setting this information.
Here are their official names, with an explanation, and the short and long form:
Table 3-1. POSIX ACL Entry Types
|POSIX ACL Entry Name||Meaning||Short Form||Long Form|
|ACL_USER_OBJ||The rights of the owner||u::||user::|
|ACL_USER||The rights of some specific user, other than the owner||u:USERNAME:||user:USERNAME:|
|ACL_GROUP_OBJ||The rights of the group that owns the file||g::||group::|
|ACL_GROUP||The rights of some other group that doesn’t own the file||g:GROUPNAME:||group:GROUPNAME:|
|ACL_OTHER||The rights of anyone not otherwise covered.||o::||other::|
|ACL_MASK||The maximum possible rights for everyone, except for the owner and OTHER.||m::||mask:GROUPNAME:|
The "mask" is the gimmick that makes these extended POSIX ACLs work well with programs not designed to work with them. If you specify any specific users or groups other than the owner or group owner (i.e., you use ACL_USER or ACL_GROUP), then you atuomaticaly have to have a mask entry. For more information on POSIX ACLs, see acl(5).
At creation time, the following rules apply. On most Unix systems, when a new filesystem object is created via creat(2) or open(2), the FSO UID is set to the process’ EUID and the FSO’s GID is set to the process’ EGID. Linux works slightly differently due to its FSUID extensions; the FSO’s UID is set to the process’ FSUID, and the FSO GID is set to the process’ FSGUID; if the containing directory’s setgid bit is set or the filesystem’s “GRPID” flag is set, the FSO GID is actually set to the GID of the containing directory. Many systems, including Sun Solaris and Linux, also support the setgid directory extensions. As noted earlier, this special case supports “project” directories: to make a “project” directory, create a special group for the project, create a directory for the project owned by that group, then make the directory setgid: files placed there are automatically owned by the project. Similarly, if a new subdirectory is created inside a directory with the setgid bit set (and the filesystem GRPID isn’t set), the new subdirectory will also have its setgid bit set (so that project subdirectories will “do the right thing”.); in all other cases the setgid is clear for a new file. This is the rationale for the “user-private group” scheme (used by Red Hat Linux and some others). In this scheme, every user is a member of a “private” group with just themselves as members, so their defaults can permit the group to read and write any file (since they’re the only member of the group). Thus, when the file’s group membership is transferred this way, read and write privileges are transferred too. FSO basic access control values (read, write, execute) are computed from (requested values & ~ umask of process). New files always start with a clear sticky bit and clear setuid bit. For more information on POSIX ACLs, see acl(5).
You can set most of these values with chmod(2), fchmod(2), or chmod(1) but see also chown(1), and chgrp(1). In Linux, some of the Linux-unique attributes are manipulated using chattr(1).
Note that in Linux, only root can change the owner of a given file. Some Unix-like systems allow ordinary users to transfer ownership of their files to another, but this causes complications and is forbidden by Linux. For example, if you’re trying to limit disk usage, allowing such operations would allow users to claim that large files actually belonged to some other “victim”. For more information on POSIX ACLs, see acl(5).
Under Linux and most Unix-like systems, reading and writing attribute values are only checked when the file is opened; they are not re-checked on every read or write. Still, a large number of calls do check these attributes, since the filesystem is so central to Unix-like systems. Calls that check these attributes include open(2), creat(2), link(2), unlink(2), rename(2), mknod(2), symlink(2), and socket(2). For more information on POSIX ACLs, see acl(5).
Over the years conventions have been built on “what files to place where”. Where possible, please follow conventional use when placing information in the hierarchy. For example, place global configuration information in /etc. The Filesystem Hierarchy Standard (FHS) tries to define these conventions in a logical manner, and is widely used by Linux systems. The FHS is an update to the previous Linux Filesystem Structure standard (FSSTND), incorporating lessons learned and approaches from Linux, BSD, and System V systems. See http://www.pathname.com/fhs for more information about the FHS. A summary of these conventions is in hier(5) for Linux and hier(7) for Solaris. Sometimes different conventions disagree; where possible, make these situations configurable at compile or installation time.
I should note that the FHS has been adopted by the Linux Standard Base which is developing and promoting a set of standards to increase compatibility among Linux distributions and to enable software applications to run on any compliant Linux system.