4 Answers
4

It's actually pretty simple, at least if you don't need the implementation details.

First off, on Linux all file systems (ext2, ext3, btrfs, reiserfs, tmpfs, zfs, ...) are implemented in the kernel. Some may offload work to userland code through FUSE, and some come only in the form of a kernel module (native ZFS is a notable example of the latter due to licensing restrictions), but either way there remains a kernel component. This is an important basic.

When a program wants to read from a file, it will issue various system library calls which ultimately end up in the kernel in the form of an open(), read(), close() sequence (possibly with seek() thrown in for good measure). The kernel takes the provided path and filename, and through the file system and device I/O layer translates these to physical read requests (and in many cases also write requests -- think for example atime updates) to some underlying storage.

However, it doesn't have to translate those requests specifically to physical, persistent storage. The kernel's contract is that issuing that particular set of system calls will provide the contents of the file in question. Where exactly in our physical realm the "file" exists is secondary to this.

On /proc is usually mounted what is known as procfs. That is a special file system type, but since it is a file system, it really is no different from e.g. an ext3 file system mounted somewhere. So the request gets passed to the procfs file system driver code, which knows about all these files and directories and returns particular pieces of information from the kernel data structures.

The "storage layer" in this case is the kernel data structures, and procfs provides a clean, convenient interface to accessing those. Do keep in mind that mounting procfs at /proc is simply convention; you could just as easily mount it elsewhere. In fact, that is sometimes done, for example in chroot jails when the process running there needs access to /proc for some reason.

It works the same way if you write a value to some file; at the kernel level, that translates to a series of open(), seek(), write(), close() calls which again get passed to the file system driver; again, in this particular case, the procfs code.

The particular reason why you see file returning empty is that many of the files exposed by procfs are exposed with a size of 0 bytes. The 0 byte size is likely an optimization on the kernel side (many of the files in /proc are dynamic and can easily vary in length, possibly even from one read to the next, and calculating the length of each file on every directory read would potentially be very expensive). Going by the comments to this answer, which you can verify on your own system by running through strace or a similar tool, file first issues a stat() call to detect any special files, and then takes the opportunity to, if the file size is reported as 0, abort and report the file as being empty.

This behavior is actually documented and can be overridden by specifying -s or --special-files on the file invocation, although as stated in the manual page that may have side effects. The quote below is from the BSD file 5.11 man page, dated Oct 17 2011.

Normally, file only attempts to read and determine the type of argument files which stat(2) reports are ordinary files. This prevents problems, because reading special files may have peculiar consequences. Specifying the -s option causes file to also read argument files which are block or character special files. This is useful for determining the filesystem types of the data in raw disk partitions, which are block special files. This option also causes file to disregard the file size as reported by stat(2) since on some systems it reports a zero size for raw disk partitions.

When you look at it with strace file /proc/version or ltrace -S /proc/version, the optimization is rather small. It does a stat() call first and finds that the size is 0, thus skipping the open() - but before that it's loading several magic files.
–
ott--Jul 15 '13 at 12:50

2

@ott-- That indeed is an odd sequence of events, but it may be related to the fact that you can pass multiple file names to file. This way, file pre-loads the magic files, then processes the command line parameter by parameter; instead of moving the magic file loading into the "do this just before trying to determine what kind of file this particular one is" part of the code, which would increase complexity. Calling stat() and acting on its return value is essentially harmless; adding complexity in keeping track of additional internal state risks introducing bugs.
–
Michael KjörlingJul 15 '13 at 14:20

@Gilles The -s is supposed for block/char special devices. Finally I looked at the file source, and at the end of fsmagic.c I saw this explanation why it returns ASCII text instead of empty: If stat() tells us the file has zero length, report here that the file is empty, so we can skip all the work of opening and reading the file. But if the -s option has been given, we skip this optimization, since on some systems, stat() reports zero size for raw disk partitions.
–
ott--Jul 15 '13 at 15:45

In this directory, you can control how the kernel views devices, adjust kernel settings, add devices to the kernel and remove them again. In this directory you can directly view the memory usage and I/O statistics.

You can see which disks are mounted and what file systems are used. In short, every single aspect of your Linux system can be examined from this directory, if you know what to look for.

The /proc directory is not a normal directory. If you were to boot from a boot CD and look at that directory on your hard drive, you would see it as being empty. When you look at it under your normal running system it can be quite large. However, it doesn't seem to be using any hard disk space. This is because it is a virtual file system.

Since the /proc file system is a virtual file system and resides in memory, a new /proc file system is created every time your Linux machine reboots.

In other words, it is just a means of easily peeking and poking at the guts of the Linux system through a file and directory type interface. When you look at a file in the /proc directory, you are looking directly at a range of memory in the Linux kernel and seeing what it can see.

The layers in the file system

Examples:

Inside /proc, there is a directory for each running process, named with its process ID. These directories contain files that have useful information about the processes, such as:

exe: which is a symbolic link to the file on disk the process was started from.

cwd: which is a symbolic link to the working directory of the process.

wchan: which, when read, returns the waiting channel the process is on.

maps: which, when read, returns the memory maps of the process.

/proc/uptime returns the uptime as two decimal values in seconds, separated by a space:

"If you were to boot from a boot cd and look at that directory on your hard drive you would see it as being empty." That is not specific to /proc, it is general to any mount point where the underlying file system has not been mounted. If you boot from that same boot CD and do something like mount -t procfs procfs /mnt/proc, you will see the currently running kernel's /proc.
–
Michael KjörlingJul 15 '13 at 12:11

In simplest terms, it's a way to talk to the kernel using the normal methods of reading and writing files, instead of calling the kernel directly. It's in line with Unix's "everything is a file" philosophy.

The files in /proc don't physically exist anywhere, but the kernel reacts to the files you read and write within there, and instead of writing to storage, it reports information or does something.

Similarly, the files in /dev aren't really files in the traditional sense (although on some systems the files in /dev may actually exist on disk, they won't have much to them other than what device they refer to) - they enable you to talk to a device using the normal Unix file I/O API - or anything that uses it, like shells

It is more like to *nix that only a file can be secured. Since access control lists are persisted in the file system, it is convenient to secure privileged resources using the common mechanism already provided by the file system driver. This simplifies implementation of tools that access kernel structures and allows them to run without elevated permissions by instead reading from the proc file system virtual files.
–
PekkaJul 15 '13 at 20:34

Inside the /proc directory, there are two types of content, first numbered directory and the second one is system information file.

/proc is a virtual file system. For example, if you do ls -l /proc/stat, you’ll notice that it has a size of 0 bytes, but if you do “cat /proc/stat”, you’ll see some content inside the file.

Do a ls -l /proc, and you’ll see lot of directories with just numbers. These numbers represents the process IDs (PIDs). The files inside this numbered directory corresponds to the process with that particular PID.

Some files which are available under /proc, contains system information such as cpuinfo, meminfo, and loadavg.

Some Linux commands read the information from these /proc files and display it. For example, the free command, reads the memory information from the /proc/meminfo file, formats it, and displays it.

To learn more about the individual /proc files, do “man 5 FILENAME”.

/proc/cmdline – Kernel command line
/proc/cpuinfo – Information about the processors.
/proc/devices – List of device drivers configured into the currently running kernel.
/proc/dma – Shows which DMA channels are being used at the moment.
/proc/fb – Frame Buffer devices.
/proc/filesystems – File systems supported by the kernel.
/proc/interrupts – Number of interrupts per IRQ on architecture.
/proc/iomem – This file shows the current map of the system’s memory for its various devices
/proc/ioports – provides a list of currently registered port regions used for input or output communication with a device
/proc/loadavg – Contains load average of the system
The first three columns measure CPU utilization of the last 1, 5, and 10 minute periods.
The fourth column shows the number of currently running processes and the total number of processes.
The last column displays the last process ID used.
/proc/locks – Displays the files currently locked by the kernel
Sample line:
1: POSIX ADVISORY WRITE 14375 08:03:114727 0 EOF
/proc/meminfo – Current utilization of primary memory on the system
/proc/misc – This file lists miscellaneous drivers registered on the miscellaneous major device, which is number 10
/proc/modules – Displays a list of all modules that have been loaded by the system
/proc/mounts – This file provides a quick list of all mounts in use by the system
/proc/partitions – Very detailed information on the various partitions currently available to the system
/proc/pci – Full listing of every PCI device on your system
/proc/stat – Keeps track of a variety of different statistics about the system since it was last restarted
/proc/swap – Measures swap space and its utilization
/proc/uptime – Contains information about uptime of the system
/proc/version – Version of the Linux kernel, gcc, name of the Linux flavor installed.

This sounds to me more like "how to use what is in /proc?" rather than "how does /proc work?". Useful information, but not necessarily answering this particular question.
–
Michael KjörlingJul 15 '13 at 12:30

Each file in /proc is runtime information, meaning that when you cat /proc/meminfo part of the kernel runs a function that generates the file contents.
–
ShaileshJul 15 '13 at 12:41