Upon completion of this lesson, you will be able to:
The Unix file system is a hierarchical file system used in Unix-like operating systems. In this tutorial, we will discuss the implementation of the Unix file system, the role of inodes, and how to manipulate and access files, directories, and file information on Unix using the C programming language.
The Unix file system is a hierarchical file system that organizes files and directories in a tree-like structure. At the root of the tree is the root directory (“/”), which contains all other directories and files. Directories can contain other directories and files, and files can be stored in any directory. Each file and directory is identified by a path, which is the sequence of directories from the root directory to the file or directory.
The Unix file system uses inodes to store information about files and directories. The inode (short for index node) is a crucial data structure that represents the metadata of a file or directory. The inode structure contains essential information about a file, such as its owner, permissions, timestamps, size, and location on the disk.Each file or directory has an inode associated with it, and the inode is used to locate and access the file or directory.
Below is a detailed explanation of the key components in a typical UNIX inode structure:
Inode number (i-number): A unique identifier for the inode within the file system. This number is used by the system to reference the inode and associated file or directory.
File type: Specifies the type of file associated with the inode, such as a regular file, directory, symbolic link, character device, or block device.
File mode: Represents the file permissions, including read, write, and execute permissions for the file’s owner, group, and others. It also contains additional flags, such as setuid, setgid, and sticky bit.
Link count: Indicates the number of hard links to the inode. When the link count is zero, the inode and its associated file or directory can be safely deleted.
Owner (UID) and Group (GID): Specifies the user identifier (UID) and group identifier (GID) of the file’s owner and group, respectively.
File size: Represents the size of the file in bytes.
Timestamps: There are usually three timestamps in the inode structure:
Block pointers: These are pointers to the blocks of data that make up the file. Typically, an inode contains a set of direct pointers, indirect pointers, double indirect pointers, and possibly triple indirect pointers.
Extended attributes: Some UNIX file systems support extended attributes, which are additional metadata attached to the inode. These attributes can store information like access control lists (ACLs) and other file properties.
The exact structure of the inode and the information it contains can vary depending on the specific UNIX file system implementation, such as ext4, XFS, or HFS+. However, the core concepts remain largely the same across different file systems.
In C programming language, the standard library provides functions for manipulating and accessing files and directories. Some of the most commonly used functions are:
fopen: Opens a file and returns a pointer to a FILE structure that can be used to read from or write to the file.
fclose: Closes a file that was opened with fopen.
fread and fwrite: Reads and writes data to a file.
opendir: Opens a directory and returns a pointer to a DIR structure that can be used to read the contents of the directory.
readdir: Reads the next entry in a directory opened with opendir.
closedir: Closes a directory that was opened with opendir.
stat: Returns information about a file or directory, including its size, permissions, and owner.
chmod: Changes the permissions of a file or directory.
chown: Changes the owner of a file or directory.
Here is an example program that demonstrates how to use some of these functions:
#include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
#include <sys/stat.h>
int main(int argc, char *argv[]) {
DIR *dir;
struct dirent *ent;
struct stat statbuf;
if (argc < 2) {
fprintf(stderr, "Usage: %s <directory>\n", argv[0]);
exit(EXIT_FAILURE);
}
if ((dir = opendir(argv[1])) == NULL) {
perror("opendir");
exit(EXIT_FAILURE);
}
while ((ent = readdir(dir)) != NULL) {
printf("%s\n", ent->d_name);
if (stat(ent->d_name, &statbuf) < 0) {
perror("stat");
continue;
}
printf("\t Size: %lld bytes\n", (long long) statbuf.st_size);
printf("\t Permissions: %o\n", statbuf.st_mode & 0777);
printf("\t Owner: %d\n", statbuf.st_uid);
}
closedir(dir);
return 0;
}
This program takes a directory name as a command line argument, opens the directory with opendir
, reads the contents of the directory with readdir
, and prints information about each file in the directory using stat.
open and fopen are both C functions that are used to open files, but they have some differences.
open is a low-level function that is part of the POSIX library, while fopen is a higher-level function that is part of the C standard library.
open returns a file descriptor, which is an integer that represents the file being opened. This file descriptor can be used with other low-level I/O functions like read and write to manipulate the file. open allows for greater control over file access modes and file permissions, as it takes a bitmask of flags that specify these parameters.
On the other hand, fopen returns a pointer to a FILE structure, which is a higher-level abstraction of the file. FILE provides a buffered interface for reading and writing data, and it is designed to be used with other higher-level I/O functions like fread, fwrite, fprintf, and fscanf. fopen provides less control over file access modes and permissions than open, but it is simpler to use.
Here’s an example of using open to open a file:
#include <fcntl.h>
#include <unistd.h>
int main() {
int fd = open("file.txt", O_RDWR);
if (fd == -1) {
perror("open");
exit(EXIT_FAILURE);
}
// Use the file descriptor with read and write operations
// ...
close(fd);
return 0;
}
And here’s an example of using fopen
to open a file:
#include <stdio.h>
int main() {
FILE *fp = fopen("file.txt", "r+");
if (fp == NULL) {
perror("fopen");
exit(EXIT_FAILURE);
}
// Use the FILE pointer with fread, fwrite, fprintf, fscanf, etc.
// ...
fclose(fp);
return 0;
}
In summary, open
is a lower-level function that provides greater control over file access modes and permissions, while fopen
is a higher-level function that provides a buffered interface for reading and writing data.
ioctl
The ioctl
function in C is a system call used for performing various input/output control operations on a file descriptor. It allows you to manipulate the underlying device parameters or properties of the file associated with the given file descriptor. The name ioctrl stands for “input/output control.”
The ioctl
function is particularly useful for managing device drivers and handling specialized or non-standard operations that don’t have dedicated system calls. Common use cases for ioctl
include configuring serial ports, managing terminal settings, or interacting with various hardware devices.
The ioctl
function has the following signature:
fd: The file descriptor for the device or file you want to perform the control operation on. request: A constant that specifies the control operation to be performed.
…: Zero or more additional arguments, depending on the specific request used. These arguments are typically used to pass data to or receive data from the ioctl
call.
The ioctl
function returns 0 on success and -1 on failure. When it fails, the errno variable is set to indicate the error code.
Here’s an example C program that traverses a directory recursively and puts all files found in the directories into an array:
#include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
#include <string.h>
#include <sys/stat.h>
#define MAX_FILES 1024
void traverse_directory(char *dir_path, char **files, int *file_count);
int main(int argc, char *argv[]) {
char *dir_path = argv[1];
char **files = malloc(MAX_FILES * sizeof(char *));
int file_count = 0;
traverse_directory(dir_path, files, &file_count);
for (int i = 0; i < file_count; i++) {
printf("%s\n", files[i]);
free(files[i]);
}
free(files);
return 0;
}
void traverse_directory(char *dir_path, char **files, int *file_count) {
DIR *dir;
struct dirent *entry;
struct stat statbuf;
if ((dir = opendir(dir_path)) == NULL) {
fprintf(stderr, "Error opening directory %s\n", dir_path);
exit(EXIT_FAILURE);
}
while ((entry = readdir(dir)) != NULL) {
char path[1024];
sprintf(path, "%s/%s", dir_path, entry->d_name);
if (stat(path, &statbuf) == -1) {
fprintf(stderr, "Error getting file stats for %s\n", path);
continue;
}
if (S_ISDIR(statbuf.st_mode)) {
if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
continue;
}
traverse_directory(path, files, file_count);
} else if (S_ISREG(statbuf.st_mode)) {
char *filename = malloc(strlen(path) + 1);
strcpy(filename, path);
files[(*file_count)++] = filename;
}
}
closedir(dir);
}
This program takes a directory path as a command-line argument and recursively traverses the directory, adding any files found to a dynamically allocated array of strings. The traverse_directory function takes a directory path, the array of files, and a pointer to an integer that keeps track of the number of files in the array. It opens the directory, reads each entry, and determines whether it is a file or directory. If it is a directory, the function recursively calls itself on the directory. If it is a file, the function adds the file path to the array of files.
Finally, the program prints out each file path in the array and frees the memory allocated for each string in the array.
Note that this program allows for a maximum of 1024 files in the files array. You need to increase or decrease this value as needed.
The Unix file system is a hierarchical file system used in Unix-like operating systems. Inodes are used to store information about files and directories, and C programming language provides functions for manipulating and accessing files,
Elements of the lesson were generated by OpenAI’s GPT-3.5 and GPT-4.