CITS2002-cfind

Background

Unix-based systems have long supported a utility program named find which, unsurprisingly, is used to find files and directories in a file-system. While the features provided by find are extensive, find is often criticised for its arcane syntax (see man find).

There is a need for an easier-to-use utility, with an easy-to-remember syntax, even if it supports far fewer of the infrequently used features.

Goal

The goal of this project is to implement a system utility, named cfind, to find file-system entries matching specific criteria. Successful completion of the project will develop your understanding of some advanced features of the C99 programming language, and your understanding of the role of a number of system-calls responsible for file and directory access and file-system attributes.

Program operation

cfind is invoked using the command:

1
prompt> cfind  [options]  pathname  [stat-expression]

which specifies optional command-line arguments, a single mandatory file-system pathname (either relative or absolute), and an optional single conditional expression. If invoked correctly, cfind will print the names of all files and directories at or below the provided pathname, for which the provided expression evalutes to true. The ‘starting’ pathname may be either a file or a directory.

A stat expression is a character string providing a Boolean predicate, written in a C-like syntax, that is evaluated against an individual file-system entry. The supported syntax provides arithmetic, equality and relational expressions, dates and times, and access to fields of the file-entry’s struct stat structure.

The default conditional expression is just ‘1’, which evaluates to true. Examples of stat expressions are given in the project’s clarifications.

Note that the stat expression will usually contain characters with a special meaning to the shell (such as whitespace characters, double-quotes, less-than/greater-than signs, and square-brackets) so it’s recommended that the full stat expression be provided within single-quote characters.

A library provides the stat expression-based functions. The anticipated use of the functions is to:

  1. First call the compile_stat_expression() function with the Boolean predicate to later be evaluated. If a valid expression, the function builds internal state to represent the expression, and returns an instance of the STAT_EXPRESSION datatype.
  2. This STAT_EXPRESSION instance, a directory-entry’s filename (not its full pathname), and an instance of a struct stat structure are passed to the evaluate_stat_expression() function to determine if the attributes of the indicated file-entry match the expression.
  3. Finally, when done, the STAT_EXPRESSION value is passed to free_stat_expression() to release allocated memory.
    Command-line options, which appear before the mandatory pathname, indicate how the file-system should be traversed and how the output should be formatted.

The project’s sample solution defines the project’s definitive operation.

The cfind utility is to support the following command-line options:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-a
Normally, file-entries beginning with the ‘.’ character are ignored. Specifying -a requests that all entries be considered.

-c
Print only the count of the number of matching file-entries, then exit. Do not print or do anything else.

-d depth
Normally the indicated filepath is recursively searched (completely). Specifying the -d option limits the search to the indicated depth, descending at most depth levels (a non-negative integer) of directories. -d 0 means only apply the tests and actions to the command-line pathname, and is obviously implied if the command-line pathname is a file.

-l
Print a long listing of matching file-entries, printing in order (left to right): inode, each entry’s permissions, number of links, owner’s name, group-owner’s name, size, modificate-date, and entry-name (similar to the output of /bin/ls -l -i ).
The listing is sorted by name (the default).

-r
Reverse the order of any sorting options.

-s
Print matching file-entries, sorted by size. If both -s and -t are provided, -t takes precedence.

-t
Print matching file-entries, sorted by modification time. If both -s and -t are provided, -t takes precedence.

-u
Attempt to unlink (remove) as many matching file-entries as possible. The cfind utility should exit with failure if any attempt to unlink a file-entry was unsuccessful.

Program requirements

  1. Your project, and its executable program, must be named cfind.
  2. Your project must be developed using multiple C99 source files and must employ a Makefile, employing variable definitions and automatic variables, to compile and link your project’s files and the provided library. While you can read the source-code of the library (to learn how it works), you should not use its source in your project - just link against its compiled library.
  3. If any error is detected during its execution, your project must use fprintf(stderr, ….) or perror() (as appropriate) to print an error message.
  4. It is anticipated that a successful project will need to use (at least) the standard C99 and POSIX functions: getopt(), opendir(), readdir(), closedir(), stat(), rmdir(), unlink(), perror(), and exit().
  5. Your project must employ sound programming practices, including the use of meaningful comments, well chosen identifier names, appropriate choice of basic data-structures and data-types, and appropriate choice of control-flow constructs.

Assessment

The project may be completed individually or in teams of two. You are strongly encouraged to work with someone else - this will enable you to discuss your initial design, and to assist each other to develop and debug your joint solution.

During the marking, attention will obviously be given to the correctness of your solution. However, a correct and efficient solution should not be considered as the perfect, nor necessarily desirable, form of solution. Preference will be given to well presented, well documented solutions that use the appropriate features of the language to complete tasks in an easy to understand and easy to follow manner. That is, do not expect to receive full marks for your project simply because it works correctly. Remember, a computer program should not only convey a message to the computer, but also to other human programmers.

Half of the possible marks will come from the correctness of your solution. The remaining marks will come from your programming style, including your use of meaningful comments, well chosen identifier names, appropriate choice of basic data-structures, data-types, control-flow constructs, and

Your project will be marked on CSSE computers running OS-X. No allowance will be made for a program that “works at home” but not on CSSE computers, so be sure that your code compiles and executes correctly on these machines before you submit it.