This thesis demonstrates a practical methodology for file usage analysis and resource usage prediction using trace-data from a production system. A VAX 11/780 system running Berkeley UNIX was instrumented to gather file usage data, in the form of file-related system calls, and resource usage data for each process.First, a user-oriented analysis was done using the file usage data collected from the first measurement. The key aspect of this analysis is a characterization of users and files. Two characterization measures are employed: accesses-per-byte and file size. This new approach is shown to distinguish differences in files as well as in users, which can be used in efficient file system design, and in creating realistic test workloads for simulations. A multi-stage gamma distribution is shown to closely model the file usage measures. Even though overall file sharing is small, some files belonging to a bulletin board system are accessed by many users, simultaneously and otherwise.Next, the file usage data from the second measurement is analyzed using a few simple measures based on the notion of a file reference. The measures used are: fraction referenced, file size, reference-time, number of references, and inter-reference time. Neither the users nor the files were characterized in this analysis. It was shown that in most references, files were accessed completely, substantiating the argument for using access-per-byte measure in user-oriented analysis. It was also shown that most file references lasted for a short time, and that inter-reference time was 2 to 3 orders of magnitude larger than reference time.Finally, a probabilistic resource usage prediction scheme was developed, using the process resource usage data. Given the identity of the program being run, the scheme predicts CPU time, file I/O, and memory requirements of a process at the beginning of its life. The scheme uses a state-transition model of a program's resource usage in its past executions for prediction. The states of the model are the resource regions obtained from an off-line cluster analysis of processes run on the system. The proposed method is shown to work on data collected from a VAX 11/780 running 4.3 BSD UNIX. (Abstract shortened with permission of author.)