section 7.5: File Access

page 160

We've come an amazingly long way without ever having to open a file (we've been relying exclusively on those predefined standard input and output streams) but now it's time to take the plunge.

The concept of a file pointer is an important one. It would theoretically be possible to mention the name of a file each time it was desired to read from or write to it. But such an approach would have a number of drawbacks. Instead, the usual approach (and the one taken in C's stdio library) is that you mention the name of the file once, at the time you open it. Thereafter, you use some little token--in this case, the file pointer--which keeps track (both for your sake and the library's) of which file you're talking about. Whenever you want to read from or write to one of the files you're working with, you identify that file by using the file pointer you obtained from fopen when you opened the file. (It is possible to have several files open, as long as you use distinct variables to store the file pointers.)

Not only do you not need to know the details of a FILE structure, you don't even need to know what the ``buffer'' is that the structure contains the location of.

In general, the only declaration you need for a file pointer is the declaration of the file pointer:

	FILE *fp;

You should never need to type the line

	FILE *fopen(char *name, char *mode);

because it's provided for you in <stdio.h>.

If you skipped section 6.7, you don't know about typedef, but don't worry. Just assume that FILE is a type, like int, except one that is defined by <stdio.h> instead of being built into the language. Furthermore, note that you will never be using variables of type FILE; you will always be using pointers to this type, or FILE *.

A ``binary file'' is one which is treated as an arbitrary series of byte values, as opposed to a text file. We won't be working with binary files, but if you ever do, remember to use fopen modes like "rb" and "wb" when opening them.

page 161

We won't worry too much about error handling for now, but if you start writing production programs, it's something you'll want to learn about. It's extremely annoying for a program to say ``can't open file'' without saying why. (Some particularly unhelpful programs don't even tell you which file they couldn't open.)

On this page we learn about four new functions, getc, putc, fprintf, and fscanf, which are just like functions that we've already been using except that they let you specify a file pointer to tell them which file (or other I/O stream) to read from or write to. (Note that for putc, the extra FILE * argument comes last, while for fprintf and fscanf, it comes first.)

page 162

cat is about the most basic and important file-handling program there is (even if its name is a bit obscure). The cat program on page 162 is a bit like the ``hello, world'' program on page 6--it may seem trivial, but if you can get it to work, you're over the biggest first hurdle when it comes to handling files at all.

Compare the cat program (and especially its filecopy function) to the file copying program on page 16 of section 1.5.1--cat is essentially the same program, except that it accepts filenames on the command line.

Since the authors advise calling fclose in part to ``flush the buffer in which putc is collecting output,'' you may wonder why the program at the top of the page does not call fclose on its output stream. The reason can be found in the next sentence: an implicit fclose happens automatically for any streams which remain open when the program exits normally.

In general, it's a good idea to close any streams you open, but not to close the preopened streams such as stdin and stdout. (Since ``the system'' opened them for you as your program was starting up, it's appropriate to let it close them for you as your program exits.)

Read sequentially: prev next up top