Nice Clean Example

Of Files, Streams and Reader/Writers

This article discusses the relationship between files, Streams and the Readers and Writers in the .NET platform. Streams are a central concept in .NET

The .NET platform provides an embarrasment of riches when it comes to the different ways to get data from Point A to Point B. When you're first learning the platform you come across Streams and different Readers and Writers. It can take a while to figure out what each is and what role they play, and how they relate to doing I/O.

The issue appears in doing I/O to lots of things, but the first place you find it is when you want to create a text file and discover that you use a 'StreamWriter' class to do this. WTF?

Where is the eff-ing file handle and the Read() function?

Why the complexity? Well, some history. I used to be part of a development team working on an operating system for the just-invented Intel 386. Working on an OS gives you a unique perspective on the issues of application programmers since you have to decide how they will get to various resources such as files.

I came across a design principle in the API of Unix. In that model "everything was a file" from the perspective of an application, whether it was a disk file, the terminal, the keyboard, serial devices, memory, etc. The goal was noble, to have a consistent interface that a program could use. Need data? then you use the read() API call. One Interface, All The World!

The devil was in the details, of course, and reading and writing a file was not the same as reading and writing a socket, or main memory, or a serial port, etc. etc. and so emerged, like Godzilla from the ocean, the unwieldy function ioctl(), which turned out to be the catch-all "set configuration parameters" function, whether to set the baud rate, change file ownership, lock pages in memory, set IP addresses, etc. A look at almost any program would reveal a tortuously complex call to ioctl() somewhere, which had a seemingly unlimited number of switches, parameters, secondary control structures, and the like.

You Want the Handle? You can't Truth the Handle!

Still, the idea of presenting a consistent interface for I/O is a powerful one, and I see this idea expressed in Streams, which are important, nay essential conceptual notions in the .NET environment.

In Unix if you wanted to do I/O to a file, you called fopen() and got back a file pointer, a handle to let you do file operations, including reading and writing. Good Old Win32 had the CreateFile() function, and it returned a file handle, also a context for file operations that you could use in function calls to read and write, amongst other uses.

However, in .NET you don't operate in the same way. There are still calls to create files, but they don't return a context pointer or handle, they return a Stream object that is connected to the file, and the primary purpose of that Stream object is read and write data, not set parameters on the item.

/*
 * When you open a file it returns a Stream object, not an generic handle
 */
     FileStream fs = File.Open("c:/autoexec.bat", FileMode.Open);

When in Redmond, ...

This impacts and frames the way you look at the tasks. If you want to read and write to a file, you're ready to go. What if you want to set the last modified date on a file? You don't use the Stream object to do that. Instead you use a static method of the File class, like this.

/*
 * Set last access time to right now
 */
     File.SetLastAccessTime("c:/autoexec.bat", System.DateTime.Now);

Note that this takes a path as an argument, not a "handle to an open file", as would be the case in other API's. This is a little disconcerting, and in fact the call to SetLastWriteTime in this next example will fail, because the file is considered to be in use by another process.

    FileStream fs = File.Open(somepath, FileMode.Open);
    File.SetLastWriteTime(somepath, System.DateTime.Now); // THIS WILL FAIL

I think its a good trade, however. The number of times that you need to set parameters are a tiny fraction of the time that you do I/O, so I think this is a good way to go. It just takes some getting used to, like everything new.

Its easiest for me to think of a Stream as a "software wrapper around a physical medium", where a medium can be a file, a socket, a named pipe, a Console, or a chunk of memory to name a few. When thinking about getting data in or out of the program a Stream will usually come into play somehow. Not all the time, of course, database operations don't use the stream concept. But its frequent enough to have to make friends with it.

How "Readers" and "Writers" fit in the picture.

Here are some sample Stream types, geared towards their specific environment:

  • FileStream -- Disk Files
  • NetworkStream -- TCP/IP Sockets
  • MemoryStream -- A chunk of Memory (can be very handy)

However, if you look at the methods of each of the Stream classes you will see that the Write and Read methods operate on an array of bytes, which is not the easiest data type to work with.

A Reader/Writer wraps a stream to provide easier access to the stream

A Reader/Writer has a Stream inside and lets you use more familiar data types, like a string, in the Read and Write methods. Its just to make it easier to use.

/*
 * Open a file, get a FileStream
 */
 FileStream fs = File.Open(somepath, FileMode.Open);
 /*
  * I can call fs.Write if I manage an array of bytes of data.
  * Wrap this in a StreamWriter so I can write a string out.
  */
  StreamWriter sw = new StreamWriter(fs);
  sw.Write("this is a string"); // Woo-Hoo!!
  

The flexibility of a Reader or Writer is that you can wrap any kind of stream with one, letting you use one class to send data to many different kinds of output.

There are really only two Reader/Writers that you are ever likely to use.

  • StreamReader/StreamWriter -- Read/Write data as text
  • BinaryReader/BinaryWriter -- Read/Write data as binary

The difference between Text and Binary in this context is that if you write the value 37 out as text it goes out as the 2 byte ASCII string "37". If you write it as binary it goes out as 1 byte with a hex value of 25.

You will see references to TextReader,TextWriter and StringReader and StringWriter. Just ignore them until you are much deeper into the environment, use either "StreamReader/StreamWriter" or "BinaryReader/BinaryWriter".

The way that the readers and writers are named isn't as consistent as other aspects of the platform, unfortunately. For instance, it would seem like "TextReader" would be the natural partner to "BinaryReader", but its not, "StreamReader" is. It would really seem like StreamReader should be the name of the base class, since its a reader that goes to a Stream, but TextReader is.

In summary

Streams are a central software concept in the .NET environment. You do most I/O through Streams. Although I/O to files is the most obvious place where Streams are used, I/O to many different items involves Streams. A Stream is a wrapper around a file, socket, pipe, memory, or another Stream. You do I/O through Streams with its Read() and Write() methods. Because they all have the same base class you can write software that can get data or send data to any number of places without regard to the specifics.

The Read() and Write() methods of Streams use an array of bytes as their parameters. Readers and Writers are software wrappers around Streams that make it simpler to use general data types like strings. There are two essential Readers and Writers, StreamReader/StreamWriter to read and write data as text and BinaryReader/BinaryWriter to read and write data in binary.

0 responses