Go: io.Reader gotchas

I’ve really come to appreciate the elegance in the io abstractions in Go. The seemingly simple patterns of io.Reader and io.Writer open up a world of easily composable data pipelines.

Need to add compression? Just wrap the Writer with a gzip.Writer, etc.

But there are some subtleties to be aware off, that might bite you.

Let’s have a look at the description of io.Reader.Read():

Read(p []byte) (n int, err error)

Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n <= len(p)) and any error encountered. Even if Read returns n < len(p), it may use all of p as scratch space during the call. If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more.

This is fairly straightforward. You call Read() with a byte slice, which it may fill up. The key point here being may. Most IO sources (e.g. a file) will generally read the full buffer, until you reach the end of the file.

But not all of them. For instance, a gzip.Writer tends to do incomplete reads, requiring multiple Read() calls.

Recommendation: If you need to read a buffer in full, use io.ReadFull() instead of Read().

When Read encounters an error or end-of-file condition after successfully reading n > 0 bytes, it returns the number of bytes read. It may return the (non-nil) error from the same call or return the error (and n == 0) from a subsequent call. An instance of this general case is that a Reader returning a non-zero number of bytes at the end of the input stream may return either err == EOF or err == nil. The next Read should return 0, EOF.

Callers should always process the n > 0 bytes returned before considering the error err. Doing so correctly handles I/O errors that happen after reading some bytes and also both of the allowed EOF behaviors.

This means it’s perfectly legal to return both n (and thus read a number of bytes) and an error at the same time.

It also means that the standard pattern of immediately checking for an error is wrong:

// Don't do this
n, err := in.Read(buf)
if err != nil {
    // Handle err
}
// Do something with n and buf

Always process n / buf first, then check for the presence of an error.

Implementations of Read are discouraged from returning a zero byte count with a nil error, except when len(p) == 0. Callers should treat a return of 0 and nil as indicating that nothing happened; in particular it does not indicate EOF.

The important take-away here: always check for err == io.EOF, some implementations might give you an empty read even if there is still data to come.


Running into either of these corner cases is generally rare, since most IO sources are quite well-behaved. But being aware of the corner cases will save you a massive amount of debugging once you do run into them.

November 25, 2019 19:47 #go

Related

Comments