Sunday, September 29, 2013

Code As If…

Sorry for how slow things have been here lately; I'm still suffering from WWDC work backlog, plus I've been spending a lot of time on the new book. This is the first book I've tried to write while also doing full-time client work and it's taking a bit of a toll on me.

I've had a blog post floating around my head since writing Beginning iPhone Development that has never become completely formed. It's about how the process of writing about code has changed the way I write code. I think the thoughts are finally ready to congeal into solid form. So, here goes.


The other day, I saw somebody wearing a T-shirt that said "Dance Like Nobody's Watching". I'm not much of a dancer, but I like the sentiment. However, dancing is not coding. The worst thing you can do is code like nobody's watching. You should code like everybody's watching you.

One thing I noticed after I'd been writing code for public consumption for a while is that I've become much more critical of my own code than I was before I started putting it out there for the world to see. A lot of that comes from pure vanity - I want to minimize the number of stupid mistakes that people are going to call me out on. There are a lot of eyes on the code I write for books and for this blog, and some of those eyes belong to people who are, quite frankly, smarter than me.

It's impossible to always put out perfect code, but I do like to avoid giving the impression of being a sloppy coder or, worse, of being a complete hack. The process I go through with code that goes into a book or blog post goes something like this:
  1. Make it work
  2. Make it work well
  3. Make it read well
The process isn't quite as linear as that makes it look, but that works as a very basic statement of the process. The last step includes the idea of refactoring, but it includes more than that. It's proofreading my code after I'm happy with the way it functions in order to look for ways the code could be written better. I'm not talking about performance here, that goes under #2. This is purely about maintainability and readability. Are my methods and variables using logical names? Is my code easy to read? Is the formatting consistent? Was I being lazy anywhere and doing quick copy and paste coding. Step 3 is about trying to be my own worst critic. You wouldn't send a proposal or other written document out without proofreading it. Your code shouldn't be any different.

I thought that I would revert back to my old ways when I wasn't writing code for public consumption, but it turns out that once acquired, this habit is hard to shake. Frankly, like most developers, I had always assumed that the third step, or at least the more picayune aspects of it, were a waste of time when doing production work.

That hasn't turned out to be true in my experience. In fact, it's been quite the opposite. In almost every case, I spend less time writing code because my code is now more readable, more precise, and generally quite a bit shorter than the way I used to write code. Every time I have to change, fix, or extend existing code, I spend less time doing it, and those little bits add up over the course of even a short job. In the log run, all those little things pay huge dividends.

Here's an example I was working on this weekend for OpenGL ES 2.0 for iOS 4. I'm putting together an Objective-C class to make creating and managing multiple programs and shaders in OpenGL easier. One piece of functionality I wanted to include was the ability to get the program and shader logs out of OpenGL. Without that information, it's awfully hard to identify why a shader won't compile or a program won't link. The end result of my first two steps ("Make it work", "Make it work well"), was this:

- (NSString *)vertexShaderLog
{
    GLint logLength = 0, charsWritten = 0;

    glGetShaderiv(vertShader, GL_INFO_LOG_LENGTH, &logLength);

    if (logLength > 0)
    {
        char *log = malloc(logLength);
        glGetShaderInfoLog(vertShader, logLength, &charsWritten, log);
        NSString *ret = [[NSString alloc] initWithBytes:log 
            length:logLength 
            encoding:NSUTF8StringEncoding];
        free(log);
        return ret;
    }

    return @"No Log Data";
}
- (NSString *)fragmentShaderLog
{
    GLint logLength = 0, charsWritten = 0;

    glGetShaderiv(fragShader, GL_INFO_LOG_LENGTH, &logLength);

    if (logLength > 0)
    {
        char *log = malloc(logLength);
        glGetShaderInfoLog(fragShader, logLength, &charsWritten, log);
        NSString *ret = [[NSString alloc] initWithBytes:log 
            length:logLength 
            encoding:NSUTF8StringEncoding];
        free(log);
        return ret;
    }

    return @"No Log Data";
}
- (NSString *)programLog
{
    GLint logLength = 0, charsWritten = 0;
    
    glGetProgramiv(program, GL_INFO_LOG_LENGTH, &logLength);
    
    if (logLength > 0)
    {
        char *log = malloc(logLength);
        glGetProgramInfoLog(program, logLength, &charsWritten, log);
        NSString *ret = [[NSString alloc] initWithBytes:log 
                                                 length:logLength 
                                               encoding:NSUTF8StringEncoding];
        free(log);
        return ret;
    }
    
    return @"No Log Data";
}

It was obvious, even as I was writing this code that I wasn't going to leave it like this. You can get away with this kind of unnecessary duplication of code in most projects because it functions fine and most clients and many development managers don't have the technical chops to read your code, but it violates the DRY principle something fierce. This is hardly the worse example of copy-and-paste coding I've ever seen, but code like this shouldn't ever go into production.

I tested this code to make sure the methods worked, then I immediately refactored the two shader log methods by adding a private method declared in an extension. I then changed the two existing methods to simply call the private method, so I was then using the same code for both scenarios.

- (NSString *)shaderLogForShader:(GLuint)shader
{
    GLint logLength = 0, charsWritten = 0;

     glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &logLength);

     if (logLength > 0)
     {
         char *log = malloc(logLength);
         glGetShaderInfoLog(shader, logLength, &charsWritten, log);
         NSString *ret = [[NSString alloc] initWithBytes:log 
             length:logLength 
             encoding:NSUTF8StringEncoding];
         free(log);
         return ret;
     }

     return @"No Log Data";
}
- (NSString *)vertexShaderLog
{
    return [self shaderLogForShader:vertShader];
}
- (NSString *)fragmentShaderLog
{
    return [self shaderLogForShader:fragShader];
}
- (NSString *)programLog
{
    GLint logLength = 0, charsWritten = 0;
    
    glGetProgramiv(program, GL_INFO_LOG_LENGTH, &logLength);
    
    if (logLength > 0)
    {
        char *log = malloc(logLength);
        glGetProgramInfoLog(program, logLength, &charsWritten, log);
        NSString *ret = [[NSString alloc] initWithBytes:log 
                                                 length:logLength 
                                               encoding:NSUTF8StringEncoding];
        free(log);
        return ret;
    }
    
    return @"No Log Data";
}

That's definitely better than the original version. I got rid of the obvious duplication between the two nearly identical shader log methods. The only difference between the original two methods was the GLuint (unsigned integer) value passed into the two OpenGL ES function calls.

This code still bothered me, though. Look how similar the shader and program log functionality is. It's identical except for the two function calls and, I realized, those two function calls take exactly the same parameters. It had been a while since I had used them, so it took my brain a few seconds to drudge up how C function pointer worked, but once it did, the path for refactoring this code became obvious. A single method could be used for all three of these cases if I took function pointers to the two OpenGL API calls as method parameters.

Now, this second refactoring is one a lot of people would probably say isn't worth the hassle. It's too much. It doesn't pass a "cost-benefit analysis". The result is not going to be that much shorter, so how can you justify the time? I used to think exactly the same way before spending so much time writing code for public consumption, but now I've become convinced that refactoring like this does pass the cost-benefit analysis. I wasn't putting enough weight on the value of good code when doing that analysis. When you come back to code weeks, months, or years later, you won't remember all the little details. With well-written code, you won't have to.

The truth (in my experience, at least) is that refactored code is enough easier to maintain that it's worth taking the time to do it. Like everything, the more you do it, the better you get at it, and the faster you can do it. Eventually, it becomes second nature. You end up writing better code in roughly the same amount of time that you used to write sloppy-but-functional code. Plus, every time you visit that code in the future, you have a little less code to search through. Whenever you have to debug that code, you have a little less code to step through. Every time you fix the code, you have less code to fix.

If I had an error in the original version of these log functions, I would almost certainly have the same exact error in three places. When I discovered such an error, do you think I would remember to fix it in all three places? Maybe, if I wrote the code recently enough to remember. If somebody else were fixing my code, do you think they'd know they had to fix it in three places? Maybe… but more likely they wouldn't.

Here's the final version of the method I settled on for the book¹:

typedef void (*GLInfoFunction)(GLuint program, GLenum pname, GLint* params);
typedef void (*GLLogFunction) (GLuint program, GLsizei bufsize, GLsizei* length, GLchar* infolog);

- (NSString *)logForOpenGLObject:(GLuint)object 
                    infoCallback:(GLInfoFunction)infoFunc 
                         logFunc:(GLLogFunction)logFunc
{
    GLint logLength = 0, charsWritten = 0;
    
    infoFunc(object, GL_INFO_LOG_LENGTH, &logLength);
    
    if (logLength > 0)
    {
        char *log = malloc(logLength);
        logFunc(object, logLength, &charsWritten, log);
        NSString *ret = [[NSString alloc] initWithBytes:log 
                                                 length:logLength 
                                               encoding:NSUTF8StringEncoding];
        free(log);
        return ret;
    }
    
    return @"No Log Data"; 
}
- (NSString *)vertexShaderLog
{
    return [self logForOpenGLObject:vertShader 
                       infoCallback:(GLInfoFunction)&glGetShaderiv 
                            logFunc:(GLLogFunction)&glGetShaderInfoLog];
    
}
- (NSString *)fragmentShaderLog
{
    return [self logForOpenGLObject:fragShader 
                       infoCallback:(GLInfoFunction)&glGetShaderiv 
                            logFunc:(GLLogFunction)&glGetShaderInfoLog];
}
- (NSString *)programLog
{
    return [self logForOpenGLObject:program 
                       infoCallback:(GLInfoFunction)&glGetProgramiv 
                            logFunc:(GLLogFunction)&glGetProgramInfoLog];
}

Now, if you work for somebody else, especially if you work in a large corporation, you may be measured on a series of metrics including something like "Lines of Code Written" (LOC). While I understand the need for accountability, most corporate metrics are (to be blunt) a bullshit abuse of statistics. The absolute worst metric ever devised is LOC. If you're working somewhere that's still in the coding dark ages and measuring you by LOC, run like hell, or at least lobby for a change in that policy. Measuring performance based on LOC encourages even good engineers to write sloppy, horrible code.

In my example here, I spent some of my time improving my code by removing 25 lines of code. I spent time taking it from 64 lines of code down to 39 lines of code. As an developer, I shouldn't be punished for doing that!

In real-life situations, we often convince ourselves that there's not time for refactoring while we code. After all, we have to make that deadline, and there's always a deadline! We can always come back and refactor the code later when we have more time, right?

Only you won't. You won't come back until you absolutely need to, and by that point, it'd probably be faster to rewrite it. If you work for somebody else, they're unlikely to let you go back and spend time "fixing" something that works perfectly well. That's especially true if you have a non-technical manager. Besides, you'll probably be onto your next deadline already and feeling time pressure all over again.

In the long run, good code is worth the time it takes. Often, it's worth it in the short run as well.

Make time to refactor, and not after the fact make it part of your development process. Do it up front, before it goes to test, because you won't come back. Proofread your code before you ever send out a build.

Or, in other words, code as if everyone was able to see and understand your code.

Addendum


There's another point I want to make. Not only should you code as if there were lots of eyes on your code, you should actually get lots of eyes on your code if you can. You will miss stuff even if you make proofreading and refactoring part of your development process. You will sometimes even miss stupid stuff. Case in point, I got an e-mail shortly after posting this from Brent Simmons of NetNewsWire fame. Brent very kindly took the time to proofread the code and pointed out one bug and several improvements that could be made.

Here are his suggestions:
  1. Return early if logLength < 1, so that the main functionality doesn't have to be indented.
  2. Return an autoreleased string, per Apple's coding conventions
  3. Return nil in the case of no log data, rather than a string that would have to be compared to determine if there was log data or not.
  4. Change the name of the char * and NSString variables to something slightly more meaningful. (Since the method is logForSomething, it returns a variable named log.)
I absolutely agree with every single one of these suggestions and am a little embarrassed about leaking the string, to be perfectly honest. Here's Brent's rewrite of my function that I plan to use in the book (with Brent's permission):

- (NSString *)logForOpenGLObject:(GLuint)object 
                   infoCallback:(GLInfoFunction)infoFunc 
                        logFunc:(GLLogFunction)logFunc
{
   GLint logLength = 0, charsWritten = 0;

   infoFunc(object, GL_INFO_LOG_LENGTH, &logLength);    
   if (logLength < 1)
    return nil;

   char *logBytes = malloc(logLength);
   logFunc(object, logLength, &charsWritten, logBytes);
   NSString *log = [[[NSString alloc] initWithBytes:logBytes 
                          length:logLength 
                          encoding:NSUTF8StringEncoding] autorelease];
   free(logBytes);
   return log;
}

No comments:

Post a Comment