Sunday, April 7, 2013

Economics of Good Code

On investigating some of the more obscure features of the Scala language, I ran into a nice new feature in Scala 2.10 which is both efficient and more natural to write extension than the usual pimp-my-library approach. Adding a pet project that I did for my son for fun, we came up with some interesting code that deals with distances used in the solar system. Distances are provided by different reference material in many different units: light-minutes, light-hours, au (astronomical units), and kilometers. But the data needs to be normalized into a standard unit - we chose kilometers. So lets represent a couple of simple data points in a map. First the facts:

Distances from the Sun
Earth: 8.3 light-minutes
Saturn: 1.3 light-hours

Now lets create a simple map that represents this in Scala:

 val c = 299792.458 // Speed of light in km/s  
 val sunDistances = Map(  
  "Earth"-> 8.3 * c * 60 ,  
  "Saturn"-> 1.3 * c * 60 * 60  
 )  

Well, this works, but is not elegant and not very readable. Furthermore, it is not very scalable and error prone with a large data set. Lets see how we can solve this in a better way using value classes and extension mechanisms:

 object Distance {  
    
  val c = 299792.458d  
    
  implicit class AstroDistance(val d: Double) extends AnyVal{    
   def lightMinutes = d * c * 60  
   def lightHours = d * c * 60 * 60  
  }  
 }  

With this, our sun distances map could change to the followings:

  import Distance._  
  val sunDistances = Map(  
    "Earth"->(8.3 lightMinutes),   
    "Saturn"->(1.3 lightHours)  
  )  

As you can see, we achieved far more concise code. First of all, we use implicit conversions to convert doubles into AstroDistance allowing calling the lightMinutes or lightHours methods from the double directly. Then, with the help of Scala's value classes, we eliminate wrapper object creation. As a result, such implicit conversions call the methods "lightMinutes" and "lightHours" as static methods and have virtually no performance overhead while achieving the desired expressiveness.

But, how many developers will make use of such features to better their code? At our workplaces, developers are under the pressure to crank out code and features as fast as possible. There is little incentive to write concise and easy to maintain code. In many cases, people don't even come back looking their code again unless they have to fix a bug. Given these constraints, which version of code would developers choose to write? If you guessed the first example, we're on the same page. It will take a bit more learning to write the second version of the code so most developers having their  deadlines looming over their heads will not even think about the "better" way to write. Yet, the second version of the code with the units spelled out will be easier to read and to maintain for years to come. While we may not see the difference at this scale, imagine millions of lines of code written like the first example and it becomes very apparent that spending time on these minor coding tricks can result in far more savings over time.

Lets now add an economic flair to our discussion. The first example is definitely less costly to write if it is throw-away code. The second version requires more skill but makes the code reusable and reduces the cost of long term maintenance. What kind of code we should write depends on the circumstances. Almost always, code is written to stay for a relatively long time (years) so it seems we should generally opt for the better code. Yet, time pressure almost always results in short term thinking, and unknowingly opting for the bigger cost to pay in the long run.

No comments:

Post a Comment