Programming Best Practices

When it comes to programming, there are as many schools of thought in terms of best practices as their are practitioners. For my Bass Connections Team, I’ve compiled a list of 42 of what I think are solid best practices for programming, targeted to Matlab. These come from varied sources (references below) as well as personal experience. These best practices are also available as a 2-page pdf.

The Basics

  1. Use variables instead of hard coded numbers and place those variables at the top of your scripts and functions
  2. If you do something more than once – write a function for it
  3. Comment your code thoroughly
    • Include explanations of the purpose and overall design of your code.
    • Add comments to tricky parts of your code especially if it won’t be obvious to someone else
    • Document and comment your functions. Explain what the inputs and outputs are, how to use them, and the overall purpose of your code
  4. Use built in or pre-packaged functions when available – don’t always assume you have to start from scratch. Google it or check out the Matlab Central before you start coding up something complex
  5. Use the Matlab debugger to fix errors in your code
  6. Write scripts for each figure or set of figures you need to make so you can easily reproduce your results
  7. Use structures instead of feeding large numbers of parameters into a function; pass the structures to and from functions. This keeps the functions modular and the variables obvious. It also helps organize the variables.
  8. Optimize software only after it works correctly. Use Matlab’s Profiler to help optimize your code
  9. Make incremental changes. Work in small steps with frequent feedback and course correction.
  10. Plan for bugs and create test scripts and debugging tools liberally.
  11. Use a version control system. Recommendation: Git.

Naming Conventions: Variables, Functions, and Files

  1. Descriptive camel case variable names and function names OR underscores. Choose one and be consistent.
    • For camel case (e.g. camelCase)
      • Always start variables with lowercase then new words start with capital
      • Acronyms are a word (e.g. signalSnr not signalSNR)
    • Distinguish structures by capitalizing the first letter
      • For camel case: FirstLetter instead of firstLetter
      • For underscores: First_letter instead of first_letter
    • Distinguish constants by capitalizing all letters and using underscores (e.g. ALL_LETTERS)
  2. Loop variables have meaning and start with i* (or j*), e.g. iRow, iCol
  3. Length/size indicators start with n*: nRows, nCols
  4. Inside of a loop use c* to indicate the current value of something
casesToConsider = cat(2,5:10,20:10:50);
nCases = length(casesToConsider);

for iCase = 1:nCases
   cCase = casesToConsider(iCase);
   ….
end
  1. Variables with a large scope should have meaningful names, variables with a small scope can have short names
  2. Avoid using a keyword or special value name for a variable name
  3. A convention on pluralization should be followed consistently (always singular or always plural)
  4. Negated Boolean variable names should be avoided. Just like it’s poor grammar to use a double negative, it’s better to use isFound than ~isNotFound
  5. Ensure that all references to files are system independent. Use functions such as filesep, fullfile, and fileparts that create file paths (independent of the root folder of the computer or of whether the system is PC or Mac) so that all code is easily portable from one system to another.
iofun_dir = ['toolbox' filesep 'matlab' filesep 'iofun']

On Windows this produces: toolbox\matlab\iofun

On Mac / Unix systems this produces: toolbox/matlab/iofun

  1. Classes: call all methods using dot notation rather than functional notation. For example:
returnedValue = object.methodName(args,...)

instead of

returnedValue = methodName(object,args,...)

Enhanced Code Readability

  1. Floating point constants should always be written with a digit before the decimal point: 0.5 not .5
  2. Surround operators (+, -, &, |, /, *, etc.) and equals signs with spaces: a + b not a+b
  3. Commas should be followed by a space: [1, 2, 344, 3] not [1,2,344,3]
  4. Use alignment of equals signs, spaces, and commas wherever it enhances readability, e.g.:
weightedPopulation = (doctorWeight * nDoctors) + ...
                     (lawyerWeight * nLawyers) ;
  1. In conditional statements (if, elseif, …, else, end) the most common case should come first and the exception in the else-part of an if-else-expression.
  2. Complex conditional expressions should be avoided. Introduce temporary logical variables instead.
  3. Any block of code appearing in more than one m-file should be considered for packaging as a fuction.
  4. Any function used by only one other function should be packaged as its subfunction in the same file.
  5. Avoid the if(0) or if(true) constructs, it is difficult to find
  6. Use strings and SWITCH as a proxy for enumerated types, it’s easy to use, fast and very readable:
fruit = 'apple';
switch(fruit)
   case 'apple'
      fun1;
   case 'banana'
      fun2;
end
  1. Keep functions small. They may be re-used by other functions; besides it makes them easier to read and helps keep you organized.
  2. Keep lines short by using the continuation character (ellipsis: … ).

Comments

  1. Comments should explain the code and do more than just restate the code.
  2. There should be a space between the % and the text.
  3. Comments should have the same indentation as the statements referred to.
  4. Function header comments should discuss any special requirements for the input arguments and the format and meaning of output arguments.

Avoiding Errors with Proactive Coding Practices

  1. Floating point comparisons should be made with caution. This can cause problems with logical expressions, e.g.
shortSide = 3;
longSide  = 5;
otherSide = 4;
longSide^2 == (shortSide^2 + otherSide^2)

ans = 1

scaleFactor = 0.01;
(scaleFactor*longSide)^2 == ((scaleFactor*shortSide)^2 + …
                             (scaleFactor*otherSide)^2)

ans = 0

  1. Always tell sum(), mean(), etc. which direction to go. Even though sum(X) and sum(X,1) do the same thing for matrix X they do different things for a row vector.
  2. Use EVAL, FEVAL, EXIST, and other self-modifying code sparingly. It may be really fun and very flexible, but its very confusing if you’re not the one debugging it. Do not use EXIST to repair faults in programming logic.
  3. Vectorize data instead of using loops when possible for improving performance.
  4. Avoid MEX files when possible.

References

  • Aruliah et al. “Best Practices for Scientific Computing,” Arvix, 2012. (link accessed 9/9/2015)
  • Hull, Doug. “Top 10 MATLAB code practices that make me cry,” Mathworks Blog Post, 2010. (link accessed 9/9/2015)
  • Johnson, Richard. “MATLAB Programming Style Guidelines,” 2002. (link accessed 9/9/2015)
  • Robbins, Michael. “Good Matlab Programming Practices for the Non-Programmer.” (link accessed 9/9/2015)
  • Springer, Michael and Zeba Wunderlich. “Matlab Good Programming Practices,” 2009. (link accessed 9/9/2015)
  • Verner, Eric. “Best Practices for Scientific Computing,” 2012. (link accessed 9/9/2015)
  • Wilson et al. “Best Practices for Scientific Computing,” PLoS Biol 12(1): e1001745. doi:10.1371/journal.pbio.1001745. (link accessed 9/9/2015)

Acknowledgements: Kenneth Morton for his excellent contributions to and review of these best practices.

LinkedIn
Twitter
Facebook
Google+
http://www.kylebradbury.org/blog/programming-best-practices/