Octave, GNU Octave, Matlab, Scientific Computing, Language, Interpreter, Compiler, C++, LAPACK, Fortran, Fun , GNU

Monday, June 18, 2007

Octave-Lint (2)

Much as I'd like to be informal, being private remains a greater love. Thinking about the previous post on Octave-Lint over lunch, a few more useful features suggested themselves. In the spirit of a note-to-self I document them here. I know looking into places like Pychecker or Pylint can save
us time.

Things that make syntactic sense but no semantic sense, and pass the parser, but which Lint must trap.
  1. x=x+ stdfunc(arg); %here if the Lint knows the standard Octave functions it has a clue to the return value and can do a little bit of type-checking analysis to figure out warnings and such. Eg: x = x + strlen('hello world') ; makes sense, but on the other hand, y = y + getenv("HOME") would probably not!, and z = strcat(system('ls'),'files') definitely doesn't make any sense. Lint knows system() always returns a single integer as its first value, and that is incompatible with strcat(). Such analysis is not performed statically, and one can discover such problems only at runtime; expensive, in my opinion.
  2. Indexing over-flow / under-flow options. In modern programming languages (Java, Python?) indexing over-flow under-flow can be flagged at runtime exceptions, or even better statically discovered from code. Now for Octave, matrices & vectors are indexed starting at 1, which lets use compute and flag array/vector indices that seem to underflow. In this type of analysis the direct algorithm would be to pick & flag usage such as array(index) | index <= 0 or index is non-integral. For directly -ve indices this is uncommon; we may catch a few 0-index usages, and sometimes usage of fractional indices. The key problem remains to identify computed indices that may underflow or overflow. Overflow is also another important problem for most programs, where arrays tend to be pre-allocated, and not x=[x; new_row]; style.
  3. Matrix dimension conformity in cases where we do products. This is a minefield I step into every now and then. Multiplying 2 matrices Amxn * Bnxl = Cmxl. This conformity rule of Matrix multiplication must be enforced to flag errors at compile time when we have an idea of the program dimensions. This will be tricky to implement.
  4. Misspelt variables can be suggested through looking up the symbol table accumulated during the parse-tree walking phase. String edit distance algorithms will be useful here.
  5. Redefined / modified pre-defined constants can be flagged as errors/warnings. Candidates like i, e, eps, pi, are immutable for me. Typically many programmers write for i=1:10; end and tend to use loop index after the loop, which can be 'caught' by Lint.
Essentially, learned behaviour of a experienced programmer can be taught to the machine, as rules specified by lint, and we could improve the range of tools available for users to fallback on.

I have a post soon on Fortran & Octave, working with LAPACK and known design issues in such an interface.

Cheers,
Muthu

No comments:

Creative Commons License