Octave, GNU Octave, Matlab, Scientific Computing, Language, Interpreter, Compiler, C++, LAPACK, Fortran, Fun , GNU

Showing posts with label AST-Walker. Show all posts
Showing posts with label AST-Walker. Show all posts

Wednesday, June 27, 2007

Octave-2-Matlab & Back (2)

Doxygen is such a nice tool. I cannot over emphasize its utility when trying to read a large codebase like GNU Octave's. So in trying to solve our parse-tree walker to convert Octave(Matlab) code to the other Matlab(Octave), we use the inheritance class list for GNU Octave Interpreter.

The following, is the class inheritance for the nodes on the parse-tree, that are in the AST. Now the interpreter or a conversion tool like Octave-2-Matlab will walk this tree. The AST Walker implements the visitor pattern supported by the tree object.



We can now look into the organization of the AST Walker, which is neatly documented/illustrated in this working piece of code at /cvs/octave/src/pt-pr-code.{cc|h}. This AST walker inherits from the base class as follows,

with tree_walker, a pure virtual base class, that encourages polymorphic design. Some of the members of tree_walker include

Now a debugger, code conversion tool, lint-like checker or whatever has to conveniently invoke the parser for the AST, and then walk it for whatever operations it needs to perform on the tree.

virtual void visit_index_expression (tree_index_expression &)=0
virtual void visit_matrix (tree_matrix &)=0
virtual void visit_cell (tree_cell &)=0
virtual void visit_multi_assignment (tree_multi_assignment &)=0
virtual void visit_no_op_command (tree_no_op_command &)=0
virtual void visit_constant (tree_constant &)=0
virtual void visit_fcn_handle (tree_fcn_handle &)=0
virtual void visit_parameter_list (tree_parameter_list &)=0

and many more functions that correspond to each language-object that derives from tree, and relevant functions that 'execute' that command or expression.

So the organization of all this code would start at the front end like:

   1 main
2 begin
3 AST=call-parser()
4 cwalk=Code_Converting_Walker(OCT2MAT)
5 cwalk.set_tree(AST)
6 cwalk.visit()
7 cleanup abracadabra
8 end
So that settles it all, if you know what to write within the pure virtual functions, visit_index_expression (tree_index_expression &)=0 and such. This conversion code is put simply in the previous blogpost by walking this tree in a post-order fashion.

Next time we could see how to write the Oct2Mat converter based on the parse tree, or atleast drive the tree-printer class code.

Cheers,
Muthu

Monday, June 25, 2007

Octave-2-Matlab and back

Octave-Lint (3)
A few more additional features that could be copied from existing tools in a similar areas. Immediately we can think of all the warnings that GCC would spit out, when you compile C/C++ code using 'gcc -Wall code.cc'. Specifically, flagging unused variables, data type conversions in known stdlibrary of Octave functions and such. Its time, I could show the code, instead of waxing eloquent about it. Just that I dont have anything to show.

Octave -> Matlab -> Octave
Next in a similar spirit of the AST walker static-checker we could also have a program transformation tool for converting Octave -> Matlab code. I could 'ask' for type inference whenever it is 'confused' by 'ambiguous' nature of the code. This is exciting, in terms of the possibilities it opens up for us. It is going to be a long project, but defnitely rewarding one.
The Octave parser is widely acknowledged on our mailing lists as a superset of Matlab
language. Using the Octave parser, you could in theory convert Matlab code the other-way
round too. Now this is double bonanza, a kind of buy-1 and you get-1 free. Am I excited?

The starting point of the program transformation tool would be from Paul Kienzle's work on
Oct2Mat conversion script. A few things are simple to understand, and require plain substituion. Paul's code does it using an AWK script, and clearly mentions not wanting to Octave's parse tree for doing the conversion. Paul attempts to do it using filters in AWK, and aims for an independent program for doing the conversion. Since our aims are to take advantage of whatever we have (GPL, Octave, Parser) then we can stand on the shoulders of giants instead of 'lets start at the very beginning, Do Re Mi'. Paul's code helps us easily identify the transformation rules to be applied.

The Octave->Matlab transformation rules, applied in Paul's code include
  1. gsub("#" , "%"); convert # to %
  2. gsub("[(][)]",""); convert () to ' ' as Matlab 4.0 and Octave-2.1.50 dont support empty arguments. This is not a problem anymore.
  3. gsub("gset[^;%]*;",""); a graphics command conversion.
  4. gsub("gset[^;]*%","%"); a graphics command conversion.
  5. gsub("gset[^;]*$",""); a graphics command conversion.
  6. gsub("endfunction",""); Octave has endif, endfor, endfunction etc, so those get converted to end, or in Matlab nothing at all. Infact Matlab complains if you have the end keyword at the end of a function file.
  7. gsub("endif","end"); Same as 6.
  8. gsub("endwhile","end"); Same as 6.
  9. gsub("endfor","end"); Same as 6.
  10. gsub("end_try_catch","end"); Same as 6.
  11. gsub(/&&/,"\\&"); Matlab didnot have the short circuit logical operators initially, so we had to use the logical &.
  12. gsub("SEEK_CUR",0); They also seemingly lacked options for STDIO processing.
  13. gsub("SEEK_END",1); Same as 12.
  14. gsub("SEEK_SET",-1); Same as 12.
  15. gsub("usage","error"); Our usage() and print_usage() are replaced by error() function.
  16. gsub("__error_text__","lasterr"); Some Octave specific code
  17. gsub("unwind_protect_cleanup",""); Again Octave was first here, w.r.t try/catch ...
  18. gsub("end_unwind_protect",""); ... doing it the LISP way.
  19. gsub("unwind_protect",""); Same as [17, 18].
  20. gsub(/\|\|/,"|"); Logical short-circuit OR operator was NOT there in Matlab first. We got here earlier.
  21. gsub("!","~"); They dont know whats Not !
  22. gsub("[\\]$","..."); Line continuations are OK for us. Not them!

These rules are easily loaded as actions for a AST walker, for operators in Octave. We just generate the do_convert_oct_to_mat() on each node. As an example, the node for the operator short-circuit && will have the conversion rule embedded in this routine as follows,

string short_ckt_and::do_convert_oct_to_mat() {
l_str=left_node.do_convert_oct_to_mat();
r_str=right_node.do_convert_oct_to_mat();
str=l_str + " & " + r_str;
return str;
}

So basically it would be a really syntactically correct way of doing things for bulilding transformation tools from the existing code base. As explained earlied since Octave being a superset of Matlab we could also turn the world the other-way round.

Ofcourse the validity of the converter can be verified by doing the rountripping, O-2-M-2-O and running the code to see if we have the regression tests validated. John Eaton (JWE), has a set of regression cases for parser, which is again something we can re-use.

It would be fun to see this happen, and I suspect this is much lower hanging fruit than writing an evaluating AST-walker for building the profiler or a debugger.

Cheers,
Muthu

Creative Commons License