GSoC 2018: Autolev Parser (using ANTLRv4): Final Report

About Me:

I am Nikhil Pappu, an undergraduate Computer Science student at the International Institute of Information Technology, Bangalore.

About the Project:

Autolev (now superseded by MotionGenesis) is a domain specific language used for symbolic multibody dynamics. The SymPy mechanics module now has enough power and functionality to be a fully featured symbolic dynamics module. The parser parses Autolev (version 4.1) code to SymPy code by making use of SymPy’s math libraries and the mechanics module.

The parser has been built using the ANTLR framework and its main purpose is to help former users of Autolev to get familiarized with multibody dynamics in SymPy.

The Plan:

The plan was to build a parser using ANTLR that could parse Autolev code to SymPy code.  Overall,  I think I was able to achieve most of what I had hoped for. I had faced some difficulties in some areas of the parser due to the very different nature of Autolev and Python. The parser has some issues as a result. I have specified all the details in the documentation I have written.

Work Done:

I made a parser for the Autolev language which is now a part of SymPy in the parsing module. I have written the code for the parser using the ANTLR framework. I have also included a bunch of tests for testing the rules of the parser code.

The majority of the work was a part of PR #14758. I made a second PR #15006 for the changes I had made after the main PR.

I have written documentation for the parser which is a part of these PRs: #15046, #15066 and #15067.

I have also written a PyDy for Autolev Users guide which is a part of PR #15077. This guide is meant to be a quick reference for looking up Autolev-PyDy equivalents.

Future Work:

  1. The parser has been built by referring to and parsing codes from the Autolev Tutorial and the book Dynamics Online: Theory and Implementation Using Autolev. Basically, the process involved going through each of these codes, validating the parser results and improving the rules if required to make sure the codes parsed well.

    As of now, a large number of codes of Dynamics Online have been parsed. Completing all the remaining codes of the book would make the parser more complete.

  2. There are some limitations and issues with the parser and these have been discussed in the documentation. The plan is to fix these in order of priority.
  3. The parser is currently built using a kind of Concrete Syntax Tree (CST) using the ANTLR framework. It would be ideal to switch from a CST to an Abstract Syntax Tree (AST). This way, the parser code will be independent of the ANTLR grammar which makes it a lot more flexible. It would also be easier to make changes to the grammar and the rules of the parser.

I would like to keep contributing to SymPy. I would be doing a lot of math in college especially related to data science so I would love to contribute in areas like Probability and Algebra among others. I would also like to help newcomers feel comfortable with the environment.

Conclusion:

I would like to thank my mentors Ondřej Čertík and Jason Moore for believing in me and taking time out from their busy schedules to guide me throughout the project. I would also like to thank Aaron Meurer for looking over GSoC as the org admin and making sure that we all had a great experience working with SymPy.

Links:

Main PR: #14758

Updated parser code PR: #15006 and #15013

Documentation PRs: #15046, #15066 and #15067

PyDy for Autolev Users guide PR: #15077

Weekly Blog link: https://nkhlpappu.wordpress.com/

Advertisements
Standard

Autolev Parser: Status Update

I have made some changes to the parser code to parse more files since #14758 has been merged. I have also made the changes suggested in that PR after it had been merged. I have opened a new PR #15006 for the updated parser code. I have also opened #15013 to include tests for physics functions which I didn’t do in the initial PR. The GitLab repo autolev-test-examples is in good shape now and is part of the sympy user.

I am currently writing the documentation in which I shall include how to use the parser, gotchas, limitations, issues and future improvements. I shall also include a rewritten version of the PyDy for Autolev Users guide in it.

I shall then write the output tests (Tests to compare the outputs of Autolev against those of SymPy) for most of the test examples in the GitLab repo (I shall include these in a directory called output-tests in the GitLab repo). I think its good to put them here as I don’t see the need to test these on Travis as changing the parser code won’t affect these. Plus, they will be in a place where the test examples are which are what they will be based on. We could still test these on Travis if required even from here I suppose.

Finally, I shall wrap things up with the Final Report and Submission.

Standard

Autolev Parser: Status Update

Hello Everyone. I have been working on getting the PR #14758 into shape and now it is finally merged. I have written my own tests for the PR so as to not include copyrighted files that belong to the creators of Autolev.

I am now working on a test-examples repo which serves as a showcase of the parser and also as a source of additional tests. The repo is private on GitLab as it contains copyrighted files. You can request access at the repo link above. Files from this repo can be copied over to the test_examples folder of parsing/autolev to use them as tests. From now, I will be working on adding more examples from the PyDy example repo, Autolev Tutorial, and Dynamics Online to this repo while improving the code of the parser to parse all these codes. I am also making note of things like errors, inaccuracies etc to include them in the Documentation.

I will open another PR once I have made enough number of changes to the parser code.

Here is my plan for the future of this project:

Till the end of GSoC:

  1. Work on getting the test-examples repo in good shape.
  2. Write extensive Documentation (explaining what the parser can do, how to use it,  limitations, issues, future improvements etc).
  3. Work on as many Dynamics Online codes (which I shall include in the repo) as possible (Wrap up Ch4 and hoping to get half of Ch5 done (as it is quite big)).

Post GSoC:

  1. Finish the rest of the Dynamics Online Book (Whatever is left of Ch5 and also Ch6 which I think is less important).
  2. Work on the issues that I will be listing down in the documentation one by one after discussing the possible fixes (Some of these might require changes in the parser while some others require changes in the SymPy code while I do not have much of an idea about the fixes of some other ones).

Thanks,

Nikhil

 

Standard

Autolev Parser: Status Update

I have been working on improving the parser by parsing Dynamics online codes, planning out how to go about writing tests and other aspects of the project and getting the PR into shape.

I am currently working on writing tests to cover all the rules of the parser. I should be done with this in 2 days.

This is the plan I have for the third phase:

  1. Make the PR merge ready:
    1. Finish the tests for the parser rules and get the PR merged.
    2. open a new PR to work on further improvements.
  2. additional_tests (will be added in a private BitBucket repo). Here I shall go through many codes from these sources and improve the parser to parse most of these. I shall take notes on little details and errors so that I can include them in the documentation.
    1. PyDy example repo (mass spring damper, double pendulum, chaos pendulum examples)
    2. Dynamics Online Chapters 1 – 4
    3. Autolev Tutorial Examples (5.1 – 5.7)
  3. Documentation (What the parser can do, How it should be used, Limitations, Future improvements etc)
  4. Make the parser parse Dynamics Online Chapter 5 codes and the Bicycle Model.
  5. Final Report
Standard

Autolev Parser: Status Update

I have a PR for a working parser now with some test cases. The Travis errors I had previously have been fixed.

I am currently going through the chapters of the book Dynamics Online: Theory and Implementation with Autolev and parsing most of the Autolev codes I come across. I feel this would help to make the parser more complete. After getting the desired parsed code I am also running the code and checking that the results are same/similar to the Autolev responses in the .ALL files.

I have parsed the codes of Chapter 1 and 2 of the book and am currently working on Chapter 3. There are 6 Chapters overall and the bulk of the stuff is concentrated in Chapters 4 and 5.

After parsing the codes of this book, I shall update the parser code and the tests in the PR. I will add more test cases as well. I will also send in a file containing all the parsed codes of Dynamics Online.

A lot of the codes are parsing completely fine. A few I feel are quite difficult to parse to SymPy code using a parser and they wouldn’t even be in the spirit of SymPy/Python if parsed exactly. I have marked these for later. A few of them are producing slightly altered expressions or in some cases errors in SymPy. I am classifying all the codes appropriately based on criteria like this.

After parsing the book I plan on finishing up the leftover parts of the Autolev Tutorial examples and making sure the Bicycle Model Autolev code is parsed.

I will then go on to do a complete code cleanup (general cleanup, using standard conventions and better variable names, adding more comments etc).

Finally, I will wrap things up by writing the Documentation and a Final Report. In these I shall discuss: what the parser can do, how it should be used (there are some minor things in some cases that the user should note to get a proper SymPy parse), limitations and future improvements.

Standard

Autolev Parser (using ANTLR v4)

In this post, I shall discuss my project: Autolev Parser (using ANTLR v4). This is the timeline I am proposing. Please suggest any changes if required.

Timeline:

Finalize this after discussing things with the mentors.

Phase 1:

Weeks 1 and 2 (May 15 – May 28): Refactor the grammar and the preprocessing steps. Finish parsing the mathematical entities of Autolev.  Start off with parsing the symbolic dynamics part of Autolev. (Pull request #1)

Weeks 3 and 4 (May 29 – Jun 11): Work on parsing the symbolic dynamics part of Autolev using sympy.physics.vector and sympy.physics.mechanics. (Pull request #2)

 

Phase 2:

Weeks 5 (Jun 12 – Jun 18): Work further on the symbolic dynamics part. Work on the more difficult parts of the parser ie translations which are quite indirect and require more code and tweaking as opposed to calling equivalent SymPy commands. (Pull request #3)

Weeks 6, 7 and 8 (Jun 19 – Jul 9): Work on parsing the solvers and visualization parts. Use sympy.solvers and numerical solvers from the SciPy stack to translate the algebraic, nonlinear and ODE solvers. Use pydy.codegen to solve the equations of motion numerically and use sympy.physics.units to manipulate the units before plugging the values in. Use matplotlib and pydy.viz to translate the plotting parts. (Pull request #4)

 

Phase 3:

Weeks 9, 10 and 11 (Jul 17 – Aug 6): Refactor and tie up all the things done earlier and work on writing lots of test cases and well documented examples. Run benchmarking tests and make sure the results obtained are correct. (Pull request #5)

Week 12: (Jul 10 – Jul 16) Implement the error handling and recovery mechanism discussed in the Design section.

Week 13 (Aug 6 – Aug 14): Wrap everything up. Refactor the code, tests and examples and do a cleanup. (Pull request #6)

 

Code Setup:

Now I shall discuss some of the technical details about how my code is setup. You can find the code on my Github here.

  1. The antlr_essentials directory contains the antlr jar file and some bat files. If you want to run antlr commands, first specify the classpath of this jar file in the commands prompt.
  2. The test files directory contains the example test files on which I run the code. The files example5.1-example5.9 are examples of the same names in the Autolev Tutorial.  I have also included whipple.txt which is the bicycle model Autolev code. The directory also contains png image files of the parse trees generated when the grammar is run on these test files.
  3. Autolev.g4 is the grammar file. AutolevLexer.py, AutolevParser.py and AutolevListener.py are files automatically generated by ANTLR. A listener is basically a tree walker which triggers events when it enters and exits a rule. One can use this mechanism by subclassing AutolevListener.
  4. The file myListener.py contains the myListener class which simply subclasses AutolevListener. This is where I will be putting the majority of the parser code.
  5. The file autolev.py initializes the parser, lexer and listener and is the file you would want to run. Running ‘python autolev.py input.txt’ on the file input.txt which contains Autolev code would generate the parsed SymPy output in an output file of choice.
  6. The pydy_for_autolev_users.rst is a guide and I will keep updating it as I keep parsing more of the Autolev language.

I haven’t put much of an emphasis on code readability. Should this be an aspect to focus on? I am thinking the parsed output and the results are of primary concern. One reason I feel this way is because this isn’t core SymPy code and the project is niche and that the majority of code is based in ANTLR. Please tell me if you don’t think the same.

 

Status of the Project:

I have finished writing the grammar for the Autolev language. I am now partway through parsing the mathematical entities.

Parsing Mathematical Entities :
# Constants ✓
# Variables ✓
# Imaginary ✓
# MotionVariables ✓
# Specifieds ✓
# Expression reconstruction ✓
# Assignments ✓
# Math commands (deal with function calls) and expressions
# Reserved names and constants (T, Pi etc)
# Matrices

I have written code to parse the basic mathematical variable declarations such as constants, variables, specifieds etc.  I have also written code for expression reconstruction and assignment. I will now work all the math commands, reserved names and matrices. I will be handling vectors and dyadics in the physics part as sympy.physics.vector is more appropriate in the sense of Autolev and symbolic dynamics. Go over pydy_for_autolev_users.rst to get an idea of the different mathematical commands and physical entities. You can have a look at input1.txt and the parsed output in output1.txt in the test_files directory of autolev-parser on my Github.

I will get started with the symbolic dynamics part next once I am done with the mathematical entities.

Do you think I am headed in the right direction with the project? Please let me know what you think. Any feedback is much appreciated.

Standard