Before software can be resuable, it first has to be usable
- Ralph Johnson
Now, the operations I was talking about comprise of, in my skeletal code structure referencing to NVector_Serial, of upto 1900+ lines of code, and even reading that to understand which were required or not, and how they should be implemented, was quite overwhelming to begin with. But thankfully, programmers revert to good programming practices, and so did the ones writing this.
Most of the code in all the different operations, accesses data from it’s content field, i.e. the ColumnVector, in the form of four MACROS conveniently defined in the header file, and that is what I needed to work with first and foremost to get the example running.
NV_CONTENT_C
is used everytime the ColumnVector has to be accessed from the code to apply a specific operation on it. Since the content fild is actally a void pointer, with no associated data type, we have typecasted it here into a ColumnVector.NV_LENGTH_C
uses the numel() function of the ColumnVector class to return the exact number of elements in the vector, often used for element-wise operations throughout the code, and NV_DATA_C
provides a way to access the pointer to the actual block of memory allocated for data to be stored in this ColumnVector.``NV_Ith_C
macro allows any element in the vector to be accessed using its index, once again proving to be very useful for element-wise operations.Using these MACROS, and quite some debugging to make sure only required functions were implemented, as some of the optional ones don’t work as intended, I was able to get the idaHeat2Dkry.c example working with my custom implementation of the N_Vector. However, this was just temporary, as optional functions not used here could be required by IDALS or even the matrix / linear solver implementations. One such instance was the N_VSetArrayPointer(), which didn’t work as intended as our MACRO returned a data member private to a class, and it wasn’t possible to write to it. This will be tackled later in the project.
Thankfully, all of Sundials has been written with compatibility to be linked to C++, with the extern "C"
keyword being used to prevent name-mangling by the C++ compiler. Hence, I wrote the files for my custom implementations as well as the example, whih would eventually be replaced by the __ode15__
file, in C++, once again taking care to wrap them with the extern "C"
keyword, and was able to successfully make it work.
The drill remained pretty the same, identify the class to be used in octave, link the correct libraries, find the relevant functions to implement the MACROS, and test the code example.
Switching up the example The example being used earlier, idaHeat2Dkry.c was a simple equation that did not require the use of the Sparse Matrix. So I had to go back to the base example most similar to Ode15, idaHeat2Dklu.c. This required me to also link the klu library correctly with it, which changed the Makefile as well, but it worked suuccessfully.
Integrating into Octave’s source tree
Since both the implementations were somewhat working, I thought of adding both of them to the Octave source tree and testing them alongside Ode15. To this, I added all the 4 files, header and source for both the implementations, to the octave/libinterp/dldfcn , i.e the same path where __ode15__.cc
resides, and added them to the module.mk
file so they could be included in the Makefile auto-generated by Autotools.
Here, I found out that the N_VSetArrayFunction is required and it not working is causing the entire solver to collapse, so I’ll be working on getting that fixed.
In vectorization, Octave delegates this operation to an underlying implementation which, among other optimizations, may use special vector hardware instructions or could conceivably even perform the additions in parallel. In general, if the code is vectorized, the underlying implementation has more freedom about the assumptions it can make in order to achieve faster execution.
Hence this part is crucial, and I will be spending the next 1-2 weeks going over the ridiculously large amount of unoptimized code, and try to make it better with wrappers to actual ColumnVector functions and get these custom implementations right.