Tuesday, September 6, 2011

SICSA2011

Since 29.Aug till 1.Sep there was a small summer school at St.Andrews, Scotland: Advanced techniques in computer algebra systems development.

It was about CAS internals, memory management/garbage collection, parallelization, thread-safety etc.

In particular, here is my talk about the Singular memory management library: omalloc.

Wednesday, March 16, 2011

Testing contributed libraries

A new Singular release is in the pipeline. In the new release, some libraries distributed as experimental previously will be updated and promoted to standard. We wanted to send an email to experimental library authors in order to remind them to
  • send us the latest version of their libraries and
  • check if the library conforms to the requirements for standard libraries.
Apart from the examples provided in the documentation, each library should provide a list of commands and the corresponding output to check the correctness of all code paths. These will be included in the Singular test suite to verify that the library works as intended on all platforms supported by Singular in future releases.

While working on the email, it became apparent that the instructions for adding tests for libraries are buried in the source code. For easy access before they are published in the online manual for the next release, here are the lines from the file Tst/README with instructions on providing test data.

To add a new test for a library/command to the test-suite:
  • For most library files, two sets of test files need to be provided.
    • Short tests should test the essential functionality of the library/command in a relatively short time (say, in no more than 30s).
    • Long tests should test the functionality of the library/command in detail so that, if possible, all relevant cases/results are tested. Nevertheless, such a test should not run longer than, say, 10 minutes.
    If useful tests generally execute in a short time, it is enough to have short tests only.
  • Each set should include
    • a tst file containing the code to execute the test (details of the format of these files are explained below),
    • and a res file with the corresponding output.
  • The test files should have the following name convention (replace xx with library name):
    • xx_s.tst: Singular code for short and basic tests
    • xx_s.res: Output of xx_s.tst
    • xx_l.tst: Singular code for long and extended tests 
    • xx_l.res: Output of xx_l.res
    or, alternatively:
    • xx.tst: Singular code for short tests, only
    • xx.res: Output of xx.tst
Rules for providing tst files:
  1. tst files always start with the following three commands as preamble:
    LIB "tst.lib";
    tst_init();
    tst_ignore("CVS ID $Id$"); // or version number here
    tst_init() writes some general info to stdout (like date,  uname, hostname, version, etc.). The library tst.lib (contained in the Singular distribution) provides, among others, the routines tst_init() and tst_ignore().
  2. tst files should end with the following statements:
    tst_status(1); $
    which enables (automatic) checks of the timing/memory performance of Singular.
  3. All system-dependent output (like run-times, memory usages, pathnames, dates, etc.) should generally be avoided.
  4. After time/memory critical sections of the tst files, the command
    tst_status();
    should be inserted. This enables (automatic) checks of the  timing/memory performance of Singular since the last call to tst_status() (resp. since the start-up of Singular).
  5.  If system-dependent output can not be avodied, the routine tst_ignore() should be used:
    tst_ignore(val [, keyword]): 'val' can have arbitrary type for which a string conversion exists; if present, keyword must be one of the following strings: "time", "memory"
    tst_ignore() outputs 'val' by adding the following prefix:
    no keyword -- // tst_ignore:
    "time" keyword -- // tst_ignore: time:
    "memory" keyword -- // tst_ignore: memory:
    which causes automatic tests to ignore these lines when doing a diff on result files. 
An example of a short tst file can be found here:
http://www.singular.uni-kl.de:8002/trac/browser/trunk/Tst/Short/gmssin_s.tst

You can download the newest version of tst.lib from
ftp://www.mathematik.uni-kl.de/pub/Math/Singular/singular-cd/LIBRARIES/tst.lib


Monday, February 14, 2011

Dynamic vs. static vs. shared

It's rarely known, that Singular has a kind of plug-in mechanism. We call it dynamic modules and it uses dlopen to add some extended kernel functionality to Singular, while keeping the binary executable small. (It might be better called shared module - we will see below.)

The module itself is compiled and linked like a shared library. Well, maybe the linker options are a little bit unusual. I realized this, while preparing the pyobject extension. The latter enables Singular to handle objects from Python, in particular it enables Singular to load and execute routines from our sister-project PolyBoRi.

I finished the extension itself some weeks ago, but in the first it was statically linked into the Singular binary. Hence, my personal development copy of the Singular binary got dependent  on libpython. To avoid this the Singular-team came up with the idea to make a dynamic module out of the extension. The principle change was not too complicated, since I already gained some experience with the dynamic module stuff, while preparing pyobject's predecessor psico. The extension is still deactivated per default, but it can easily be activated while configuring:
./configure --with-python

After rebuilding (make install) a Singular session can seamlessly access PolyBoRi.
> python_import("polybori");
> def r = declare_ring(list(Block("x", 10), Block("y", 10)));
> list polybori_ideal = (x(1)+x(2),x(2)+y(1));
> def result = groebner_basis(polybori_ideal);
> result;
[x(1) + y(1), x(2) + y(1)]
> Auf Wiedersehen.


After that, things got complicated and the real work was starting.  My pyobject.so (the plug-in) depended on the runtime library libpython (and the dependencies of the latter). In general, this is not bad: if you build Singular from scratch, building succeeds if and only if the dependencies were there in the first place. It is also not bad, if a package management system (rpm, deb and consorts) does the building, because it resolves such dependencies for you. To cut a long speech short: if you want to use Python, you'll have to install Python on your system. By the way, I don't know any full-featured distribution (Linux or Unix), which runs without it, so this is not really a challenge. (It would be if you want to install Singular on an embedded system. But that's a challenge anyway.)

On the other hand the Singular team distributes fall-back binaries for those users which are not able to build Singular on their own. These binaries must not have any external dependencies, because system libraries vary from distro to distro.

What about the dynamic modules which already come with Singular?
Answer:
> ldd p_Procs_FieldQ.so
        statically linked
This was a surprise. Since dynamic modules should make the binary small, I did not expect the modules itself to be linked statically, because - incorporating a whole bunch of system library stuff - now the modules become quite fat.

To be honest, until I typed that ldd command above, I thought, that shared and static are mutually exclusive. Indeed, the opposite of static is dynamic. Albeit the opposite of a shared library is a static one, a shared library itself could be linked dynamically or statically with the system libraries. So statically linked dynamic modules do make sense in the Singular context of providing fall-back binaries.

How can I archive this with pyobject.so? Just adding -shared -Xlinker -static to the Makefile did not do the job. The module was compiled and linked without error, but loading yielded an dlopen error: The symbol _Py_NoneStruct was not found.

A quick google search implied that there is something special with PyNone, so I avoided to use it. But then, the next unresolved symbol occurred. Ah, do'h! dlopen just complains about the first missing symbol. Actually, I never linked to libpython, because I put the libraries in the wrong order and (using -static) the order does matter. Also, linking like this does not complain about missing symbols, because the symbols might be resolved on runtime from the binary or other shared libraries. The latter was forbidden by the -static flag, and so, and so, and so, my module got corrupted.

Finally, I ended up with the following linkage call
  g++ pyobject.dl_o -Xlinker -static -nodefaultlibs -Xlinker -export-dynamic -shared -L/usr/lib64/python2.6 -L/usr/lib64 -lpython2.6  -lpthread -ldl  -lutil -lc -o pyobject.so

where -lpthread -ldl -lutil fulfill the dependencies of libpython2.6 and the construct -nodefaultlibs ... -lc avoids trying to link against some unused shared c++-libs (which - of course - is not possible statically).

Note that this only works, if all libraries are available in their static lib***.a variants. Even more: to get position-independent code (i. e. shared-library conforming) each of which has to be compiled with -fPIC, which is not common for .a-libs. So ./configure --with-python  tests for all of this and falls back to the classical dynamic-shared-modules in case of failure. Also, the user can force to build these light-weight modules by typing
  ./configure --with-python=module,dynamic
at the command prompt. In addition
   ./configure --with-python=embed
will incorporate the pyobject extension completely into the binary.

My best,
  Alexander
P.S.: I'm aware, there are more dependencies besides libpython, for instance the Python standard library (written in Python itself), which needs to be bundled for providing a completely distro-independent binary distribution. But that's another story.