============= Multilanguage ============= .. Modified: 2019-02-23/17:05-0500 btiffin .. Copyright 2016 Brian Tiffin .. This file is part of the Unicon Programming document .. GPL 3.0+ :ref:`license` .. image:: images/unicon.png :align: center .. only:: html :ref:`genindex` :floatright:`Unicon` .. index:: multilanguage .. _multilanguage: Multilanguage programming ========================= By necessity, this chapter is not solely Unicon focused. Development with multiple programming languages, across multiple paradigms can be highly rewarding when applied to non-trivial application development. This can be as basic as adding a scripting engine to a program, or far more sophisticated, with multiple modules integrated into a cohesive whole. For example, Emacs, a text editor now over 30 years old, merged a C programming core with Lisp scripting. Not the first multilanguage environment, but a relatively famous, and highly successful one. Unicon is well placed for ``general purpose programming``, the feature set allowing for many problems to be solved with only Unicon source code. This is especially true when programming in the small, utility programs and light applications. One thing missing from base Unicon is end user scripting. Extending a core Unicon program means writing new Unicon source and recompiling the application. Emacs has shown that allowing the end user to extend a program with scripts is a powerful attribute, providing a path to add useful features that were not originally designed into an application. Along with scripting, there are vast quantities of libraries written in other languages that may benefit a Unicon program, alleviating the need to write new code to solve an already solved problem. Luckily, Unicon allows for extending programs with foreign language libraries and exposing some powerful scripting engines (with some small effort). A small downside is that Unicon requires a specific interface protocol,\ [#protocol]_ so there is usually a few lines of C language source involved. This extra effort is fairly boilerplate, and working samples already exist, making this a rather convenient inconvenience. Unicon ships with multiple forms of C integration (:ref:`loadfunc` and :ref:`callout` to name two), and from there, all kinds of potential is exposed. Multilanguage programming with Unicon will usually be a two or three language mix, ``Unicon``, ``C``, and any other language in use for the modules at hand. This is possible because almost every programming language in existence provides access to an :ref:`ABI`, application binary interface, compatible with platform specific C compilers. C becomes the crossroads of a *many to one to many* integration hub. The required C layer is usually small in terms of binary size and source code line count, and as mentioned earlier, there are existing examples to ease the burden on a Unicon developer. See :ref:`slang`, :ref:`COBOL integration `, :ref:`Javascripting ` and :ref:`Mini Ruby ` for some example integrations. .. [#protocol] The Unicon foreign function interface protocol is in place to manage the high level Unicon :doc:`datatypes ` expectations. Unicon data types are held in encoded descriptors, and these need conversion to and from the native ABI. .. index:: loadfunc loadfunc -------- `loadfunc` is a handy piece of kit. It allows Unicon to load libraries and then define Unicon functions that will invoke special C functions. The [#protocol]_ requires a small piece of C that accepts ``int argc, descpriptor argv[]``, or a count of arguments and pointer to an array of Unicon descriptors. Argument 0 is reserved for the return result, so ``argc`` is always at least 1. The ``descriptor`` data structures are define in :file:`ipl/cfuncs/icall.h` and that header file must be included in the source that defines the callable functions. The ``descriptor`` is what gives Unicon its variant data type powers. A data item can be a string, a number, a list, a real, etcetera and the structure can assigned to a variable (or passed to C functions). Variables do not have a type, and can reference any data, determined by the descriptor. The :doc:`programs` section has many examples of using `loadfunc`. The small wrappers, written in C, are then free to call any other C function as a native call. Results from these functions can then be passed back to Unicon by setting the ``argv[0]`` element. There are macros to make this very easy. ``Fail``, ``Return``, ``RetString``, ``RetInteger``, ``RetReal``, and so on, all documented in :file:`ipl/cfuncs/icall.h`. There are also a helper macros for testing the Unicon type passed into the wrapper, and for converting the descriptor values to native C data types. ``IconType`` is used for testing. ``ArgString`` will ensure (and convert if necessary) a particular descriptor is a C ``char`` pointer. ``StringVal`` will return the ``char *``. Same for ``ArgInteger``, ``ArgReal``, ``ArgList`` (and others) along with the corresponding ``IntegerVal``, ``RealVal``, ``ListVal``, etcetera. Read :file:`icall.h` for all current details. .. index:: native .. _cnative: C Native -------- .. note:: Superseded, see `libffi` below. But read through this section as it details a lot of important background. The libffi version uses a very similar Unicon programmer interface, but offers a lot more platforms from a much more mature and well tested codebase. Here is an experimental enhancement to Unicon that allows directly calling almost any C function, without need for a wrapper. This builds on `loadfunc` and defines 2 (or 3) new loadable functions. ``addLibrary`` will add a dynamic shared object library to the loader search path. ``native`` allows for calling C functions directly, by string name and an enumerated constant to determine the return type. Other arguments are tested by Unicon type and passed to C having built a call frame, using inline assembler. The listing below provides support for x86_64, System V ABI calling conventions. Other platform specific assembler will be required to add support for other systems. This is relatively straight forward inline assembler, 17 instructions for the initial trials. It allows up to 6 arguments, mixed integer, real or pointer/handle argument types and handles void returns along with integral (numbers/pointers), and ``double`` floating point data for use as Unicon `real`, along with `string` types. The code needs to be loaded into Unicon space, using `loadfunc`. ``addLibrary`` allows Unicon to add libraries to the dynamic search path. ``native`` is then used to lookup the call vectors (by string name, similar to `loadfunc` but then marshals the other arguments and sets up a valid Unicon return descriptor. No other wrappers are required. First the new functions: .. literalinclude:: programs/uniffi/native.c :language: c :start-after: +*/ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/native.c` Before the sample run, here are the two small support files for enumerating the encoded return types, and other related paperwork. These two files must be kept in synch. A Unicon ``$include`` file: .. literalinclude:: programs/uniffi/natives.inc :language: unicon :start-after: ##+ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/natives.inc` (Note the filename, ``.inc``), it is a Unicon preprocessor include file. A small C ``#include`` header. .. literalinclude:: programs/uniffi/natives.h :language: c :start-after: +*/ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/natives.h` Now a test head of small C functions to try out various call frames. .. literalinclude:: programs/uniffi/testnative.c :language: c :start-after: +*/ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/testnative.c` Unicon code to put in all in motion: .. literalinclude:: programs/uniffi/testffi.icn :language: unicon :start-after: ##+ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/testffi.icn` Two commands to prep the new support function and build the test heads .. command-output:: gcc -o uninative.so -shared -fPIC native.c :cwd: programs/uniffi .. command-output:: gcc -o libtestnative.so -shared -fPIC testnative.c :cwd: programs/uniffi And finally, the sample run: .. command-output:: unicon -s testffi.icn -x :cwd: programs/uniffi Woohoo, swing you partner round and round. Calling C without any wrapper functions. -------- .. index:: libharu, hpdf, PDF libharu ....... This makes things a little easier when it comes to some of the more feature rich libraries available, first trial is ``libharu`` a C library for creating PDF documents. .. literalinclude:: programs/uniffi/haru-v1.icn :language: unicon :start-after: ##+ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/haru-v1.icn` And then generating a password protected sample PDF. .. command-output:: unicon -s -DPROTECTED haru-v1.icn -x :cwd: programs/uniffi In a less short example, the password PROTECTED option would be tested at runtime, not at compile time, along with other control options for things like supported page sizes, compression and default character encodings. .. command-output:: ls -l harutest-pass.pdf :cwd: programs/uniffi -------- .. index:: GnuCOBOL GnuCOBOL ........ Calling COBOL modules is now a breeze. .. literalinclude:: programs/uniffi/cobolnative.cob :language: cobol :start-after: *>+<* .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/cobolnative.cob` A caller: .. literalinclude:: programs/uniffi/testcob.icn :language: unicon :start-after: ##+ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/testcob.icn` A build, and a run: .. command-output:: cobc -m cobolnative.cob -Wno-unfinished :cwd: programs/uniffi .. command-output:: unicon -s testcob.icn -x :cwd: programs/uniffi And there is Unicon calling COBOL, with no additional wrappers, besides the ``native`` functions. Data passed, and results returned to Unicon. Supported platforms ................... Currently, the assembler required to make this work is x86_64 GNU/Linux only. That will change if/when people show interest. .. note:: libffi fixes that support problem, lots of platforms supported. libffi supersedes, but does not displace, the above information. -------- .. index:: libffi, uniffi .. _libffi: .. _uniffi: libffi ------ After playing with the C Native interface, and actually looking into how many platforms would require the assembly layer, bumped into ``libffi``. ``libffi`` is a library that does the call frame setup, very much like the native interface described above. Except it is mature, well tested, and already supports many tens of platforms and various compiler systems. So, while the above interface works, and will be available, the ``libffi`` system is already ahead of the game, and will be put to use instead of hand rolled assembler. https://sourceware.org/libffi/ The user interface from a Unicon point of view, is almost identical to what is shown above, but extended with a few more features, like support for ``struct``, something that would have been a little trickier with hand rolled assembly developed from scratch. A new ``native(...)`` function is defined in the same way with `loadfunc`, but instead of including inline assembler, it uses ``ffi_prep_cif`` and ``ffi_call``. FFI Foreign Function Interface CIF Call InterFace The only visible change is ``uninative`` in now ``uniffi``, and ``-lffi`` is included for access to the ``libffi`` features. Using ``libffi`` suffers the same problem of ``float`` versus ``double`` that was in the previous native sequence. A Unicon Real is a ``double`` and sometimes :t:`C` requires a ``float``. A new interface was developed to help get around this problem; a datatype override can be included by using a `list` structure as part of any arguments passed on to :t:`C`. As an explanation: The current marshalling layer pulls data from Unicon and tests the type of data passed. Integer values are mapped to ``long``, Real values are mapped to ``double``. For example, the following code, just works: .. sourcecode:: unicon ans := native("j0", TYPEDOUBLE, &pi) That code will call the ``libm`` Bessel function of the first kind, order 0, which assumes a ``double`` floating point input and returns a ``double``. The Unicon type does not need any further conversion. A real is effectively a double. The problem comes when you want to call a lower precision ``float`` version. The first crack at a solution was to have ``native(...)`` and ``nativeFloat(...)`` routines. The ``nativeFloa(...)`` function assumed that all real values were to be demoted to ``float`` values. It worked, but it meant that you could never mix ``float`` and ``double`` arguments. The new sequence drops the ``nativeFloat`` function and provides an override option to the Unicon programmer. The new experiment uses `list` data for those times when an override is required. .. sourcecode:: unicon # pass pi as a real (as double), get back a real (as double) ans := native("j0", TYPEDOUBLE, &pi) write("j0(pi) = ", ans) # pass pi as a real (as float), get back a real (as float) ans := naitive("j0f", TYPEFLOAT, [&pi, TYPEFLOAT]) write("j0f(pi) = ", ans) # pass pi as a normal double, pass e as a float, get back a double ans := native("mymath", TYPEDOUBLE, &pi, [&e, TYPEFLOAT]) The override allows freely mixed ``float`` and ``double`` arguments pulled from the Unicon Real type. An alternative was an overly burdensome type specifier required for every argument, which make the source code less friendly to write and a little bit harder to read with all the extra type specifiers getting in the way. As a bonus, it also allows other hard to qualify options in Unicon, such as .. sourcecode:: unicon # pass integer i (by address) get back an integer ans := native("indirect", TYPEINT, [i, TYPESTAR]) write("indirect( ", i, ") = ", ans) If the :t:`C` function is defined as .. sourcecode:: c int indirect(int *i) { return *i * 2; } The Unicon code listing above will call the function by passing the address *of a copy of* the integer value. This is not quite the same as pass by reference, but is pass by content. The called routine gets a pointer (common in :t:`C` library functions) but will not be able to change the source data. This is a limitation, but it protects the immutable property of Unicon base data types. This limitation will stand. The type override opens the possibility of calling a :t:`C` functions with pointers, without falling back to writing a wrapper (unless the routine actually needs to change the referenced data for proper functioning). At that point, to protect Unicon immutable data, an extra wrapper routine would need to be written, burden on the Unicon programmer to follow the `loadfunc` model of C integration. *It's not a huge burden really, but does require a little bit of C code.* There may be a future addition to the :file:`uniffi.c` feature set to allow changing referenced data, by passing a list from Unicon. The caller would then be able to reassign the output values using normal Unicon assignment operators. .. note:: There will *never* be a feature added that allows immutable Unicon data to be changed in place with ``uniffi``. That goes too far against the grain and the spirit of Unicon programming. The current ``libffi`` interface tests just as well as the hand rolled assembler. *To be honest it actually works better, many edge and corner cases have been debugged in libffi, and the cross platform support is a complete boon*. .. literalinclude:: programs/uniffi/uniffi.c :language: c :start-after: +*/ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/uniffi.c` And an updated test head: .. literalinclude:: programs/uniffi/uniffi.icn :language: unicon :start-after: ##+ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/uniffi.icn` Nearly the same commands to prep the support function and build the test heads, as listed above under `cnative`. The only addition is ``-lffi`` to the :t:`gcc` compile line, and a change of filename from :file:`native.c` to :file:`uniffi.c`. (I've taken to pronouncing that as "unify"). .. command-output:: gcc -o uniffi.so -shared -fPIC uniffi.c -lffi :cwd: programs/uniffi The :file:`testnative.c` :t:`C` code remains almost the same, but added ``libm.so`` to test the ``j0`` and ``j0f`` functions (and to test that loaded libraries stick around and function lookups are cumulative). .. command-output:: gcc -o testnative.so -shared -fPIC testnative.c :cwd: programs/uniffi Here is a test run with a new top level Unicon filename :file:`uniffi.icn` with the new list ``[arg, TYPE]`` specifiers included in some of the tests: .. command-output:: unicon -s uniffi.icn -x :cwd: programs/uniffi Unified Foreign Function Interface. (Uniconified FFI). Running your Unicon on a Mac, Windows, 32bit, 64bit, FreeBSD or a z/Linux mainframe? ``uniffi`` has your back. Call as many library functions as you like, all from Unicon sources, no wrappers required. The GnuCOBOL sample listed in `cnative` is identical. The ``libharu`` integration now uses the type override system, as the PDF generator prefers ``float`` data arguments. It makes the listing a listing a little less easy to read, with all the type overrides, but not too bad, given the level of flexibility provided. .. literalinclude:: programs/uniffi/haru.icn :language: unicon :start-after: ##+ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/haru.icn` What comes next? That is up to the creativity, imagination and needs of august Unicon developers. No need to write any :t:`C` to get at :t:`C`, or any language that uses the :t:`C` application binary interface. As a (not completely) informed guess, I'd peg that at well over 75% of currently available computing resources, world wide. Even an Android phone plays nice with the :t:`C` ABI internally. Other features of ``libffi`` can be exposed as needs are determined. The ``ffi_call`` layer supports ``sysv``, ``unix64``, ``win64``, ``stdcall``, ``fastcall``, ``thiscall`` and ``cdecl`` call conventions. Possibly others, depending on platform specific builds. By default, ``uniffi`` uses the calling convention most appropriate for the system used during builds by using ``FFI_DEFAULT_ABI`` when preparing the Call InterFace block. -------- .. index:: BaCon .. _baconffi: baconffi -------- Putting `libffi` to use with `BaCon`, the BASIC Converter. Like the GnuCOBOL example, this routine just sums two numbers passed in as arguments and returns the resulting integer. .. literalinclude:: programs/uniffi/basicnative.bac :language: basic :start-after: REM + .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/basicnative.bac` This source is then converted to C, and a shared library is created. .. command-output:: bacon -q -f basicnative.bac :cwd: programs/uniffi A slightly more sophisticated BaCon program, that embeds some inline assembler. *Sample derived from a thread on the BaCon forums by vovchik and Axelfish*, http://basic-converter.proboards.com/thread/752/assembler-bacon .. literalinclude:: programs/uniffi/asmmix.bac :language: text :start-after: REM + .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/asmmix.bac` Again, convert the source and compile to a shared library. .. command-output:: bacon -q -f asmmix.bac :cwd: programs/uniffi A Unicon test head to load in the ``basic`` function defined in ``basicnative.bac`` and then invoke the function through the `uniffi` interface. Same sequence for the ``asmmix`` call. .. literalinclude:: programs/uniffi/baconffi.icn :language: unicon :start-after: ##+ .. only:: html .. rst-class:: rightalign :download:`programs/uniffi/baconffi.icn` Sample run: .. command-output:: unicon -s baconffi.icn -x :cwd: programs/uniffi And mixing Unicon with BASIC becomes another easy thing to do. BaCon is an extraordinarily powerful BASIC translator. Lots of features. http://www.basic-converter.org/ .. only:: html .. -------- :ref:`genindex` | Previous: :doc:`programs` | Next: :doc:`theory` |