=============
Multilanguage
=============
.. Modified: 2019-02-23/17:05-0500 btiffin
.. Copyright 2016 Brian Tiffin
.. This file is part of the Unicon Programming document
.. GPL 3.0+ :ref:`license`
.. image:: images/unicon.png
:align: center
.. only:: html
:ref:`genindex`
:floatright:`Unicon`
.. index:: multilanguage
.. _multilanguage:
Multilanguage programming
=========================
By necessity, this chapter is not solely Unicon focused. Development with
multiple programming languages, across multiple paradigms can be highly
rewarding when applied to non-trivial application development. This can be as
basic as adding a scripting engine to a program, or far more sophisticated,
with multiple modules integrated into a cohesive whole.
For example, Emacs, a text editor now over 30 years old, merged a C
programming core with Lisp scripting. Not the first multilanguage environment,
but a relatively famous, and highly successful one.
Unicon is well placed for ``general purpose programming``, the feature set
allowing for many problems to be solved with only Unicon source code. This is
especially true when programming in the small, utility programs and light
applications. One thing missing from base Unicon is end user scripting.
Extending a core Unicon program means writing new Unicon source and
recompiling the application. Emacs has shown that allowing the end user to
extend a program with scripts is a powerful attribute, providing a path to add
useful features that were not originally designed into an application.
Along with scripting, there are vast quantities of libraries written in other
languages that may benefit a Unicon program, alleviating the need to write new
code to solve an already solved problem.
Luckily, Unicon allows for extending programs with foreign language libraries
and exposing some powerful scripting engines (with some small effort).
A small downside is that Unicon requires a specific interface protocol,\
[#protocol]_ so there is usually a few lines of C language source involved.
This extra effort is fairly boilerplate, and working samples already exist,
making this a rather convenient inconvenience.
Unicon ships with multiple forms of C integration (:ref:`loadfunc` and
:ref:`callout` to name two), and from there, all kinds of potential is exposed.
Multilanguage programming with Unicon will usually be a two or three language
mix, ``Unicon``, ``C``, and any other language in use for the modules at
hand. This is possible because almost every programming language in existence
provides access to an :ref:`ABI`, application binary interface, compatible
with platform specific C compilers. C becomes the crossroads of a *many to
one to many* integration hub. The required C layer is usually small in terms
of binary size and source code line count, and as mentioned earlier, there are
existing examples to ease the burden on a Unicon developer.
See :ref:`slang`, :ref:`COBOL integration `, :ref:`Javascripting
` and :ref:`Mini Ruby ` for some example integrations.
.. [#protocol]
The Unicon foreign function interface protocol is in place to manage
the high level Unicon :doc:`datatypes ` expectations. Unicon
data types are held in encoded descriptors, and these need conversion to
and from the native ABI.
.. index:: loadfunc
loadfunc
--------
`loadfunc` is a handy piece of kit. It allows Unicon to load libraries and
then define Unicon functions that will invoke special C functions. The [#protocol]_
requires a small piece of C that accepts ``int argc, descpriptor argv[]``, or
a count of arguments and pointer to an array of Unicon descriptors.
Argument 0 is reserved for the return result, so ``argc`` is always at least
1.
The ``descriptor`` data structures are define in :file:`ipl/cfuncs/icall.h`
and that header file must be included in the source that defines the callable
functions.
The ``descriptor`` is what gives Unicon its variant data type powers. A data
item can be a string, a number, a list, a real, etcetera and the structure can
assigned to a variable (or passed to C functions). Variables do not have a
type, and can reference any data, determined by the descriptor. The
:doc:`programs` section has many examples of using `loadfunc`.
The small wrappers, written in C, are then free to call any other C function
as a native call. Results from these functions can then be passed back to
Unicon by setting the ``argv[0]`` element. There are macros to make this very
easy. ``Fail``, ``Return``, ``RetString``, ``RetInteger``, ``RetReal``, and
so on, all documented in :file:`ipl/cfuncs/icall.h`.
There are also a helper macros for testing the Unicon type passed into the
wrapper, and for converting the descriptor values to native C data types.
``IconType`` is used for testing. ``ArgString`` will ensure (and convert if
necessary) a particular descriptor is a C ``char`` pointer. ``StringVal``
will return the ``char *``. Same for ``ArgInteger``, ``ArgReal``, ``ArgList``
(and others) along with the corresponding ``IntegerVal``, ``RealVal``,
``ListVal``, etcetera. Read :file:`icall.h` for all current details.
.. index:: native
.. _cnative:
C Native
--------
.. note::
Superseded, see `libffi` below. But read through this section as it
details a lot of important background. The libffi version uses a very
similar Unicon programmer interface, but offers a lot more platforms
from a much more mature and well tested codebase.
Here is an experimental enhancement to Unicon that allows directly calling
almost any C function, without need for a wrapper. This builds on `loadfunc`
and defines 2 (or 3) new loadable functions. ``addLibrary`` will add a
dynamic shared object library to the loader search path. ``native`` allows
for calling C functions directly, by string name and an enumerated constant to
determine the return type. Other arguments are tested by Unicon type and
passed to C having built a call frame, using inline assembler.
The listing below provides support for x86_64, System V ABI calling
conventions. Other platform specific assembler will be required to add
support for other systems. This is relatively straight forward inline
assembler, 17 instructions for the initial trials. It allows up to 6
arguments, mixed integer, real or pointer/handle argument types and handles
void returns along with integral (numbers/pointers), and ``double`` floating
point data for use as Unicon `real`, along with `string` types.
The code needs to be loaded into Unicon space, using `loadfunc`.
``addLibrary`` allows Unicon to add libraries to the dynamic search path.
``native`` is then used to lookup the call vectors (by string name, similar to
`loadfunc` but then marshals the other arguments and sets up a valid Unicon
return descriptor. No other wrappers are required.
First the new functions:
.. literalinclude:: programs/uniffi/native.c
:language: c
:start-after: +*/
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/native.c`
Before the sample run, here are the two small support files for enumerating
the encoded return types, and other related paperwork. These two files must
be kept in synch.
A Unicon ``$include`` file:
.. literalinclude:: programs/uniffi/natives.inc
:language: unicon
:start-after: ##+
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/natives.inc`
(Note the filename, ``.inc``), it is a Unicon preprocessor include file.
A small C ``#include`` header.
.. literalinclude:: programs/uniffi/natives.h
:language: c
:start-after: +*/
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/natives.h`
Now a test head of small C functions to try out various call frames.
.. literalinclude:: programs/uniffi/testnative.c
:language: c
:start-after: +*/
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/testnative.c`
Unicon code to put in all in motion:
.. literalinclude:: programs/uniffi/testffi.icn
:language: unicon
:start-after: ##+
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/testffi.icn`
Two commands to prep the new support function and build the test heads
.. command-output:: gcc -o uninative.so -shared -fPIC native.c
:cwd: programs/uniffi
.. command-output:: gcc -o libtestnative.so -shared -fPIC testnative.c
:cwd: programs/uniffi
And finally, the sample run:
.. command-output:: unicon -s testffi.icn -x
:cwd: programs/uniffi
Woohoo, swing you partner round and round. Calling C without any wrapper
functions.
--------
.. index:: libharu, hpdf, PDF
libharu
.......
This makes things a little easier when it comes to some of the more feature
rich libraries available, first trial is ``libharu`` a C library for creating
PDF documents.
.. literalinclude:: programs/uniffi/haru-v1.icn
:language: unicon
:start-after: ##+
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/haru-v1.icn`
And then generating a password protected sample PDF.
.. command-output:: unicon -s -DPROTECTED haru-v1.icn -x
:cwd: programs/uniffi
In a less short example, the password PROTECTED option would be tested at
runtime, not at compile time, along with other control options for things like
supported page sizes, compression and default character encodings.
.. command-output:: ls -l harutest-pass.pdf
:cwd: programs/uniffi
--------
.. index:: GnuCOBOL
GnuCOBOL
........
Calling COBOL modules is now a breeze.
.. literalinclude:: programs/uniffi/cobolnative.cob
:language: cobol
:start-after: *>+<*
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/cobolnative.cob`
A caller:
.. literalinclude:: programs/uniffi/testcob.icn
:language: unicon
:start-after: ##+
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/testcob.icn`
A build, and a run:
.. command-output:: cobc -m cobolnative.cob -Wno-unfinished
:cwd: programs/uniffi
.. command-output:: unicon -s testcob.icn -x
:cwd: programs/uniffi
And there is Unicon calling COBOL, with no additional wrappers, besides the
``native`` functions. Data passed, and results returned to Unicon.
Supported platforms
...................
Currently, the assembler required to make this work is x86_64 GNU/Linux only.
That will change if/when people show interest.
.. note::
libffi fixes that support problem, lots of platforms supported.
libffi supersedes, but does not displace, the above information.
--------
.. index:: libffi, uniffi
.. _libffi:
.. _uniffi:
libffi
------
After playing with the C Native interface, and actually looking into how
many platforms would require the assembly layer, bumped into ``libffi``.
``libffi`` is a library that does the call frame setup, very much like the
native interface described above. Except it is mature, well tested, and
already supports many tens of platforms and various compiler systems. So,
while the above interface works, and will be available, the ``libffi`` system
is already ahead of the game, and will be put to use instead of hand rolled
assembler.
https://sourceware.org/libffi/
The user interface from a Unicon point of view, is almost identical to what is
shown above, but extended with a few more features, like support for
``struct``, something that would have been a little trickier with hand rolled
assembly developed from scratch.
A new ``native(...)`` function is defined in the same way with `loadfunc`, but
instead of including inline assembler, it uses ``ffi_prep_cif`` and
``ffi_call``.
FFI
Foreign Function Interface
CIF
Call InterFace
The only visible change is ``uninative`` in now ``uniffi``, and ``-lffi`` is
included for access to the ``libffi`` features.
Using ``libffi`` suffers the same problem of ``float`` versus ``double`` that
was in the previous native sequence. A Unicon Real is a ``double`` and
sometimes :t:`C` requires a ``float``. A new interface was developed to help
get around this problem; a datatype override can be included by using a `list`
structure as part of any arguments passed on to :t:`C`.
As an explanation:
The current marshalling layer pulls data from Unicon and tests the type of
data passed. Integer values are mapped to ``long``, Real values are mapped to
``double``.
For example, the following code, just works:
.. sourcecode:: unicon
ans := native("j0", TYPEDOUBLE, &pi)
That code will call the ``libm`` Bessel function of the first kind, order 0,
which assumes a ``double`` floating point input and returns a ``double``. The
Unicon type does not need any further conversion. A real is effectively a
double.
The problem comes when you want to call a lower precision ``float`` version.
The first crack at a solution was to have ``native(...)`` and
``nativeFloat(...)`` routines. The ``nativeFloa(...)`` function assumed that
all real values were to be demoted to ``float`` values. It worked, but it
meant that you could never mix ``float`` and ``double`` arguments. The new
sequence drops the ``nativeFloat`` function and provides an override option to
the Unicon programmer.
The new experiment uses `list` data for those times when an override is
required.
.. sourcecode:: unicon
# pass pi as a real (as double), get back a real (as double)
ans := native("j0", TYPEDOUBLE, &pi)
write("j0(pi) = ", ans)
# pass pi as a real (as float), get back a real (as float)
ans := naitive("j0f", TYPEFLOAT, [&pi, TYPEFLOAT])
write("j0f(pi) = ", ans)
# pass pi as a normal double, pass e as a float, get back a double
ans := native("mymath", TYPEDOUBLE, &pi, [&e, TYPEFLOAT])
The override allows freely mixed ``float`` and ``double`` arguments pulled
from the Unicon Real type.
An alternative was an overly burdensome type specifier required for every
argument, which make the source code less friendly to write and a little bit
harder to read with all the extra type specifiers getting in the way.
As a bonus, it also allows other hard to qualify options in Unicon, such as
.. sourcecode:: unicon
# pass integer i (by address) get back an integer
ans := native("indirect", TYPEINT, [i, TYPESTAR])
write("indirect( ", i, ") = ", ans)
If the :t:`C` function is defined as
.. sourcecode:: c
int
indirect(int *i)
{
return *i * 2;
}
The Unicon code listing above will call the function by passing the address
*of a copy of* the integer value. This is not quite the same as pass by
reference, but is pass by content. The called routine gets a pointer (common
in :t:`C` library functions) but will not be able to change the source data.
This is a limitation, but it protects the immutable property of Unicon base
data types. This limitation will stand. The type override opens the
possibility of calling a :t:`C` functions with pointers, without falling back
to writing a wrapper (unless the routine actually needs to change the
referenced data for proper functioning). At that point, to protect Unicon
immutable data, an extra wrapper routine would need to be written, burden on
the Unicon programmer to follow the `loadfunc` model of C integration. *It's
not a huge burden really, but does require a little bit of C code.*
There may be a future addition to the :file:`uniffi.c` feature set to allow
changing referenced data, by passing a list from Unicon. The caller would then
be able to reassign the output values using normal Unicon assignment
operators.
.. note::
There will *never* be a feature added that allows immutable Unicon data
to be changed in place with ``uniffi``. That goes too far against the
grain and the spirit of Unicon programming.
The current ``libffi`` interface tests just as well as the hand rolled
assembler. *To be honest it actually works better, many edge and corner cases
have been debugged in libffi, and the cross platform support is a complete
boon*.
.. literalinclude:: programs/uniffi/uniffi.c
:language: c
:start-after: +*/
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/uniffi.c`
And an updated test head:
.. literalinclude:: programs/uniffi/uniffi.icn
:language: unicon
:start-after: ##+
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/uniffi.icn`
Nearly the same commands to prep the support function and build the test
heads, as listed above under `cnative`. The only addition is ``-lffi`` to the
:t:`gcc` compile line, and a change of filename from :file:`native.c` to
:file:`uniffi.c`. (I've taken to pronouncing that as "unify").
.. command-output:: gcc -o uniffi.so -shared -fPIC uniffi.c -lffi
:cwd: programs/uniffi
The :file:`testnative.c` :t:`C` code remains almost the same, but added
``libm.so`` to test the ``j0`` and ``j0f`` functions (and to test that loaded
libraries stick around and function lookups are cumulative).
.. command-output:: gcc -o testnative.so -shared -fPIC testnative.c
:cwd: programs/uniffi
Here is a test run with a new top level Unicon filename :file:`uniffi.icn`
with the new list ``[arg, TYPE]`` specifiers included in some of the tests:
.. command-output:: unicon -s uniffi.icn -x
:cwd: programs/uniffi
Unified Foreign Function Interface. (Uniconified FFI). Running your Unicon
on a Mac, Windows, 32bit, 64bit, FreeBSD or a z/Linux mainframe? ``uniffi``
has your back. Call as many library functions as you like, all from Unicon
sources, no wrappers required.
The GnuCOBOL sample listed in `cnative` is identical.
The ``libharu`` integration now uses the type override system, as the PDF
generator prefers ``float`` data arguments. It makes the listing a listing
a little less easy to read, with all the type overrides, but not too bad,
given the level of flexibility provided.
.. literalinclude:: programs/uniffi/haru.icn
:language: unicon
:start-after: ##+
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/haru.icn`
What comes next? That is up to the creativity, imagination and needs of
august Unicon developers. No need to write any :t:`C` to get at :t:`C`, or any
language that uses the :t:`C` application binary interface. As a (not
completely) informed guess, I'd peg that at well over 75% of currently
available computing resources, world wide. Even an Android phone plays nice
with the :t:`C` ABI internally.
Other features of ``libffi`` can be exposed as needs are determined. The
``ffi_call`` layer supports ``sysv``, ``unix64``, ``win64``, ``stdcall``,
``fastcall``, ``thiscall`` and ``cdecl`` call conventions. Possibly others,
depending on platform specific builds. By default, ``uniffi`` uses the calling
convention most appropriate for the system used during builds by using
``FFI_DEFAULT_ABI`` when preparing the Call InterFace block.
--------
.. index:: BaCon
.. _baconffi:
baconffi
--------
Putting `libffi` to use with `BaCon`, the BASIC Converter.
Like the GnuCOBOL example, this routine just sums two numbers passed in as
arguments and returns the resulting integer.
.. literalinclude:: programs/uniffi/basicnative.bac
:language: basic
:start-after: REM +
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/basicnative.bac`
This source is then converted to C, and a shared library is created.
.. command-output:: bacon -q -f basicnative.bac
:cwd: programs/uniffi
A slightly more sophisticated BaCon program, that embeds some inline
assembler. *Sample derived from a thread on the BaCon forums by vovchik and
Axelfish*, http://basic-converter.proboards.com/thread/752/assembler-bacon
.. literalinclude:: programs/uniffi/asmmix.bac
:language: text
:start-after: REM +
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/asmmix.bac`
Again, convert the source and compile to a shared library.
.. command-output:: bacon -q -f asmmix.bac
:cwd: programs/uniffi
A Unicon test head to load in the ``basic`` function defined in
``basicnative.bac`` and then invoke the function through the `uniffi`
interface. Same sequence for the ``asmmix`` call.
.. literalinclude:: programs/uniffi/baconffi.icn
:language: unicon
:start-after: ##+
.. only:: html
.. rst-class:: rightalign
:download:`programs/uniffi/baconffi.icn`
Sample run:
.. command-output:: unicon -s baconffi.icn -x
:cwd: programs/uniffi
And mixing Unicon with BASIC becomes another easy thing to do.
BaCon is an extraordinarily powerful BASIC translator. Lots of features.
http://www.basic-converter.org/
.. only:: html
..
--------
:ref:`genindex` | Previous: :doc:`programs` | Next: :doc:`theory`
|