❌ About FreshRSS

Normal view

There are new articles available, click to refresh the page.
Before yesterdayNews from the Ada programming language world

IO stream composition and serialization with Ada Utility Library

5 March 2022 at 22:48

To be able to provide this IO stream combination, the Ada Utility Library(https://github.com/stcarrez/ada-util) defines two Ada types: the `Input_Stream` and the `Output_Stream` limited interfaces. The `Input_Stream` interface only defines a `Read` procedure and the `Output_Stream` interface defines the `Write`, `Flush` and `Close` procedures. By implementing these interfaces, it is possible to provide stream objects that can be combined together.

[IO Stream Composition and Serialization](Ada/ada-util-streams.png)

The Ada Utility Library(https://github.com/stcarrez/ada-util) provides stream types that implement these interfaces so that it is possible to read or write files, sockets and system pipes. In many cases, the concrete type implements both interfaces so that reading or writing is possible. This is the case for `File_Stream` which allows to read or write on a file, the `Socket_Stream` which handles sockets by using GNAT sockets. The `Pipe_Stream` on its side allows to launch an external program and either read its output, available through the `Input_Stream`, or feed the external program with some input by using the `Output_Stream`.

The Ada Utility Library(https://github.com/stcarrez/ada-util) also provides stream objects that make transformation on the data through various data encoders. The Ada library supports the following encoders:

  • Base 16, Base 64,
  • AES encryption or decryption,
  • LZMA compression or decompression

Other encoders could be added and it is always possible to provide custom transformations by implementing the `Input_Stream` and `Output_Stream` interfaces.

The last part that completes the IO stream framework is the serialization framework. That framework defines and provides interface and types to read or write a CSV, XML, JSON or HTTP form stream. The serialization framework uses either the `Input_Stream` or the `Output_Stream` interfaces to either read or write the content. The serialization framework defines operations in a way that allows to read or write these streams independently of their representation format.

    1. LZMA Compression

Let's have a look at compressing a file by using the `Util.Streams` framework. First we need a `File_Stream` that is configured to read the file to compress and we need another `File_Stream` configured for writing to save in another file. The first file is opened by using the `Open` procedure and the `In_File` mode while the second one is using `Create` and the `Out_File` mode. The `File_Stream` is using the Ada `Stream_IO` standard package to access files.

```Ada with Util.Streams.Files;

  In_Stream  : aliased Util.Streams.Files.File_Stream;
  Out_Stream : aliased Util.Streams.Files.File_Stream;
  In_Stream.Open (Mode => Ada.Streams.Stream_IO.In_File, Name => Source);
  Out_Stream.Create (Mode => Ada.Streams.Stream_IO.Out_File, Name => Destination);

```

In the middle of these two streams, we are going to use a `Compress_Stream` whose job is to compress the data and write the compressed result to a target stream. The compression stream is configured by using the `Initialize` procedure and it is configured to write on the `Out_Stream` file stream. The compression stream needs a buffer and its size is configured with the `Size` parameter.

```Ada with Util.Streams.Lzma;

  Compressor : aliased Util.Streams.Lzma.Compress_Stream;
  Compressor.Initialize (Output => Out_Stream'Unchecked_Access, Size => 32768);

```

To feed the compressor stream with the input file, we are going to use the `Copy` procedure. This procedure reads the content from the `In_Stream` and writes what is read to the `Compressor` stream.

```Ada

  Util.Streams.Copy (From => In_Stream, Into => Compressor);

```

Flushing and closing the files is automatically handled by a `Finalize` procedure on the `File_Stream` type.

Complete source example: compress.adb(https://github.com/stcarrez/ada-util/tree/master/samples/compress.adb)

    1. LZMA Decompression

The LZMA decompression is very close to the LZMA compression but instead it uses the `Decompress_Stream`. The complete decompression method is the following:

```Ada procedure Decompress_File (Source : in String;

                          Destination : in String) is
  In_Stream    : aliased Util.Streams.Files.File_Stream;
  Out_Stream   : aliased Util.Streams.Files.File_Stream;
  Decompressor : aliased Util.Streams.Lzma.Decompress_Stream;

begin

  In_Stream.Open (Mode => Ada.Streams.Stream_IO.In_File, Name => Source);
  Out_Stream.Create (Mode => Ada.Streams.Stream_IO.Out_File, Name => Destination);
  Decompressor.Initialize (Input => In_Stream'Unchecked_Access, Size => 32768);
  Util.Streams.Copy (From => Decompressor, Into => Out_Stream);

end Decompress_File; ```

Complete source example: decompress.adb(https://github.com/stcarrez/ada-util/tree/master/samples/decompress.adb)

    1. AES Encryption

Encryption is a little bit more complex due to the encryption key that must be configured. The encryption is provided by the `Encoding_Stream` and it uses a `Secret_Key` to configure the encryption key. The `Secret_Key` is a limited type and it cannot be copied. To build the encryption key, one method consists in using the PBKDF2 algorithm described in RFC 8018(https://tools.ietf.org/html/rfc8018). The user password is passed to the PBKDF2 algorithm configured to use the HMAC-256 hashing. The hash method is called on itself 20000 times in this example to produce the final encryption key.

```Ada with Util.Streams.AES; with Util.Encoders.AES; with Util.Encoders.KDF.PBKDF2_HMAC_SHA256;

  Cipher       : aliased Util.Streams.AES.Encoding_Stream;
  Password_Key : constant Util.Encoders.Secret_Key := Util.Encoders.Create (Password);
  Salt         : constant Util.Encoders.Secret_Key := Util.Encoders.Create ("fake-salt");
  Key          : Util.Encoders.Secret_Key (Length => Util.Encoders.AES.AES_256_Length);
  ...
     PBKDF2_HMAC_SHA256 (Password => Password_Key,
                         Salt     => Salt,
                         Counter  => 20000,
                         Result   => Key);

```

The encoding stream is able to produce or consume another stream. For the encryption, we are going to use the first mode and use the `Produces` procedure to configure the encryption to write on the `Out_Stream` file. Once configured, the `Set_Key` procedure must be called with the encryption key and the encryption method. The initial encryption `IV` vector can be configured by using the `Set_IV` procedure (not used by the example). As soon as the encryption key is configured, the encryption can start and the `Cipher` encoding stream can be used as an `Output_Stream`: we can write on it and it will encrypt the content before passing the result to the next stream. This means that we can use the same `Copy` procedure to read the input file and pass it through the encryption encoder.

```Ada

  Cipher.Produces (Output => Out_Stream'Unchecked_Access, Size => 32768);
  Cipher.Set_Key (Secret => Key, Mode => Util.Encoders.AES.ECB);
  Util.Streams.Copy (From => In_Stream, Into => Cipher);

```

Complete source example: encrypt.adb(https://github.com/stcarrez/ada-util/tree/master/samples/encrypt.adb)

    1. AES Decryption

Decryption is similar but it uses the `Decoding_Stream` type. Below is the complete example to decrypt the file:

```Ada procedure Decrypt_File (Source : in String;

                       Destination : in String;
                       Password    : in String) is
  In_Stream    : aliased Util.Streams.Files.File_Stream;
  Out_Stream   : aliased Util.Streams.Files.File_Stream;
  Decipher     : aliased Util.Streams.AES.Decoding_Stream;
  Password_Key : constant Util.Encoders.Secret_Key := Util.Encoders.Create (Password);
  Salt         : constant Util.Encoders.Secret_Key := Util.Encoders.Create ("fake-salt");
  Key          : Util.Encoders.Secret_Key (Length => Util.Encoders.AES.AES_256_Length);

begin

  --  Generate a derived key from the password.
  PBKDF2_HMAC_SHA256 (Password => Password_Key,
                      Salt     => Salt,
                      Counter  => 20000,
                      Result   => Key);
  --  Setup file -> input and cipher -> output file streams.
  In_Stream.Open (Ada.Streams.Stream_IO.In_File, Source);
  Out_Stream.Create (Mode => Ada.Streams.Stream_IO.Out_File, Name => Destination);
  Decipher.Produces (Output => Out_Stream'Access, Size => 32768);
  Decipher.Set_Key (Secret => Key, Mode => Util.Encoders.AES.ECB);
  --  Copy input to output through the cipher.
  Util.Streams.Copy (From => In_Stream, Into => Decipher);

end Decrypt_File; ```

Complete source example: decrypt.adb(https://github.com/stcarrez/ada-util/tree/master/samples/decrypt.adb)

    1. Stream composition: LZMA > AES

Now, if we want to compress the stream before encryption, we can do this by connecting the `Compressor` to the `Cipher` stream and we only have to use the `Compressor` instead of the `Cipher` in the call to `Copy`.

```Ada

  In_Stream.Open (Ada.Streams.Stream_IO.In_File, Source);
  Out_Stream.Create (Mode => Ada.Streams.Stream_IO.Out_File, Name => Destination);
  Cipher.Produces (Output => Out_Stream'Unchecked_Access, Size => 32768);
  Cipher.Set_Key (Secret => Key, Mode => Util.Encoders.AES.ECB);
  Compressor.Initialize (Output => Cipher'Unchecked_Access, Size => 4096);
  Util.Streams.Copy (From => In_Stream, Into => Compressor);

```

When `Copy` is called, the following will happen:

  • first, it reads the `In_Stream` source file,
  • the data is written to the `Compress` stream,
  • the `Compressor` stream runs the LZMA compression and writes on the `Cipher` stream,
  • the `Cipher` stream encrypts the data and writes on the `Out_Stream`,
  • the `Out_Stream` writes on the destination file.

Complete source example: lzma_encrypt.adb(https://github.com/stcarrez/ada-util/tree/master/samples/lzma_encrypt.adb)

    1. More stream composition: LZMA > AES > Base64

We can easily change the stream composition to encode in Base64 after the encryption. We only have to declare an instance of the Base64 `Encoding_Stream` and configure the encryption stream to write on the Base64 stream instead of the output file. The Base64 stream is configured to write on the output stream.

```Ada In_Stream : aliased Util.Streams.Files.File_Stream; Out_Stream : aliased Util.Streams.Files.File_Stream; Base64 : aliased Util.Streams.Base64.Encoding_Stream; Cipher : aliased Util.Streams.AES.Encoding_Stream; Compressor : aliased Util.Streams.Lzma.Compress_Stream;

  In_Stream.Open (Ada.Streams.Stream_IO.In_File, Source);
  Out_Stream.Create (Mode => Ada.Streams.Stream_IO.Out_File, Name => Destination);
  Base64.Produces (Output => Out_Stream'Unchecked_Access, Size => 32768);
  Cipher.Produces (Output => Base64'Unchecked_Access, Size => 32768);
  Cipher.Set_Key (Secret => Key, Mode => Util.Encoders.AES.ECB);
  Compressor.Initialize (Output => Cipher'Unchecked_Access, Size => 4096);

```

Complete source example: lzma_encrypt_b64.adb(https://github.com/stcarrez/ada-util/tree/master/samples/lzma_encrypt_b64.adb)

    1. Serialization

Serialization is achieved by using the `Util.Serialize.IO` packages and child packages and their specific types. The parent package defines the limited `Output_Stream` interface which inherit from the `Util.Streams.Output_Stream` interface. This allows to define specific operations to write various Ada types but also it provides common set of abstractions that allow to write either a JSON, XML, CSV and FORM (`x-www-form-urlencoded`) formats.

The target format is supported by a child package so that you only have to use the `Output_Stream` type declared in one of the `JSON`, `XML`, `CSV` or `Form` child package and use it transparently. There are some constraint if you want to switch from one output format to another while keeping the same code. These constraints comes from the nature of the different formats: `XML` has a notion of entity and attribute but other formats don't differentiate entities from attributes.

  • A `Start_Document` procedure must be called first. Not all serialization method need it but it is required for JSON to produce a correct output.
  • A `Write_Entity` procedure writes an XML entity of the given name. When used in JSON, it writes a JSON attribute.
  • A `Start_Entity` procedure prepares the start of an XML entity or a JSON structure with a given name.
  • A `Write_Attribute` procedure writes an XML attribute after a `Start_Entity`. When used in JSON, it writes a JSON attribute.
  • A `End_Entity` procedure terminates an XML entity or a JSON structure that was opened by `Start_Entity`.
  • At the end, the `End_Document` procedure must be called to finish correctly the output and terminate the JSON or XML content.

```Ada procedure Write (Stream : in out Util.Serialize.IO.Output_Stream'Class) is begin

  Stream.Start_Document;
  Stream.Start_Entity ("person");
  Stream.Write_Entity ("name", "Harry Potter");
  Stream.Write_Entity ("gender", "male");
  Stream.Write_Entity ("age", 17);
  Stream.End_Entity ("person");
  Stream.End_Document;

end Write; ```

      1. JSON Serialization

With the above `Write` procedure, if we want to produce a JSON stream, we only have to setup a JSON serializer. The JSON serializer is connected to a `Print_Stream` which provides a buffer and helper operations to write some text content. An instance of the `Print_Stream` is declared in `Output` and configured with a buffer size. The JSON serializer is then connected to it by calling the `Initialize` procedure and giving the `Output` parameter.

After writing the content, the JSON is stored in the `Output` print stream and it can be retrieved by using the `To_String` function.


```Ada with Ada.Text_IO; with Util.Serialize.IO.JSON; with Util.Streams.Texts; procedure Serialize is

  Output : aliased Util.Streams.Texts.Print_Stream;
  Stream : Util.Serialize.IO.JSON.Output_Stream;

begin

  Output.Initialize (Size => 10000);
  Stream.Initialize (Output => Output'Unchecked_Access);
  Write (Stream);
  Ada.Text_IO.Put_Line (Util.Streams.Texts.To_String (Output));

end Serialize; ```

The `Write` procedure described above produces the following JSON content:

```C {"person":{"name":"Harry Potter","gender":"male","age": 17}} ```

Complete source example: serialize.adb(https://github.com/stcarrez/ada-util/tree/master/samples/serialize.adb)

      1. XML Serialization

Switching to an XML serialization is easy: replace `JSON` by `XML` in the package to use the XML serializer instead.

```Ada with Ada.Text_IO; with Util.Serialize.IO.XML; with Util.Streams.Texts; procedure Serialize is

  Output : aliased Util.Streams.Texts.Print_Stream;
  Stream : Util.Serialize.IO.XML.Output_Stream;

begin

  Output.Initialize (Size => 10000);
  Stream.Initialize (Output => Output'Unchecked_Access);
  Write (Stream);
  Ada.Text_IO.Put_Line (Util.Streams.Texts.To_String (Output));

end Serialize; ```

This time, the same `Write` procedure produces the following XML content:

```C <person><name>Harry Potter</name><gender>male</gender><age>17</age></person> ```

Complete source example: serialize_xml.adb(https://github.com/stcarrez/ada-util/tree/master/samples/serialize_xml.adb)

A curiosity with LZMA data compression

10 August 2021 at 09:42

Uncompressed file: 1'029'744 bytes.

Compressed size (excluding Zip or 7z archive metadata; data is not preprocessed):

BytesCompressed / Uncompressed ratio Format Software
172'976 16.80% PPMd 7-Zip 21.02 alpha
130'280 12.65% BZip2 Zip 3.0
119'327 11.59% BZip2 7-Zip 21.02 alpha
61'584 5.98% LZMA Zip-Ada v.57
50'398 4.89% LZMA2 7-Zip 21.02 alpha
50'396 4.89% LZMA 7-Zip 21.02 alpha
42'439 4.12% LZMA Zip-Ada v.58 (preview)
41'661 4.05% LZMA Zip-Ada (current research branch)

Conclusion: the Zip-Ada (current research branch) compresses that data 17.3% better than 7-Zip v.21.02!

The file (zipped to its smallest compressed size, 4.05%) can be downloaded here. It is part of the old Canterbury corpus benchmark file collection (file name: kennedy.xls).

Please don't draw any conclusion: the test data is a relatively small, special binary file with lots of redundancy.
But that result is a hint that some more juice can be extracted from the LZMA format.

The open-source Zip-Ada project can be found here and here.

Some research with LZMA...

28 November 2020 at 19:54

A rare case where Zip-Ada's LZMA encoder is much better than LZMA SDK's. Rare but still interesting, and with standard LZMA parameters (no specific tuning for that file):


The compressed size with current revision (rev.#882) of Zip-Ada is slightly worse (42,559 bytes).

The file is part of the classic Canterbury Corpus compression benchmark data set.

Zip-Ada v.57

3 October 2020 at 20:55
 New in v.57 [rev. 799]:

  - UnZip: fixed bad decoding case for the Shrink (LZW) format,
        on some data compressed only by PKZIP up to v.1.10,
        release date 1990-03-15.
  - Zip.Create: added Zip_Entry_Stream_Type for doing output
        streaming into Zip archives
.
  - Zip.Compress: Preselection method detects Audacity files (.aup, .au)
        and compresses them better
.

***

Zip-Ada is a pure Ada library for dealing with the Zip compressed
archive file format. It supplies:
 - compression with the following sub-formats ("methods"):
     Store, Reduce, Shrink (LZW), Deflate and LZMA
 - decompression for the following sub-formats ("methods"):
     Store, Reduce, Shrink (LZW), Implode, Deflate, Deflate64,
     BZip2 and LZMA
 - encryption and decryption (portable Zip 2.0 encryption scheme)
 - unconditional portability - within limits of compiler's provided
     integer types and target architecture capacity
 - input archive to decompress can be any kind of indexed data stream
 - output archive to build can be any kind of indexed data stream
 - input data to compress can be any kind of data stream
 - output data to extract can be any kind of data stream
 - cross format compatibility with the most various tools and file formats
     based on the Zip format: 7-zip, Info-Zip's Zip, WinZip, PKZip,
     Java's JARs, OpenDocument files, MS Office 2007+,
     Google Chrome extensions, Mozilla extensions, E-Pub documents
     and many others
 - task safety: this library can be used ad libitum in parallel processing
 - endian-neutral I/O

***

Main site & contact info:
  http://unzip-ada.sf.net
Project site & subversion repository:
  https://sf.net/projects/unzip-ada/
GitHub clone with git repository:
  https://github.com/zertovitch/zip-ada

Enjoy!

AZip 2.40 - Windows Explorer context menus

3 October 2020 at 18:00

New release (2.40) of AZip.

The long-awaited Windows Explorer integration is there:



 
Context menu for a file

Context menu for a folder

This integration is activated upon installation or on demand via the Manage button:

Configuration

 

This new version is based on the Zip-Ada library v.57 and includes its recent developments.

Enjoy!

Zip-Ada for Audacity backups

22 September 2020 at 18:21

Audacity is a free, open source, audio editor, available here.

If you want to backup you Audacity project, you can manually do it with "Save Lossless Copy of Project..." with the name, say, X, which will create X.aup (project file), a folder X_data, and, in there, a file called "Audio Track.wav".

Some drawbacks:

  • It is a manual operation.
  • It is blocked during playback.
  • Envelopes are applied to the "Audio Track.wav" data. So data is altered and no more a real lossless copy of the project. Actually this operation is something between a backup and an export of the project to a foreign format.

A solution: Zip-Ada.

The latest commit (rev. 796) adds to the Preselection method a specific configuration for detecting Audacity files, so they are compressed better than with default settings.

Funny detail: that configuration makes, in most cases, the compression better than the best available compression with 7-Zip (v.19.00, "ultra" mode, .7z archive).

The compressing process is also around twice as fast as 7-Zip in "ultra" mode. This is no magic, since the "LZ" part of the LZMA compression scheme spends less time finding matches, in the chosen configuration for Zip-Ada.


A backup script could look like this (here for Windows' cmd):

rem --------------------------
rem Nice date YYYY-MM-DD_HH.MM
rem --------------------------

set year=%date:~-4,4%

set month=%date:~-7,2%
if "%month:~0,1%" equ " " set month=0%month:~1,1%

set day=%date:~-10,2%
if "%day:~0,1%" equ " " set day=0%day:~1,1%

set hour=%time:~0,2%
if "%hour:~0,1%" equ " " set hour=0%hour:~1,1%

set min=%time:~3,2%

set nice_date=%year%-%month%-%day%_%hour%.%min%

rem --------------------------

set audacity_project=The Cure - A Forest

zipada -ep2 "%audacity_project%_%nice_date%" "%audacity_project%.aup" "%audacity_project%_data\e08\d08\*.au"

AZip in action for duplicating a Thunderbird profile

18 September 2020 at 07:16

You want to copy your Thunderbird profile from machine A to machine B (with all mail accounts, passwords, settings, feeds, newgroups, ...) ? Actually it is very easy. From the user storage (on Windows, %appdata% (you get there with Windows key+R and typing %appdata%)), you copy the entire Thunderbird folder of machine A to the equivalent location on machine B, and that's it. The new active profile will be automatically selected since the file profiles.ini will be overwritten on the way.

Now, if you want or need to use a cloud drive or a USB stick for the operation, it's better to wrap everything in a Zip file (a single file instead of hundreds) to save time. Plus, you can store the Zip file in case of an emergency (losing data on both A and B machines).

With AZip, it's pretty easy: 

  • Shut down Thunderbird on both machines.
  • On machine A: drag & drop the Thunderbird folder on an empty AZip window.
  • Copy or move the Zip file.
  • On machine B: extract everything with another drag & drop, from AZip to the Explorer window with the %appdata% path. When asked "Use archive's folder names for output", say "Yes". When asked "Do you want to replace this file ?", say "All".

That's it!

Here a few screenshots:

Folder tree view - click to enlarge

You can squeeze the data to a smaller size (the LZMA format will be most of the time chosen over Deflate) with the "Recompress" button (third from the right).

After recompression - click to enlarge


Using Ada LZMA to compress and decompress LZMA files

16 December 2015 at 10:25

Setup of Ada LZMA binding

First download the Ada LZMA binding at http://download.vacs.fr/ada-lzma/ada-lzma-1.0.0.tar.gz or at [email protected]:stcarrez/ada-lzma.git, configure, build and install the library with the next commands:

./configure
make
make install

After these steps, you are ready to use the binding and you can add the next line at begining of your GNAT project file:


with "lzma";

Import Declaration

To use the Ada LZMA packages, you will first import the following packages in your Ada source code:


with Lzma.Base;
with Lzma.Container;
with Lzma.Check;

LZMA Stream Declaration and Initialization

The liblzma library uses the lzma_stream type to hold and control the data for the lzma operations. The lzma_stream must be initialized at begining of the compression or decompression and must be kept until the compression or decompression is finished. To use it, you must declare the LZMA stream as follows:


Stream  : aliased Lzma.Base.lzma_stream := Lzma.Base.LZMA_STREAM_INIT;

Most of the liblzma function return a status value of by lzma_ret, you may declare a result variable like this:


Result : Lzma.Base.lzma_ret;

Initialization of the lzma_stream

After the lzma_stream is declared, you must configure it either for compression or for decompression.

Initialize for compression

To configure the lzma_stream for compression, you will use the lzma_easy_encode function. The Preset parameter controls the compression level. Higher values provide better compression but are slower and require more memory for the program.


Result := Lzma.Container.lzma_easy_encoder (Stream'Unchecked_Access, Lzam.Container.LZMA_PRESET_DEFAULT,
                                            Lzma.Check.LZMA_CHECK_CRC64);
if Result /= Lzma.Base.LZMA_OK then
  Ada.Text_IO.Put_Line ("Error initializing the encoder");
end if;
Initialize for decompression

For the decompression, you will use the lzma_stream_decoder:


Result := Lzma.Container.lzma_stream_decoder (Stream'Unchecked_Access,
                                              Long_Long_Integer'Last,
                                              Lzma.Container.LZMA_CONCATENATED);

Compress or decompress the data

The compression and decompression is done by the lzma_code function which is called several times until it returns LZMA_STREAM_END code. Setup the stream 'next_out', 'avail_out', 'next_in' and 'avail_in' and call the lzma_code operation with the action (Lzma.Base.LZMA_RUN or Lzma.Base.LZMA_FINISH):


Result := Lzma.Base.lzma_code (Stream'Unchecked_Access, Action);

Release the LZMA stream

Close the LZMA stream:


    Lzma.Base.lzma_end (Stream'Unchecked_Access);

Sources

To better understand and use the library, use the source Luke

Download

❌
❌