Moving on to beancount

For the past three years I have been recording my finances with tools from the ledger family. While I had no major issues with them, there were a few minor annoyances, most notably recording capital gains in hledger without a virtual posting and a lack of a nice visual representation of my finances. I knew about beancount but was always a bit sceptical about its data format which is not really compatible with the other ledger tools. The deciding factor to take the plunge was the fava web interface and the comprehensive inventory system which makes recording capital gains a breeze.

Importing Deutsche Bank data

Moving the ledger data to the beancount format was pretty straightforward using the ledger2beancount tool. Unfortunately, I had to re-write the import tool from scratch because beancount does not provide a command similar to hledger csv. On the other hand, it was relatively simple to come up with these little bits of Python:

import os
import re
import codecs
import csv
import datetime
from beancount.core import number, data, amount
from beancount.ingest import importer

class Target(object):
    def __init__(self, account, payee=None, narration=None):
        self.account = account
        self.payee = payee
        self.narration = narration

class DeutscheBankImporter(importer.ImporterProtocol):
    def __init__(self, account, default, mapping):
        self.account = account
        self.default = default
        self.mapping = mapping

    def identify(self, fname):
        return re.match(r"Kontoumsaetze_\d+_\d+_\d+_\d+.csv",
            os.path.basename(fname.name))

    def file_account(self, fname):
        return self.account

    def extract(self, fname):
        fp = codecs.open(fname.name, 'r', 'iso-8859-1')
        lines = fp.readlines()

        # drop top and bottom stuff
        lines = lines[5:]
        lines = lines[:-1]
        entries = []

        def fix_decimals(s):
            return s.replace('.', '').replace(',', '.')

        for index, row in enumerate(csv.reader(lines, delimiter=';')):
            meta = data.new_metadata(fname.name, index)
            date = datetime.datetime.strptime(row[0], '%d.%m.%Y').date()
            desc = row[4]
            payee = row[3]
            credit = fix_decimals(row[15]) if row[15] != '' else None
            debit = fix_decimals(row[16]) if row[16] != '' else None
            currency = row[17]
            account = self.default
            num = number.D(credit if credit else debit)
            units = amount.Amount(num, currency)

            for p, t in self.mapping.items():
                if p in desc:
                    account = t.account

                    if t.narration:
                        desc = t.narration

                    if t.payee:
                        payee = t.payee

            frm = data.Posting(self.account, units, None, None, None, None)
            to = data.Posting(account, -units, None, None, None, None)
            txn = data.Transaction(meta, date, "*", payee, desc,
                    data.EMPTY_SET, data.EMPTY_SET, [frm, to])

            entries.append(txn)

        return entries

that you would plug in to your import config like this:

mappings = {
    'Salary':
        Target('Assets:Income', 'Foo Company'),
    'Walmart':
        Target('Expenses:Food:Groceries'),
}
CONFIG = [
    DeutscheBankImporter('Assets:Checking', 'Expenses:ReplaceMe', mappings)
]

Yes, that’s my answer to this statement from the official documentation:

My standard answer is that while it would be fun to have [automatic categorization], if you have a text editor with account name completion configured properly, it’s a breeze to do this manually and you don’t really need it.

On to the next years …


Trying vcpkg

Developing C and C++ projects in a cross-platform manner is not only difficult because of different system APIs but also because dependencies have to be tracked, installed and integrated into the build system du jour. While it is largely a solved problem on Linux thanks to system package managers and de facto standards such as pkg-config, the situation on Windows is tricky because both are missing. Rust and Go easily circumvent this situation by building static binaries from source using their integrated package managers. In a similar manner a Microsoft team tries to combine package management and build integration on Linux, Mac and Windows with their open source vcpkg package manager.

The main idea is simple: 1) provide a repository with build and dependency descriptions for a wide range of libraries, 2) a CLI tool to actually build said libraries locally and 3) means to make use of these libraries. Each of these points has its pros and cons you can see when trying to build the following C toy example on Linux which depends on the GLib utility library:

#include <glib.h>

int
main (int argc, char *argv[])
{
    g_print ("Hello world\n");
    return 0;
}

With vcpkg you first have to clone the repository and then build the vcpkg binary. Easier said than done because the binary requires g++-7 to build. You then use

vcpkg install glib

to build GLib and all its dependencies which works well but has one big caveat: Using so-called CONTROL files, maintainers specify the version of the library and all their dependencies. Unfortunately, build dependencies are not versioned but taken from the current state of the repository. This means you can only pin the version of library A by not upgrading the repository therefore you are unable to upgrade to a newer version of library B. In the end, vcpkg is just like a system package manager and we are nowhere near closer to the Rust build story.

Now, how do we use our library? Well there’s the first problem. At the moment, the only officially supported way is to write a CMake build script such as this

cmake_minimum_required(VERSION 3.0)
project(hello)

find_path(GLIB_INCLUDE_DIR glib.h)
find_library(GLIB glib REQUIRED)

add_executable(hello hello.c)
target_link_libraries(hello PRIVATE ${GLIB_LIBRARY})
target_include_directories(hello ${GLIB_INCLUDE_DIR})

and passing their toolchain file to the configure process of CMake

cmake <builddir> . -DCMAKE_TOOLCHAIN_FILE=<vcpkgroot>/scripts/buildsystems/vcpkg.cmake

For projects that have native CMake support you can use the find_package command but as you can see we have to rely on the find_library and find_path commands for GLib. Except, it does not work. CMake is unable to find neither GLib sqlite3 using find_package as outlined in the documentation. Could we at least build something by hand? Kind of. All libraries and header files are installed under <vcpkgroot>/installed/x64-linux and the various sub-directories. However, you have to hardcode the paths because pkg-config files are not installed thus its worse than using system development libraries.

Because of these issues, I would not consider vcpkg to be an alternative package and build manager for Linux at the time of writing. There are some good ideas but spack pretty much does the same for a while now and they gave way more time thinking into their product. Let’s hope it will get better.


Using a Focusrite 18i8 under Linux

Focusrite’s Scarlett USB audio interfaces are cost-effective units with a wide range of input and output options. Out of the box they can be used by most digital audio workstations on Linux because they are USB class-compliant. However, besides the entry-level models (Solo, 2i2 and 2i4), you will certainly miss a few features for recording that just cannot be accessed due to a lack hardware controls. For these you need to use the Focusrite Control software which – as you can imagine – is available only for Mac and Windows. Fortunately, there is a device specific ALSA driver for the Scarlett devices that exposes mix controls to set gains on the different ins and outs as well as controls to enable Hi-Z or Pad modes on certain inputs. Unfortunately, using alsamixer to set these values is a huge PITA.

Now Robin Gareus, a very active contributor to Ardour, JACK and LV2 made the scarlett-mixer for his Scarlett 18i6 that allows one to control all the parameters with an easy-to-use graphical user interface. Now, I counted one and one together, made the appropriate changes and added support for the 18i8. So, no longer do I need to resort to the borderline crazy Alsa Json Gateway software 🙌


OpenCL resource tracking

I wrote about the woes I had with the abysmal OpenCL implementations by both AMD and NVIDIA. Now, it’s not always the fault of others and I recently had to debug a GPU memory leak originating from our own software. The first thing I needed to indentify was if all OpenCL resources were referenced and dereferenced equally. However, the entire system is (as usual I suppose) a dynamic pile of layers of abstractions, so simple reasoning from the code was out of the question rather quickly. I remembered the GObject tracking library from which I took the interception and backtrace bits, re-implemented the OpenCL calls dealing with resource allocation and unreferencing to make the LD_PRELOAD library libocl-stat.so. Usage is similar to the GObject tracking library gobject-list:

$ LD_PRELOAD=/path/to/libocl-stat.so ./app

with an example output like (now):

OpenCL objects alive
====================
Contexts        1/1
Command queues  0/1
Buffers         0/9
Samplers        0/0
Kernels         3/11

Memory leaks
============
Leaking         0 B