Bloerg – Moving on to beancount

Moving on to beancount

14 May 2018

For the past three years I have been recording my finances with tools from the ledger family. While I had no major issues with them, there were a few minor annoyances, most notably recording capital gains in hledger without a virtual posting and a lack of a nice visual representation of my finances. I knew about beancount but was always a bit sceptical about its data format which is not really compatible with the other ledger tools. The deciding factor to take the plunge was the fava web interface and the comprehensive inventory system which makes recording capital gains a breeze.

Importing Deutsche Bank data

Moving the ledger data to the beancount format was pretty straightforward using the ledger2beancount tool. Unfortunately, I had to re-write the import tool from scratch because beancount does not provide a command similar to hledger csv. On the other hand, it was relatively simple to come up with these little bits of Python:

import os
import re
import codecs
import csv
import datetime
from beancount.core import number, data, amount
from beancount.ingest import importer

class Target(object):
    def __init__(self, account, payee=None, narration=None):
        self.account = account
        self.payee = payee
        self.narration = narration

class DeutscheBankImporter(importer.ImporterProtocol):
    def __init__(self, account, default, mapping):
        self.account = account
        self.default = default
        self.mapping = mapping

    def identify(self, fname):
        return re.match(r"Kontoumsaetze_\d+_\d+_\d+_\d+.csv",
            os.path.basename(fname.name))

    def file_account(self, fname):
        return self.account

    def extract(self, fname):
        fp = codecs.open(fname.name, 'r', 'iso-8859-1')
        lines = fp.readlines()

        # drop top and bottom stuff
        lines = lines[5:]
        lines = lines[:-1]
        entries = []

        def fix_decimals(s):
            return s.replace('.', '').replace(',', '.')

        for index, row in enumerate(csv.reader(lines, delimiter=';')):
            meta = data.new_metadata(fname.name, index)
            date = datetime.datetime.strptime(row[0], '%d.%m.%Y').date()
            desc = row[4]
            payee = row[3]
            credit = fix_decimals(row[15]) if row[15] != '' else None
            debit = fix_decimals(row[16]) if row[16] != '' else None
            currency = row[17]
            account = self.default
            num = number.D(credit if credit else debit)
            units = amount.Amount(num, currency)

            for p, t in self.mapping.items():
                if p in desc:
                    account = t.account

                    if t.narration:
                        desc = t.narration

                    if t.payee:
                        payee = t.payee

            frm = data.Posting(self.account, units, None, None, None, None)
            to = data.Posting(account, -units, None, None, None, None)
            txn = data.Transaction(meta, date, "*", payee, desc,
                    data.EMPTY_SET, data.EMPTY_SET, [frm, to])

            entries.append(txn)

        return entries

that you would plug in to your import config like this:

mappings = {
    'Salary':
        Target('Assets:Income', 'Foo Company'),
    'Walmart':
        Target('Expenses:Food:Groceries'),
}
CONFIG = [
    DeutscheBankImporter('Assets:Checking', 'Expenses:ReplaceMe', mappings)
]

Yes, that’s my answer to this statement from the official documentation:

My standard answer is that while it would be fun to have [automatic categorization], if you have a text editor with account name completion configured properly, it’s a breeze to do this manually and you don’t really need it.

On to the next years …