Reinout van Rees: Pygrunn: Kliko, compute container specification

(One of my summaries of the one-day 2016 PyGrunn conference).

Gijs Molenaar works on processing big data for large radio telescopes ("Meerkat" in the south of Africa and "Lofar" in the Netherlands).

The data volumes coming from such telescopes are huge. 4 terabits per seconds, for example. So they do a log of processing and filtering to get that number down. Gijs works on the "imaging and calibration" part of the process.

So: scientific software. Which is hard to install and fragile. Especially for scientists. So they use ubuntu's "lauchpad PPA's" to package it all up as debian packages.

The new hit nowadays is docker. Containerization. A self-contained light-weight "virtual machine". Someone called it centralized agony: only one person needs to go through the pain of creating the container and all the rest of the world can use it... :-)

His line of work is often centered around pipelines. Data flows from one step to the other and on to the next. This is often done with bash scripts.

Docker is nice and you can hook up multiple dockers. But... it is all network-centric: a web container plus a database container plus a redis container. It isn't centered on data flows.

So he build something new: kliko. He's got a spec for "kliko" containers. Like "read your input from /input". "Write your output to /output". There should be a kliko.yml that defines the parameters you can pass. There should be a /kliko script as an entry point.

Apart from the kliko container, you also have the "kliko runner". It is the actor that runs the container. It runs the containers with the right parameters. You can pass the parameters on the command line or via a web interface. Perfect for scientists! You get a form where you can fill in the various parameters (defined in the kliko.yml file) and "just" run the kliko container.

An idea: you could use it almost as functional programming: functional containers. Containers that don't change the data they're operating on. Every time you run it on the same input data, you get the same results. And you can run them in parallel per definition. And you can do fun things with caching.

There are some problems with kliko.

There is no streaming yet.
It is filesystem based at the moment, which is slow.

These are known problems which are fine with what they're currently using it for. They'll work on it, though. One thing they're also looking at is "kliko-compose", so something that looks like "docker-compose".

Some (fundamental) problems with docker:

Docker access means root access, basically.
GPU acceleration is crap.
Cached filesystem layers is just annoying. In first instance it seems fine that all the intermediary steps in your Dockerfile are cached, but it is really irritating once you install, for instance, debian packages. They're hard to update.
You can't combine containers.

Reinout van Rees: Pygrunn: Kliko, compute container specification - Gijs Molenaar

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112