As a data engineer I consult on designing, developing, and running a central data platform. By processing both small and big data, especially events, in a timely manner, the platform enables you to digitalise business processes, and offer completely new services.
I help connect internal and external source systems. Working together with domain experts I model incoming data so it is structured and can be enriched. I develop microservices to implement business processes. I collaborate with Data Scientists that explore the data and create prototypical algorithms or Machine Learning models (for example using Python or R). In some cases I even develop the models myself.
In my role I also move ad-hoc queries, algorithms, and new models to production jobs where they integrate into the system and serve digital products. These jobs are tested, scheduled, versioned, monitored, and made highly-available. In case of an error they can be rolled back to an older version.
You don’t have a central data platform yet? I can support you build such a system from the ground up. For that I can also organise a workshop (remote). You can also reuse standard components I’ve developed over the years.
Apache Spark is a framework that allows you to process large amounts of data in a relatively short amount of time. Written in Scala it offers a unified API for batch processing, SQL, Machine Learning, Streaming for Scala, Java, and Python.
To build a central data platform, Apache Kafka or Apache Pulsar can be used.
Microservices are the perfect fit for a central data platform that processes streams of events. This can be done with any programming language that has a supporting API. I prefer Scala, Python, Kotlin.
It’s important to model microservices correctly, especially in terms of responsibility of functions, ownership of data, data modeling, and communication with other services. This can be explored using Domain Driven Design and Event Storming workshops that are held together with domain experts. One microservice should be responsible for one specific domain, and is the owner of its data, copying other services’ data.
For pure web services and RESTful backends I prefer the Play Framework. Play offers a scaling, non-blocking architecture, and allows for a quick development cycle. Written in Scala, based on Akka, it’s well prepared for modern web development. Any other Scala-based framework is also possible.
I also develop front-ends using Scala.js. This way code can be shared between front-end and backend. Scala.js is best suited for complex web applications and administration interfaces.
Available for custom software engineering: 2021