MATT
  • Certifications
  • Blog
  • Contact
Travelling Britain: From City Vibes to Countryside Magic
Featured

Travelling Britain: From City Vibes to Countryside Magic

Exploring the UK has always been a dream of mine, and recently, I finally could turn that into reality. What fascinated me wasn’t just the landmarks, but the massive transformation the country is currently going through. Of course, that didn't stop me from having a great time and lots of fun.
Sep 10, 2024 2 min read

Agentic AI with Gemma 3

Although Gemma 3 is available as instruction tuned model, it does not offer dedicated support for tools. When you have a look at the documentation, it's just regular prompting without any special tokens for functions. So in order to use the model as agent, you need to take
May 4, 2025 2 min read

How to fine tune Gemma 3 with LoRA using MLX-LM

In case you are on a Mac, you might actually might miss CUDA. And it is quite hard without it to tune an LLM using transformers or FLAX, as speed will be low. Recently, there has been some progress. With MLX and MLX-LM, there is a cool new package available
May 3, 2025 2 min read

Stop aggregating away the signal in your data

Not written by me...but I wanted to share a great post with you created some time ago - I think the author is absolutely right about everything in regards to processing and visualizing of data. You can head over to stack overflow and read it there: https://stackoverflow.blog/
Apr 18, 2025 1 min read

How to generate embeddings for RAG

When using retrieval augmented generation in order to provide additional context to an LLM, you have to generate the vectorized embeddings for a prompt. In the past year, a lot of ready-made libraries got available making this task rather trivial. The more interesting aspect is to select the right model.
Apr 18, 2025 7 min read

Ingesting data into delta lakes without using Spark

When setting up a modern date lake, it is quite important to have ACID guarantees in place, the same as for traditional data warehouses. This helps to mitigate the case when two different jobs might import data in parallel or one job fails during writing data. There are currently two
Apr 5, 2025 1 min read

Apache Arrow with Flight IPC

When you think of industry support, Apache Arrow is currently the de-facto way for in memory processing of data and got adopted by a lot of modern data science frameworks. Besides processing of data, like transforming and cleaning, of course it has to be queried and transferred to frontend for
Mar 26, 2025 2 min read

Love is in the air

When there is love, there is a way. In all kindness, respect and deepest of gratitude - for your enduring friendship, your great heart, your fairness and the endless conversations. For every dinner, every smile, every destination travelled. The only thing you‘ll never learn: how to remember jokes. But
Mar 22, 2025

How to implement user defined functions in Acero

In my opinion, Apache Arrow is the de-facto standard for in-memory processing data on modern hardware like CPUs and GPUs. It not only provides a modern specification for columnar data layout, but has all the features an up-to-date framework for data processing actually should provide: data transfer with GRPC, an
Mar 12, 2025 2 min read

How to deploy Swift to Digital Ocean App Platform

Recent developments in Swift greatly simplified and improved the deployment story on Linux. You can now build minimum containers optimized for deployment on cloud platforms.
Oct 20, 2024 1 min read

Creating modern microservices

In today's fast-paced digital world, creating self-contained and connected microservices quickly and efficiently is crucial. Vapor, a powerful web framework for Swift, allows you to build robust and high-performance services and APIs.
Jun 30, 2024 1 min read

Whats new in Spark 3.5

Apache Spark continues to evolve, and the latest release, Spark 3.5, brings a host of exciting new features and improvements that enhance performance, usability, and flexibility.
Jun 30, 2024 1 min read

Widgets for time-series analytics

We introduce Push, an innovative framework designed specifically for interactive time series analysis. This framework empowers regular business users and part-time analysts to conduct sophisticated data analyses effortlessly.
Jun 30, 2024 1 min read

Whats new in Cassandra 5

Apache Cassandra has long been a favorite among developers for its robust, distributed, NoSQL database capabilities. With the release of Apache Cassandra 5, several new features and improvements will be introduced that enhance performance, scalability, and ease of use.
Jun 30, 2024 1 min read

Tutorial: How to virtualize FreeBSD on ARM-based Macs

This post shows you how to automate virtualization of FreeBSD 14.1 on MacOS Sonoma with an open source solution that achieves high performance and is speed-wise near bare metal.
May 21, 2024 3 min read
Page 1 of 1
MATT © 2025
  • Sign up
Powered by Ghost