Row-level Error Handling with Apache Spark SQL
If you’re using Apache Spark SQL for running ETL jobs and applying data transformations between different domain models, you might be wondering what’s the best way to deal with errors if some of the values cannot be mapped according to the specified business rules. In this blog post I would like to share one approach that can be used to filter out successful records and send to the next layer while quarantining failed records in a quarantine table. I’ll be using PySpark and DataFrame
s but the same concepts should apply when using Scala and DataSet
s.
Developer's guide on setting up a new MacBook in 2021
Introduction
So recently I got myself a new shiny M1 based MacBook Air. To make it clear, although it has ‘Air’ in its name, this machine is more than capable of handling all types of backend development workloads like coding, compilation, ML tasks, and more, really well. After switching from my 15” MacBook Pro (2018) I can say that pretty much everything seems faster and more responsive, even tools that have not yet been migrated to run on M1 natively and hence require translation via Rosetta2. Added benefits of choosing MacBook Air over MacBook Pro (which at the time of writing is only available with Intel CPU), except of course being super light and transportable, are its innovative fan-less design and its welcome absence of touch bar. So overall, after having used MacBook Air for a few weeks as my primary dev machine, I have to agree with many reviews available on the web in that this is probably the best laptop you can get right now as a Web and/or Backend Developer.
But to make the most out of it, it is absolutely key to set up this new toy for maximum productivity with the right choice of software tools and configuration. In this blog post I would like to share with you some of the tweaks and tools I personally used to transform my new MacBook into a productivity monster with a goal of hopefully making your life a little bit easier when you have to do the same.
Writing a Scala and Akka-HTTP based client for REST API (Part I)
And so I decided to write another HTTP Client using Scala. Sounds easy? Sure, if you know where to start. As with so many things in the Scala world, picking your libraries can be the hardest part. In this post I’ll try to make the experience of writing a HTTP client library in Scala a little less painful so you don’t have to start from scratch. Ok, enough introduction, let’s start!
Scala Wrapper for Oanda REST API
Oanda is in my opinion one of the best online brokers for retail folks (like you and I). Originally an FX-only broker, Oanda has more recently expanded its offering to allow for trading of stock and commodity indices through the so-called CDFs (contract for difference). I don’t want to go too much into detail on how it works, you can read up on that stuff on their website if you’re interested. In this post I would like to offer you some guidance on how to use Oanda as a broker for executing algorithmic / automated trading strategies.
Installing Arch Linux on VirtualBox
Let’s assume you love your Linux terminal to get things done (like I do). Let’s also assume that the firm you’re working for is forcing you to use Windows. Bugger! What are you gonna do now? One option would be to use Cygwin, but unfortunately, it is just a Linux terminal emulator still based on Windows and very buggy on top of that. A better solution would be to install a clean Linux distribution inside a virtual machine. And this is exactly what I would like to share with you today, namely some tips on how to install a distribution of Arch Linux on VirtualBox.
Welcome and Merry X-Mas!
I would like to welcome you all on my new website. In this blog you will find posts on topics related to things I know best, namely functional programming (Scala in particular), web development, world of finance and trading. I will also occasionally post here photos from my trips around the world. Meanwhile feel free to check out my projects on GitHub: