The Apache Nifi full guide

In this article I will explain the purpose of the Apache Nifi tool, when to use it and how it interacts with other applications in your landscape. Furthermore, I will explain how you can install Apache Nifi on your Ubuntu virtual machine and subsequently load data into the HBase table explained in a separate article on this website.

What is the purpose of Apache Nifi? Why to use it?

Tools ans applications are created for a specific purpose or user case. They always try to solve a problem. Hence, which problem does Apache Nifi aim to solve? Let’s dig a bit deeper into this.

Apache Nifi was originally developed at NSA (originally names NiagaraFiles) and aimed to solve the problem of data management there. Specifically it aimed to automate in real-time the processing, manipulation, transferring, interpreting and storaging of data from different sources (including different formats) to multiple users of the data.

Main challenges that Apache Nifi tries to address are, amongst others, security, scalability (it is part of the Hadoop platform), and easy of use via an interactive user interface. Especially the latter can easily be seen when you install Apache Nifi on your server. It provides a low code type of environment whereby via drag and drop functional components can be configured to extract, transform and load data between systems. This is actually the main purpose of Apache Nifi.

The graphical user interface does not only provide the ability to easily create new data flows to move and transform between systems, it also offers the ability to monitor the flow of data between the different steps you created in real-time. In other words, tracking data lineage and provinance is very strongly implemented in Nifi.

When not to use Apache Nifi?

While we now have a better idea on the purpose of Apache Nifi, to me it always helps to better understand a technology when considering the “not” question. Hence, in which scenarios is Apache Nifi not the most optimal solution?

Simply put, when you need to do some complex data transformations and manipulation based on particular business logic, Apache Nifi might not be the way to go. When simple data and file transformations are needed prior to processing and storing the data elsewhere, Apache Nifi is very capable and also scalable. When this logic becomes more complex it is better to choose for a different solution.

How is Apache Nifi setup?

Apache Nifi uses and implemented several concepts to be able to extract, transform and load data between systems. This data processinf capability requires these different concepts to work seamlessly.

How to install Apache Nifi on your Linux server?

Downloading and installing Apache Nifi is similar to HBase. Java must be installed already to be able to start Apache Nifi as the platform makes use of the Java engine. For an explanation on how to install Java, please refer to the HBase guide.

Flowfile