How to filter rows by using Pentaho data integration

In the field of Data Integration, coming across the issue of filtering incoming data from a source is a requirement that occurs regularly. The ETL process requires a lot of conditioning and filtering in order to overcome the data quality issues and all this processing may take a considerable amount of time if we go step by step.

Pentaho gives us the best suitable inbuilt plugin for the same namely “FILTER ROWS”.
The Filter Rows step allows you to filter rows based on conditions and comparisons giving us the desired output based on the conditions applied.

Following is a demo on how to use filter rows in Pentaho:

1. After combining the incoming data into one DUMMY step, we add a FILTER ROWS step to it.


2. Then we can apply the conditions as per requirement in the filter rows step.


3. You can further divide the step by sending the true data to a select values step and false data to a dummy step.


4. Preview the transformation and check if the result is as per your requirement.

