Cloning Rows Using Pentaho Data Integration

Cloning is a different kind of operation to replication and backups in which the cloned environment is both fully functional and separate in its own right. Additionally the cloned environment may be modified at its inception due to configuration changes or data subsetting.

When there is the need of cloning certain number of rows Pentaho gives an inbuilt plugin namely “CLONE ROWS” which can clone rows as well as flagging the duplicate rows which can further give an enhanced and much of a detailed output.

The following Blog briefs you on how to implement cloning using KETTLE.

1.Initially we generated some rows using the “Generate Rows ” step, the preview of which you can see underneath.

In the subsequent step we used a “Get Value From Sequence” step where we generated counter rows.

Finally we clone the rows as per our required output.In this case we have created 2 as we have given the Nr Clones=2 in the Clone Rows step.

The output is as follows where the duplicate rows are flagged as Y and original ones as N.

