Dynamic Looping in KNIME

Leveraging Athena with KNIME in a Robust Manner, Part One
October 24, 2019
Leveraging Athena with KNIME in a Robust Manner, Part 2
November 6, 2019
Show all

I recently came across a situation where I needed to load a couple of table partitions. I wanted to automate this process and have a table drive the process. To accomplish this task, I stated by creating a table of partitions that would drive the dynamic refresh process. My requirement was to be able to refresh the current month and the previous month’s data. (see the DDL for the partitioned fact table below)

CREATE TABLE FACT_BOOK_SALE
( FBS_BOOK_KEY NUMBER NOT NULL ENABLE
,FBS_CSTMR_KEY NUMBER NOT NULL ENABLE
,FBS_STORE_KEY NUMBER NOT NULL ENABLE
,FBS_DATE_KEY NUMBER NOT NULL ENABLE
,FBS_BOOK_ISBN_13 NUMBER
,FBS_CSTMR_ID VARCHAR2(50 CHAR)
,FBS_STORE_ID NUMBER
,FBS_QTY NUMBER
,FBS_DSCNT_PCT NUMBER(6,3)
,FBS_SALE_PRICE NUMBER(19,4)
,PRIMARY KEY (FBS_BOOK_KEY, FBS_CSTMR_KEY, FBS_STORE_KEY, FBS_DATE_KEY)
)
PARTITION BY RANGE(FBS_DATE_KEY)
( PARTITION FACT_BOOK_SALE_201901 VALUES LESS THAN (20190200),
PARTITION FACT_BOOK_SALE_201902 VALUES LESS THAN (20190300),
PARTITION FACT_BOOK_SALE_201903 VALUES LESS THAN (20190400),
PARTITION FACT_BOOK_SALE_201904 VALUES LESS THAN (20190500),
PARTITION FACT_BOOK_SALE_201905 VALUES LESS THAN (20190600),
PARTITION FACT_BOOK_SALE_201906 VALUES LESS THAN (20190700),
PARTITION FACT_BOOK_SALE_201907 VALUES LESS THAN (20190800),
PARTITION FACT_BOOK_SALE_201908 VALUES LESS THAN (20190900),
PARTITION FACT_BOOK_SALE_201909 VALUES LESS THAN (20191000),
PARTITION FACT_BOOK_SALE_201910 VALUES LESS THAN (20191100),
PARTITION FACT_BOOK_SALE_201911 VALUES LESS THAN (20191200),
PARTITION FACT_BOOK_SALE_201912 VALUES LESS THAN (20200100),
PARTITION FACT_BOOK_SALE_MAX VALUES LESS THAN (99999999)
);

The command we will dynamically create to truncate the partitions will be the following:

ALTER TABLE FACT_BOOK_SALE TRUNCATE PARTITION FACT_BOOK_SALE_201910;
ALTER TABLE FACT_BOOK_SALE TRUNCATE PARTITION FACT_BOOK_SALE_201909;

Once the data has been truncated, I can then load the data with a simple insert/select statement from the partition refresh utility table (see DDL below).

CREATE TABLE UTL_PARTITION_REFRESH
(UPR_YYYYMM                                NUMBER
,UPR_TBL_NM                                  VARCHAR2(100)
,UPR_PBL_PRTN_NM                     VARCHAR2(100)
,UPR_PRTTN_RFRSH_FL                NUMBER
);

INSERT INTO UTL_PARTITION_REFRESH VALUES(201912,’FACT_ORDER’,’FACT_ORDER_201912′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201911,’FACT_ORDER’,’FACT_ORDER_201911′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201910,’FACT_ORDER’,’FACT_ORDER_201910′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201909,’FACT_ORDER’,’FACT_ORDER_201909′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201908,’FACT_ORDER’,’FACT_ORDER_201908′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201907,’FACT_ORDER’,’FACT_ORDER_201907′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201906,’FACT_ORDER’,’FACT_ORDER_201906′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201905,’FACT_ORDER’,’FACT_ORDER_201905′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201904,’FACT_ORDER’,’FACT_ORDER_201904′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201903,’FACT_ORDER’,’FACT_ORDER_201903′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201902,’FACT_ORDER’,’FACT_ORDER_201902′,0);
INSERT INTO UTL_PARTITION_REFRESH VALUES(201901,’FACT_ORDER’,’FACT_ORDER_201901′,0);
COMMIT;

Now that we have the infrastructure set, let’s look at the KNIME workflow:

Figure 1

As you can see this is a fairly straight forward workflow. There is an extra step needed to pass the statements as table rows so the workflow will connect as the DB Table Selector will not connect to the Table Row To Variable Loop Start.

Let’s look inside the critical steps. First, we reset all of the flags in the UTL_PARTITION_REFRESH table and then set the refresh partition flags by using the sysdate function.

Figure 2

Now that we have the refresh flag set to the current month and the previous month let’s dynamically create the truncate statement in the next step.

Figure 3

In this step we run a dynamic select statement using the UTIL_PARTITION_REFRESH table as the driver. Once this is completed, we push the output on to the loop where we will see the first statement (DYNAMIC_SQL) and the number of statements which will be pushed through the loop (maxIterations).

Figure 4

Finally, we complete the process by executing the newly created flow variables in the DB SQL Executor inside the loop.

Figure 5

Now that the current partitions have been truncated, I can now continue on with my workflow to load new data into the table.

Jim McHugh
Jim McHugh
An accomplished Sr. IT leader with over 27 years of professional experience applying technology solutions to business challenges. Mr. McHugh has expertise in data modeling, data governance, business intelligence, predictive analytics and data science. His responsibilities include establishing and executing a strategy that ensures the application of data management & analytics to enable an organization to strategically leverage and fully realize the value of their data.

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact