Athena

November 6, 2019

Leveraging Athena with KNIME in a Robust Manner, Part 2

In my previous blog posting, I introduced an issue we were having with seemingly random intermittent failures using Amazon Web Services’ Athena backed by a large number of data files in S3.   The issue was arising because S3 is eventually consistent and occasionally queries were being executed before their underlying data files were fully materialized in S3. Our solution was to introduce try/catch Knime nodes with a loop to retry failed queries a few times in case of intermittent failures.   To do this we had to do our own flow variable resolution in the Athena SQL queries since the standard […]
October 24, 2019

Leveraging Athena with KNIME in a Robust Manner, Part One

Recently, we began experiencing seemingly random intermittent failures in one of our modeling workflows used in VANE, an advanced predictive analytics project we are building for the US Army.  These failures were occurring with varying frequency inside of any one of several Database SQL Executor nodes.  These nodes were performing a large number of SQL queries against Amazon Web Services’ Athena backed by a large number of data files in S3.   The workflow was randomly failing in any of these nodes with a HIVE SPLIT error due to a missing partition file in S3.  When we investigated the failure, we […]
July 31, 2019

Serverless Analysis of data in Amazon S3 using Amazon Athena through KNIME.

This blog describes how to perform Serverless Analysis of data in Amazon S3 using Amazon Athena through KNIME. Let’s start with quick introduction about Amazon Athena. What is Athena? Athena is serverless query service for querying data in S3 using standard SQL, with no infrastructure to manage. It supports ANSI SQL queries with support for joins, JSON and window functions. How to connect Athena and execute SQL from KNIME? KNIME can interface with Athena using the following nodes. How to do Analysis of data in S3 using Athena trough KNIME? A traditional approach is to download the entire files from […]
Contact