Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
Extract, transform and load (ETL) tools like AWC Glue bring much needed functionality. This tool enables new approaches to pulling, processing and pushing data from source to target, and introduces concepts such as performing data transformation tasks using SparkSQL scripts in Apache spark environment. However, there are shortcomings with AWS Glue, leading to a number of challenges and questions: