Scalable ETL Frameworks for High Volume Transactional Systems in Distributed Data Warehouses

Authors

  • Srikanth Reddy Keshireddy, Harsha Vardhan Reddy Kavuluri

Keywords:

distributed ETL, scalability, transactional systems

Abstract

This article evaluates scalable ETL frameworks designed for high-volume transactional systems
operating within distributed data warehouses, emphasizing how parallel extraction, MPP-based transformation,
and multi-writer loading significantly enhance throughput, reduce latency, and strengthen fault tolerance.
Experimental results demonstrate that fully distributed ETL architectures outperform both monolithic and
partially distributed strategies by maintaining stable performance under fluctuating workloads, balancing
resource utilization across cluster nodes, and recovering rapidly from node-level failures. The findings
highlight distributed ETL as a critical enabler for real-time analytics, cloud-native data ecosystems, and
enterprise-scale digital operations, providing a resilient and future-ready foundation for continuous data
integration.

Downloads

Published

2021-08-24

Issue

Section

Articles