FEEDBACK-DRIVEN AUTOMATION IN DATA PREPARATION: A SYSTEMATIC LITERATURE REVIEW

Authors

  • Sa’adatu Abdulkadir Department of Computer Science, Nigerian Defence Academy, Kaduna,
  • Philip O. Odion
  • Darius T. Chinyio
  • Isah R. Saidu
  • Muhammad A. Ahmad

Abstract

Data preparation remains a major bottleneck in analytics and machine learning workflows, often consuming up to 80% of the total project effort for data scientists. Although recent systems automate portions of the preparation pipeline, most lack mechanisms to capture or reuse user feedback, thereby limiting their ability to adapt to evolving data characteristics and domain-specific requirements. This study aims to systematically review the state of research on feedback-driven automation in data preparation, with emphasis on how feedback mechanisms are designed, integrated, and evaluated in contemporary systems. A systematic literature review was conducted following PRISMA 2020 guidelines, targeting peer-reviewed publications from 2010 to 2025 indexed in IEEE Xplore, ACM Digital Library, SpringerLink, Semantic Scholar, and Google Scholar. Twenty-nine primary studies met the predefined eligibility criteria and were included in the analysis. The findings indicate that feedback is predominantly utilised at schema-alignment, error-detection, and transformation stages, where explicit user input or implicit behavioural signals guide corrective processes. Feedback has been shown to improve data quality and reduce the need for repeated manual intervention; however, existing solutions are fragmented and do not support the persistence or reuse of feedback across future datasets or sessions. The review identifies a significant gap in unified architectures that treat feedback as a sustained knowledge asset. Addressing this gap presents opportunities for developing adaptive, feedback-driven data-preparation platforms that enhance transparency, reliability, and long-term automation benefits.

Downloads

Published

2026-03-30

Issue

Section

ARTICLES