Cleanix: A parallel big data cleaning system

H Wang, M Li, Y Bu, J Li, H Gao, J Zhang - ACM SIGMOD Record, 2016 - dl.acm.org
… background nor understand the semantics of a specific data cleaning … existing systems and
support data cleaning at a large scale, we design and implement a new system called Cleanix

A review on data cleansing methods for big data

F Ridzuan, WMNW Zainon - Procedia Computer Science, 2019 - Elsevier
… this paper will identify the data cleansing challenge in big data. … Cleanix; a parallel big
data cleansing system aims to solve the issue related to the volume and variety of big data. Four …

Data cleansing mechanisms and approaches for big data analytics: a systematic study

M Hosseinzadeh, E Azhir, OH Ahmed… - Journal of Ambient …, 2023 - Springer
… (2016) presented a parallel big data cleansing system, named Cleanix, to handle mixed
errors. Cleanix supports four types of data cleansing tasks, namely detection and correction of …

Cleanix: A big data cleaning parfait

H Wang, M Li, Y Bu, J Li, H Gao, J Zhang - Proceedings of the 23rd ACM …, 2014 - dl.acm.org
systems and support data cleaning at a very large scale, we design and implement a new
system called Cleanix. We … purpose data parallel execution engine, with our user-defined data

Investigating Data Repair steps for EHR Big Data

S Juddoo - 2022 3rd International Conference on Next …, 2022 - ieeexplore.ieee.org
… claiming at undertaking data cleansing for Big Data, and therefore … Cleanix’ is developed
with the ‘Hyracks’ execution engine, which is a data-parallel execution engine for Big Data

A five-layer architecture for big data processing and analytics

JY Zhu, B Tang, VOK Li - international Journal of big Data …, 2019 - inderscienceonline.com
… development of big data solutions is progressing fast, we have updated the case studies of
big data solutions to … Generally, the data processing layer performs parallel computing, data

Data preparation as a service based on Apache Spark

N Mahasivam, N Nikolov, D Sukhobok… - Service-Oriented and …, 2017 - Springer
… , and parallel processing of very large volumes of data. In [18]… and transformation tools for
big data. In the following we discuss … compared to Cleanix since the data cleaning workflow is …

Application of attribute correlation in unsupervised data cleaning

P Li, C Dai, W Wang - Proceedings of the 5th International Conference …, 2019 - dl.acm.org
… correlation among attributes using the machine learning with huge data scales, but also
greater … on how to reduce the complexity of the algorithm and improve the efficiency in big data. …

When Considering More Elements: Attribute Correlation in Unsupervised Data Cleaning under Blocking

P Li, C Dai, W Wang - Symmetry, 2019 - mdpi.com
… -intensive enterprises and may cause huge economic losses… data cleaning to solve the data
quality problems accompanying big data and clean “dirty data” in datasets How to solve data

The role of big data analytics in industrial Internet of Things

MH ur Rehman, I Yaqoob, K Salah, M Imran… - … Computer Systems, 2019 - Elsevier
… An opportunity exists to develop an end-to-end industrial analytics pipeline that can handle
big data from various data sources in parallel and find highly correlated knowledge patterns …