TY - GEN
T1 - Towards collaborative data analysis with diverse crowds – a design science approach
AU - Feldman, Michael
AU - Anastasiu, Cristian
AU - Bernstein, Abraham
N1 - Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - The last years have witnessed an increasing shortage of data experts capable of analyzing the omnipresent data and producing meaningful insights. Furthermore, some data scientists mention data preprocessing to take up to 80% of the whole project time. This paper proposes a method for collaborative data analysis that involves a crowd without data analysis expertise. Orchestrated by an expert, the team of novices conducts data analysis through iterative refinement of results up to its successful completion. To evaluate the proposed method, we implemented a tool that supports collaborative data analysis for teams with mixed level of expertise. Our evaluation demonstrates that with proper guidance data analysis tasks, especially preprocessing, can be distributed and successfully accomplished by non-experts. Using the design science approach, iterative development also revealed some important features for the collaboration tool, such as support for dynamic development, code deliberation, and project journal. As such we pave the way for building tools that can leverage the crowd to address the shortage of data analysts.
AB - The last years have witnessed an increasing shortage of data experts capable of analyzing the omnipresent data and producing meaningful insights. Furthermore, some data scientists mention data preprocessing to take up to 80% of the whole project time. This paper proposes a method for collaborative data analysis that involves a crowd without data analysis expertise. Orchestrated by an expert, the team of novices conducts data analysis through iterative refinement of results up to its successful completion. To evaluate the proposed method, we implemented a tool that supports collaborative data analysis for teams with mixed level of expertise. Our evaluation demonstrates that with proper guidance data analysis tasks, especially preprocessing, can be distributed and successfully accomplished by non-experts. Using the design science approach, iterative development also revealed some important features for the collaboration tool, such as support for dynamic development, code deliberation, and project journal. As such we pave the way for building tools that can leverage the crowd to address the shortage of data analysts.
KW - Collaborative data analysis
KW - Crowdsourcing
KW - Design science
UR - https://www.scopus.com/pages/publications/85047956680
U2 - 10.1007/978-3-319-91800-6_15
DO - 10.1007/978-3-319-91800-6_15
M3 - Conference contribution
AN - SCOPUS:85047956680
SN - 9783319917993
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 218
EP - 235
BT - Designing for a Digital and Globalized World - 13th International Conference, DESRIST 2018, Proceedings
A2 - Chatterjee, Samir
A2 - Dutta, Kaushik
A2 - Sundarraj, Rangaraja P.
PB - Springer Verlag
T2 - 13th International Conference on Design Science Research in Information Systems and Technology, DESRIST 2018
Y2 - 3 June 2018 through 6 June 2018
ER -