In 2016, Forbes assessed that “data preparation accounts for about 80% of the work of data scientists” where preparation includes finding and collecting data, cleaning and integrating data, and managing data for data analysis. They also concluded that this is also the least enjoyable part of a data scientist’s job. As scientists, they would rather be deriving new knowledge and insights. The paradox is that without principled data management and preparation, those new insights are suspect at best. Data preparation, integration, curation, and management in support of analysis is so time consuming and unenjoyable because of the lack of tools, scientific frameworks, and mathematical foundations to support principled data preparation. My research is helping to correct this deficit. As part of my methodology, I often use open data, both because of its availability for scientific research and because of its importance to governments and society.

Renée J. Miller is a Fellow of the Royal Society of Canada, Canada’s National Academy of Science, Engineering and the Humanities. She received the US Presidential Early Career Award for Scientists and Engineers (PECASE), the highest honor bestowed by the United States government on outstanding scientists and engineers beginning their careers. She received an NSF CAREER Award, the Ontario Premier’s Research Excellence Award, and an IBM Faculty Award. She has been named the Bell Canada Chair of Information Systems and a fellow of the ACM. Her work has focused on the long-standing open problem of data integration and has achieved the goal of building practical data integration systems. She and her co-authors (Fagin, Kolaitis and Popa) received the (10 Year) ICDT Test-of-Time Award for their influential 2003 paper establishing the foundations of data exchange. Professor Miller is been a leader among her peers in Canada and abroad, she has lead the NSERC Business Intelligence Strategic Network and the non-profit International Very Large Data Base Foundation. She received her PhD in Computer Science from the University of Wisconsin, Madison and bachelor’s degrees in Mathematics and Cognitive Science from MIT.


