On Big Data: How should we make sense of them?


The topic of Big Data is today extensively discussed, not only on the technical ground. This also depends on the fact that Big Data are frequently presented as allowing an epistemological paradigm shift in scientific research, which would be able to supersede the traditional hypothesis-driven method. In this piece, I critically scrutinize two key claims that are usually associated with this approach, namely, the fact that data speak for themselves, deflating the role of theories and models, and the primacy of correlation over causation. My intention is both to acknowledge the value of Big Data analytics as innovative heuristics and to provide a balanced account of what could be expected and what not from it.


Big Data; data-driven science; epistemology; end of theory; causality; opacity of algorithm

Full Text:



Anderson, C. (2008, June 23). The end of theory: The data deluge makes the scientific method obsolete. Wired

Bollier, D. (2010). The promise and peril of big data. The Aspen Institute.

Bowker, G. (2014). The theory/data thing. Commentary. International Journal of Communication, 8(2043), 1795–1799.

Boyd, D., & Crawford, K. (2012). Critical questions for big data. Information, Communication and Society, 15(5), 662–679.

Burrell, J. (2016). How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 1–12.

Calude, C. S., & Longo, G. (2017). The deluge of spurious correlations in big data. Foundations of Science, 22(3), 595–612.

Canali, S. (2016). Big data, epistemology and causality: Knowledge in and knowledge out in EXPOsOMICS. Big Data & Society, 3(2), 1–11.

Chadeau-Hyam, M., Athersuch, T. J., Keun, H. C., De Iorio, M., Ebbels, T. M. D., Jenab, M., Sacerdote, C., Bruce, S. J., Holmes, E., & Vineis, P. (2010). Meeting-in-the-middle using metabolic profiling – A strategy for the identification of intermediate biomarkers in cohort studies. Biomarkers, 16(1), 83–88.

Chandler, D. (2015). A world without causation: Big data and the coming age of posthumanism. Millennium: Journal of International Studies, 43(3), 833–851.

Diakopoulos, N. (2015). Algorithmic accountability: Journalistic investigation of computational power structures. Digital Journalism, 3(3), 398–415.

Giere, R. (2006). Scientific perspectivism. University of Chicago Press.

Gitelman, L. (Ed.). (2013). ‘Raw data’ is an oxymoron. The MIT Press.

Golub, T. (2010). Counterpoint: Data first. Nature, 464(7289), 679.

Hales, D. (2013, February 1). Lies, damned lies and big data.

Humphreys, P. (2009). The philosophical novelty of computer simulation methods. Synthese, 169(3), 615–626.

Kitchin, R. (2014). Big data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), 1–12.

Knobel, C. (2010). Ontic occlusion and exposure in sociotechnical systems (Doctoral dissertation). University of Michigan, USA.

Leonelli, S. (2015). What counts as scientific data? A relational framework. Philosophy of Science, 82(5), 810–821.

Mazzocchi, F. (2015). Could big data be the end of theory in science? A few remarks on the epistemology of data-driven science. EMBO Reports, 16(10), 1250–1255.

Popper, K. R. (1959). The logic of scientific discovery. Hutchinson.


  • There are currently no refbacks.

Creative Commons License
Texts in the journal are –unless otherwise indicated– published under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License