On Big Data: How should we make sense of them?

Fulvio Mazzocchi


The topic of Big Data is today extensively discussed, not only on the technical ground. This also depends on the fact that Big Data are frequently presented as allowing an epistemological paradigm shift in scientific research, which would be able to supersede the traditional hypothesis-driven method. In this piece, I critically scrutinize two key claims that are usually associated with this approach, namely, the fact that data speak for themselves, deflating the role of theories and models, and the primacy of correlation over causation. My intention is both to acknowledge the value of Big Data analytics as innovative heuristics and to provide a balanced account of what could be expected and what not from it.


Big Data; data-driven science; epistemology; end of theory; causality; opacity of algorithm

Full Text: PDF

DOI: https://doi.org/10.7203/metode.11.15258


Anderson, C. (2008, June 23). The end of theory: The data deluge makes the scientific method obsolete. Wiredhttps://www.wired.com/2008/06/pb-theory/

Bollier, D. (2010). The promise and peril of big data. The Aspen Institute.

Bowker, G. (2014). The theory/data thing. Commentary. International Journal of Communication, 8(2043), 1795–1799.

Boyd, D., & Crawford, K. (2012). Critical questions for big data. Information, Communication and Society, 15(5), 662–679. http://doi.org/10.1080/1369118X.2012.678878

Burrell, J. (2016). How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 1–12. http://doi.org/10.1177/2053951715622512

Calude, C. S., & Longo, G. (2017). The deluge of spurious correlations in big data. Foundations of Science, 22(3), 595–612. http://doi.org/10.1007/s10699-016-9489-4

Canali, S. (2016). Big data, epistemology and causality: Knowledge in and knowledge out in EXPOsOMICS. Big Data & Society, 3(2), 1–11. http://doi.org/10.1177/2053951716669530

Chadeau-Hyam, M., Athersuch, T. J., Keun, H. C., De Iorio, M., Ebbels, T. M. D., Jenab, M., Sacerdote, C., Bruce, S. J., Holmes, E., & Vineis, P. (2010). Meeting-in-the-middle using metabolic profiling – A strategy for the identification of intermediate biomarkers in cohort studies. Biomarkers, 16(1), 83–88. https://doi.org/10.3109/1354750X.2010.533285

Chandler, D. (2015). A world without causation: Big data and the coming age of posthumanism. Millennium: Journal of International Studies, 43(3), 833–851. http://doi.org/10.1177/0305829815576817

Diakopoulos, N. (2015). Algorithmic accountability: Journalistic investigation of computational power structures. Digital Journalism, 3(3), 398–415. http://doi.org/10.1080/21670811.2014.976411

Giere, R. (2006). Scientific perspectivism. University of Chicago Press.

Gitelman, L. (Ed.). (2013). ‘Raw data’ is an oxymoron. The MIT Press.

Golub, T. (2010). Counterpoint: Data first. Nature, 464(7289), 679. http://doi.org/10.1038/464679a

Hales, D. (2013, February 1). Lies, damned lies and big data. https://aidontheedge.wordpress.com/2013/02/01/lies-damned-lies-and-big-data/

Humphreys, P. (2009). The philosophical novelty of computer simulation methods. Synthese, 169(3), 615–626. http://doi.org/10.1007/s11229-008-9435-2

Kitchin, R. (2014). Big data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), 1–12. http://doi.org/10.1177/2053951714528481

Knobel, C. (2010). Ontic occlusion and exposure in sociotechnical systems (Doctoral dissertation). University of Michigan, USA.

Leonelli, S. (2015). What counts as scientific data? A relational framework. Philosophy of Science, 82(5), 810–821. http://doi.org/10.1086/684083

Mazzocchi, F. (2015). Could big data be the end of theory in science? A few remarks on the epistemology of data-driven science. EMBO Reports, 16(10), 1250–1255. http://doi.org/10.15252/embr.201541001

Popper, K. R. (1959). The logic of scientific discovery. Hutchinson.


  • There are currently no refbacks.