Una crítica metodológica a las evaluaciones PISA


En este trabajo realizamos una evaluación metodológica de las evaluaciones internacionales PISA, presentando un análisis crítico de sus deficiencias y limitaciones. Presentamos una revisión metodológica o meta-evaluación de los múltiples informes PISA, en un intento de demostrar la validez plausible de las inferencias que PISA mantiene, teniendo en cuenta una serie de limitaciones metodológicas tales como: una lógica incoherente, toma de muestras opacas, diseño evaluativo inestable, instrumentos de medición de validez cuestionables, el uso oportunista de las puntuaciones transformadas por la normalización, la confianza reverencial en la significación estadística, la ausencia de estadísticas sustantivamente importantes centradas en las magnitudes de los efectos, una presentación problemática de los hallazgos e implicaciones cuestionables extraídas de los resultados para las prácticas y las legislaciones educativas. Recae sobre PISA la responsabilidad de proporcionar y demostrar mayor rigor metodológico en los futuros informes técnicos y la consiguiente necesidad de ser cuidadosos para no mostrar inferencias sin fundamento a partir de sus hallazgos.

Palabras clave

Evaluación; Meta-evaluación; Metodología of evaluación; PISA.

Texto completo:

PDF (English) PDF


Adams, R. J. (2003). Response to ‘Cautions on OECD's recent educational survey (PISA)’. Oxford Review of Education, 29(3), 377-389. doi:

Adams, R., Berezner, A. & Jakubowski, M. (2010). Analysis of PISA 2006 preferred items ranking using the percent correct method. Paris: OECD. Retrieved from

Adams, R. J., Wu, M. L. & Carstensen, C. H. (2007). Application of multivariate Rasch models in international large-scale educational assessments. In M. von Davier & C. H. Carstensen (Eds.), Multivariate and mixture distribution Rasch models (pp. 271–280). New York: Springer.

American Psychological Association (1994). Publication manual of the American Psychological Association (4th ed.). Washington, DC: APA.

Bank, V. (2012). On OECD policies and the pitfalls in economy-driven education: The case of Germany. Journal of Curriculum Studies, 44(2), 193-210. doi:

Berliner, D. C. (2011). The context for interpreting PISA results in the USA. Negativism, chauvinism, misunderstanding, and the potential to distort the educational systems of nations. In M. Pereira, H-G. Kotthoff & R. Cowen (Eds.), PISA under examination: Changing knowledge, changing tests, and changing schools (pp. 77-96). Rotterdam: Sense Publishers

Berliner, D. C. (2015). The many facets of PISA. Teachers College Record, 117(1), 20.

Bollen, K., Paxton., P. & Morishima, R. (2005). Assessing international evaluations. An example from USAID's democracy and governance program. American Journal of Evaluation, 26(2), 189-203. doi:

Brown, G., Micklewright, J., Schnepf, S. V. & Waldmann, R. (2007). International surveys of educational achievement: How robust are the findings? Journal of the Royal Statistical Society Series A-Statistics in Society, 170(3), 623-646. doi:

Carnoy, M. & Rothstein, R. (2013). International tests show achievement gaps in all countries, with big gains for U.S. disadvantaged students. Economic Policy Institute: Washington, DC. Retrieved from

Couso, D. (2009). Y después de PISA, ¿qué? Propuestas para desarrollar la competencia científica en el aula de ciencias [And after PISA, what? Proposals to develop the scientific competence in the Science classroom]. Enseñanza de las Ciencias, especial issue, 3547-3550.

DeSeCo Project (2008). Definition and selection of competencies: Theoretical and conceptual foundations. Paris: OECD. Retrieved from

De Witte, K. & Kortelainen, M. (2013). What explains the performance of students in a heterogeneous environment? Conditional efficiency estimation with continuous and discrete environmental variables. Applied Economics, 45(17), 2401-2412. doi:

Dolin, J. & Krogh, L. B. (2010). The relevance and consequences of PISA science in a Danish context. International Journal of Science and Mathematics Education, 8(3), 565-592. doi:

Drechsel, B., Carstensen, C. & Prenzel, M. (2011). The role of content and context in PISA interest scales: A study of the embedded interest items in the PISA 2006 science assessment. International Journal of Science Education, 33(1), 73-95. doi:

Ehmke, T., Drechsel, B. & Carstensen, C. H. (2008). Klassenwiederholen in PISA-I-Plus: Was lernen sitzenbleiber in mathematik. [Grade repetition in PISA-I-Plus: What do students who repeat a class learn in mathematics?]. Zeitschrift fur Erziehungswissenschaft, 11(3), 368-387. doi:

Ercikan, K., Roth, W-M. & Asil, M. (2015). Cautions about inferences from international assessments: The case of PISA 2009. Teacher College Records, 117(1), 1-28.

Fernández-Cano, A. & Fernández-Guerrero, IM. (2009). Crítica y alternativas a la significación estadística en el contraste de hipótesis. Colección Cuadernos de Estadística, nº 37. Madrid: Arco Libros-La Muralla.

Glass, G. V. (1977). Integrating findings: The meta-analysis of research. Review of Research in Education, 5(1), 351-379.

Grisay, A., De Jong., Gebhardt, E., Berezner, A. & Halleux-Monseur, B. (2007). Translation equivalence across PISA countries. Journal of Applied Measurement, 8(3), 249-266

Gur, B. S., Celik, Z. & Ozoglu, M. (2012). Policy options for Turkey: A critique of the interpretation and utilization of PISA results in Turkey. Journal of Education Policy, 27(1), 1-21. doi:

Hanberger, A. (2014). What PISA intends to and can possibly achieve: A critical programme theory analysis. European Educational Research Journal, 13(2), 167-180. doi:

Hartig, J. & Frey, A. (2012). Validity of a standard-based test for mathematical competencies. Relations with the competencies assessed in PISA and variance between schools and school tracks. Diagnostica, 58(1), 3-14. doi:

Hohensinn, C., Kubinger, K. D., Reif, M., Schleicher, E. & Khorramdel, L. (2011). Analysing item position effects due to test booklet design within large-scale assessment. Educational Research and Evaluation, 17(6), 497-509. doi:

IEA (2012). The International Association for the Evaluation of Educational Achievement. Retrieved from

Jerrim, J. (2011). England's" plummeting" PISA test scores between 2000 and 2009: Is the performance of our secondary school pupils really in relative decline (Nº. 11-09). London: Department of Quantitative Social Science-Institute of Education of University of London.

Judkins, D. R. (1990). Fay’s method of variance estimation. Journal of Official Statistics, 6, 223-239.

Kjaernsli, M. & Lie, S. (2011). Students' preference for science careers: International comparisons based on PISA 2006. International Journal of Science Education, 33(1), 121-144. doi:

Knipprath, H. (2010). What PISA tells us about the quality and inequality of Japanese Education in Mathematics and Science. International Journal of Science and Mathematics Education, 8(3), 389-408.

Kreiner, S. & Christensen, K. B. (2014). Analyses of model fit and robustness. A new look at the PISA scaling model underlying ranking of countries according to reading literacy. Psychometrika, 79(2), 210-231. doi:

Kubinger, K. D., Hohensinn, C., Hofer, S, Khorramdel, L, Freborta, M, Holocher-Ertl, S., Reif, M. & Sonnleitner, P. (2011). Designing the test booklets for Rasch model calibration in a large-scale assessment with reference to numerous moderator variables and several ability dimensions. Educational Research and Evaluation, 17(6), 483-495. doi:

Lafourcade, P. (1971). Evaluación de los aprendizajes [Learning evaluation]. Buenos Aires: Kapelusz.

Lee, J. (2014). An attempt to reinterpret student learning outcomes: A cross-national comparative study. Peabody Journal of Education, 89(1), 106-122. doi:

Lu, Y. & Bolt, D. M. (2015). Examining the attitude-achievement paradox in PISA using a multilevel multidimensional IRT model for extreme response style. Large-scale Assessments in Education, 3(2). doi:

Lynn, R. & Mikk, J. (2009). Sex differences in reading achievement. Trames-Journal of the Humanities and Social Sciences, 13(1), 3-13. doi:

Meyer, H-D. & Benavot, A. (Eds.). (2013). PISA, power, and policy. The emergence of global educational governance. Providence, RI: Symposium Books.

Ministerio de Educación, Cultura y Deporte [MECD] (2010). PISA 2009. Programa para la Evaluación Internacional de los Alumnos. OCDE. Informe español [PISA 2009.The Spanish report]. Madrid: Instituto de Evaluación. Retrieved from

National Center for Education Statistics (2012). National Assessment of Educational Progress. Retrieved from

Olsen, RV. & Lie, S. (2011). Profiles of students' interest in science issues around the world: Analysis of data from PISA 2006. International Journal of Science Education, 33(1), 97-120. doi:

Organisation for Economic Co-operation and Development (2009). PISA 2009 key findings. Paris: OECD Publishing. Retrieved from

Organisation for Economic Co-operation and Development (2012). PISA 2009 Technical report, PISA. Paris: OECD Publishing. doi:

Organisation for Economic Co-operation and Development (2013a). PISA 2012 Results in focus. What 15-year-olds know and what they can do with what they know. Retrieved from

Organisation for Economic Co-operation and Development (2013b). PISA 2012 results. What make schools successful? Resources, policies and practices. Vol. 4. Paris: OECD. Retrieved from

Organisation for Economic Co-operation and Development (2014a). SPAIN – Country note –Results from PISA 2012 problem solving. Retrieved from

Organisation for Economic Co-operation and Development (2014b). PISA 2012 technical report. Paris: OECD. Retrieved from

Organisation for Economic Co-operation and Development (2015a). School governance, assessments and accountability. Paris: OECD. Retrieved from

Organisation for Economic Co-operation and Development (2015b). PISA 2012 results. Paris: OECD. Retrieved from

Prais, S. J. (2003). Cautions on OECD'S recent educational survey (PISA). Oxford Review of Education, 29(2), 139-163. doi:

Reilly, D. (2012). Gender, cu

lture, and sex-typed cognitive abilities. Plos One, 7(7). Retrieved from 10.1371/journal.pone.0039904

Rutkowski, L. (2014). Sensitivity of achievement estimation to conditioning model misclassification. Applied Measurement in Education, 27(2), 115-132. doi:

Rychen, D. S. & Salganik, L. H. (2003). Key competencies for successful life and a well-functioning society. Göttinga: Hogrefe & Huber.

Scriven, M. (2011). Evaluating evaluations:A meta/evaluation checklist. (6th ed.). Retrieved from

Smith, E. (2009). Underachievement, failing youth and moral panics. Evaluation & Research in Education, 23(1), 37-49.

Strietholt, R., Rosén, M. & Bos, W. (2013). A correction model for differences in the sample compositions: the degree of comparability as a function of age and schooling. Large-scale Assessments in Education, 1(1). doi:

Stufflebeam, D. (2011). Meta-evaluation. Journal of MultiDisciplinary Evaluation, 7(15), 99-158.

Takayama, K. (2008). The politics of international league Tables: PISA in Japan's achievement crisis debate. Comparative Education, 44(4), 387-407. doi:

Wang, M.C., Haertel, G.D. & Walberg, H.J. (1993). Toward a knowledge base for school learning. Review of Educational Research, 63(3), 249-294. doi:

Wikipedia (2014). Informe PISA [PISA Report]. Retrieved from

Yarbrough, D.B., Shulha, L.M., Hopson, R.K. & Caruthers, F.A. (2011). The program evaluation standards: A guide for evaluators and evaluation users (3rd ed.). Thousand Oaks, CA: Sage.

Yore, L. D., Anderson, J. O. & Chiu, M. H. (2010). Moving PISA results into the policy arena: Perspectives on knowledge transfer for future considerations and preparations. International Journal of Science and Mathematics Education, 8(3), 593-609.

Zuckerman, G. A., Kovaleva, G. S. & Kuznetsova, M. I. (2013). Between PIRLS and PISA: The advancement of reading literacy in a 10-15-year-old cohort. Learning and Individual Differences, 26, 64-73. doi: 1

Enlaces refback

  • No hay ningún enlace refback.