Capacidad evaluativa, validez cultural, y validez consecuencial en PISA


La evaluaciones internacionales de estudiantes han desempeñado un papel cada vez más importante en la política educativa. Estas comparaciones internacionales basadas en pruebas generan información valiosa sobre el rendimiento del estudiantado de cada país participante y los factores sociales y contextuales asociados. Una imagen compleja de los factores culturales, económicos y sociales que dan forma a la participación de PISA empieza a emerger. Nuestro objetivo es entender la relación entre la capacidad evaluative nacional y la forma en que los países participan en estas comparaciones internacionales. Proponemos un marco conceptual para examinar la capacidad evaluativa como clave para abordar dos aspectos de la validez: cultural y consecuencial. Asimismo, se discuten las múltiples facetas de la capacidad evaluativa como condiciones para abordar la validez cultural y validez consecuencial en comparaciones internacionales.

Palabras clave

PISA; capacidad evaluativa; validez cultural; validez consecuencial.

Texto completo:

PDF PDF (English)


Ad-Hoc Technical Committee on the Development of Technical Criteria for Examining Cultural Validity in Educational Assessment. (2015). Promoting and evaluating cultural validity in the activities performed by the National Institute for Educational Evaluation (INEE). Submitted to the National Institute for Educational Evaluation. Mexico City, Mexico, January 16.

Basterra, M. R. (2011). Cognition, culture, language, and assessment. In M. R. Basterra, E. Trumbull, & G. Solano-Flores (Eds.), Cultural validity in assessment (pp. 72-95). New York: Routledge.

Bialystok, E. (2002). Cognitive processes of L2 users. In V. J. Cook (Ed.), Portraits of the L2 user (pp. 145-165). Buffalo, NY: Multilingual Matters.

Bonnet, G. (2002). Reflections in a critical eye: On the pitfalls of international assessment. Assessment in Education: Principles, Policy & Practice, 9(3), 387-399.

Breakspear, S. (2012). The policy impact of PISA: An exploration of the normative effects of international benchmarking in school system performance. OECD Education Working Paper Number 71. Retrieved from OECD website:

Brennan, R. L. (2001). Generalizability theory. New York: Springer Verlag.

Camilli, G. (2006). Test fairness. In R. L. Brennan (Ed.), Educational measurement (4th ed.) (pp. 221-256). Westport, CT: American Council on Education and Praeger.

Camilli, G., & Shepard, L. (1994). Methods for identifying biased test items. Thousands Oaks, CA: Sage.

Capacity Development Group (2007, May). Capacity assessment methodology: User’s guide.. Bureau for Development Policy, United Nations Development Program. New York, September 2005. Retrieved from the United Nations Development Programme website: /UNDP-Capacity-Assessment-User-Guide.pdf

Carnoy, M. (2015). International test score comparisons and educational policy. Carnoy, M. (2015). International Test Score Comparisons and Educational Policy: A Review of the Critiques. Boulder, CO: National Education Policy Center. Retrieved from

Clarke, M. (2012). What matters most for student assessment systems: A framework paper. Retrieved from the World Bank website:

Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements. New York: Wiley.

Darling-Hammond, Linda (2014). What can PISA tell us about U.S. education policy? New England Journal of Public Policy: 26(1), Article 4. Retrieved from

Dogan, E., & Circi, R. (2010). A blind item-review process as a method to investigate invalid moderators of item difficulty in translated assessment. In M. von Davier & D. Hastedt (Eds.), IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments (Vol. 3) (pp. 157-172). Hamburg: IERI.

Ercikan, K., Roth, W.-M., & Asil, M. (2015). Cautions about uses of international assessments. Teachers College Record, 117(1), 1-28.

Ercikan, K., Roth, W.-M., Simon, M., Sandilands, D., & Lyons-Thomas, J. (2014). Inconsistencies in DIF detection for sub-groups in heterogeneous language groups. Applied Measurement in Education, 27, 275-285.

Ercikan, K., & Solano-Flores, G. (2016). Assessment and sociocultural context: A bidirectional relationship. In G. T. L. Brown & L. Harris (Eds.), Human Factors and Social Conditions of Assessment. New York: Routledge.

Ferrer, G. (2006). Educational assessment systems in Latin America: Current practice and future challenges. Washington, DC: PREAL. Retrieved from

Figazzolo, L. (2009). Impact of PISA 2006 on the education policy debate. Retrieved from

Gebril, A. (2016). Educational assessment in Muslim countries: Values, policies, and practices. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment . New York: Routledge.

Gilmore, A. (2005). The impact of PIRLS (2001) and TIMSS (2003) in low- and middle-income countries: An evaluation of the value of World Back support for international surveys of reading literacy (PIRLS) and mathematics and science (TIMSS). Retrieved from

Hamano, T. (2011).The globalization of student assessments and its impact on education policy [English version]. Proceedings, 13, 1-11. (Originally appeared in Japanese in 2008 in the Annual Bulletin of JASEP (Japan Academic Society for Educational Policy), 15, 21-37). Retrieved from

Hambleton, R.K. (2005). Issues, designs, and technical guidelines for adapting tests into multiple languages and cultures. In R.K. Hambleton, P.F. Merenda, & C.D. Spielberger (Eds.), Adapting educational and psychological tests for cross-cultural assessment. Mahwah, NJ: Lawrence Erlbaum.

Husén, T. (1983). An incurable academic: Memoirs of a professor. Oxford, UK: Pergamon Press.

Kamens, D. H., & McNeely, C. L. (2010). Globalization and the growth of international educational testing and national assessment. Comparative Education Review, 54(1), 5-25. doi:

Kane, M. T. (1982). A sampling model of validity. Applied Psychological Measurement, 6, 125-160. doi:

Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed.) (pp. 17-64). Washington, DC: The National Council on Measurement in Education & the American Council on Education.

Kennedy, K. J. (2016). Exploring the influence of culture on assessment: The case of teachers’ conceptions of assessment in Confucian-heritage societies. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment. New York: Routledge.

Lingard, B., & Lewis, S. (2016). Globalization of the American approach to accountability: The high price of testing. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment. New York: Routledge.

Martínez-Rizo, F. (2015). Las pruebas ENLACE y EXCALE: Un estudio de validación [The ENLACE and EXCALE assessments: A validation study]. Retrieved from

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational Measurement (3rd ed.) (pp. 13-103). New York: American Council on Education, Macmillan.

Messick, S. (1995) Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741-749. doi:

Mullis, I. V. S., & Martin, M. O. (2011). TIMSS 2011 item writing guidelines. Retrieved from

Mullis, I. V. S., Martin, M. O., Ruddock, G. J., O’Sullivan, C. Y., & Preuschoff, C. (2009). TIMSS 2011: Assessment frameworks. Retrieved from

National Project Managers’ Meeting (2010, October). Translation and adaptation guidelines for PISA 2012. Doc: NPM10104e. PISA Consortium. Budapest, Hungary. Retrieved from

Organisation for Economic Co-operation and Development (OECD). (n.d.). Programme for international student assessment (PISA): Results from PISA 2012, Country note: United States. Retrieved from

Organisation for Economic Co-operation and Development (2006). PISA released items: Mathematics. Retrieved

Organisation for Economic Co-operation and Development (2010). Translation and adaptation guidelines for PISA 2012. Retrieved on from

Lockheed, M., Prokic-Bruer, T., & Shadrova, A. (2015). The experience of middle-income countries participating in PISA 2000-2015 (PISA series). Washington, D.C. & Paris: The World Bank & OECD Publishing. doi: /9789264246195-en

Ravela, P. (Ed.). (2001). Los próximos pasos: ¿Hacia dónde y cómo avanzar en la evaluación de aprendizajes en América Latina? [The next steps: Where and how to advance the evaluation of learning in Latin America?] Document No. 20. Working Group on Assessment and Standards. Santiago: PREAL. Retrieved from

Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage.

Shepard, L. A. (1997). The centrality of test use and consequences for test validity. Educational Measurement: Issues and Practice, 16(2), 5‐8, 13. doi:

Sjøberg, S. (2007). PISA and “real life challenges”: Mission impossible. In S. T. Hopmann, G. Brinek, & M. Retzl (Eds.), According to PISA—Does PISA keep what it promises? Berlin: LIT Verlag.

Solano-Flores, G. (2008, July). A conceptual framework for examining the assessment capacity of countries in an era of globalization, accountability, and international test comparisons. Paper given at the 6th Conference of the International Test Commission, Liverpool, UK.

Solano-Flores, G. (2011). Assessing the cultural validity of assessment practices: An introduction. In M. R. Basterra, E. Trumbull, and G. Solano-Flores, Cultural validity in assessment (pp. 3-21). New York: Routledge.

Solano-Flores, G. (2016). Generalizability. In L. E. Suter, D. Wyse, E. Smith, & N. Selwyn (Eds.), The BERA/SSAGE Handbook of Educational Research (chap. 47). London: Sage.

Solano-Flores, G., Contreras-Niño, L. A., & Backhoff, E. (2006). Traducción y adaptación de pruebas: Lecciones aprendidas y recomendaciones para países participantes en TIMSS, PISA y otras comparaciones internacionales [Test translations and adaptation: Lessons learned and recommendations for countries participating in TIMSS, PISA, and other international comparisons]. Revista Electrónica de Investigación Educativa (REDIE) [Electronic Journal of Educational Research], 8(2). Retrieved from /redie/article/download/143/246

Solano-Flores, G., Backhoff, E., & Contreras-Niño, L.A. (2009). Theory of test translation error. International Journal of Testing, 9, 78-91.

Solano-Flores, G., Contreras-Niño, L.A., & Backhoff, E. (2013). The measurement of translation error in PISA-2006 items: An application of the theory of test translation error. In M. Prenzel, M. Kobarg, K. Schöps, & S. Rönnebeck (Eds.), Research in the context of the programme for international student assessment (pp. 71-85). Springer Verlag.

Solano-Flores, G., & Gustafson, M. (2013). Assessment of English language learners: A critical, probabilistic, systemic view. In M. Simon, K. Ercikan, & M. Rousseau (Eds.), Improving large scale assessment in education: Theory, issues, and practice (pp. 87-109). New York: Routledge.

Solano-Flores, G., & Li, M. (2006). The use of generalizability (G) theory in the testing of linguistic minorities. Educational Measurement: Issues and Practice, 25(1), 13-22.

Solano-Flores, G., & Li, M. (2009). Generalizability of cognitive interview-based measures across cultural groups. Educational Measurement: Issues and Practice, 28 (2), 9-18.

Solano-Flores, G., & Li, M. (2013). Generalizability theory and the fair and valid assessment of linguistic minorities. Educational Research and Evaluation, 19(2-3), 245-263. doi:

Solano-Flores, G., & Nelson-Barber, S. (2001). On the cultural validity of science assessments. Journal of Research in Science Teaching, 38(5), 533-573. doi:

Stachelek, A. J. (2010). Exploring motivational factors for educational reform: Do international comparisons dictate educational policy? Journal of Mathematics Education at Teachers College, 1, 52-55.

Suter, Larry E. (2000). Is student achievement immutable? Evidence from international studies on schooling and student achievement. Review of Educational Research, 70(4), 529-545. doi:

Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and Instruction, 4, 295-312. doi:

Tatto, M. T. (2006). Education reform and the global regulation of teachers’ education, development and work: A cross-cultural analysis. International Journal of Educational Research, 45, 231-241. doi:

Teltemann, J., & Klieme, E. (2016). The impact of international testing projects on policy and practice. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment (Chap. 21). New York: Routledge.

van de Vijver, F. J. R. (2016). Assessment in education in multicultural populations. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment, (Chap. 25). New York: Routledge.

Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.

Wertsch, J. V. (1985). Vygotsky and the social formation of mind. Cambridge, MA: Harvard University Press.

Wuttke, J. (2007). Uncertainties and bias in PISA. In S. T. Hopmann, G. Brinek, and M. Retzl (Eds.), According to PISA – Does PISA keep what it promises? Berlin: LIT Verlag.

Enlaces refback

  • No hay ningún enlace refback.  

Licencia Creative Commons
Esta obra está acreditada por una licencia de Reconocimiento-NoComercial 4.0 de Creative Commons.