In 2006, Vincent Dupriez and Xavier Dumay published a paper called ‘Inequalities in school systems: effect of school structure or of society structure?’ which you can download from here in ‘Comparative Education’ where they attempted to measure the degree of educational inequality in a country and whether it was related to school structure rather than societal inequality. The argument they propose does follow a logical explanation, and theoretically, has valid conclusions. In this brief paper, it is my attempt to replicate some of their findings, and where possible, extend it.

As a first analysis, Dupriez and Dumay (2006) posit a question: is the effect of family background related to the degree of tracking in a country or is it related to the degree of economic inequality? This questions comes arund because tracking has been often found to be correlated and often causally (Hanushek and others 2006) related to dispersion of test scores. Dupriez and Dumay (2006) hypothesize that background effects are primarily related to the school structure and not that level of inequality of the society. In fact, they suggest that, if society is behind the change of the background effects then we should expect to find a fairly strong relationship. Probably because there aren’t a lot of European countries to concentrate on, they only look at this relationship within Europe, a region with little variation in terms of economic equality. The authors warn that this might be one reason why their results are not entirely valid and should be careful in making general remarks.

They find that the Gini index1 is unrelated to family background. What do they use as family background? Well, building on the definition of Shavit and Blossfeld (1993) and Erikson and Goldthorpe (1992), they adopt the definition that equality is a country where a child’s performance is statistically independente of the background where they come from. It is for this reason that their index is a simply the R-square of regression of child performance on mother’s education (coded as 1 for post-secondary education and 0 for less), whether the same language is spoken at home/school and the International Socio-Economic Index.

Economic inequality

They’re first results are below and show that the two variables are independente of each other.

The Y axis is the percent of explained variance of the model explained above whereas the X axis is the GINI coefficient. In this paper I try to replicate to the closest possible their model specification. I run a model with child’s performance as the dependent variable, mother’s education coded as 1 for ISCED 5B or higher (which is the beginning of tertiary education, see here for a formal definition of each category), whether the mother is an immigrant or not and the ISEI index. As one cas see, the model specification is slightly different due to the mother’s variable. I choose this model specification because it was not very easy to code this variable in new PISA surveys (I couldn’t find it). However, I presume that both the language variable and whether the mother was an immigrant are capturing similar things and the results should be fairly similar. Below I plotted the same relationship for all European countries.

Dupriez and Dumay (2006) plotted the relationship only for literacy, in which I get a slightly positive slope rather than their flat relationship. In fact, the actual correlation is -0.05 for Mathematics and 0.11 for Literacy. Both relationships are insignificant, proving that they adhere to Dupriez and Dumay’s hypothesis. This replication shows that the relationship was indeed estimated correctly, and is robust even to a slightly different model specification. One additional contribution from this replication is to extend this same analysis to all other PISA surveys and include several other economic inequality indicators. By the time Dupriez and Dumay (2006) carried out their analysis, only PISA 2000 and 2003 were available. Up next I replicate the same analysis for all available PISA surveys to check for changes over time.

Using OECD data, I searched for all available years for each country. Remember that the PISA waves we have range from the year 2000 to the year 2015, every three years. Because not all indicators have data for all PISA years, I always choose the closest year available. So for example, if we had data on the year 2001 but no on the year 2000, I imputed the year 2000 with the 2001 value. This gives us a nice time-series analysis, not only on the Gini index, but on all other economic inequaltiy indicators. Below you can find the plots.

Each row represents an economic inequality indicator. Below you can find a small description of each one.

name value
GINI Gini (disposable income, post taxes and transfers)
GINIB Gini (market income, before taxes and transfers)
GINIG Gini (gross income, before taxes)
PALMA Palma ratio
P90P10 P90/P10 disposable income decile ratio
P90P50 P90/P50 disposable income decile ratio
P50P10 P50/P10 disposable income decile ratio
S80S20 S80/S20 disposable income quintile share
S90S10 S90/S10 disposable income decile share

As we can see, the relationship is flat, not only for the Gini index but for all other indicators, across all years. Below is another set of indicators, which resembles the above graph as well2.

Up next I plot the correlations for all years and all indicators.

The correlations are predominantly negative, which should be the other way around, however they are insignificant. There seems to be a wiggly pattern that the correlations go up and down, but no discernible pattern emerges. It does look like this relation doesn’t exists very strongly.

Tracking

As a second step, they hypothesize that the relationship should be much stronger using a summary of the level of differentiation of a country’s educational system. That is, countries which have highly tracked educational systems, should have high levels of family background effects. They find that that relationship is strong with a correlation of about 0.50.

They measure differentiation by using an index developed by Duru-Bellat, Mons, and Suchaut (2004). This index is a factor score constructed on the basis of three variables: the age at which different tracks, or separate educational pathways, are first introduced in each school system; the percentage of pupils who have fallen behind their age group at age 15; and an index of academic segregation between schools at age 15. This index indeed shows to be strongly correlated with family background effects.

Because the degree of tracking and differentiation of a country hardly changes over time, unless a top-to-bottom reform is implemented (Jakubowski et al. 2010), I trust that using the same indicator over time can be a rough approximation of what happens in reality. In my replication, I will use a slightly different tracking indicator. Following Bol and Van de Werfhorst (2013), I will use their set of indicators. Acknowledging that most authors use different representations of tracking, Bol and Van de Werfhorst (2013) created a widely comprehensive set of indicators for tracking and vocational enrollment that portray very nicely the level of fragmentation of an educational system. The data can be accesed here.

Up next I plot the new tracking indicator and the importance of family background.

Just like in Dupriez and Dumay (2006) I find a strong correlation between the two. In fact, both correlations verge on the 0.5, just in the their original paper. Overtime the correlation is becoming weaker, but it is still very substantial. Up next I plotted the evolution of this correlation for Mathematics.

One can see a downward trend, going down from a high 0.7 in 2003 to the actual 0.5 correlation coefficient. All results at this point suggest the results of Dupriez and Dumay (2006) not only were correct, but that they hold over time and across a wide variety of indicators.

Before tracking

Dupriez and Dumay (2006) acknowledge that if their hypothesis is true, then early tracking countries should have greater difficulty at reducing inequalities relative to late-tracking countries. For that analysis they created a table that portrays their inequality indicator estimated from PIRLS and PISA data. They then subtract each one and check which countries had bigger reductions, relative to their degree of tracking.

I replicated their analysis for all available PIRLS years (2001, 2006 and 2011) and compared them to their closest PISA equivalents (2000, 2006 and 2012). However, I had to do some compromises because not all variables were present in all data sets. For example, instead of the PISA socio-economic index, I had to use an index of home possesions for 2001, an index of home educational resources for 2006 and parent’s highest occupational status in 2011. I do acknowledge that this turns the analysis upside down but I wasn’t able to find neither a similar PISA index in PIRLS nor a consistent index available in all waves. All the analysis carried out for the PIRLS are for literacy, just as Dupriez and Dumay (2006).

I tried to reproduce the above a bit differently below.

In this plot I coloured inequality reduction as blue and increase as red. The blue means that inequality was higher in PIRLS (ages 9-10) than in PISA (age 15) whereas the red squares suggest that inequality increased (PIRLS inequality is lower than PISA inequality). From this analysis, the results of Dupriez and Dumay (2006) don’t add up. For example, they find that Sweden, France, Netherlands and Czech Republic had reductions of inequality when I find that they had increases of inequality. Moreover, I don’t find any strong distinctions between tracking ages (next to the label) and the level of inequality. Even more discouraging, is the fact that no clear patterns emerges for any country over time. It’s just as likely for a country to be red/blue acros the three waves.

This paper is not yet a rigorous replication of Dupriez and Dumay (2006). I have not spoken to the authors to identify in detail how they coded each variable (although they specified in their paper) nor which specific socio-economic indicator they used for the PIRLS analysis. However, I’ve tried to choose the most appropriate replication scheme. Although some results do replicate very nicely, some don’t, and they were attempted to be as close as the original analysis. The R code where you can replicate this analysis as-is is embedded in their Rmarkdown document, so everyone indeed replicate my analysis and improve it further.

Any comments can be addressed to cimentadaj@gmail.com or contributed directly to the Github repository of this replication here

Bibliography

Bol, Thijs, and Herman G Van de Werfhorst. 2013. “The Measurement of Tracking, Vocational Orientation, and Standardization of Educational Systems: A Comparative Approach.” Gini Discussion Paper 81.

Dupriez, Vincent, and Xavier Dumay. 2006. “Inequalities in School Systems: Effect of School Structure or of Society Structure?” Comparative Education 42 (02). Taylor & Francis: 243–60.

Duru-Bellat, Marie, Nathalie Mons, and Bruno Suchaut. 2004. “Caractéristiques Des Systèmes éducatifs et Compétences Des Jeunes à 15 Ans.” Les Cahiers De L ‘IREDU 66: 1–158.

Erikson, Robert, and John H Goldthorpe. 1992. The Constant Flux: A Study of Class Mobility in Industrial Societies. Oxford University Press, USA.

Hanushek, Eric A, and others. 2006. “Does Educational Tracking Affect Performance and Inequality? Differences-in-Differences Evidence Across Countries.” The Economic Journal 116 (510). Wiley Online Library: C63–C76.

Jakubowski, Maciej, Harry A Patrinos, Emilio Ernesto Porta, and Jerzy Wisniewski. 2010. “The Impact of the 1999 Education Reform in Poland.”

Shavit, Yossi, and Hans-Peter Blossfeld. 1993. Persistent Inequality: Changing Educational Attainment in Thirteen Countries. Social Inequality Series. ERIC.


  1. Taken from the World Development Report, United Nations, 2003

  2. I excluded Russia from all of these scatterplots because it was a strong outlier