Friday, June 28, 2019

IEA IRC 2019: Day 3

The final day of the IEA IRC 2019 started with Anna Rosling Rönnlund's talk "For a fact-based worldview". Of course, I had been looking forward to this, her book "Factfulness" (with Hans Rosling and Ola Rosling) is a wonderful eye-opener. She started by telling how the three of them have spent many years trying to make numbers more understandable and easily available. Then she went on with our results on the "test" (from Factculness). Actually, 43 % of this conference got it better than chimpanzees on these questions, while in representative surveys, 10 % do better than chimps. In average, we got 4.2 points out of 12, which is 0.2 points better than an average chimp. (In representative surveys, the average is 2. Personally, I think having read Factfulness helps a lot...)

People tend to answer systematically wrong - they think the situation is worse than it is. (Moreover, if I remenber the book correctly, people tend to answer "correctly" based on the situation fifty years ago. People should not learn about the world by heart and then think that they don't have to update their worldview.) Looking out of the window, we don't see the slow global trends. In newspapers, we see diseases and catastrophes, the extraordinary events.

She spent some time on Dollar Street - where differences in living standard are illustrated by lots of photos from 350 homes around the world (so far). She illustrated how families in different countries are amazingly similar when they live on the same income level, while diversity in each country is amazing. For instance, as the income doubles from 2 to 4 dollars a day, people tend to prioritize the same regardless of continent: stove instead of fire, toilet instead of no toilet, mattress instead of sleeping on the floor and so on.

Towards the end, she gave a few of the rules of thumb of critical thinking that is also in the book. I highly recommend reading the book...

Finally, she showed some animations using TIMSS data. They were interesting, but as hung up we are on confidence intervals, it would be good to have a way of marking that to avoid making a point of differences that are not actually differences or trends that are not trends. But she stressed that the tool should be seen as a hypothesis-generating tool, to notice things that seem interesting. Of course, we need to include the statistical models to investigate the hypotheses.

After this, the penultimate session I went to was "TIMSS, PIRLS and ICILS: Utilizing in-depth analysis of large-scale assessment data to improve teaching". The first talk here was Franck Salles: "Clarifying TIMSS Advanced mathematics 2015 results: A didactical approach through levels of mathematics knowledge operation". The ministry of education in France does detailed analysis of task performance to inform inspectors and teacher educators. In TIMSS Advanced, there was a 1 SD drop in French students' performance from 1995 to 2015. There had been large maths curriculum changes in that period. Of the three cognitive domains, France does relatively badly in "applying". He showed how two different tasks concerning "applying" was very different, illustrating the need of a better math task analysis model. One part of the model he proposed is tasks assessing mathematical knowledge as an object or assessing it as a tool. As an object, there are items asking for a computation or asking to show the understanding of a concept. As a tool, there are items asking for a direct application of knowledge, there are items asking for application with adaptation, and there are items asking for applivation "with intermediary" - students have to add something not in the task already. This classification wass done by a national expert panel. He showed how the classification of a task of course has to depend on the prior knowledge of students. The classification then had to be done based on the national curriculum, making it a national classification which would have been different in other countries.

Secondly in that session, I heard Jeppe Bundsgaard talk on "Differential item functioning as a pedagogical tool". He used ICILS 2013 (where Denmark didn't quite meet the sampling criteria), and the Rasch model. Differential Item Functioning (DIF) refers, of course, to the phenomenon that an item works differently for two different groups of students. Bundsgaard wanted to see if grouping items with DIF can identify challenging areas in the study. He studied DIFs for the countries together and the DIFs of Norway, Denmark and Germany separately. The short conclusion is that Norwegian and Danish students are better at items related to computer literacy, but worse at items related to information literacy.

Thirdly, Olesya Gladushyna and Rolf Strietholt talked on "Nerds or polymaths? Performance profiles at the end of primary education". Latent profile analysis (LPA) was used to try to see whether students (of 4th grade) have qualitatively distinct profiles (even though not differing quantitatively) (I'm not sure whether what I'm writing her makes sense.) They did find different models with different profiles, for instance the three-profile model included one profile where students are better in math than in reading and science, one profile where students are worse in math than in reading and science, and one profile where students are equally good in all three. (The designation of those who are better in math as "nerds" is troublesome, as another result was that children with a home language other than the language of test tend to belong to that group, which does probably rather mean that they are not so good in reading and science, not that they are outstanding in math.)

Finally in this session, Nani Teig and others had the title "I know I can, but do I have the time? The role of teachers' self-efficacy and perceived time constraints in implementing cognitive-activation strategies". She used a framework for instructional quality with three basic dimensions: classroom management, supportive climate and cognitive activation. The focus here is cognitive activation strategies (CAS). This can be divided into general CAS ("asking challenging questions") and CAS specific to science - inquiry-based CAS; learn about how to do science. We know that teachers have a lack of confidence in enacting CAS, and that CAS can be demanding and time-consuming. This study focused on the interplay between teacher self-efficacy and teacher perceived time constraints, using Norwegian TIMSS 2015 data. They found that general and inquiry-based CAS are distinct but correlated constructs. They found significant correllation between self-efficacy and both kinds of CAS, but only significant correlation between teacher perceived time constraints and inquiry-based CAS (which makes sense, I guess), but this is significant only on grade 9 when analysed for each grade (at that point, of course, the number of students got smaller).

The final session (the closing ceremony excluded) was on "Socioeconomic background and student achievement: TIMSS and PIRLS". Here, there were three talks, the first of which was Rune Muller Kristensen: "Deconstruction of the negative social heritage? A search for variables confounding the simple relation between socioeconomic status and student achievement". They used Danish TIMSS data from 2015. ESCS (Economic, Social and Cultural Status) cancelled out other effects in that study, including school and class size. The point of this project was to understand these relationships better. However, no matter how many relevant variables were thrown into the model, not much of the variation between ESCS and performance was explained. (The discussant at the end asked whether the variations of the ESCS and the potential confounding variables are too small in Denmark, and whether different results could be found in other countries with more variance.)

Secondly, Andrés Strello and others had the title "Effects of early tracking on performance and inequalities in achievement: Combined evidence from PIRLS, TIMSS and PISA". They studied all available cycles of the three studies and 75 countries, sorted according to when (or if) they started tracking, looking at dispersion, social inequality and performance level. They did 45 pairs of comparisons, and then used a meta-analytical approach (Card 2011). The meta-analysis showed that tracking has a significant effect on inequality as dispersion, on social inequality and on performance level. (It seemed, however, that tracking has a negative effect on reading performance as measured in PISA.) Also, the earlier tracking takes place, the larger the effects. (In the discussion afterwards, and in the presentation itself, it was pointed out that tracking is a complex phenomenon with different implementations between and even within countries. Still, that makes it perhaps more surprising that significant findings were found.)

Finally, the last talk of the conference was Vasilik Pitsia and others: "High achievement in mathematics and science: A multilevel analysis of TIMSS 2015 data for Ireland" (using 4th grade data). They divided the students into high level achievers (TIMSS Advanced International Benchmark) and non-high achievers. As usual, confidence correlates highly with being high-achievers (probably meaning partly that high-achievers notice that they are high-achievers). Also, home resources are important. However, the chance of being high-achiever decreases when pupils think they get engaging teaching. (This was the results for mathematics, I didn't note down the results for science.) (Of course, it is tempting to find an explanation for the last result. May high-achievers be less easily engaged, because level of mathematics in the teaching is too low?)

Then, there was only the closing ceremony left. We were invited to join the IEA IRC in United Arab Emirates in 2021. For me, there are many reasons to avoid a conference in UAE. Of course it is difficult to get to the UAE from Northern Europe in a climate-friendly way and it is unpleasant to have a conference in high temperatures. A lot more importantly, at least for me, is the human rights situation, where for instance gays are arrested and in theory get a death sentence. There are examples of gay men being raped in UAE only to be investigated for illegal gay sex. So I will leave the 2021 IRC for more thick-skinned people than myself, and instead aim for the 2023 IRC.


(I have nothing personally against the UAE representative advertising for the beauty of UAE and the happiness of its people, but it would have been fair to mention that certain subgroups of the population are not happy at all.)

No comments:

Post a Comment