Friday, June 28, 2019

IEA IRC 2019: Day 3

The final day of the IEA IRC 2019 started with Anna Rosling Rönnlund's talk "For a fact-based worldview". Of course, I had been looking forward to this, her book "Factfulness" (with Hans Rosling and Ola Rosling) is a wonderful eye-opener. She started by telling how the three of them have spent many years trying to make numbers more understandable and easily available. Then she went on with our results on the "test" (from Factculness). Actually, 43 % of this conference got it better than chimpanzees on these questions, while in representative surveys, 10 % do better than chimps. In average, we got 4.2 points out of 12, which is 0.2 points better than an average chimp. (In representative surveys, the average is 2. Personally, I think having read Factfulness helps a lot...)

People tend to answer systematically wrong - they think the situation is worse than it is. (Moreover, if I remenber the book correctly, people tend to answer "correctly" based on the situation fifty years ago. People should not learn about the world by heart and then think that they don't have to update their worldview.) Looking out of the window, we don't see the slow global trends. In newspapers, we see diseases and catastrophes, the extraordinary events.

She spent some time on Dollar Street - where differences in living standard are illustrated by lots of photos from 350 homes around the world (so far). She illustrated how families in different countries are amazingly similar when they live on the same income level, while diversity in each country is amazing. For instance, as the income doubles from 2 to 4 dollars a day, people tend to prioritize the same regardless of continent: stove instead of fire, toilet instead of no toilet, mattress instead of sleeping on the floor and so on.

Towards the end, she gave a few of the rules of thumb of critical thinking that is also in the book. I highly recommend reading the book...

Finally, she showed some animations using TIMSS data. They were interesting, but as hung up we are on confidence intervals, it would be good to have a way of marking that to avoid making a point of differences that are not actually differences or trends that are not trends. But she stressed that the tool should be seen as a hypothesis-generating tool, to notice things that seem interesting. Of course, we need to include the statistical models to investigate the hypotheses.

After this, the penultimate session I went to was "TIMSS, PIRLS and ICILS: Utilizing in-depth analysis of large-scale assessment data to improve teaching". The first talk here was Franck Salles: "Clarifying TIMSS Advanced mathematics 2015 results: A didactical approach through levels of mathematics knowledge operation". The ministry of education in France does detailed analysis of task performance to inform inspectors and teacher educators. In TIMSS Advanced, there was a 1 SD drop in French students' performance from 1995 to 2015. There had been large maths curriculum changes in that period. Of the three cognitive domains, France does relatively badly in "applying". He showed how two different tasks concerning "applying" was very different, illustrating the need of a better math task analysis model. One part of the model he proposed is tasks assessing mathematical knowledge as an object or assessing it as a tool. As an object, there are items asking for a computation or asking to show the understanding of a concept. As a tool, there are items asking for a direct application of knowledge, there are items asking for application with adaptation, and there are items asking for applivation "with intermediary" - students have to add something not in the task already. This classification wass done by a national expert panel. He showed how the classification of a task of course has to depend on the prior knowledge of students. The classification then had to be done based on the national curriculum, making it a national classification which would have been different in other countries.

Secondly in that session, I heard Jeppe Bundsgaard talk on "Differential item functioning as a pedagogical tool". He used ICILS 2013 (where Denmark didn't quite meet the sampling criteria), and the Rasch model. Differential Item Functioning (DIF) refers, of course, to the phenomenon that an item works differently for two different groups of students. Bundsgaard wanted to see if grouping items with DIF can identify challenging areas in the study. He studied DIFs for the countries together and the DIFs of Norway, Denmark and Germany separately. The short conclusion is that Norwegian and Danish students are better at items related to computer literacy, but worse at items related to information literacy.

Thirdly, Olesya Gladushyna and Rolf Strietholt talked on "Nerds or polymaths? Performance profiles at the end of primary education". Latent profile analysis (LPA) was used to try to see whether students (of 4th grade) have qualitatively distinct profiles (even though not differing quantitatively) (I'm not sure whether what I'm writing her makes sense.) They did find different models with different profiles, for instance the three-profile model included one profile where students are better in math than in reading and science, one profile where students are worse in math than in reading and science, and one profile where students are equally good in all three. (The designation of those who are better in math as "nerds" is troublesome, as another result was that children with a home language other than the language of test tend to belong to that group, which does probably rather mean that they are not so good in reading and science, not that they are outstanding in math.)

Finally in this session, Nani Teig and others had the title "I know I can, but do I have the time? The role of teachers' self-efficacy and perceived time constraints in implementing cognitive-activation strategies". She used a framework for instructional quality with three basic dimensions: classroom management, supportive climate and cognitive activation. The focus here is cognitive activation strategies (CAS). This can be divided into general CAS ("asking challenging questions") and CAS specific to science - inquiry-based CAS; learn about how to do science. We know that teachers have a lack of confidence in enacting CAS, and that CAS can be demanding and time-consuming. This study focused on the interplay between teacher self-efficacy and teacher perceived time constraints, using Norwegian TIMSS 2015 data. They found that general and inquiry-based CAS are distinct but correlated constructs. They found significant correllation between self-efficacy and both kinds of CAS, but only significant correlation between teacher perceived time constraints and inquiry-based CAS (which makes sense, I guess), but this is significant only on grade 9 when analysed for each grade (at that point, of course, the number of students got smaller).

The final session (the closing ceremony excluded) was on "Socioeconomic background and student achievement: TIMSS and PIRLS". Here, there were three talks, the first of which was Rune Muller Kristensen: "Deconstruction of the negative social heritage? A search for variables confounding the simple relation between socioeconomic status and student achievement". They used Danish TIMSS data from 2015. ESCS (Economic, Social and Cultural Status) cancelled out other effects in that study, including school and class size. The point of this project was to understand these relationships better. However, no matter how many relevant variables were thrown into the model, not much of the variation between ESCS and performance was explained. (The discussant at the end asked whether the variations of the ESCS and the potential confounding variables are too small in Denmark, and whether different results could be found in other countries with more variance.)

Secondly, Andrés Strello and others had the title "Effects of early tracking on performance and inequalities in achievement: Combined evidence from PIRLS, TIMSS and PISA". They studied all available cycles of the three studies and 75 countries, sorted according to when (or if) they started tracking, looking at dispersion, social inequality and performance level. They did 45 pairs of comparisons, and then used a meta-analytical approach (Card 2011). The meta-analysis showed that tracking has a significant effect on inequality as dispersion, on social inequality and on performance level. (It seemed, however, that tracking has a negative effect on reading performance as measured in PISA.) Also, the earlier tracking takes place, the larger the effects. (In the discussion afterwards, and in the presentation itself, it was pointed out that tracking is a complex phenomenon with different implementations between and even within countries. Still, that makes it perhaps more surprising that significant findings were found.)

Finally, the last talk of the conference was Vasilik Pitsia and others: "High achievement in mathematics and science: A multilevel analysis of TIMSS 2015 data for Ireland" (using 4th grade data). They divided the students into high level achievers (TIMSS Advanced International Benchmark) and non-high achievers. As usual, confidence correlates highly with being high-achievers (probably meaning partly that high-achievers notice that they are high-achievers). Also, home resources are important. However, the chance of being high-achiever decreases when pupils think they get engaging teaching. (This was the results for mathematics, I didn't note down the results for science.) (Of course, it is tempting to find an explanation for the last result. May high-achievers be less easily engaged, because level of mathematics in the teaching is too low?)

Then, there was only the closing ceremony left. We were invited to join the IEA IRC in United Arab Emirates in 2021. For me, there are many reasons to avoid a conference in UAE. Of course it is difficult to get to the UAE from Northern Europe in a climate-friendly way and it is unpleasant to have a conference in high temperatures. A lot more importantly, at least for me, is the human rights situation, where for instance gays are arrested and in theory get a death sentence. There are examples of gay men being raped in UAE only to be investigated for illegal gay sex. So I will leave the 2021 IRC for more thick-skinned people than myself, and instead aim for the 2023 IRC.

(I have nothing personally against the UAE representative advertising for the beauty of UAE and the happiness of its people, but it would have been fair to mention that certain subgroups of the population are not happy at all.)

IEA IRC 2019: Day 2

The morning plenary at day 2 of the IEA IRC was Aaron Benavot's talk "How can IEA make a difference in measuring and monitoring learning in the 2030 agenda for sustainable development?" He is a former director of UNESCO's Global Education Monitoring (GEM) Report. GEM Reports, published yearly, previously monitored progress on the 6 Education for All Goals, now they monitor educational targets in the 2030 Agenda for Sustainable Development. 

He discussed the history behind the merging of several processes into the Sustainable Development Framework with 17 goals, 169 targets, 230 indicators. He stressed how this is the most aspirational and comprehensive international education agenda ever. This comprehensive agenda reinvigorate earlier debates on how to measure and monitor learning. The countires are supposed to have voluntary national reviews, and there will be an elaborate indicator framework with different indicators and measures - at least one global indicator per target, a number of thematic indicators (globally comparable indicators), regional indicators and national indicators. For instance, the target 4.1 talkes about relevant and effective learning outcomes, while the global indicator narrows this down to reading and mathematics. However, the measuring on the global indicators is to be done in close cooperation with each state.

He stressed how different ways of measuring gives very different results. For instance, the traditional way of measuring literacy is by census data, where often the leader of the household is asked who in the household is literate. Now, a few countries are moving towards testing - for instance asking people to read a sentence for the census taker. This reduces the literacy estimates. He gave examples of how IEA might help in developing ways of measuring.

He also discussed how the international assessments are increasingly supplemented by regional and national assessments; more than 150 countries have performed national assessments since 2000.

There was a discussion after the talk about the country-led nature of the reviews and measuring. A researcher from South Africa stressed the importance of each country being able to determine what are the education priorities in their context. If South Africa cannot itself decide but has to adopt measures from western "North" countries, that would not be suited to the local context.

Again, I decided to skip the panel (which was on PIRLS and not part of my research interest presently).

After lunch, there was a Norwegian symposium on the TIMSS. There were three presentations from the University of Oslo research team (CEMO: Centre for Educational Measurement). The first was Rolf Vegar Olsen and Sigrid Blömeke: "Predicting change in mathematics achievement in Norway over time". He started out by pointing out the dramatic fall in Norwegian TIMSS results from 1995 to 2003 (comparable to twice the difference between 8th and 9th grade in 2015), followed by an increase from 2015.
The method used for this paper was the Oaxaca-Blinder Decomposition, a method for studying the mean differences between two groups - basically looking at both the constant and the slope of the regression lines for the two groups. (Actually, they used a threefold OBD, with "endowments", "coefficients" and "interaction" terms.) They wanted to include predictors which had changed in the period used. However, fairly little of the change in score could be explained by the included predictors. (A lot of possible predictors had to be excluded because the questions were different in 2003 and 2015.)

The second talk was Trude Nilsen, Julius Björnsson and Rolf Vegar Olsen: "Has equity changed in Norway over the last decades?" First, they discussed the definition of equity: it could be defined as lack of achievement differences between schools, a small SES effect on achievement or as a low proportion of pupils getting low scores. They looked at all cycles of TIMSS and PISA. While many measures had changed over time, "Number of books at home" had kept stable in both TIMSS and PISA. The findings was that the total variance had decreased over time (which may, however, be because the proportion of high performing students have decreased), while on the school level there has been different developments in the different studies. The variance explained by SES has increased over time. The main problem however is the lack of stability in the SES measures. A solution could be to combine ILSA (international large-scale assessments) with register data (but that could be controversial for privacy reasons).

The third talk of this symposium was Hege Kaarstein and Trude Nilsen: "Twenty years of science motivation mirrored through TIMSS: Examples of Norway". Their goal was to look at the development of science motivation. Methodically, every TIMSS study was compared to the 1995, in addition comparisons between 4th and 8th grade and between girls and boys in all cycles, was planned. They studied intrinsic motivation, self-concept and extrinsic motivation (the third one not measured in 4th grade).  It is an important point that it must be checked (within the means available) that questions are understood similarly over time, but the details of the scalar measurement invariance (MI) I am not able to repeat. The results of the study were mixed, but the motivation seem to have increased. Self-concept has the highest correlation with performance, but the self-concept did not increase significantly in 8th grade. (However, Norwegian students already reported very high self-concept from the beginning.)

Jan-Eric Gustafsson was the discussant, who picked up on the difficulties of looking at change over time, and asked how the ILSAs could be improved to make it easier to study change over time. He also pointed out that many of the independent variables used here are prone to large errors in measurement (as they are self-reported by students), which can lead to regression coeffisients being underestimated. He also pointed out that the PISA scales vary in reliability from year to year, while TIMSS scales have higher reliability. He also noted that "number of books at home" is shown to be working differently in different countries, so it may also be assumed to be working differently over time in one country. (It was actually pointed out in the plenary on Friday that the proportion of pupils with many books in the home, is decreasing in rich countries.) Also, he criticised cutting out lots of the comparisons based on the MIs, as the MI analyses has so high power that they detect substantially insignificant differences. (A very interesting point although he himself admitted that including these comparisons may make the paper impossible to publish.)

He also provided a fun example of problems of measuring: when a new grade scale was introduced in Sweden, confusion followed as teachers did not use the new scale consistently. This lead to increased variance in the grades (and less correlation with the underlying competence of pupils, I guess), leading to a decrease in the SES effect. (Of course, if grades are more randomly assigned, all correlations between grades and other variables will decrease.)

Then, there was the last session of the second day. The first talk was Samo Varsik: "Differences in students' and teachers' characteristics between high and low performing classes in Slovakia". He used PIRLS 2016 4th grade data from the Slovak republic. The methodological approach is based on similar research done in Czech Republic. He first showed how SES has a huge impact in Slovakia. But he also looked for differences in teaching methods between high- and low-performing classes, but found very few significant correllations. The only two significant differences were connected to high-performing classes being tested more often and being more often asked to summarize the main ideas. The second part of his work was regression models, showing for instance that students' confidence n reading is, not so surprisingly, correllated with performance, also when controlling for SES, gender and so on. However, he did not find significant results regarding teachers' characteristics. (Other than this, I did not manage to write down so much of his results.) At the end, he noted that an important limitation of his method is the "Modern Teaching Methods" variable, based on a few self-reported questions.

The second paper of the session was supposed to be Bieke De Fraine et al: "Reading comprehension growth from PIRLS Grade 4 to Grade 6", but this was cancelled.

The third paper was Marie Wiberg and Ewa Rolfsman: "Nordic students' achievement and school effectiveness in TIMSS 2015". (Nordic = Sweden & Norway). They looked at student variables (sex, native father (NF) and number of books) and school variables (student behaviour, urban school location, school climate (teacher, student, parents), aggregated SES, aggregated NF, general resources at school and resources in mathematics) and used linear regression. They included the concept of "effective schools" based on them having better results than expected based on background data. Students's background was important everywhere. In Norway, school location and school climate was significant, while in Sweden only school climate was significant. For future research, the possibility to connect to register data will make possible other analyses.

The final paper of the day was André Rognes' talk on "Birth month and mathematics performance relationships in Norway", written with Annette Hessen Bjerke, Elisabeta Eriksen and myself. Of course, I knew the paper quite well in advance: the main point is that the Relative Age Effect (RAE) is statistically significant in all content and cognitive domains of mathematics in 4th, 5th and 8th grade. There were no statistically significant RAE in 9th grade. We also tested whether there was a significant difference in RAE between 4th and 5th grade and between 8th and 9th grade - there was not.

We did get a question about whether we had looked at SES in our research. We had not. It is unlikely that birth month can be predicted by SES (and a colleague actually pointed out that he had checked it. Whether the RAE is larger in some SES groups than others, is another question that it would be interesting to investigate. (Although I fear the Norwegian data alone would not provide enough power to find out. Even with more than 4000 students in each cohort, the number of students per month gets quite small if dividing into different SES groups.)

That was the end of the second day. For me, this day was more aligned with my research interests than the first, so I was happy about it.

IEA IRC 2019: Day 1

My first IEA IRC (the 8th IEA International Research Conference) took place in Aarhus University in June 2019. As this was my second conference in this venue, I was not surprised to find that the conference was actually in Copenhagen... However, unlike the ESU five years ago, this conference started with a song (allsang): "Svantes lykkelige dag" and "I Danmark er jeg født".

The first keynote was Christian Christrup Kjeldsen: "Global attitudes and perceptions of social justice among youth: When no (in)differences make the difference". He reminded us of the concept of fuzzy set, where elements can be a member of a set to different degrees. Becoming a subject is part of life, and (I suppose) people cannot always be put 100% into the fixed boxes. (He argued based on his reading of Bourdieu, but of course I'm not able to summarize that.) Part of his talk was on what is significant: the differences between (the continuum of) statistical significance, (the continuum of) substantial issues in a moral philosophical approach and (the continuum of) effect sizes. He argued for the concept of "substantian significance": differences in capabilities supporting a life that the individual has reason to value. When talking about effect size, he connected this to Hattie, who he claimed could serve as an inspiration. At the end, he talked about a case in which he merged results from different studies in a fuzzy way while trying to keep enough noise to not understate the variance. Again, hard to summarize.

I think there was food for thought there. Take gender as an example. Of course, we are well aware that gender is more complex than the oldfashioned binary "man"/"woman" concept. However, there are important differences between "men" and "women" in most fields of research, when treated as a binary concept. So which underlying concepts can be found that can explain the differences, without having to keep using a binary concept that we know is too simplistic?

As happens at conferences, I had to spend the next slot doing some last-week edits to our presentation with my colleague. For the after-lunch slot I chose to take part in the "Open source publishing with IEA" panel. Of course, if we are to do more analyses of international studies, we need to know as much as possible about the publishing possibilities.

The journal "Large-scale Assessments in Education" has had its fifth anniersary, and is now a Springer open source journal, giving it more visibility. Also, there is the IEA Research for Education Book Series,   often 80-150 pages long. Calls for proposals are published biannually. (The authors actually get 25 000 euro for each book.) Unsolicited applications are also considered. Only IEA studies can be used for the book series, while the journal is more forgiving. The full process from accepted proposal to finished printed book is usually about two years.

Finally, Seamus Hegarty talked about the review process for the Book Series: There is a pre-review, then review of each chapter (based on an annotated ToC, which is mandatory for proposals). The review is not double-blind - only the reviewers are anonymous.

He gave some examples of some usual editorial suggestions:
  • Do provide an argument about the significance of your work
  • Contextualise your work
  • Detail your methodology
  • Be rigorous and coherent (especially difficult to obtain coherence when different teams of authors write different parts of the book
  • Write clearly
  • Organise your own review! It is useful to use colleagues to do a "review" before the real review.

Then, for the final session of the day I decided to attend the session on "TIMSS and ICCS: Students' attitudes and achievement in TIMSS, TIMSS Advanced mathematics, and ICCS". The first paper was by Laura Palmerio and Elisa Caponera: "Relationship between students' attitudes and beliefs, and achievement in advanced mathematics". The TIMSS Advanced questionnaires and tests were supplemented by a national questionnaire given to the same students, on self-efficacy and anxiety. They found that self-efficacy is highly correlated with mathematics performance, not surprisingly. This could be a sign that we should work on students' self-efficacy.

A side note: they showed that "self-efficacy was the best predictor of mathematics performance" (according to the abstract). I think this is a good example of how the language of "predictor" can be problematic, as the relationship between self-efficacy and performance is of course going in two directions - performance leads to better self-efficacy and self-efficacy leads to better performance. (In the presentation it was very clear that self-efficacy and performance are part of a circular relationship which also includes behaviour and anxiety.)

The next talk was Michaelides and others: "Meaningful clusters of eight grade students in 2015 TIMSS mathematics using motivation variables". They focused on confidence, enjoyment and value (all three scales administered in 8th grade, the two first also in 4th grade), to look at what the interactions between them are. For instance, some students report that they value math but do not enjoy it. They did this across 12 jurisdictions, in TIMSS cycles 1995-2007. The analysis was based on a two-step clustering approach. These clusters were developed per country, and then the clusters' participants' achievement and gender composition was explored. In inconsistent clusters, value did not play much of a role for achievement - self-confidence and enjoyment was more important.

The third talk in this session was Dupont et al: "The role of parents' literacy attitudes on children's reading achievement (PIRLS 2016)". They had different hypotheses on the connection between parents' reading attitudes and the outcomes (students' reading motivation and students' reading achievement). Regression analyses were done, controlling for home resources for learning. They found high correlation between parents' attitudes and students' reading achievement. (Some of the diagrams here could be useful in my teaching on quantitative methods in our master courses.) The study underline the importance of parents' literacy practices on students' reading achievement and attitudes.

The final talk of the day was Kwong and Macaskill: "The relationship between student engagement and achievement across countries within regions using latent class analysis". They looked at Asia, Europe and Latin America as three regions. First used Exploratory Factor Analysis to explore relations among the attitude indices. Thereafter LPA was used - a two level LPA model used for the Asia region.  (Obviously, I can not summarize all the tables showing the results of these analyses.) Through lots of diagrams, we were shown how the three regions had different profiles, although for instance Taiwan seemed to stand out a bit from the other Asian regions included. (Sadly, complex diagrams with lots of small type do not work very well when sun is flooding the room, so it was hard to get the details.)

That ended the first day of the conference. It is a different experience than many other conferences, as I usually go to conferences where I can choose talks on topics I am very interested in. Here, I more often find myself listening to talks where the topic in itself is not that relevant to my interests, but where the methodological ideas can very well be useful for me to explore other topics. So it is a different focus.

Saturday, February 4, 2017

CERME 10 Day 4

The first TWG session on Saturday consisted of four ten-minute presentations, followed by discussions. As I had one of the presentations, it's a bit hard to give details on them (one does get a bit too occupied with one's own presentation in such circumstances). They were:
• Rodolfo Fallas-Soto: "Variational strategies on the study of the existence and uniqueness theorem for ordinary differential equations"
• Me: "Design research with history in mathematics education"
• Antonio Oller-Marcén: "Analyzing some algebraic mistakes from a XVI century Spanish text and observing their persistence among present 10th grade students"
• Katalin Gosztonyi: "Understanding didactical conceptions through their history: a comparison of Brousseau's and Varga's experimentations"

In the discussion, some of the points were:
• Tradition and contextualisation are important - the traditions researchers come from are important (in the case of my design research project). It is important to be clear about the context of them (but on the other hand, it is also important for design research projects to consider and describe which context they may be relevant for).
• There was a chicken-and-egg-discussion on what comes first in historical research - the question and/or method or the data. (Arguably, all the world is data - or; you can say that they only become data when they can be helpful in answering a question someone poses.)
• In what way do theoretical frameworks "work"?
• What to do once epistemological obstacles are identified? Should we face or avoid them (until students are "hungry" - why feed them if they're not?).
• Design research - can it be called a "theoretical framework" (as the chairs did in their framing of question for the group discussion). (My answer would be no. A participant also said that it could rather be seen as a framework of aspects to be thought of in such projects.)

The next part of the programme was a plenary panel. The panellists were Marianna Bosch (Spain), Tommy Dreyfus (Israel), Caterina Primi (Italy) and Gerry Shiel (Ireland). The topic of the panel was "Solid findings in mathematics education: what are they and what are they good for?" Marianne Bosch was the chair. The background for the panel was EMS' series of articles on "Solid findings in mathematical education". "Solid findings" are defined as important contributions, which are trustworthy and that can be applied. The panel wanted to examine the notion of "solid finding" and consider possible utilities and weaknesses.

Tommy Dreyfus pointed out that there are not many review articles in the field of mathematics education. The European Mathematical Society (EMS) decided to help remedy this. (The articles are in the Newsletter of the EMS issues 81-94.)

One example: we know that many students "prove" a universal statement by providing examples, across many age levels and countries, including teachers. We call this "empirical proof schemes". But to be called "solid", an explanation is also needed, and here the explanations are varied. But the main criteria for being "solid" holds. Another example: concept image. Students tend to think with their personal image rather than the definition. This occurs at all levels, in many countries, for almost 40 years and across many topics of mathematics. These are often formed by prototypes. Instruction plays a (limited) role. These findings can be considered "solid".

Solidity cannot be "proved", expert opinion is crucial, and experts from several fields should be consulted.

Caterina Primi talked about how psychometrics could contribute to solid findings in mathematics education. We often measure something else than the trait we are interested in - for instance signs of anxiety, even though it is the unobservable trait anxiety we are interested in. Of course, we can create instruments to try to measure the trait based on them, and these can also be used to find differences between groups. (And so on. It is hard to see how this rather elementary discussion of psychometrics contributes much to the general discussion of solid results - unless her talk is an implicit argument that psychometrics are more important than other research approaches to get solid results - as many would of course say about their own pet approach.)

Gerry Shiel's perspective was whether outcomes of international assessments (PISA) can contribute to evidence-based decision-making. Are PISA findings solid? On the one hand, it is huge (more than 500 000 students have contributed to it). He gave an introduction to PISA and how it tries to be an evidence-based series of studies including testings. He gave an example of how Ireland's performance in TIMSS changed over time, with a significant dip in 2009. This dip has not been explained. Ireland rebounded, while other countries had a dip in 2015 when digital testing was done. Also, Ireland has an increase in the gender difference between boys and girls, which is hard to explain. PISA results are used to inform policy - and PISA surprisingly tries to impact teaching directly by publishing their speculations on what can be inferred by the data.

In the discussion (which did not work very well, because of a somewhat confusing combination of "questions" from the floor and "questions" sent electronically), it was asked "solid for whom" - implying that what is solid for researchers may not be solid for teachers (and vice versa). This is an interesting point. Gabrielle Keiser mentioned that we need some methodology for writing review papers - it is a very difficult task, and for instance quantitative analyses are not always helpful.

(But in hindsight, it is easy to see that this topic invites people to promote their own research or conception of research...)

The last part of Saturday (before the gala dinner) was the last session of the TWG. First, there was a part where participants talked about planned or ongoing projects with calls for cooperation. Then we talked about future conferences, where I presented the plans for ESU8 in July, 2018. Plans for the HPM satelite conference to the ICME conference in Shanghai 2020 were presented - it will be somewhere in Asia. Then the process of the proceedings were discussed, and finally there was discussion on the report of the conference, the result of which will of course be seen in the proceedings of the conference.

Due to travel arrangements, for me the conference ended with the gala dinner on Saturday evening (which had much Irish music and rather less talk). Thus, this is the place for summarizing the experience. This was my first CERME conference, and I realized that CERME is not really one conference, it is rather ~25 mini-conferences under one roof and with shared amenities and a few common talks. This means that it in one sense is an intimate conference in the same way as smaller conferences are. However, getting the intimate feel demands some consicous choices - not to switch groups no matter how interesting the talks going on elsewhere are, and to try to socialize with people in the group and not be tempted to only socialize with the people you already know. Then, the CERME experience is quite different than for instance ICME, which is a smorgasbord of interesting talks where you risk never running into the same people twice (even though even ICME has some working groups, of course, so I am exaggerating a bit).

Dublin was great, the LGBT guided tour was great and the atmosphere throughout was also great. I did learn some new things during the conference, of course, but most importantly, I think, it solidified my determination to try to focus more in the future. I want to spend my research time to get deeper knowledge in some areas rather than having many parallell projects with different foci. I'll see how this works out...

CERME 10 Day 3

Day 3 consisted solely of TWGs and an excursion. The first TWG session was devoted to discussion on the draft chapter on this group for an ERME book. It was introduced by Uffe Jankvist, who has written the chapter with Jan van Maanen. I did not note down anything from that discussion - but I was perplexed to be put in an "old-timers" group despite this being my first CERME. :-) (My feeling of being "young" was destroyed due to my participation in similar conferences since 2000...)

The rest of the morning session was spent on participants sharing informaion on important publications that the others should know of. I have notes of this somewhere, but we were also promised an email later summarizing this.

The second session started off with Renaud Chorlay's paper, about using parts of Nine Chapters in teacher training. He has three goals for working with this problem (which may be a problem, as students often focus on at most one). Liu Hui gave two justifications for multiplication of fractions, the second of which could probably be used in teaching, in my opinion. The use of a semantic embedding (word problem) is a resource, but also a worry as it can decrease the generality. Renaud argued convincingly that this example can be useful for discussion with teacher students, even though (according to him) perhaps not useful for direct work with children. I am a big fan of Renaud's work and am happy that he is now working in teacher education, as it means that his work - which is as always historically solid - now includes sharp analyses of what might be the use of the historical examples in teacher education.

Next, Regina Moeller and Peter Collignon talked on their paper which concerns the work on infinity with children. The concept has a long history, while  teacher education students tend to have only the epsilon-delta based concept. (Of course, this is context-dependent - most Norwegian teacher education students would look at you wide-eyed if you mention epsilon or delta.) In their opinion, teachers need to know other conceptions that may be closer to the steps children go through. They look especially at Hilbert and Cantor - including the hotel of Hilbert, of course. The work can make students more aware that there exists different conceptions that they have not learned and to be more open-minded.

Then, Rui Candeias presented "Mathematics in the initial pre-service education of primary school teachers in Portugal: analysis of Gabriel Gonçalves' proposal for the concept of unit and its application in solving problems with decimals". This is part of a larger research project comparing different textbooks for teacher training. He presented in detail the steps adviced by Gonçalves. (Which makes me think that it could be a good idea to study historical teacher guides in Norway to point out to students the evolution of the field of mathematics education when it comes to concrete advice given to students.)

Maria Sanz gave the last presentation of the day; "Classification and Resolution of the Descriptive Historical Fraction Problems". She proposes a classification of the problems based on which methods can be used to solve them. It is unclear to me what this classification brings to the table - other aspects (known/unknown context, size of numbers, distractors included and so on) could be as important for practical use in classrooms. In the discussion, she was asked about connection to the mathematics education research on the same issues. It was also mentioned that in some countries they are "banned" from textbooks, while in others they are obviously not banned.

Some comments that turned up:
• What can these examples bring to teacher training? The common denominator seems to be that they are in a preliminary phase - but they can work to show students that problems are not something to be solved but rather something to be analysed to decide whether and how to use in their teaching.
• Could students solve and classify problems in the way of Maria themselves? Would that be more useful than being presented with a classification?
• History can be a good tool to connect algebra without the symbolism with algebra with symbols.
• A book by Brian Clegg on infinity was recommended.

I do think that a closer collaboration between maths ed people and history of mathematics people is called for. In some cases, we see discussions on how historical sources can be used in teaching of subjects where there exist a huge amount of literature in the field of mathematics education, but where this work is disregarded. This is every bit as bad as the huge number of papers in mathematics education that completely disregards the history of the subjects that they want to discuss.

This was the end of the third day. Well, not quite. I was lucky enough to take part on the "lavender walking tour", which was a walking tour of Dublin LGBT History. We saw the Oscar Wilde monument, the Parliament, Dublin Castle, the national library and many other places of importance. We got detailed and enthusiastic information on the liberation fight, including the disgraceful attitudes of the government when activists tried to save lives by distributing condoms (which were illegal at the time). Today, Ireland has moved in a liberal direction and is one of the few countries where gay marriage has been decided in a referendum - although relgious fundamentalists still have a role. The tour ended at a gay pub where we got to continue the discussion over some Irish refreshments.

CERME is the second big international mathematics education conference in less than a year with something concerning LGBT issues on or near the programme. I do hope that this is an emerging trend.

CERME 10 Day 2

The second day of CERME 10 started where the first one ended - with a TWG (topic working group session). Please excuse my extremely short descriptions of the papers - the authors were just given ten minutes to remind participants of their papers as a basis for discussion, and I do not have the time to go back to the papers to give more detailed accounts. First, Kathy Clark talked on the very interesting TRIUMPHS project, a big design research project based on original sources. At this time, the project reports on a pilot study in the first year. I notice an inteesting focus on meta-discursive rules and on views of mathematics. They use Törner's aspects and his instrument - but the number of students included in the analysis at this point was small. It will be interesting to follow the project in years to come!

Rainer Kaenders talked about "Historical Methods for Drawing Anaglyphs". In this project, students draw 3d drawings using historical methods. The point was not to learn the methods, but to understand the mathematical principles in order to be able to do the drawings. Again, this was an interesting project giving ideas for working on geometry in new ways. Kaenders had used this in extracurricular activities with students, for which it seemed well suited.

Thirdly, Rita (Areti) Panaoura talked about the paper "Inquiry-based teaching approach in mathematics by using history of mathematics - a case study". In Cyprus, which has a centralized school system, history of mathematics is seen as a tool to investigate the mathematical concepts. She reiterated Siu's reasons that teachers hesitate in using HM.  She gave examples of teachers' attitudes and knowledge. Teachers could not connect the HM and the inquiry-based teaching approach which was also mandated. Understanding what teachers need in order to include history of mathematics in their teaching, is very important in order to implement HM in teaching. As such, I find this paper interesting. A participant questioned whether the use of Egyptian multiplication is helpful. I think that depends on the goal. According to Rita, there are no teacher guide saying what the point is, therefore it is difficult to see if the example is well-chosen or not - and difficult for teachers to use it in a meaningful way. Thus, this paper shows the problem of giving teachers resources without giving them the reasoning behond them.

The fourth presentation was of the paper "Teaching kinematics using mathematics history" (Alfredo Martinez). This is a paper concerning a reconstruction of a method of measuring time which Galileo may have used. Students were able to measure time using a rhythm, thereby being able to recreate Galileo's results. It is a bit unclear to me if this really fits in the history of mathematics group or would rather fit in a history of science group (at some unspecified conference), though.

Then there was a group discussion and sharing. Some points:
• It is a shame that the scaffolding was not there for the teachers or students in the Egyptian multiplication example to see the connection to our algorithms.
• What "scaffolding" is needed? Notes to teachers and workshops are parts of the project Kathy talked about. Also, use of history of mathematics should also be included in teacher training.
• A historical document is not necessary, historical problems (without giving the actual source) worked on with students are also useful. But what difference does the source make? (Of course, many authors have written extensively on this.)
• Can all topics be taught using history? Are there too big obstacles in some cases?
• Can we do good history and good mathematics at the same time? (My answer would be that we are never "perfect" in the classroom, teaching is always full of compromises. So there is a question of what is good enough.)
• The geographical and cultural distance is important. Is Greek mathematics more motivating for pupils in Greece?
• How much of the original context must a teacher understand?
• Choice of examples: should they be "exemplary" or could we have "fringe" examples? Papers that are most interesting from a historical point of view, may not be the best ones from an educational point of view.
• How do teachers come to have materials that they can use? And how do they (learn to) orchestrate the classroom experience?

Then, there was time for another plenary: Lieven Verschaffel on "Young children's early mathematical competencies: analysis and stimulation". Researchers today believe that children have a "starter kit", object tracking system and approximate number system (ANS). Gradually, there is a development towards a symbolic representation. There are significant correlations between numberical magnitude understanding and early mathematical achievement.

The ordinality aspect of number is neglected in the cognitive neuroscientific work. But research suggest stronger correlation/predictability between ordinal aspect and mathematical skill. For instance Hyman Bass argues for developing number based on measurement. Basing the number concept on cardinality means that later developments, such as fractions, will be more difficult.

There is also more interest in children's understanding of basic arithmetic concepts and relations. There is little research on the consequences of this for later mathematics learning. Nunes et al (2015) is an exception.

Other researchers have looked at pattern and structures. Mulligan et al (2015) is the most comprehensive, looking at children's awareness of mathematical patten and structure (AMPS). A related intervention study shows no improvement in general mathematics achievement.

The research studies mentioned so far look at children's abilities, not their dispositions.  (I.e. Asking children to look for a pattern, not measuring whether they see the pattern without a prompt.)

SFOR (spontaneous focusing on quantitative relations) - individual differences, and has a direct effect on mathematical results at end of elementary school. Several other such FLAs (four letter acronyms) were also mentioned- we do not know much about their development and interrelationship.

Then he went on to talk on domain-general (not domain-specific) abilities, such as attention, flexibility, inhibition, working memory etc. There is evidence of these abilities' importance - to a greater degree than domain-specific abilities.

Other aspects mentioned in the talk was the role of parents and early caregivers, preschool to elementary school transition, and the professional development of caregivers and teachers. He concluded by listing a whole range of important aspects which need to be further developed in years to come.

For the third session of the TWG, the first person was  Luciane de Fatima Bertin, presenting the paper "Arithmetical problems in primary school: ideas that circulated in São Paulo/Brazil in the end of the 19th century". She highlighted the notion of appropriation and the notion of purpose. The word "problem" is undefined, but seems to be synonymous with "exercise", so it has no connection to the modern understanding connected to "problem solving". There was no discussion in the journals analysed on the use of problems in teaching.

Asger Senbergs talked on his article "Mathematics at the Royal Danish Military Academy of 1830". His article is based on his Master thesis. The research was based on his curiosity about why mathematics became the main topic when Denmark created a military academy. The value of mathematics as a goal in itself was prominent - not just as a tool for action on the field.

Ildar Safuanov's paper "The role of genetic approach and history of mathematics in works of Russian mathematics educators (1850-1950)" was up next. The paper details early Russians ideas on the genetic approach. The genetic approach was connected to the idea that pupils should not just witness but also create mathematics, and was included in the guidelines for mathematics teaching after the 1917 revolution.

Tanja Hamman talked about ""Sickened by set theory?" - About New Math in German primary schools". The title is from Der Spiegel from March 1974 ("Macht Mengenlehre krank?"). She has looked at textbooks and teacher guides from West Germany to see whether the main ideas were present in the textbooks. Traditional education did influence the implementation, it is not possible to create a clean slate when dealing with teaching.

Then, it was time for group discussions. Here are some points from the discussion:
• Do we see history of mathematics education mainly as part of general history, part of mathematics education or as part of history of mathematics?
• It is interesting to look at historical cases to investigate conditions for ("successful") implementation of educational reforms. (Which is part of the value of history of mathematics education for teacher education?)
• How does it matter that a subject has a history? Does it provide a knowledge base to look at your subject?
• Who decides what are popular and unpopular subjects? What are the forces behind which topics are in vogue at a given time?
• When you know more about the past, you have more tools to deal with the present.
• New Math - was it never, anywhere, implemented as intended, with the intended outcomes?

Thus ended the second day of CERME. Although most participants probably continued their discussions into the early hours of the next day, I returned to my hotel room to prepare for the university board meeting next week. It is necessary to mention this, as some colleagues have developed an unhealthy interest in my nightlife while in Dublin... :-)

CERME 10 Day 1

CERME 10 was my first CERME, taking place at Croke Park in Dublin. With a capacity of more than 80000, the stadium had plenty of space for the 800 participants. The opening ceremony included short adresses from various dignitaries (of course, including the leaders of the groups actually doing the work of preparing the conference). For instance, we learned how Hamilton got a key insight (concerning quaternions) by the Royal Canal (which passes just outside the stadium). In addition, there was some beautiful Irish music, of course.

The first plenary lecturer was Elena Nordi. Her title was "From Advanced Mathematical Thinking to University Mathematics Education: A story of emancipation and enrichment". She opened with an image from the Coen film "A serious man" - pointing out the popular conception of what university mathematics teaching look like: a professor filling a blackboard. University mathematics teaching today is much more varied than that - the demands on the teachers are quite varied. In her talk, she wanted to give an overview of the CERME work on university mathematics since the first CERME, in a way she called "impressionistic" and personal.

She pointed out that the field is quite young, for instance important papers such as Yackel & Cobb ("Sociomathematical Norms, Argumentation, and Autonomy in Mathematics")  arrived in 1996. She pointed out that research on university mathematics education has in this period been moving away from being a "hobby" done by mathematics professors without a connection to the general mathematics education research. However, she also mentioned how her field differs from other fields in that there is a less clear distinction between teacher and researcher - the university lecturers are also often researchers. However, she did not fully go into the implications of this.

Her (rapid) talk discussed a huge number of papers from different CERME conferences, pointing out developments. For me, who is not doing research on or teach advanced mathematics, the talk was so full of unfamiliar names and developments that I will not attempt to summarize here. Sadly, the speed of her talk also excluded some participants - not all of which speak English on a daily basis. (In fact, 50 countries were represented in the conference.)

The main feature of the CERMEs are the TWGs (Topic Working Groups), which one is supposed to stay loyal to throughout the conference and which takes up most of the conference time. The first session of the TWG took place at the end of the first day. Renaud Chorlay gave a quick introduction to the working of the group.

After we had all introduced ourselves, we were ready for the first paper. That was Elizabeth de Freitas' paper called "A course in the philosophy of mathematics for future high school mathematics teachers". She talked about a course she has given for three years ar Adelphi University in New York, which was actually an alternative to a history of mathematics course. One important aspect is the philosophical paper students have to write - where they have to take a stand and defend a position on one central question from the philosophy of mathematics. Maurice O'Reilly presented his paper on "Multiple perspectives on working with original mathematical sources from the Edward Worth Library, Dublin". He stressed the scaffolding of students' work - helping and encouraging the students reading unfamiliar sources (to them) in foreign languages. These were short presentations as we had all read the papers in advance. Then we started discussing the expected and actual impact of the teaching projects. The discussion centered on whether there are ways of collecting data and convince others of the potential value of such approaches. Here are some  points:
• The researchers had some data that could have been analysed to shed light on the potential. However, as some of the assumed values concerns students' long-term approach to and image of mathematics, maybe longitudinal studies are neccessary?
• In some cases, The visceral reactions of the students are powerful  but not measurable? Some participants in the group recognized their own reaction in students' reaction.
• The role of the teacher seemed to be different here than in "usual" teaching. The projects can give ideas on how to teach to avoid the students' imitation.
• There is a pull to prove effectiveness, but also a danger of being drawn into the metrics. We need more research that convinces others than ourselves, but we also need development and ideas that can later be explored more. So papers such as these are valuable although they may not convince others.

That was already the end of the first day at CERME. Three more blog posts will follow.