14

Feb
2017

First results of SERISS project

Synergies for Europe’s Research Infrastructures in the Social Sciences (SERISS) is a four-year project (2015-2019) funded by the European Commission as part of its Horizon 2020 programme. It aims to foster collaboration and develop shared standards between the three leading European research infrastructures in the social sciences – the European Social Survey (ESS ERIC), the Survey of Health, Ageing and Retirement in Europe (SHARE ERIC), and the Consortium of European Social Science Data Archives (CESSDA AS) – and organisations representing the Generations and Gender Programme (GGP), European Values Study (EVS) and the WageIndicator Survey. Work focuses on three key areas: addressing key challenges for cross-national data collection, breaking down barriers between social science infrastructures, and embracing the future of the social sciences.

The first results of the project are now available online. These include D3.9: Report on findings from re-translation of ELSST terms and their use in the CESSDA Portal reporting the work done by the UK Data Service. The deliverable describes two methods that were used to assess the translation quality of ELSST terms, and is in two parts.

Back-translation

The first part describes the evaluation of a subset of ELSST French and German terms (1000 from each language) using the re-translation (or more precisely, the back-translation) method. The French and German terms were back-translated into the source language (English) and differences between the back-translations and the original source language terms were then analysed. This resulted in a classification of error types, and a number of recommendations. The deliverable shows that, while the back-translation method was useful in highlighting some issues with the thesaurus that affect both its ‘semantic adequacy’ (i.e. how adequate terms are from a semantics point of view) and its ‘formal adequacy’ (i.e. the extent to which terms conform to ELSST Translation Guidelines), it has nothing to say about ‘pragmatic adequacy’ (i.e. how acceptable terms are to users), or how the terms would function in an operational setting. Back-translation should, therefore, be seen as one of several complementary evaluation methods.

ELSST in use

One such complementary evaluation method is to compare the sets of terms that have been used to index the same resources, to see if differences are due to differences in how the terms have been interpreted, indicating unintended ambiguity in either the source or target terms. This approach is explored in the second part of the deliverable which compares the sets of ELSST terms that have been used to index specific cross-national surveys. The original plan was to use the CESSDA portal to find such studies, but this had to be revised since the portal has not been operational for some time. Instead, CESSDA-ELSST partners were asked via a questionnaire how they index a set of cross-national surveys. Many thanks to all who responded. Differences in the sets of terms assigned to each survey were then analysed. Results showed that, due to the paucity of the data and the differences in indexing practices across archives, it was not possible to draw any firm conclusions on the quality of the translation. However, the work highlighted ways in which ELSST could be better exploited within the archives.

Next steps

The results of the evaluation work described above will feed into ongoing work on HASSET and ELSST within the CESSDA-ELSST project, and into the next goals of the SERISS project. SERISS goals in the next phase (to June 2017) include updating the ELSST translation guidelines and producing the next deliverable: D3.10 ‘Best practice document on translation and use of thesaurus terms’. This work will be produced in consultation with CESSDA-ELSST partners.

Related SERISS work

Complementary work within SERISS is looking at how to improve the translation quality of the questionnaires in cross-national surveys. Different approaches to questionnaire translation are being investigated, including how computational linguistic methods could be exploited. A workshop on this last topic is planned in the near future.

Feedback

If you have any comments on this blog, please either add them below, or send them to the UK Data Service Thesaurus Team at thesaurus@ukdataservice.ac.uk

Lorna Balkan