Each MAGNET tool underwent comprehensive testing to assess its psychometric and usability qualities. The scores obtained during these tests are located at the bottom of each tool’s dedicated page. Statistical annexes that present the analysis and implementation duration of the tools across the different contexts in which they were tested are also provided. More values will be uploaded as the site continues to be updated.
Summary of our scoring methodology:
- Each measure was assessed by a tool lead, with scoring decisions confirmed by other team members.
- Measures are evaluated based on formative research, and their reliability and validity (including respondent understanding).
- Predefined adequacy guidelines are applied to evaluate statistical values.
Following these guidelines, reliability was assessed using internal consistency, test-retest reliability, and inter-rater reliability, and validity was evaluated based on face validity, construct validity, and structural validity (for scales) as detailed below. Our scoring methodology was informed by that of the Evidence-based Measures of Empowerment for Research on Gender Equality (EMERGE) repository (Bhan et al. 2017; Jose et al. 2017).
Formative research
- Expert input: All MAGNET tools were designed with the input of cross-disciplinary subject matter experts external to the MAGNET team.
- Cognitive interviews/pilot testing: All MAGNET tools were tested through a small-scale pilot prior to larger-scale implementation in an average of three contexts. In-depth cognitive interviewing was additionally conducted at least once for most tools. When needed, tools were subsequently adapted to improve respondent understanding prior to implementation.
Reliability
- Internal reliability: scored as “Full points” if the Cronbach’s alpha had a value equal to or greater than 0.6 and “Partial points” if it was below 0.6 (Vaz et al. 2016; Henson 2001). It is scored as “Not applicable” if the tool type was not appropriate for this statistical test.
- Test-retest reliability: scored as “Full points” if the test-retest correlation was equal to or greater than 0.6 and “Partial points” otherwise (Cicchetti 1994).
- Inter-rater reliability: scored as “Full points” if the Cohen’s kappa had a value equal to or greater than 0.8 and “Partial points” otherwise (Cohen 1960; McHugh 2012). This is assessed for tools that might be sensitive to enumerators’ interpretation of respondents’ answers.
Validity
- Face validity: Tools are considered to have face validity if they are supported by theoretical frameworks and have been assessed by experts in the subject matter. In the case of MAGNET, all tools were developed in collaboration with a diverse range of subject matter experts and have “Full points”.
- Construct validity: scored as “Full points” if the tool had a statistically significant (p<0.1) correlation with more than three relevant variables and “Partial points” otherwise. Relevant variables are those that are expected to be correlated with the tools based on theory or prior empirical evidence. These variables include socio-demographics characteristics, measures of subjective well-being, economic achievement and the intra-household position of the respondent.
- Structural: For scales, structural validity was assessed using exploratory factor analysis (EFA). Confirmatory factor analysis (CFA) was also conducted, with results for both available in the tools’ statistical annexes. MAGNET scales were scored as “Full points” if EFA factor loadings of each item on its factor were at least 0.3, a maximum of 10% of the items load on more than one factor and the cumulative variance is 50% (Field 2018).
- Respondent understanding: To gauge respondent understanding, the following question was implemented after each tool: How clear did you find the phrasing of the preceding question? The response options were: “very unclear and difficult to answer”, “slightly unclear and slightly difficult to answer” or “clear and simple to answer”. Respondent understanding was scored as “Full points” if at least 75% of respondents said that the tool was “clear and simple to answer” and “Partial points” if the figure was less than 75%.