Nutritools

Log in Register

Help/FAQ

Best Practice Guidelines

The Best Practice Guidelines (BPG) are a set of guidelines that were developed to help researchers select the most suitable Dietary Assessment Tool (DAT) for their study. Researchers are encouraged to follow these guidelines before starting a study design. In this help section, first there is a brief explanation of how to navigate through the initial BPG screens which will help define what to measure.

You can access the Best Practice Guidelines from the homepage as it is shown below.

Navigating through the Best Practice Guidelines:

To guide you through the Best Practice Guidelines, we recommend you follow each stage in the order displayed: "Define", "Investigate", "Evaluate" and "Think Trough".

We also recommend you read the (s) for further information on each guideline statement. Please use the "Next" buttons to navigate through each stage, and the "Previous" button to return to an earlier stage.

To start, please click on "Define" to help you define what you would like to measure. This will open a new window with the first question of the BPG: "What? Characteristics of the main dietary component of interest."

Then, click on and tick the boxes beside your dietary exposure of interest (Full Nutrient, Energy, Macronutrients, Micronutrients, Food Groups). For instance, if you click on Full Nutrient (Energy, Macro & Micronutrient) this will show how many tools in the library that can measure and have been validated for these.

Click on the "Next" button to continue to the screen to select the lifestage of your study population, i.e. "Who?" you wish to study (Infants and toddlers, Children, Adolescents, Adults, Elderly). Notice that clicking on one (to three) of these increases the number of tools listed in the tool library that would be suitable for your research. Therefore if you click on Children and Adolescents tools that have been validated with children or adolescents will be presented. You can select the format of the tool by clicking on "Paper" or "Online", however we would recommend you leave this this blank to maximise the number of tools that will be presented.

You can also click on the hyperlink words, and this will take you to the Glossary of Terms, where the definition will be highlighted.

Move onto the next screen which allows you to select the timeframe of the tool you might require (short, medium, long, multiple) and the type of reporting method ("Retrospective", "Prospective" or "Usual") in terms of dietary intake. Then this will display the number of tools from the library that would be appropriate for your research given the requirements you have selected.

Click the button and this will take you to the second stage of the Best Practice Guidelines which is Investigate.

Click on the "Show Tools in Library" tab, this will be opened in a new tab. The link will take you to the list of these tools (shown on the right hand side of the tool library) that meet the requirements you have selected (shown on the left hand side).

You can also visualize the tools in a bubble plot by clicking on "Bubble Chart" from the menu above the Tools Library. Please note that the Tool library has the same filters used in each of the “Define” stage of the BPG. You can use these filters in the Tool Library once you are familiar with our guidelines, in order to explore the validated tools available for a different set of criteria. You can move back and forth between the tool library and the best practice guidelines. You are advised to continue reading the ‘Investigate’, ‘Evaluate’ and ‘Think through’ sections of these guidelines in order to help you select the most appropriate tool.

When clicking on the bubble plot this will display a short summary information on the tool and the validation study as it is shown below.

See the Best Practice Guidelines – Case Study help section for more information.

Best Practice Guidelines - Case Studies

The case studies are provided to illustrate using the Best Practice Guidelines (BPG) and to help researchers understand and follow each step of the BPG. Start with the Pre-study considerations and go through the steps of the case study.

Pre-Study consideration.

Prior to using the BPG, the researcher should consider the aims and objectives of their study. It is also recommended a nutritional statistician should be consulted to calculate a sample size required, and to discuss the statistical tests that should be undertaken and other aspects of the study design.

Case study 1

Case study 1: Measure of fruit and vegetables intake, water, consumption of high energy-dense foods and sugar sweetened beverages.

Pre-study: what is your research objective?

Answer: The aim of the study is to encourage healthy eating (increasing the consumption of fruit and vegetables and water, and reducing the consumption of high energy-dense foods and sugar sweetened beverages), in a school based intervention programme.

Start by clicking on stage I: Define and follow the guidelines.

Stage I: Define what you want to measure in terms of dietary intake - the key a priori considerations to guide your choice of the appropriate type of Dietary Assessment Tool (DAT).
  1. What? - Characteristics of the main dietary component of interest.

    1. Clearly define what needs to be measured.

      Answer: The researcher plans to evaluate fruit and vegetables intake in children (students from public/private schools) in relation to the WHO recommended daily intake of 400g of fruit and vegetables. The researcher also plans to measure the consumption of high energy-dense foods and sugar sweetened beverages.

    2. Determine how the dietary data will be analysed and presented.

      Answer: The researcher will analyse food, energy and nutrient data using available nutrient software. Dietary data will be presented as Mean and Standard deviation of the daily intake for a full day. The researcher clicked on the appropriate boxes in the ‘Define’ screen shown above, and then the available tools meeting these options were automatically selected in the tool library: in this case 27 (at the time of writing this).

    Note: the total number of tools in the database will change as more tools from more countries are added to the nutritools website.

  2. Who? – Considerations around the characteristics of study participants.

    1. Define the target sample in terms of characteristics.

      Answer: The researcher wants to measure dietary intake in primary school age children (3 to 11y). Therefore the researcher selected the ‘Children’ option and the available tools meeting the criteria selected dropped to 6. As this is a large scale study, a representative random sample for a pre- and post-evaluation was required.

    2. Identify other issues that could affect the choice of DAT.

      Answer: There are issues that the researcher has considered for their study, such as the level of literacy, technology familiarity and the classification of fruits and vegetables. Also, the estimation of portion sizes by children can add measurement error. Similarly, household items used as a proxy for food measurement may vary among families. Other aspects that the researcher has taken into account were issues in collecting composite foods (food and recipes that include fruit or vegetables).

    3. Consider the study sample size required in relation to the level of variation of your dietary component of interest and study power.

      Answer: The researcher has determined the sample size for their study, after consulting a statistician, this has sufficient statistical power to detect an effect i.e. an increase in fruit and vegetable intake.

  3. When? – Time frame considerations.

    1. Are you interested in ’actual’/short-term (hours or several days, up to one week) or ‘usual’/long term intake (e. g. months or years). Consider what reference period e.g. daily, weekly, monthly, yearly would be best suited to your dietary component of interest.

      Answer: The researcher is interested in assessing short term intake over a 24 hour period (from midnight to midnight of previous day), of fruit and vegetables in school children, collected before and after the intervention (16 weeks apart).

    2. Will data collection in your study be retrospective or prospective?

      Answer: This was a prospective study for all food consumed, where intake is recorded as it is consumed. Dietary assessment will need to be undertaken both at school and at home.

Stage II. Investigate the different types of DAT and their suitability for your research question.
Consider and appraise the different DAT types.

  1. In relation to your research question, consider the suitability, strengths and weaknesses of different DAT types.

    Answer: The researcher referred to the Strengths and Weaknesses of Different DATs in the Nutritools website by clicking on the link shown above and decided that the most appropriate tool for their study was a short-term food checklist, this was based on the description and comparison of the Strengths and Weaknesses of each tool. In addition to low cost, they have low researcher and participant burden making them suitable for use in children.

    Food checklist was chosen because it has a lot of strengths that are in common with the longer FFQ, it is low researcher and participant burden, low cost and the coding is generally simple. The Food checklist is suitable to estimate specific foods or nutrients such as fruits and vegetables.

  2. Think about participant burden (e.g. study participants’ potential willingness, time, ability, ethical considerations, interest in using different tools, and access issues associated with different DATs).

    Answer: The researcher is aware that the assessment of dietary intake in young children is challenging as many people may be involved in providing their meals. In addition, for a large scale school-based intervention, the cost of data entry and analysis, and the respondent age and burden all influence the choice of DAT. The researcher has also thought about the children’s and parent’s compliance to fill in the questionnaires and parent’s skills to complete the food checklist. The researcher decided to undertake this assessment using both a school food checklist and a home food checklist. The researcher will advise parents how to complete the home food checklist.

  3. Identify the availability of resources (e.g. staff skill, time, finances).

    Answer: The researcher has considered the staff availability for their study, which is subject to the study financial support. Another aspect taken into account is the ability of the school field worker to collect dietary information about the food eaten in school and to collect the food checklists completed at home/outside of school.

    After answering the questions above from the BPG, by clicking the tab ‘Show tools in library’ in the ‘Investigate’ section, the researcher can view the tools that have been validated for their selected criteria, and they can look at these in more detail in the tool library. These opened in a new tab, making it easier for the researcher to go back and forth between the Best practice guidelines and the Tool library. In this case, only three UK tools meet the criteria.

    Completing the ‘Define’ screens presents a number of tools in the Tool library that meet the researcher’s criteria, that are short term and prospective for children, the UK tools are:

    • CADET (Cade)
    • Food Checklist (Holmes)
    • Semi-Weighed Food Diary (Holmes)

    These tools can also be visually represented through the bubble chart

    Clicking on one of the bubbles displays information on this tool including the sample size used in the validation study.

    The darker coloured bubble indicates that there is more than one study here.

    The validation results of the DATs for energy and nutrients can also be represented through the summary plot which shows the upper and lower Bland-Altman Limits of Agreement.

    The researcher can click on the bubble or arrows to obtain the main information about the tool and the validation study. e.g. CADET was compared against the weighed food diary.

    The researcher can move back and forth between these screens and the tab for the Best practice guidelines. They can also move back and forth within the best practise guideline screens. They are advised to continue reading the ‘Evaluate’ and ‘Think through’ sections of these best practice guidelines in order to help them select the most appropriate tool.

Stage III. Evaluate existing tools to select the most appropriate DAT; Select your potential DATs.
Research and evaluate available tools of interest.

  1. Read any available published validation studies:

    • Has the DAT been evaluated to measure the dietary component you are interested in?

      Answer: As seen above, completing the ‘Define’ screens presents a number of UK tools in the Tool library that meet the researcher’s criteria, that are short term and prospective for children:

      • CADET (Cade)
      • Food Checklist (Holmes)
      • Semi-Weighed Food Diary (Holmes)

      The screenshots below show some of the validation details of the tools. All have used the recommended Bland-Altman Limits of Agreement validation method.

      • CADET (The Child and Diet Evaluation Tool) has been validated in Children. The researcher can access information about the tool and on the validation study by clicking on ‘CADET’ in the tool library.

      • The Food checklist (Holmes) has been validated in children as well as for other life stages and is used to measure intake over 4 days. Validation results for boys have been recorded separately from girls.

      • The Semi-Weighed Food Diary (Holmes) has been validated in children and other life stages. However, the use of this tool may be a burden to participants for data collection because of the difficulty and time taken to weigh the food.

    • Has the DAT been evaluated in a population similar to your population of interest?

      Answer: The Food Checklist (Holmes), the Semi-weighed method (Holmes) and the CADET tool have been used in School children. For example, CADET has been shown to be an effective method of capturing fruit and vegetable intake, as this has been validated for fruit and vegetable intake in two separate studies in children.

    • Is the nutrient database used appropriate?

      Answer: The nutrient software used with CADET, the Food Checklist (Holmes) and Semi-weighed method (Holmes) was based on the McCance and Widdowson’s composition of foods in the UK.

    • Are the portion sizes used relevant?

      Answer: The CADET uses age- and gender-specific food portion sizes to calculate food and nutrient intake for children aged 3–7 years old, and a modified version of CADET is available for 3-11 year olds. For the Food Checklist (Holmes) and Semi-weighed method (Holmes), the portion size used was a standard portion.

  2. Assess the quality of validation in terms of:

    • Has the DAT been compared to an objective method (e.g. biomarker)?

      Answer: No, CADET, Food Checklist (Holmes) and Semi-weighed method (Holmes) have not been compared to an objective method. They have been compared against weighed food diaries.

    • Has the DAT been compared to a subjective method (e.g. a different self-reported diet assessment)?

      Answer: Yes, CADET has been compared to a subjective method (one day semi-weighed food diary) for 3-11 year olds. The Food Checklist (Holmes) and the Semi-weighed method (Holmes) have been compared against the weighed food diaries.

    • What were the limitations of the validation study?

      Answer: In CADET, the food portion sizes were based on those reported in the National Diet and Nutrition Survey (NDNS), the number of children in specific age/gender groups was small for some of the food, which may lead to unreliable food portion sizes estimates. This may or may not be better than the standard portions used in the Food Checklist (Holmes) and the Semi-weighed method (Holmes).

  3. The strength of agreement between the two methods:

    • Is there any evidence of bias; do the methods agree on average?

      Answer: The limits of agreement between each tool and the comparator (i.e. the validation method) were assessed using Bland-Altman plots. This involved plotting the mean values of nutrients from the two methods against the differences between the two methods.

    • Is there any evidence of imprecision; how closely do the methods agree for an individual?

      Answer: The researcher has considered the level of agreement of the tools as this provides the precision. The researcher has looked at the difference between the DATs and the validation methods in the summary plot for energy. Although the Semi-weighed Food Diary (Holmes) had the narrowest mean difference (-25kcal for girls), the limits of agreement for this DAT as shown in the summary plot were similar to the other two DATS (except for the 2015 Christian validation where the limits of agreement were much wider).

      In the summary plot, clicking on the bubbles shows some of the details of the tools and each validation study on children, including the limits of agreement and the mean difference between the tool and the validation method. For instance details of the CADET validation study with the larger sample size (2016 Cade) for energy can be seen:

      To obtain the limits of agreement (LOA) for fruit and vegetable intake, the researcher needed to access the Validation Results from the detailed information screens from the tool library for each of the DATs. They found the LOA between CADET and the validation method in the 2006 Cade validation study was between -327 and 417g, and in the 2015 Christian validation study it was between -226 and 333g. However, no validation for fruit and vegetables had been undertaken for Food Checklist (Holmes) and the Semi-weighed Food Diary (Holmes).

      Clicking on the ‘Detailed Information’ button of the validation paper displays the following screen. The abstract explains that the tools was validated against a 24-hour semi-weighted food diary in children from a range of socio-economic and ethnic backgrounds. The researcher can access the full paper if available or click a web link for the abstract.

      The researcher can view the limits of agreement for fruit and vegetables by clicking on the ‘Validation Results’ tab.

If, based on the validation studies, none of the existing DATs are entirely or wholly suitable, consider the need to modify or update an existing DAT, or create a new DAT, and evaluate it.

  1. Decide whether an existing tool can be improved. Investigate whether:

    • Foods and portion sizes included are characteristic of your target population; and frequency categories are appropriate.

      Answer: Yes, the food and portion sizes included in the CADET are specific to the children’s age and sex. The Food Checklist (Holmes) and Semi-weighed Food Diary (Holmes) estimated portion sizes using photographic food atlas and/or averaged portion sizes according to age and sex of respondent.

    • The time period that the questionnaire refers to could be modified to better suit your needs.

      Answer: The researcher may be able to use the questionnaire for CADET, Food Checklist (Holmes) and Semi-weighed Food Diary (Holmes) for a different time period, with caution, subject to the study aim. For instance, although CADET was designed to be used over one day, it could be used on additional days.

  2. Consider the face validity of existing tools. Is there evidence the tool has been used to measure dietary intake in your population of interest?

    Answer: Yes, there is evidence that CADET has been used to measure fruit and vegetables in children, and this has been validated against semi-weighed diaries.

    The researcher also planned to measure the consumption of high energy-dense foods and sugar sweetened beverages, which can be achieved using either CADET, Food Checklist (Holmes) and Semi-weighed Food Diary (Holmes) Alternatively, the researcher can create a new questionnaire using the food questionnaire creator, and modify questions of an existing tool to ensure all energy dense foods and sugar sweetened beverages are measured.

  3. Updated or modified tools may require re-evaluation. Consider if validation can be integrated into your study.

    Answer: Yes. For instance if the researcher plan to update the CADET tool to be used for older children, then the tool will be re-evaluated with another validation study.

Select your DAT.

Answer: As the target group in this scenario comprises children we should select tools that are acceptable, easy for them to use and validated among them.

There is not a suitable tool which fits all criteria. Thus, after evaluating the strengths and weaknesses of the 3 DATs, the researcher has chosen the CADET tool to be the most appropriate tool, based on the validity of the tool, particularly in relation to fruit and vegetable intake and ease of use. CADET is also available in the Food Questionnaire Creator where it can be modified if required.

Stage IV. Think through the implementation of your chosen DATs.
Consider issues relating to the chosen DAT and the measurement of your dietary component of interest.

  1. Obtain information regarding DAT logistics (e.g. tool manual, relevant documents and other requirements from the DAT developer).

    Answer: Obtain the manual guide for using CADET from the Nutritional Epidemiology Group. Contact tool owner Prof Cade (J.E.Cade@leeds.ac.uk). This information is displayed in the tool library, tool summary contact information.

  2. Check that the chosen DAT has the most appropriate food/nutrient database and software.

    Answer: After contacting the tool owner, the researcher found out that, CADET has the most appropriate nutrient database as it uses Nutritional information which was based on the McCance and Widdowson’s the composition of foods.

  3. Check the requirements for dietary data collection (e.g. entry, coding and software).

    Answer: If a dish is not listed on CADET, the researcher considered selecting the closest food.

  4. Consider collecting additional related data (e.g. was intake typical; supplement use).

    Answer: The researcher may gather additional data, for example if the diet was a typical diet.

Prepare an implementation plan to reduce potential biases when using your chosen DAT

  1. Consider potential sampling/selection bias and track non-participation/dropout/withdrawal at different stages.

    Answer: The researcher determined an appropriate sample size with the help of a statistician. To encourage participation of all families, the researcher also engaged the interest of schools, parents, and children with prior study information about benefits of their participation in the study, which can be stated in a leaflet.

  2. Minimise interviewer bias (e.g. ensure staff qualifications and training are appropriate; develop standardised training protocols and monitoring procedures).

    Answer: The field researcher is aware of the importance of a good level of skill needed to undertake the interview/measurement in children, as well as an ability to build a good rapport with them and any adults involved, in order to improve the quality of data.

  3. Minimise respondent biases (e.g. use prompts, clear instructions).

    Answer: The researcher has ensured that there are reminders or questions to prompt participants, in order to minimise omitted foods.

  4. Quantify misreporting.

    Answer: The researcher is taking into consideration that reporting of intake by parents and children may be prone to social desirability bias towards reporting healthy foods and under-reporting unhealthy foods. Previous research may provide an indication of under-reporting for these age-groups.

Case study 2

Case study 2: Assess the relative risk of being diagnosed in the UK with colorectal cancer in relation to fibre, and calcium intake.

Pre-study: what is your research objective?

Answer: The aim of the study is to measure fibre from grains, energy and calcium intake in adults and elderly.

Stage I: Define what you want to measure in terms of dietary intake - the key a priori considerations to guide your choice of the appropriate type of Dietary Assessment Tool (DAT).
  1. What? - Characteristics of the main dietary component of interest.

    1. Clearly define what needs to be measured.

      Answer: The researcher would like to determine the habitual dietary fibre intake from cereal, fruit and vegetables, and also calcium intake from dairy products, vegetables (e.g. spinach, broccoli), and grains, and also total energy intake in adults and the elderly, and their risk of colorectal cancer. Therefore the researcher looks for a tool that can measure food groups as well as energy, macro and micronutrients.

    2. Determine how the dietary data will be analysed and presented.

      Answer: The researcher plans to present the relative risk of colorectal cancer in relation to fibre, and calcium intake. Mean values of fibre, energy and calcium will be determined.

  2. Who? – Considerations around the characteristics of study participants.

    1. Define the target sample in terms of characteristics.

      Answer: The researcher will compare people who were diagnosed with colorectal cancer and people without the disease. Intakes of the colorectal cancer cases will be compared with sex-matched control group without the disease. Cancer cases are more likely to be found in adults, and particularly the elderly. Note that clicking on both the Adults and the Elderly boxes will select tools that have been validated in either of these life stages.

    2. Identify other issues that could affect the choice of DAT.

      Answer: The researcher has identified issues in measuring intake, such as defining serving or portion size. Other Issues identified in defining dietary fibre are: the method for determining fibre intake (e.g. the AOAC gravimetric method or the Englyst method); use of recipe files; differentiating between digestible and non-digestible mono and disaccharides. In most countries, the Associate of Official Agricultural chemists (AOAC) gravimetric method is used to analyse fibre (which includes soluble and insoluble forms (includes lignin) and non-starch polysaccharides and resistant starch), except in the UK and Greece, where the Englyst method is used. The elderly may not be familiar with technology enough to complete an online tool, therefore a paper format would be preferable.

    3. Consider the study sample size required in relation to the level of variation of your dietary component of interest and study power.

      Answer: The researcher has estimated a suitable sample size for its case control study with the help of a statistician. Although, relatively small sample sizes are required for case-control studies, recruitment of new cancer patients via health practitioners in a number of hospital cancer units over a period of many months may be required to obtain a large enough sample for the analysis.

  3. When? – Time frame considerations.

    1. Are you interested in ’actual’/short-term (hours or several days, up to one week) or ‘usual’/long term intake (e. g. months or years). Consider what reference period e.g. daily, weekly, monthly, yearly would be best suited to your dietary component of interest.

      Answer: The researcher is interested in total daily intake of dietary fibre and calcium (g/day) over the past year and therefore the researcher selects the long timeframe option.

    2. Will data collection in your study be retrospective or prospective?

      Answer: The research wants to measure the habitual intake and should tick either the box for retrospective or usual reporting method (which will select the same tools).

    Note: the total number of tools in the database will change as more tools from more countries are added to the nutritools website.

Stage II. Investigate the different types of DAT and their suitability for your research question.
Consider and appraise the different DAT types.

  1. In relation to your research question, consider the suitability, strengths and weaknesses of different DAT types.

    Answer: After looking at the strengths and weaknesses for each DAT type [Strengths and Weaknesses of DATs] for the food diary, the 24HR and the FFQ, and comparing their use and administration, the self-administered FFQ was selected as a suitable DAT to assess fibre intake in colorectal patients.

  2. Think about participant burden (e.g. study participants’ potential willingness, time, ability, ethical considerations, interest in using different tools and access issues associated with different DATs).

    Answer: The researcher has taken into account that participant’s estimates of their fibre and calcium intake are likely to contain errors, as it requires good participant memory, literacy and numerical skills to average intakes over a long period.

  3. Identify the availability of resources (e.g. staff skill, time, finances).

    Answer: The researcher has identified skilful staff who will be available to administer the dietary assessment tool and estimate the dietary fibre and calcium intake from the responses.

    After completing the ‘Define’ screens, clicking on ‘Show Tools in Library’ opens the Tool Library in a new tab showing the researcher’s selected tool criteria on the left hand side of the screen. The researcher is interested in tools created for the UK population and therefore clicks on the UK box; six UK validated tools are presented in the Tool library that meet the researcher’s criteria.

    The researcher also looks at the bubble chart for the selected tools in the Tool library. When the researcher hovers and clicks on each bubble, a summary information for the tool and the validation paper are displayed, as it is shown below.

    Then the researcher navigates to the summary plots, selecting Energy, fibre (NSP) and then calcium as X variable of interest for the validation studies that have used Bland Altman’s limits of agreement for validation. The black circles on the plots represent the difference in estimated intakes between the DAT and the reference validation method for men and women separately. The arrows represent the upper and lower limits of agreement, clicking on the arrows gives information on the tool and validation on the each tool for women and men. Not all of the UK tools have been validated for energy, fibre and calcium using the Bland Altman validation method. The Short Food Group Questionnaire (Roddam) does not appear in the summary plots because it was not validated using this recommended method.

    Note: the summary plots can also display validation information relating to carbohydrates, protein, fat, saturated fat, sugar, iron sodium and zinc.

Stage III. Evaluate existing tools to select the most appropriate DAT; Select your potential DATs.
Research and evaluate available tools of interest.

  1. Read any available published validation studies:

    • Has the DAT been evaluated to measure the dietary component you are interested in?

      Answer: Completing the ‘Define’ screens presents a number of validated tools in the Tool library that meet the researcher’s criteria.

      • Cambridge FFQ (Bingham)
      • UK EPIC FFQ (McKeown)
      • FFQ (Whitehall II Study) (Brunner)
      • Oxford FFQ (Bingham)
      • Quest1 (O’Donnell)
      • Short Food Group Questionnaire (Roddam)

      From the information on the summary plots shown above, energy, fibre and calcium have been validated using the Bland Altman method only for the Cambridge FFQ (Bingham), Oxford FFQ (Bingham) and UK EPIC FFQ (McKeown) tools. The researcher can look at the detailed information listed in the Tools library in the validation results screen for each validation study of each DAT which shows the results of the validations for fibre (shown as NSP – non-starch polysaccharides), calcium and energy. A variety of other methods have been used to validate intakes for these in all the tools. For more information on the validation methods please see the section on ‘Statistical tests used in validation studies’.

      Example of screenshot of Cambridge FFQ (Bingham):

      Once the researcher clicks on the detailed information of the validation paper, the following screen is displayed. The researcher can access the full paper if available or click a web link for the abstract.

      The researcher can view the results for the nutrients of interest, in this case study, fibre, calcium and energy intake. In addition, validation information for 15 other nutrients, fruit and vegetables and urinary nitrogen (a biomarker for dietary protein) are displayed if they have been measured in the validation study. Later the researcher can check the validation papers to determine whether the tool has been validated against other foods that may be of interest where these are not be presented on the nutritools website.

    • Has the DAT been evaluated in a population similar to your population of interest?

      Answer: All the DATs have been validated in adults, but only the EPIC FFQ has been validated in elderly populations for the food and nutrients of interest. In addition, the Oxford FFQ (Bingham) and Cambridge FFQ (Bingham) was validated for NSP (fibre) (g) and calcium intake only in adult women.

    • Is the nutrient database used appropriate?

      Answer: The researcher is planning to use the UK McCance and Widdowson's composition of foods nutrient database to estimate fibre, energy and calcium intake. The researcher is aware that dietary fibre or specific dietary fibre component tends to be underestimated because of the analytical method used to determine fibre content, thus, analytical errors in the food composition database underestimate fibre intake.

    • Are the portion sizes used relevant?

      Answer: The portion sizes related to the FFQ should be adequate for the adult and elderly population under study. Elderly populations tend to consume smaller portion sizes. To increase accuracy of portion sizes, photographs of food portion sizes can be included in the tool. The Cambridge FFQ (Bingham), EPIC FFQ (McKeown), FFQ (Brunner), and the Oxford FFQ (Bingham) also used photographs to help the respondents estimate their portion size. The elderly may need help completing the DAT.

  2. Assess the quality of validation in terms of:

    • Has the DAT been compared to an objective method (e.g. biomarker)?

      Answer: The EPIC FFQ (McKeown) has been compared to objective methods: biomarkers of intake in urine (nitrogen, sodium) and blood plasma (plasma ascorbic acid).

    • Has the DAT been compared to a subjective method (e.g. a different self-reported diet assessment)?

      Answer: Yes, they have been compared to a subjective method for the food and nutrients of interest and populations of interest. The Oxford FFQ (Bingham) Cambridge FFQ (Bingham) have been compared to weighed food diaries. The FFQ (Brunner) and EPIC FFQ have been compared to the estimated food diary, which is less reliable than the weighed food diary method.

    • What were the limitations of the validation study?

      Answer: None were reported.

      In the Cambridge FFQ (Bingham) no units or portion sizes were specified and the portion size were obtained from the weighed food records obtained from one group. For instance the portion size assigned to milk to the Oxford FFQ (Bingham) was 567g (one pint), whereas in the Cambridge FFQ (Bingham) it was 59g.

  3. The strength of agreement between the two methods:

    • Is there any evidence of bias; do the methods agree on average?

      Answer: The Bland-Altman mean difference and limits of agreement is the recommended validation method to measure absolute agreement when comparing continuous data. This method was used to validate energy, fibre and calcium in the Oxford FFQ (Bingham), Cambridge FFQ (Bingham), and UK EPIC FFQ. However, FFQ (Whitehall) (Brunner) and the Quest1 (O’Donnell) FFQ did not validate Fibre.

    • Is there any evidence of imprecision; how closely do the methods agree for an individual?

      Answer: The narrowest limits of agreement for fibre (NSP) was found for Oxford FFQ (Bingham), UK EPIC FFQ (Bingham) and the Quest1 (O’Donnell) FFQ. For calcium it was narrowest in UK EPIC FFQ (Bingham) and the Quest1 (O’Donnell) FFQ. The UK EPIC FFQ also has smaller mean differences between the DAT and the validation method.

If, based on the validation studies, none of the existing DATs are entirely or wholly suitable, consider the need to modify or update an existing DAT, or create a new DAT, and evaluate it.

  1. Decide whether an existing tool can be improved. Investigate whether:

    • Foods and portion sizes included are characteristic of your target population; and frequency categories are appropriate.

      Answer: The researcher has evaluated whether foods and portion sizes used with the DATs to estimate fibre intake from grains are relevant to the population of study, in this case, to adults and elderly.

    • The time period that the questionnaire refers to could be modified to better suit your needs.

      Answer: The researcher wishes to use a DAT that will assess intake in the previous 12 months before the participants were diagnosed with cancer; the Oxford FFQ (Bingham), Cambridge FFQ (Bingham), and EPIC FFQ, and FFQ (Brunner) are suitable for this time period. However, it is possible that dietary behaviour may have changed within the previous 12 months as a results of the disease process. Although it may be possible to modify the tool to assess intake over a longer period, this may be prone to recall errors.

  2. Consider the face validity of existing tools. Is there evidence the tool has been used to measure dietary intake in your population of interest?

    Answer: The tools selected by the tool library have been validated in adults, but not all were validated in the elderly.

  3. Updated or modified tools may require re-evaluation. Consider if validation can be integrated into your study.

    Answer: The researcher decided not to modify any of the tools.

Select your DAT.

Answer: After looking the information provided on the validation paper (mean difference, variance measured, correlation coefficients, percentage agreement, and limits of agreement), and its validation in the adult and elderly population. Then, for this hypothetical scenario, the UK EPIC FFQ will be selected.

If none of the DATs selected by the initial criteria used in the ‘Define’ screens were suitable, other potentially useful DATs could be selected and explored by relaxing the criteria and changing the filters on the left hand side in the Tools library.

Another suggestion would be to consider the food questionnaire creator developed through the Nutritools website to design the FFQ. The researcher should decide which foods to include in its questionnaire for the measurement of the nutrient of interest, and the questions to measure dietary fibre.

Stage IV. Think through the implementation of your chosen DATs.
Consider issues relating to the chosen DAT and the measurement of your dietary component of interest.

  1. Obtain information regarding DAT logistics (e.g. tool manual, relevant documents and other requirements from the DAT developer).

    Answer: The researcher has looked at the tool summary, and if available will obtain relevant documents on use and administration of the UK EPIC FFQ

  2. Check that the chosen DAT has the most appropriate food/nutrient database and software.

    Answer: The researcher checks whether the software specifically designed for the UK EPIC FFQ is appropriate to estimate fibre intake in addition to the other foods and nutrients of interest. The software uses McCance and Widdowson’s the composition of foods; the researcher will check whether the latest version has been incorporated.

  3. Check the requirements for dietary data collection (e.g. entry, coding and software).

    Answer: The researcher assesses the requirements to enter fibre intake data and for the software analysis. They identify food sources for cereal fibre (whole grain bread, crisp bread, oats, and muesli), breakfast cereal, pasta, rice.

  4. Consider collecting additional related data (e.g. was intake typical; supplement use).

    Answer: The researcher has considered collecting other sources of fibre from the diet such as those not-derived from non-cereal, vegetable or fruit sources.

Prepare an implementation plan to reduce potential biases when using your chosen DAT

  1. Consider potential sampling/selection bias and track non-participation/dropout/withdrawal at different stages.

    Answer: The researcher has ensured an adequate sample size of people with colorectal cancer. The researcher has also explained to participants, the benefits of the study, and provided a leaflet about this for the participants to read it. In addition, the researcher has also obtained contact details to follow-up the patients.

  2. Minimise interviewer bias (e.g. ensure staff qualifications and training are appropriate; develop standardised training protocols and monitoring procedures).

    Answer: The researcher has considered trained interviewers to administer standard FFQ to cases and control participants to minimize bias.

  3. Minimise respondent biases (e.g. use prompts, clear instructions).

    Answer: The researcher will give the interviewers clear instructions to ask for specific food items that are consumed over the reference period.

  4. Quantify misreporting.

    Answer: The researcher has considered misreporting. Misreporting is likely to be greater when using a self-administered FFQ, especially in patients who are unwell. Misreporting will be difficult to quantify without the collection of weight, physical activity, age and sex data to estimate energy expenditure e.g. by Schofield and Goldberg methods. Previous research on the target population may provide an indication of misreporting.

Further background information on dietary assessment methodology can be found at:

Dietary Assessment Primer

Diet, Anthropometry and Physical Activity (DAPA) Measurement Toolkit

Strengths and Weaknesses of DATs

To identify the Strengths and Weaknesses of Dietary Assessment tools click on the tab Dietary Assessment Guidelines and then on Strengths and Weaknesses.

A total of six dietary assessment tools (DATs) will be displayed:

  1. Food diaries
  2. 24hr recall
  3. Food frequency questionnaire
  4. Food checklists
  5. Dietary histories
  6. Emerging technologies

Clicking on each dietary assessment tool will provide detailed descriptions of each DAT and will open a screen in which information on their Strengths and Weaknesses are summarised in a table.

The information in this section will help you decide on the most suitable DAT to measure dietary intake in your study.

For further information on the definition of words refer to the glossary by clicking the underlined word. If you need more information, please refer to the following websites:

Library

Tool Library: Searching for a tool:
  1. To search for a dietary assessment tool click on Tool E-Library on the welcome page or click on Dietary Assessment Tools from the menu at the top, then select Library. On the left hand side of the tool library you will be provided with the filter options to select the specific characteristics of the tool under “Tool Filter” and the characteristics of the validation method used to validate the tool under “Validation Method Filter”. On the right hand side you will be displayed with the results (available tools) of your selection criteria. Being flexible with the filter options will allow you to display more tools.

Using the filters
  1. Depending on the characteristics of your study or research activity, click on the options under “Tool filter” to select the tool type (i.e, Food Diary weighed, Diet recall, etc.), dietary exposure (i.e., full nutrient, food groups), timeframe tool measures (i.e., short, long), reporting method (i.e, retrospective, prospective, usual), and the format (i.e., paper or online) that you would like to cover in your assessment methodology. To assist your decisions, please refer to section 2 “Best Practice Guidelines - Case Studies”, go to the paper, or navigate through the Best Practice Guidelines screens on the website by clicking on Dietary Assessment Guidelines > Best Practice Guideline from the main menu. We recommend that you do this first.

  2. Click on the “Validation method filter to select the sex (i.e., male, female), lifestage (i.e., children, adults), comparator method (i.e., food diary weighed, food diary estimated), year of tool validation (i.e, 1990 to 1999, 2000 to 2009), and geographical area (UK, Europe, North America) that meets your validation criteria and reflects the population of your study. For more help, please refer to the Best Practice Guidelines screens, section 2 Best Practice Guidelines - Case Studies or the paper, to assist you in your decision.

Note: When you tick more than one option in each of the filtering criterion, this will display results on the combination of tools incorporating one or more of the options selected. It will not merge tools using one option and another. The only exceptions are the domain of Dietary exposure which will combine tools with one option and another, and where a “full nutrient” option is also available (combining tools that have measured energy plus a minimum of 3 macro and 3 micronutrients. There is also an option that combines “both” sexes (female and male).

Exploring the results
  1. Results listing all the tools that match your selected criteria will be shown on the right hand side of the webpage. The total number of tools selected is displayed at the bottom of the list. From the displayed list, click on each tool name to read information about the tools as well as their validation details. This action will open a new box as shown below with general information on the tool and the validation study (ies) describing when, how (comparator method used for validation, the number of nutrients that it has been validated or tested against -out of a list of 20- and which statistical methods have been used), as well as in which population(s) it has been validated. For further information on the tool itself or the validation study (ies), click on the “Detailed Information” and on the “Web link” or “Download paper”.

    Note: to go back to the results list of tools, click on the “close” button on the top right hand side.

    Assessing the tool information and validation of papers will help you to select the most suitable DAT for your study. Some tools have been validated in more than one study using different comparators and /or different populations e.g. elderly as well as adults, which may assess the validity of the tool for different nutrients.

  2. When you click on “Detailed information” in the tool information box you will be provided with a brief tool summary (including the type of dietary exposure it can collect, reporting method, timeframe of the tool, the method used to collect the data, the availability of the tool in the Food Questionnaire Creator and the web link to the paper). If available, it will also provide information on contact details and special considerations related to the version of the tool and other relevant validation details.

  3. When you click on the “How to use” tab, it will give you information related to the software needed to use the tool, training required for data collectors, instructions on how to use the tool, administration methods or steps to be considered when implementing the tool as well as how data analysis was conducted, if available.

  4. Click on “Validation Papers” to look at the information on author, year of study, comparator, lifestage, sex, the number of nutrients validated, if the mean difference, variance measured, correlation coefficients, percentage agreement and limits of agreement (provide links to definitions on the glossary) are available on the website, and the special considerations of the validation studies.

    From the Library section, when clicking on a tool name, the same information is displayed. E.g.

    When clicking on “Detailed information” for the validation paper, this will provide a number of tabs which will firstly show the abstract of validation study.

    A summary of the study design information from the paper can be accessed when clicking the ‘Study Design’ tab.

    When you click on the ‘Validation Results’ tab you will be presented with the detailed validation statistical results for the nutrients validated. More information about the types of statistical tests commonly used for validation can be found later in the help section. Nutritools currently reports validation results on 20 nutrients - a list of these can be found in the info icon at the top of the validation results page next to Total nutrients validated. Clicking on the ‘Best Practice Guidelines- Evaluation Checklist” will open up a new browser window to the Best Practise Guidelines so that you can easily refer to questions that can help you research and evaluate the tools, whilst keeping open the validation results window. Clicking on ‘Tool Library’ will take you back to the results of your search based on filter criteria.

Accessing the original published validation/tool study (ies) online
  1. A web link which will direct you to the website of the articles publication will be provided for all of the studies which contain information on dietary assessment tools or validation details. This action will open in a new browser window.

    Nutritools will also provide the full text (in PDF format) of articles which have open access. These will be available when the Download paper symbol is present.

Visualisation plots: Bubble Chart and Summary plots

To allow comparison among DATs you can use the Bubble chart and the Summary plot.

Bubble Chart

The Bubble chart allows you to compare DATs based on the tools and validation characteristics. This can be found under the Dietary Assessment Tools tab:

Elements of the bubble chart

The x axis (horizontal) indicates the year in which the tool was validated, whereas the y axis (vertical) is indicative of three possible comparators:

  • Timeframe tool measures (Long, medium, short or multiple)
  • Number of Nutrients validated
  • Number of statistical methods used/calculated (out of a maximum of 6)
  1. To display these options click on the dropdown next to Y variable:

    The colour of the bubble equates to the tool type:

    The size of bubble equates to sample size. The bigger the bubble the larger the sample size in the validation studies. Greater sample sizes are indicative of more statistical power which provide more certainty of detecting true effect sizes in what is being measured/tested or compared. However, for certain methodologies (such as studies including double labelled water), it may be deemed difficult, costly and thus impractical to reach higher number of participants, therefore results even in smaller populations can be equally informative and reliable to use.

    When you click on each bubble this will display summary information on the tool and its validation details (including the sample size). Each bubble represents one validation study.

  2. When you select "Timeframe tool measures" in the y axis, this will display the terms "Multiple, short, medium and/or long" on the chart-depending on the period of dietary intake that the tool is capable measuring. Click on the button to see a detailed explanation of each category, as follows:

  3. When you select "Number of nutrients validated" a numerical scale of 0 to 20 will show on the Y axis. Note that there are some tools that have not been validated against any nutrient itself (and thus will appear aligned to "0") as their main purpose was measuring broader dietary components such as Food groups (i.e., Fruit and vegetables, dairy products).

  4. When you select "Number of Statistical methods/used or calculated" a numerical scale from 0 to 6 will appear on the Y axis.

    Statistical methods include: mean difference (Mean of nutrient from reference tool – Mean of nutrient from a specific tool), standard deviation, correlation coefficient (Pearson, Spearman), limits of agreement, percentage agreement and/or Cohen’s kappa coefficient . Please refer to the statistics help section if you require more information on these methods. Note that there are some tools that appear aligned to "0" as they have not used the statistical validation methods mentioned above, but may have reported other statistical methods which were out of the scope of those considered by the Nutritools website.

Comparing and interpreting results in the bubble chart

If you choose one of the options in the tool filter: "tool type" and one in the validation method filter: 'Comparator' the Bubble chart will show all the validated tools on the right hand side which fit the criteria chosen.

For instance, click on tool type: Food Checklist and then click on comparator: Food Diary Weighed. This will show several bubbles, each one of them represents one validation study. If you select as your Y variable "Timeframe tool measures" you will notice that those measuring a medium timeframe (a period of dietary intake between 1 week and 1 month) all correspond to the 7 day Food checklist. They appear in 3 different bubbles on the top as there has been more than one study validating different characteristics of this tool (for example, different sex, age group or number of micro and macronutrients).

Remember that clicking on each bubble will display summary information. Clicking on ‘See tool in library’ will direct you to the Tool Summary tab where more details can be found.

Looking at the size of the bubbles, the biggest ones correspond to the 7-day Food checklist by Bingham et al., 1994 and CADET by Cade et al., in 2006. Exploring the validation papers will reveal that they had both been validated in 160 participants, but in different age groups: adults and children, respectively.

When comparing tools by the "Number of nutrients validated" on the Y axis, you will again notice that the 7-day food checklist (Bingham et al, 1994)is at the top with 14 nutrients validated, followed by CADET and the 7-day Food checklist (Johansoon et al, 2008) each with 13 nutrients validated. If a tool has been validated in a larger number of nutrients it should provide more confidence when using the tool for a wider array of dietary components.

If you compare tools by the "Number of statistical methods used/calculated", then you will notice that the 7-day food checklist by Johansson et al has only reported 2 out of 6 possible statistical estimates (those being mean difference, standard deviation, correlation coefficient, limits of agreement percentage agreement and/or and Cohen's Kappa coefficient, details of which can be found by exploring the validation paper on the tool library), whereas CADET has reported 4 and the 7-day food checklist by Bingham et al, 1994 has reported 5 statistical measures.

The "best options" of DATs will depend on your aims, needs and resources available to conduct your study or research. Whether you need a tool that has been validated in children or elderly, or perhaps that is capable of measuring dietary intake for longer periods, or that has been evaluated on a broader range of dietary components. The bubble chart is yet another resource to help you make a decision. If in doubt, go through the steps described in the "Best Practice Guidelines" on the website or refer to the paper. Also, you can explore the case-studies that have been covered in previous sections to assist you in making an informed decision.

Summary Plots

The summary plots are a visual approach to compare the mean difference in intakes for certain nutrients between the dietary assessment tools (DATs) and the comparators used in the validation studies. They can help you assess agreement between the tools and identify how much one tool is likely to differ from a comparator/reference method when measuring a specific nutrient or dietary component.

Elements

The arrows represent the upper and lower limits of agreement, the bubble represents the mean difference of the two methods (or the difference between the average of individual measurements from a dietary assessment tool and the average of individual measurements from a comparator tool). The size of the bubble equates to the sample size. For more information about the Bland-Altman limits of agreement and other statistical methods used in validation studies please refer to the statistics help section.

Wider arrows represents more variation to the mean difference between methods and thus implies that the tools differ more in the results they provide.

The "0" in the middle represents the position at which there is no difference between the means of both methods (and that is a sign of a perfect agreement). A bubble that is closer to "0" is indicative of more similar results between the dietary assessment tool and its comparator. Bubbles on the right hand side indicate an overestimation (or positive difference) of a particular nutrient, whereas bubbles on the left hand side represent underestimation (or negative difference).

There are different dietary components that you can visualize and compare in a summary plot. Depending on your study needs, define your X variable (on the x-axis) by selecting either: Energy (kcal), carbohydrates (g), protein, fat (g), Saturated fat (g), Sugar (g), Fibre (g), Calcium (mg), Iron (mg), Sodium (mg), Zinc (mg) using the box at the top. To display these options click on the downwards arrow ▼ next to x variable:

Note that the scale of the X axis will vary depending on the units in which the measurement has been estimated (either kcals, grams or milligrams).

For instance, click on the tool filter, tool type: Dietary Recall. Click on the validation method filter, comparator: Doubly Labelled Water. On the right hand side a summary plot will be displayed of the tools meeting those criteria.

Click on the bubble of a particular tool to display summary information including the lifestage of the population validated, the comparator used and the specific data points of the mean difference and limits of agreement which are needed to compare the assessment tools.

Note: The author’s name in brackets next to each displayed tool is the author of the original paper for the tool and may be a different name from the validation study (the name that appears when you hover over or click on the bubble) as different authors may have used the original tool to validate it against other methods or different populations or nutrients.

In this example, selecting the filters tool type Dietary Recall and comparator method Double Labelled Water will retrieve one DAT meeting these criteria, the Multiple Pass 24h Recall (Johnson), that has been validated in children on three occasions: by Johnson (1996), by Reilly (2001), then by Montgomery (2005). This information can be seen if you hover over the bubbles or click on the bubbles to view the summary information. Looking at the plot below you will notice that the DAT in two validation studies has overestimated energy intake (kcal) when compared to Doubly Labelled water, and thus appear on the right hand side of the "0" reference point. The validation by Montgomery has a smaller mean difference (59.8 kcal), narrower limits of agreement (-568.8 to 688.3 kcal) and a larger sample size (n=63) in comparison to the study by Reilly (mean difference 157.7 kcal, limits of agreement: -563.6 to 879.1 kcal and n=41). In the earlier validation by Johnson the tool underestimated energy intake by 54kcals and had wider limits of agreement and smaller sample size. In this example, only information on energy (kcal) will be displayed in the x axis, because doubly labelled water can only measure total energy expenditure.

** Screenshots of summary information of validation studies for Johnson et al.

Doubly labelled water has been recognised as the "gold standard" to measure total energy expenditure (TEE). In conditions of energy balance (that is no weight gain and no weight loss), total energy expenditure (TEE) should equate to energy intake (EI) [TEE-TI=0]. The multiple pass 24-hr recall was used to measure energy intake whereas doubly labelled water was used to measure total energy expenditure. As observed, there were differences in the results provided by these methods in the studies.

Some results have been calculated using statistical techniques based on the published data. To find more information read the validation article.

To find a validation paper, click on . A new screen will open containing Tool information. Go to the tab "validation papers" and click on detailed information:

This will open another screen with Validation information. Results will be displayed on the Validation Results tab. There is also the web link and/or a download paper option (if available) to access the original publication.

When looking at the validation details of the studies obtained from the validation articles, you will notice that:

  1. The validation by Reilly (2001) was done in children aged 3 to 4 years (pre-schoolers)
  2. The validation by Johnson (1996) was done in children aged 4 to 7 years
  3. The study by Montgomery (2005) was done in children aged 5 to 7 years (school-aged) and concluded that the estimates of energy intake as measured by the multiple pass 24-hr recall and compared to doubly labelled water improved as the children were older.
This information together with the validation results can help you decide whether the DAT is suitable for the population you wish to measure.

When comparing different DATs, in order to make the most informed decision on what tool to choose for your study, refer to the Best Guidelines Section and to the case studies in the help section. Also remember to consider the differences in the statistical methods used as discussed in the next section of help.

Statistical tests used in validation studies

This section provides information about the statistical tests used in validation studies of dietary assessment tools that have been reported in the Nutritools website (example shown in figure 1). Validation studies compare nutrient intake estimated by a test dietary assessment tool (DAT), such as an estimated food diary, Food Frequency Questionnaire (FFQ) or 24 hour recall with that from a reference/ validation method such as doubly labelled water or weighed food diary.

Figure 1: Example of validation study results shown in the Nutritools website

Absolute validity is the measure of agreement between two methods measuring the same variable (e.g. nutrient) with the same units. Relative validity describes the relationship between the measurements of the variable by two different methods; this can be done by ranking individuals or groups in the same order irrespective of the units of the variable. Although it is possible that the measurement using a dietary assessment tool may have a strong correlation with those from the reference method, they may have little agreement in absolute terms. Therefore, it is advisable not to rely on measures of relative validity, such as Pearson or Spearman correlation coefficients, even though these correlations have been most commonly used as measures of validity (Cade et al. 2002).

Ideally the Bland-Altman limits of agreement or intra class correlations (ICC) should be used to measure the extent of absolute agreement between a dietary assessment tool and the reference method (Cade et al. 2002). The Bland-Altman test also measures the extent of systematic bias. If continuous data is not available for these tests, then kappa statistics can be used as a measure of agreement using categorical data. Table 1 lists the different types of statistical tests commonly used in validation studies and whether they measure agreement or correlation at individual or group level. The left-hand column shows indicators that have been reported as ‘good’ outcomes of validation (and are referenced below the table); however these are not universally agreed cut off points.

Table 1: Statistical tests used in validation studies
Statistical test Validity measured Reported as “Good” outcomes*
Bland Altman analysis Presence, direction and extent of bias at group level, and limits of agreement
Absolute validity
P > 0.05 [1]
and within expected limits of agreement
Mean difference
Paired t-test/Wilcoxon signed rank tests
Agreement at group level p>0.05
Weighted Kappa statistics (coefficient) Agreement (beyond chance) at individual level ≥0.81 ≤1.00 very good
≥0.61 ≤0.80 substantial
≥0.41 ≤0.60 moderate
[2, 3]
Intra class correlation coefficient Agreement at individual level >0.75 [4]
Percentage agreement (Cross-classification of categories -tertiles/quartiles/quintiles) Agreement (including chance) at individual level ≥50% in same tertile/quartile/quintile [2]
≤10% in opposite category [2]
Correlation coefficients (Spearman, Pearson) Strength and direction of association at individual level
Relative validity
≥0.5 [2,5]

Based on table 1 Lombard et al. Nutrition Journal (2015) 14:40

* not universally agreed cut off points

1 Bland, J. M., & Altman, D. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet, 327(8476), 307-310.

2 Masson L, McNeill G, Tomany J, Simpson J, Peace H, Wei L, et al. Statistical approaches for assessing the relative validity of a food-frequency questionnaire: use of correlation coefficients and the kappa statistic. Public Health Nutr. 2003;6(03):313–21.

3 Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. biometrics, 159-174.

4 Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of chiropractic medicine, 15(2), 155-163.

5 Brunner, E, Juneja, M, and Marmot, M. (2001). "Dietary assessment in Whitehall II: comparison of 7 d diet diary and food-frequency questionnaire and validity against biomarkers." British Journal of Nutrition 86(3), 405-414.

It has been suggested that measurement of relative validity may be appropriate when a tool such as an FFQ is to be later used in an epidemiological study to measure the association between a health outcome and nutrient intake, where the aim of the analyses is to rank individuals (i.e. between high and low intake) (Masson et al. 2003). However, measurement of absolute validity is still advisable (Cade et al. 2002).

The similarity between the two assessment methods should also be considered when interpreting the results. For instance, a short-term more accurate measurement tool like the weighed food diary will have greater within-person measurement error than a long-term FFQ. The type of measurement errors of the test tool and the reference tool should be independent in a validation study (Cade et al. 2002).

Bland-Altman Limits of Agreement

The Bland-Altman method can determine if there is any systematic difference between the test tool and reference methods (bias), and to what extent the two agree (limits of agreement). It also provides a method of assessing whether the difference between the methods is the same across the range of intakes, and whether the extent of agreement differs for low intakes compared with high intakes. These may be assessed by plotting the difference (test – reference measure) between the methods against the average of the two ((test measure + reference measure)/2). About 95% of the recordings will be within the upper and lower limits of agreement calculated from 1.96 standard deviations of the mean.

The lower limit of agreement (LLA) = mean difference - 1.96 standard deviations (SD)

The upper limit of agreement (LLA) = mean difference + 1.96 standard deviations (SD)

The Bland-Altman plot does not determine whether the agreement is sufficient to demonstrate validity. The best way to use the plot is to define a priori the limits of maximum acceptable differences i.e. the limits of agreement expected.

Figure 2 shows an example of a Bland-Altman plot illustrated in the validation study for myfood24 self-administered online dietary assessment tool recorded by adolescences (Albar et al. 2016). On average this DAT underestimated energy intake compared with an interview administered 24 hour recall by only 55kcals; the measure of systematic bias. The upper and lower limits of agreement were 687kcals and -797kcals ranging from an underestimation of 39% to an overestimation of 34% in relation to the average intake for the two methods (2029kcal – not shown in figure 2). This was considered acceptable. Looking across the range of intakes, the plot shows that there is a slight tendency for myfood24 to overestimate rather than underestimate at the higher recordings of energy intake (>3250), however over- and underestimation appears roughly equal below intakes of 3000kcals.

Figure 2: Example of a Bland-Altman plot (Albar et al. 2016)
Bland and Altman plot for energy intake for the two methods (myfood24 and interviewer-administered 24-h multiple-pass dietary recall (MPR)), including both days of measurements (n 75, using both days).

The absolute values for the upper and lower limits of agreement (ULA & LLA) from this myfood24 validation study can be compared with those of other validation studies in the summary plots on the Nutritools website. For instance it can be compared with other dietary recall validation studies for adolescents and, as illustrated on the right hand side of figure 3, the limits of agreement (shown by the arrow heads) are narrowest for myfood24. Additionally, the mean difference (shown by the circles) is also closer to zero for myfood24. These may indicate that the myfood24 tool is preferable.

Figure 3: Example of a summary plot in the Nutritools website comparing Bland-Altman results from validation studies
Mean difference: Paired t-test/Wilcoxon signed rank tests

If it is important for the dietary assessment tool to measure differences between absolute nutrient/food intakes, then the validation study should assess the ability of the tool to reflect the group mean (Cade et al. (2002); Nelson in Margetts (ed) (1997)). The non-parametric Wilcoxon signed rank sum test may be more appropriate to use in the validation study than the paired t-test since dietary intakes are often not normally distributed. A p-value of over 0.05 indicates that the group means of the two measurements are not statistically different; the researcher should also determine whether the estimated actual difference appears large and not just rely on p values.

Another way of expressing the difference between the mean intakes measured by the test tool and the reference tool is by percentage difference. This is the value measured using the reference/validation method subtracted from the value estimated from the tool tested, divided by the reference measure and multiplied by 100 for each participant. The mean percentage difference is then calculated for the total sample. However, it may be preferable to focus on the absolute mean difference itself rather than the percentage difference.

Intra-class correlation coefficient

Intra-class correlation coefficient (ICC), usually used to measure the reliability of a method (repeated measures of the same method), can also be used to measure the agreement between two assessment methods at individual level. It also incorporates a measure of correlation. It is the ratio of variation between individuals and the total variation, it is the fraction of the variability due to causes other than variability within an individual. ICC estimates closer to 1 represents greater validity. It can be used to show agreement between more than two methods. ICC can be calculated in a number of different ways using mean squares obtained through analysis of variance; these can be found in table 3 of the article by Koo et al. (2016). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4913118/

There are a limited number of studies that have used ICC to validate dietary assessment methods, one example is the validation of myfood24 tool (Albar et al. 2016). ICC in this study was calculated using a two-way mixed-effects ANOVA model. The ICC for energy and the majority of nutrients was over 0.75 and considered good (shown in Figure 4 below – table 3 of Albar et al. (2016)).

Figure 4: Example of ICCs used to measure validity of a dietary assessment tool (Albar et al. 2016).
Agreement between myfood24 and the interviewer-administered 24-h multiple-pass dietary recall (interview (MPR)) with multiple observations per individual (Mean differences, intraclass correlation coefficients (ICC) and 95 % confidence intervals)
Percentage agreement

Individuals can be categorised, usually into tertiles, quartiles, or quintiles depending on the sample size, for both the tool and reference methods according to their dietary intake. Cross-classification of the categories from the two methods allows the calculation of the percentage of individuals correctly classified in the same category (exact agreement) and the percentage misclassified in the opposite category (Altman and Bland, 1994; Lombard et al. 2015; Masson et al. 2003). It has been suggested that 50% of individuals should be correctly classified and less than 10% of subjects grossly misclassified Masson et al. (2003). Sometimes the total percentage of individuals classified into the same or adjacent quantile is quoted in validation papers.

Table 2: Cross-classification example to calculate percentage agreement
Classification of intake measured by Weighed Food Diary (Reference tool) total
Quintile 1 Quintile 2 Quintile 3 Quintile 4 Quintile 5
Classification of intake measured by FFQ (Test tool) Quintile 1 15% 6% 1% 2% 6% 30%
Quintile 2 2% 10% 3% 1% 0% 16%
Quintile 3 2% 5% 7% 1% 1% 16%
Quintile 4 2% 1% 3% 4% 2% 12%
Quintile 5 7% 3% 4% 3% 9% 26%
Total 28% 25% 18% 11% 18% 100%

In the example above the percentage of exact agreement is 45% (shaded grey), the percentage of exact agreement plus adjacent quintiles is 70% (shaded grey and hatched) and the percentage grossly misclassified is 13% (dark red).

Cohen’s kappa/ Weighted Cohen’s kappa

The percentage agreement, however, will include agreement that has occurred by chance. A superior method to this is Cohen’s kappa statistic (К), which is a summary measure of cross classification that allows for the agreement expected by chance. The weighted kappa (Кw) has the added advantage over the unweighted Cohen’s kappa statistic in that it allows for the degree of misclassification. For weighted kappa, classification occurring in adjacent categories can be counted as partial agreement and weighted accordingly. They can be interpreted as suggested by Landis and Koch (1997) on p165 of their article.

It is also informative to report the percentages of subjects categorised into the same group and the percentage categorised into the extreme opposite group of intake as explained in the previous section, and as shown in the example at the end of this section (figure 5).

Correlation Coefficient

Correlation coefficients can measure the strength of the association at an individual level between nutrient intakes measured as continuous variables by the test dietary assessment tool and the reference method; however unlike the Bland-Altman technique they cannot assess the extent of absolute agreement between the two methods (Bland and Altman (1986)). Therefore, if it is important to use a tool to obtain valid absolute levels of dietary intake e.g. to assess nutrient adequacy, then it is advisable not to rely on correlation coefficients from validation studies as a measure of validity (Cade et al., 2002).

  • Pearson correlation coefficient describes the strength and direction of a linear relationship between two continuous measures
  • Spearman correlation coefficient compares the rank order of the individuals between two dietary assessment methods, it does not assume normal distribution of intake and is not as sensitive to extreme values (outliers) as Pearson correlation coefficients. However, it ignores the actual magnitude of the estimated intake, determining whether it’s more or less than the next individual, but does not taking into account by how much.

A correlation coefficient of 1 shows perfect positive correlation, (-1 perfect negative correlation) whereas zero reflects no correlation. In the review by Cade et al. (2002) correlations between FFQs and a reference dietary measure were highest when subjects were able to describe their own portion size (0.5 to 0.6) compared with no portion size specified (0.2 to 0.5, where average portion weights were used to compute intakes) or portion size specified on the questionnaire (0.4 to 0.5). Nutrients such as vitamin A (b-carotene equivalents) which tend to have high within-person variation from day to day, as well as season, are likely to have lower correlation coefficients than nutrients such as vitamin C which have less daily variation (Lombard et al. (2015), Masson et al. (2003)), as observed in the example below (figure 5). Some validation studies adjust for energy which may increase or decrease the strength of the correlation. The effect of energy adjustment depends on both the correlation between the intake of a nutrient and energy intake and the correlation between the errors of measurement for these two quantities (Bingham and Day, 1997).

Example: the study by Masson et al. (2003) assessed the validity of a FFQ with a weighed record using several statistical methods. For instance using the interpretation of a ‘good’ outcome (reported in previous research and collated in in table 1 of this help document), results for male participants indicate that Spearman correlation coefficients were good (above 0.5) for 10 of the nutrients analysed (SFA, cholesterol, NSP, alcohol, riboflavin, folate, iron, magnesium, potassium and zinc). Many of these nutrients also had ≥50% of individuals correctly classified in tertiles and ≤10% classified in the opposite tertile (SFA, cholesterol, NSP, alcohol, riboflavin, magnesium, potassium). However, weighed Cohen’s kappa values ranged from -0.14 to 0.53, thus revealing none showed substantial agreement (though NSP, alcohol, riboflavin, iron, magnesium and potassium showed moderate agreement). Bland-Altman test using continuous intake data was not undertaken for this study, but the weighed Cohen’s Kappa statistic being a suitable validation test, indicates moderate agreement between the FFQ and the weighed record for some nutrients, and below moderate for other nutrients.

Figure 5 Example showing weighed Cohen’s kappa alongside Pearson and Spearman correlation coefficients in Masson et al. (2003)
Pearson r and Spearman r_s correlation coefficients, percentages of subjects classified into the same and opposite thirds of intake, and weighted kappa (K_w) in 41 men

Additional information can be found at Diet, Anthropometry and Physical Activity (DAPA) Measurement Toolkit.

References

Albar, S, Alwan, N, Evans, C, Greenwood, D, & Cade, J. (2016). Agreement between an online dietary assessment tool (myfood24) and an interviewer-administered 24-h dietary recall in British adolescents aged 11–18 years. British Journal of Nutrition, 115(9), 1678-1686. https://www.cambridge.org/core/journals/british-journal-of-nutrition/article/agreement-between-an-online-dietary-assessment-tool-myfood24-and-an-intervieweradministered-24h-dietary-recall-in-british-adolescents-aged-1118-years/29F022E91D0337E4CB21376B1EF6EE40

Altman DG. (1991) Practical statistics for medical research. London: Chapman & Hall, pp403-409

Altman DG, Bland JM (1994). Quartiles, quintiles, centiles, and other quantiles. BMJ (Clin Res Ed), 309(6960), 996.

Bingham, SA, & Day, NE. (1997). Using biochemical markers to assess the validity of prospective dietary assessment methods and the effect of energy adjustment. The American journal of clinical nutrition, 65(4), 1130S-1137S.

Bland, J. M., & Altman, D. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet, 327(8476), 307-310. http://www.sciencedirect.com/science/article/pii/S0140673686908378

Brunner, E, Juneja, M, and Marmot, M. (2001). Dietary assessment in Whitehall II: comparison of 7 d diet diary and food-frequency questionnaire and validity against biomarkers. British Journal of Nutrition, 86(3), 405-414. https://www.cambridge.org/core/services/aop-cambridge-core/content/view/CBDEE6FDEF3D5A1FBC77D1E79444EDE6/S0007114501002069a.pdf/dietary_assessment_in_whitehall_ii_comparison_of_7_d_diet_diary_and_foodfrequency_questionnaire_and_validity_against_biomarkers.pdf

Cade J, Thompson R, Burley V, Warm D. (2002) Development, validation and utilisation of food-frequency questionnaires - a review. Public Health Nutrition, 5(4), 567–87. https://www.cambridge.org/core/services/aop-cambridge-core/content/view/463EFE9970053E8BD922CC88F52E6244/S1368980002000794a.pdf/development_validation_and_utilisation_of_foodfrequency_questionnaires_a_review.pdf

Koo, TK, & Li, MY. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of chiropractic medicine, 15(2), 155-163.

Landis, JR, & Koch, GG. (1977). The measurement of observer agreement for categorical data. biometrics, 33, 159-174. https://pdfs.semanticscholar.org/7e73/43a5608fff1c68c5259db0c77b9193f1546d.pdf

Lombard, MJ, Steyn, NP, Charlton, KE, & Senekal, M. (2015). Application and interpretation of multiple statistical tests to evaluate validity of dietary intake assessment methods. Nutrition journal, 14(1), 40. https://nutritionj.biomedcentral.com/articles/10.1186/s12937-015-0027-y

Nelson M. The validation of dietary assessment. In: Margetts BM, Nelson M, eds. Design Concepts in Nutritional Epidemiology. Oxford: Oxford University Press, 1997; 241–72.

Masson, LF, McNeill, G, Tomany, JO, Simpson, JA, Peace, HS, Wei, L, ... & Bolton-Smith, C. (2003). Statistical approaches for assessing the relative validity of a food-frequency questionnaire: use of correlation coefficients and the kappa statistic. Public health nutrition, 6(3), 313-321 https://www.cambridge.org/core/services/aop-cambridge-core/content/view/8832DF5F69715C4F8DDC2E0F32BC4EF1/S1368980003000417a.pdf/statistical_approaches_for_assessing_the_relative_validity_of_a_foodfrequency_questionnaire_use_of_correlation_coefficients_and_the_kappa_statistic.pdf

Willett WC. (1994). Future directions in the development of food frequency questionnaires. American Journal of Clinical Nutrition, 59(Suppl.), 171S–4S.

Frequently Asked Questions (FAQ)