Reliability and Validation of the Persian Version of JUICE Study Questionnaire

1Digestive Disease Research Institute, Tehran University of Medical Sciences, Tehran, Iran 2Psychology Division of York University, Toronto, Canada 3Islamic Azad University, Central Tehran Branch, Tehran, Iran 4Department of Medicine, Division of Gastroenterology, McMaster University, Hamilton, Canada Division of Gastroenterology, Tohoku University Graduate school of Medicine, Sendai, Japan 5Medicine Digestive Disease Research Institute, Tehran University of Medical Sciences, Tehran, Iran 6Department of Medicine, Division of Gastroenterology, McMaster University, Hamilton, Canada 7Digestive Disease Research Institute, Tehran University of Medical Sciences, Tehran, Iran


Introduction
Our understanding of microbes living in our body has changed in the past decade and has gone from only a few species to a large and diverse community that develops in humans along with their chronologic age. 1 The largest number of microbes exist in the lower gastrointestinal (GI) tract (large intestine); however, there is a variety of microbiota throughout the human body including the upper GI tract.The microbiota in the GI tract has been associated with various aspects of health and disease, covering a broad spectrum from local GI conditions to psychiatric and behavioral ones. Therefore, better understanding of the GI microbiota, and carefully manipulating it, may enable us to find new treatments for many conditions. 2 Most of our knowledge about the GI microbiota comes from studies performed on the stool which will not let us know where exactly the identified microbes live and come from. We know little about the composition and correlations of the upper GI tract microbiota.
Therefore, we decided to perform a prospective multicenter cohort study to compare upper GI symptoms and endoscopy findings in Iran (Tehran University of Medical Sciences, Masoud Clinic) with Canada (McMaster Open Access University) and Japan (Tohoku University) and correlate these findings with the upper GI microbiota (The JUICE Study, Japanese upper GI symptoms compared with Iranian and Canadian patients presenting for Endoscopy). We will recruit adult patients who undergo esophago-gastroduodenoscopy (EGD) for any reason and consent to be enrolled in the study during the study period. If the patient is eligible for this study, we will record demographics and the main reason for endoscopy. Patients will be asked to complete the following questionnaires: upper GI symptoms measured by Short form Leeds Dyspepsia Questionnaire (SFLDQ), quality of life measured by EQ-5D, and anxiety and depression measured by the Hospital Anxiety and Depression Scale (HADS).
Reliability and validity of a measuring instrument are the key indicators of the quality of that instrument. The current study reports the reliability and validity of the questionnaires which will be used in the JUICE study (listed above) in Iran.
Reliability is defined as the extent to which results are consistent over time; if the results of a study can be reproduced using a similar methodology, then the research instrument is considered to be reliable. An instrument can be reliable without being valid. Validity or trustworthiness is defined as the degree to which the results of a questionnaire agree with the real world. 3

Materials and Methods
The original English questionnaire was translated to Farsi by one of the researchers (SaNM). It was then back translated to English by an expert Iranian English teacher. The back-translated questionnaire was then reviewed by one of the researchers (PM, a native English speaker) and corrections and refinements were made for the final Farsi translation. This latter version was used for the validity and reliability study performed at the Digestive Disease Research Institute (DDRI) of Tehran University of Medical Sciences. This questionnaire contains three sections: The first part has 17 questions: of those, the first six are demographics (name, sex, birth date, education level, birth place and race), questions number 7 and 8 ask about any current drug consumption, question 9 asks if the client has had any abdominal surgery before and the 10th question is about history of Helicobacter pylori infection. The questions number 11 to 16 ask about using PPIs (proton pump inhibitors), aspirin products, NSAIDS (non-steroidal anti-inflammatory drugs), and any other painkiller or antibiotic use during the previous month, respectively. The 17th question asks about the present and past smoking history.
The second section is the EQ5D questionnaire which asks about five items related to the same-day's quality of life including "movement", "self care", "daily activities", "pain or discomfort", "stress and depression" and a chart is designed to indicate the interviewee's health status on the day of study.
The third and last part is the HADS questionnaire which consists of 14 questions (asking about one's psychological health in the last week). At the end of our questionnaire, a list of all the available brand names of PPIs and NSAIDs in Iran is attached which is extracted from the official Iranian pharmaceutical statistics (11th version) (https://www.fda.gov.ir/fa/).
Reliability of this questionnaire was assessed, using the standard (test-retest) method. 4 For the first (test) phase, 22 volunteers from DDRI were asked to complete a coded questionnaire anonymously. During this process, any ambiguity was clarified by the oriented distributor; most of the participants returned the completed form on the same day, but a few needed reminding. After gathering all the forms, data codes were entered on an Excel worksheet. Two weeks later, the same questionnaire was given to the same people (in order to reduce the bias of memory recall, a time interval of two weeks was selected between the test and retest phases and the participants were not informed about the retest phase before its time). Data codes of these forms were entered on the same Excel sheet, the same questions were arranged beside each other with different numbers (1st and 2nd).
To assess validity, 22 participants completed the questionnaire and one of the researchers (MR) interviewed them and completed the questionnaire for them accordingly. The paired questionnaires were then transferred to the Excel sheet.
Cohen's kappa coefficient and 95% confidence interval (CI) were used to estimate inter-rater reliability for categorical variables. Kappa coefficient shows agreement beyond chance. Kappa values range between 0 and 1 and a value above 0.6 is considered acceptable (Table 1). 5,6 Percentage agreement was used for variables without a normal distribution, where kappa statistics cannot be used. For the purpose of this study, a percentage agreement of 0.6 and above was considered adequate and percentage agreement of 0.5-0.6 was acceptable.
The Bland-Altman plot was used for assessment of agreement of the chart scores on current health status. We analyzed the data using IBM SPSS Statistic software, version 22 for Windows (released 2013, Armonk, NY, USA). Table 2 shows the reliability and validity results for the categorical variables of the first part of the questionnaire. As shown, in reliability assessment, only question number The reliability and validity test results for the EQ-5D questionnaire are depicted in Table 3. All values were in the acceptable range.

Results
In the last part (the HADS questionnaire), percentage agreement was calculated for all of the 14 items which is summarized in Table 4.

Discussion
This study was performed to establish the reliability and validity of the Persian version of the three-part questionnaire which will be used in the JUICE Study in Iran. The questions in the 1 st and 2 nd parts of the questionnaire were found to be adequately valid and reliable. The third part, which consists of the HADS questionnaire, was moderately reliable and substantially valid.
In order to assess the sixth item of the EQ5D questionnaire in our study, we drew a Bland-Altman plot because this item is a chart that categorizes the sameday's health status as 0 (worst) to 100 (best). This plot can show the limits of agreements but gives no information on whether these limits are clinically acceptable or not. 5,7 As we did not claim to prove the acceptability of each individual's health status and the main goal is to compare two scores of each phase, it seems that this plot can fulfill our purpose (Figure 1 and 2).
Most of the studies performed in this field have assessed the reliability and validity of a questionnaire in a special and selected group of patients, but we enrolled the participants in DDRI by convenience sampling and this can be the strength of our study. However, our sample size is smaller than other studies which explains the wide range of 95% confidence intervals for kappa values.
In a study performed to assess the reliability and validity of the Persian version of the HADS questionnaire in Iran among 261 depressed and/or anxious patients and 261 healthy controls, acceptable validity and reliability were found for this questionnaire. 5 A cross-sectional study, which was performed in an emergency department in Saudi Arabia to assess the reliability and validity of the Arabic version of the HADS Questionnaire among 257 participants, showed acceptable results in 95% of subjects. They used Cronbach's alpha coefficient to evaluate reliability, and it indicated a significant correlation with both the anxiety (0.73) and depression (0.77) subscales of HADS, thereby supporting the validity of the instrument. By means of factor analysis, they obtained a two-factor solution according to the two HADS subscales (anxiety and depression), and they observed a statistically significant correlation (r = 0.57; P < 0.0001) between the two subscales. 7 In a research on reliability, validity and responsiveness of EQ-5D to evaluate health status in patients with social phobia in Germany, a sample of 445 patients with social phobia were studied with five measurement points over a 30-month period. The discriminative ability of EQ-5D was analyzed by comparing the patients' responses with the general population and between different disease severity levels. For test-retest reliability, they assessed the level of agreement in patients' responses over time, which was moderate (intra class correlation coefficient > 0.6). Construct validity was analyzed by identifying correlations of EQ-5D with more specific instruments which was limited. 8 Hereby, we report the validation procedure of the questionnaire to be used in the Iranian section of the JUICE study. According to our data, the Persian version of this questionnaire is reliable and valid for use in this study.