Validity and Reliability of Persian Version of Henry Ford Hospital Headache Disability Inventory Questionnaire

1Faculty of Rehabilitation Sciences, Tabriz University of Medical Sciences, Tabriz, Iran 2School of Public Health, Department of Statistic and Epidemiology, Tabriz University of Medical Sciences, Tabriz, Iran 3Neurosciences Research Center (NSRC), Aging Research Institute, Tabriz University of Medical Sciences, Tabriz, Iran 4Department of Health and Rehabilitation Sciences, University of Western Ontario, London, Canada


Introduction
Headache disorders are the second cause of disability worldwide. 1 According to 33% of patients, headaches cause a negative impact on individuals' careers and family relationships. 2 All types of headaches, especially the migraine type, are significantly associated with reduced quality of life. 3,4 Measurement of disability in patients with chronic disorders can predict the problem's severity and effects on the psychosocial and economic aspects of life. 5 Although different generic and specific questionnaires have been designed to measure the levels of pain and quality of life, most known questionnaires in this area are related to specific types of headaches such as migraines. 5 The most well-known headache-specific questionnaires include the headache impact questionnaire, 6 migraine disability assessment, 7 migraine-specific quality of life questionnaire version 2.1, 8 headache self-efficacy scale, 9 and Headache impact test questionnaire. 10 The migraine disability assessment and migraine-specific quality of life questionnaire version 2.1 were developed to assess the headache-related disabilities of migraine patients, while the headache self-efficacy scale contains 51 questions and focuses on the patients' level of self-efficacy. The headache impact questionnaire has only 8 questions and does not cover many aspects of disability, such as emotional and functional disabilities. The headache impact test questionnaire was localized and its validity and reliability were confirmed for migraine and tension-type headaches. However, its validity was not evaluated with regard to some types of headache, such as cluster headaches. In this regard, the Henry Ford hospital headache disability inventory (HDI) questionnaire is one of the most comprehensive questionnaires designed for measuring the disability of patients with headache. This questionnaire was developed to assess the impact of headache on functional and emotional aspects of daily life. The Alpha version of HDI (alpha-HDI) included 40 items, but the newest Beta version (beta-HDI) contains 25 items.
The Beta version of HDI, developed by Jacobson (1994) in English, is a generic headache disability questionnaire widely administered in the related research studies. [11][12][13][14] This questionnaire measures the effect of different medical and rehabilitation treatments on the physical and emotional disabilities in patients with different types of headaches. 15 This generic headache questionnaire is not specific to any headache type. Therefore, HDI can be applied to follow the general consequences of different headache types and to compare different aspects of disability for a variety of headaches. Based on the literature, the original HDI has strong internal consistency/reliability and construct validity. 16 The results of test-retest showed that HDI has an acceptable level of reliability considering the total questionnaire and its subscale scores. 16 The validity and reliability of the non-English versions of HDI, such as Spanish, have been previously investigated and confirmed. 17 The Iranian researchers and practitioners need reliable and valid tools to measure the emotional and functional states of patients with different headache disorders. However, the only available questionnaires are the validated Persian version of migraine-specific quality of life version 2.1 18 and Headache impact test 19 that measures the disability levels of patients with migraines and tensiontype headaches. So, this study was conducted due to the importance of HDI and the need for using a valid Persian questionnaire for assessing disability levels for different types of headaches.
Considering the growing number of multinational research endeavors, researchers are required to adapt and apply disability measures in various languages. So, this study was carried out to translate and culturally adapt the HDI questionnaire for administration among the Iranian patients with headaches. Moreover, we aimed to assess the validity and reliability of the developed Persian version of this questionnaire.

Study Design, Setting and Participants
The study participants included 250 Persian-speaking patients who attended the Neurology Clinic of Imam Reza Hospital in Tabriz University of Medical Sciences, Tabriz, Iran. Participants were recruited via convenience sampling method. Considering the study sample size, it has been suggested that the number of participants should be at least 10 times the number of items in the questionnaire. Given that the beta-HDI comprises 25 items, 250 participants were selected to participate in this study. In order to evaluate the questionnaire's test re-test reliability, we selected 30 participants and asked them to complete the Persian version of the beta-HDI over one-week and one-month intervals. The questionnaire's face validity and convergent validity were investigated using 20 and 50 patients, respectively. In order to select the participants, researchers examined the patients who referred to the hospital over a 3-month period and selected those diagnosed with chronic primary headache to enter the study. The other inclusion criteria were having chronic daily headaches at least 15 days a month diagnosed by a neurologist and being fluent in spoken and written Persian. Patients with secondary headaches who had underlying serious problems (infection, trauma, tumors, and brain bleedings) as well as those with severe psychotic and mental dysfunctions were excluded from the study. All participants were asked to sign informed consent forms to enter the study.
In order to translate the questionnaire and ensure its cross-cultural adaptation, the following steps were taken: establishing the experts' committee, forward and backward translation, preliminary pilot testing, validity assessment including the content and construct validity, reliability assessment consisting of the internal consistency and testretest reliability.

Translation and Cultural Adaptation of HDI
The alpha version of HDI (alpha-HDI) includes 40 items, which should be answered using "yes" (4 points), "sometimes" (2 points), or "no" (0 points). The items of this version were derived from the response history of the patients with headaches in the trial. From the alpha-HDI, a 25-item beta version (beta-HDI) was developed by Jacobson et al including items sub-grouped into functional and emotional subscales. In this questionnaire, the maximum and minimum attainable scores are 100 and 0, respectively. For the functional and emotional subscales, the maximum attainable scores are 48 and 52, respectively. Participants with a low total HDI scores of < 29 (with 95% confidence interval) do not experience great improvement with their headache treatments. 16 To translate the original questionnaire, the authors contacted the HDI developers by email and received the necessary permissions. To translate and to ensure the cultural adaptation of the HDI, we followed the published guidelines presented by previous studies. 20,21 In this regard, two independent native Persian professional translators were asked to translate HDI into Persian -forward-translation. One of the translators was aware of the questionnaire concept and the other was not. Both translators were instructed to translate the questionnaire conceptually, but not literally. Later, the two translations were compared and merged into a single questionnaire. In the next stage, two professional native English translators, who were blind to the original version, translated the Persian questionnaire back into English; both translators were unaware of the questionnaire purpose. Consensus was reached considering the semantic, idiomatic, experimental, and conceptual dimensions of the translated versions. As a result, a prefinal version of the questionnaire was developed. Later, an expert committee consisting of the translators, researchers, neurologists, statisticians, and physiotherapists reviewed and evaluated the entire translation process and its cultural adaptation during a meeting. Based on the suggestions provided by the panel of experts, some sentences of the questionnaire were revised in terms of the cultural rules, wording, and consistency. Consensus was reached with regard to the semantic, idiomatic, experimental, and perceptual equality concepts. For example, "difficult" and "incomprehensible" terms were identified and replaced with simple terminologies used more commonly. Finally, the pre-final version of the questionnaire was prepared.

Face Validity
To measure comprehensibility of the Persian version of HDI, the pre-final version was administrated among 20 patients in the presence of one of the authors as the preliminary pilot testing. Later, the questions were analyzed to determine if they were measuring what they were supposed to measure. The participants were required to complete the questionnaire carefully and share any misunderstandings or difficulties they had in understanding the questions. Furthermore, the authors examined the correlation of the questionnaire's objectives with the items' ambiguity, confusion, and misunderstanding according to the participants' level of understanding. Necessary revisions were made in the phrases or words of the final Persian version of the HDI. According to the patients' suggestions, the experts' panel replaced the words "handicap" with "disability", "entertainment" with "hobby", "fear" with "afraid of ", "stress" with "tension", "around the people" with "along with people", "I believe" with "I accept", and "social relationship" with "social gathering". Moreover, the sentence "It is tough for me to divert my mind from the headache and think about another things" was changed to "I can hardly ever get my mind away from a headache and focus on other things".

Content Validity
Content validity of the questionnaire was assessed by the panel of experts including two neurologists, two experts with PhD (doctors of philosophy) in physical therapy, one expert with a PhD in neuroscience, and one expert with a PhD in Persian literature, who was a native English speaker. Given the content validity, the experts' recommendations were adopted on grammar, use of appropriate and correct words, order of words in the items, and appropriate scoring. Modifications were evaluated by the authors and the required changes were applied. 22

Statistical Analysis
Construct validity of the HDI questionnaire was assessed using the structural and convergent validity. Confirmatory factor analysis (CFA) is commonly applied to investigate structural validity as a type of construct validity. [23][24] The purpose of CFA was to investigate whether the collected data fit the hypothesized measurement model. The original version of HDI included the functional and emotional subscales. Therefore, to evaluate the structural validity of the translated Persian questionnaire, the relationship between 25 items of the functional and emotional subscales was assessed by CFA using the maximum likelihood method.
In general, if the overwhelming majority of indices indicate a good fit, a good fit is probable. The sample has an acceptably good fit in the case that the ratio of Chisquare to degree of freedom index ≤2 or 3, Tucker-Lewis index and comparative fit index ≥0.95, standardized root mean square residual index ≤0.08, and root mean square error of approximation index < 0.06-0.08. Considering the original questionnaire, we assessed the fitness of the model with regard to its two domains and their items. After fitting this primary model, the goodness of fit criteria were moderate; so, the required modifications were performed to improve the model fit using the modification index in the CFA. Moreover, some covariates were added to the model between items that were theoretically justified.
To assess convergent validity, the Short-Form Health Survey (SF-36) was used. The SF-36 is a general quality of life assessment tool that measures 8 health-related domains: physical functioning (10 items), role physical (4 items), bodily pain (2 items), general health perceptions (5 items), vitality (4 items), social functioning (2 items), role emotional (3 items), and perceived mental health (5 items).The SF-36 also contains an additional item showing a perceptible change in the individual's general health status during the last year (health transfer). 26 The items of this questionnaire need different types of answers; some should be responded in two parts and others should be answered on a scale of 6 points. Scores of each part can range from 0 (poor health) to 100 (good health). The literature provides a well-documented resource considering the psychometric properties of the Persian version of the SF-36. 27 To verify the convergent validity, the Pearson correlation coefficient was measured among subscales of the functional and physical functioning, roll physical, body pain, and general health domains of the SF-36. Furthermore, the correlation among the emotional, vitality, social functioning, role emotional, and mental health domains of the SF-36 were calculated to verify the convergent validity.
To assess the test-retest reliability of HDI, 30 patients were asked to complete its Persian version three times: initially, after one week, and after one month. The testretest reliability was also evaluated using the intraclass correlation coefficient (ICC) through one-way random effects model for the whole questionnaire, including its functional and emotional subscales. The internal consistency of HDI was also evaluated using the Cronbach's alpha coefficient.
The normality assumption underlying Pearson correlation analysis was assessed via two numerical measures of skewness and kurtosis tests. All statistical analyses were done using Stata 14.0 (StataCorp, College Station, Texas 77845 USA) software. In addition, the level of significance was set at p-values of less than 0.05.

Results
The majority of participants were female (52.4%) and housewives (31.2%). Most patients reported moderate (44.4%) and severe headaches (42.4%). Table 1 summarizes the participants' demographic information and Table 2 shows the participants' mean scores obtained from HDI questionnaire. The Goodness-of-Fit index of CFA for construct validity of HDI has been depicted in Table 3. The comparative fit index (CFI), as the CFA fit index was 0.90 and the root mean square error index with 95% CI was 0.06 (0.05, 0.07) which indicated a relatively good model fit. Ratio of chi square to degree of freedom value was 1.99 which was acceptable. 28 The internal consistency reliability was evaluated using Cronbach's alpha, which was 0.91[95% CI: (0.89, 0.92)] for the total questionnaire, 0.82 [95% CI: (0.78, 0.85)] for the functional subscale, and 0.86 [95% CI: (0.83, 0.88)] for the emotional subscale. Normality assumption for Pearson correlation was established, so that the skewness and kurtosis indices ranged from -1 to +1 for all the variables. Therefore, we used the Pearson correlation to assess the correlation between HDI and SF-36 subscales. A significant negative correlation was also observed between quality of life and HDI (Tables 4 and 5), which confirmed the construct validity of the HDI questionnaire. The ICCs (CI 95%) for one week reliability of emotional and functional subscales as well as the total questionnaire score were 0.97(0.93-0.98), 0.96 (0.95-0.97), and 0.97 (0.95-0.97), respectively. The ICCs (95% CI) for one month reliability of the emotional and functional subscales as well as the total questionnaire score were 0.96 (0.91-0.98), 0.95 (0.98-.097), and 0.96 (0.93-0.98), respectively.

Discussion
Given the growing demand for measurement of disability caused by chronic headaches, we decided to translate HDI to Persian and to evaluate the validity and reliability of the Persian version. To this end, a panel of the Iranian experts and patients were recruited.
A valid tool reflects the conceptual areas that it was designed to measure accurately. 22 Face validity is defined as the extent to which the items appear to be meaningful and indicates whether the tool is assessing the desired items. 23 Content validity indicates the extent to which a tool represents the construct that is to be measured. 25 According to the results of the present study, the Persian version of the HDI had satisfactory results for content validity. This means that it was necessary to develop all the questions in the questionnaire according to expert panels. Thus, content validity was verified for all questions in terms of necessity. According to this index, no questions were eliminated.
The results of the CFA showed that the questionnaire's VAS, visual analogy scale (average pain in the recent months) (VA3 = 1-3 mild / VAS = 3-7 moderate/ VAS = 7-10 sever).     16 The significant and negative correlation between scores obtained from HDI and SF-36 shows the convergence between the two questionnaires; in SF-36, lower scores indicate higher disability, while in HDI, lower scores indicate less disability. For example, the correlation obtained between the HDI functional subscale and the physical functioning domain was -0.371, which was significant. Furthermore, the correlation of physical functioning domain with the total HDI score was -0.358, which was also significant. However, the correlation between the body pain domain of the SF-36 and the functional subscale of the Persian version of the HDI was not significant. This finding was expected due to the low number of items on the body pain domain in the SF-36 (only 2 questions), the number of questions in the functional subscale in the Persian version of the HDI (12 questions), as well as the dissimilarity between the questions in these two subscales. Regarding construct validity, a significant correlation was found between the physical domains of the SF-36, the functional subscale, and the total score of the HDI. Moreover, a significant correlation was observed between the mental domains of SF-36, the emotional subscale, and the total score of the HDI. Therefore, the Persian version of the HDI also showed significant construct validity.
As evaluated by the ICC, the reliability index of the Persian version in comparison with the original HDI was 0.97 versus 0.83 for the total questionnaire score, respectively. The findings indicated that the subscales were more reliable against the original HDI and that the Persian version of the HDI had satisfactory internal consistency and test-retest reliability. Our reliability findings were similar to those of the original HDI. 15 The first step in using any health tool in a country is to determine the "perceived health" of the patients, which depends on socio-cultural factors and cultural differences. These factors may affect the performance and interpretation of a questionnaire. We tried to remain loyal to the original text of the HDI, but we had to change and adapt some terminologies to the Iranian culture. For example, "social gathering" was very difficult to translate into Persian in a way that ordinary Iranian people may understand. Thus, we changed the rhetorical forms several times to achieve the best construction for the Persianspeaking people.
In measurements, the floor and ceiling effects are among challenging problems. They indicate that the questionnaire is inappropriate in measuring the maximum and minimum states of the participants (such as pain or disability). These effects were confirmed in the case where more than 15% of the patients achieved the worst or the best score.
In the Persian version of the HDI questionnaire, the percentage of people who obtained highest or lowest scores in the total scale was less than 15%. So, the ceiling and the floor effects were not observed in the total scale. Given that some items of the HDI questionnaire have only three options (scored as 0, 2, and 4), the participants' score was higher than 15%. To eliminate the floor and ceiling effects and to normalize the distribution of data, we need to design other options and distribute the scores among the options, so that an item can have 0, 1, 2, 3, and 4 attainable scores.
In a similar study, Rodríguez examined the validity and reliability of the Spanish HDI questionnaire. 17 In this study, 84 patients with migraine headaches and tension with an average age of 38.4 years were examined. The findings indicated a good reliability based on the internal consistency method by calculating the Cronbach's alpha coefficient (0.94). The reliability of the HDI reference study was also evaluated as very good, similar to the Cronbach's alpha coefficients reported by Jacobson et al 16 and our study (0.91). The correlation coefficients were calculated as 0.76 and 0.97 in the Spanish and Persian versions of this questionnaire, which are similar to the original version (0.83). Therefore, it can be concluded that the Persian version has obtained the acceptable levels in reliability compared with the original version and Spanish translation of HDI. The HDI was also translated into German, but we could not access this version.
In conclusion, based on our findings, the Persian version of HDI was developed and its validity and reliability were confirmed as a tool for measuring the quality of life of patients with headaches in both functional and emotional subscales. The data collected through this questionnaire can help the authorities to design the required treatment strategies. This questionnaire can be administered in neurological and rehabilitation research projects and in headache clinical centers due to its good factor structure.

Clinical Relevance
We are faced with an urgent need for a reliable and valid questionnaire to assess the quality of life among patients with chronic headaches, who refer to rehabilitation and health centers in Iran. Furthermore, we are required to know the effectiveness of the treatment process and the impact of therapeutic practices on the patients before and after the treatment. So, this study was conducted to fill these gaps.

Authors' Contribution
ZS, PS, SJ, MR and MF designed the study, SJ collected data, SJ and ZS and PS wrote the manuscript, PS analyzed data; MR and MG edited the manuscript and contributed to interpretation of the results. MF visited and referred the patients.