The First Inherited Retinal Disease Registry in Iran: Research Protocol and Results of a Pilot Study

Hamideh Sabbaghi, PhD1,2; Narsis Daftarian, MD3; Fatemeh Suri, PhD4; Mehraban Mirrahimi, MS4; Sina Madani, MD5; Abbas Sheikhtaheri, PhD6; Farid Khorrami, PhD7; Proshat Saviz, BS6; Mohammad Zarei Nejad, PhD8; Ali Tivay, MS8; Hossein Ali Shahriari, MD9; Alireza Maleki, MD9; Seyed Sajad Ahmadi, MD9; Monireh Sargazi, MS9; Frans P.M. Cremers, PhD10; Maryam Najafi, PhD10; Barbara Vona, PhD11,12; Thomas Haaf, MD11; Paulina Bahena-Carbajal, MD11; Afrooz Moghadasi, MS4; Houra Naraghi, MS4; Mehdi Yaseri, PhD13; Bahareh Kheiri, MS4; Masoumeh Kalantarion, MS4; Elham Sabbaghi, MS14; Mahtab Salami, BS2; Laleh Pazooki, BS2; Kazem Zendedel, MD, PhD15,16; Shahnaz Mojarrab, PhD15; Hamid Ahmadieh, MD4*

(RP) as the most common type of IRD is estimated to be 1 per 4000 individuals all around the world. [4][5][6] According to the definition presented by the National Committee on Vital and Health Statistics, a patient or a disease registry is an organized system for collection, storage, retrieval, analysis, and dissemination of information being widely applied in medicine and public health domain. 7,8 Registries are essential tools for providing information about the prevalence and natural history of a particular type of disease, discovering associated environmental and genetic factors, and prognosis related to different therapeutic approaches. [8][9][10][11] From administrative perspectives, patient and disease registries are efficient tools in improving health outcomes as well as reducing treatment costs. In addition, they may become useful in providing clinical guidelines and recruiting patients for clinical trial studies. [8][9][10][11][12][13] The number, size and scope of patient registries in the field of ophthalmology have been gradually growing all around the world during the last decade. 14 A comprehensive review by Tan et al showed that a total of 85 clinical registries have been established in various fields of ophthalmology, including blindness or low vision (n = 19), corneal transplantation (n = 10), glaucoma (n = 10), cataract or refractive surgery (n = 8), retinoblastoma (n = 7), endophthalmitis or uveitis (n = 6), ocular trauma (n = 6), age-related macular degeneration (n = 5), microphthalmia or congenital ocular anomalies (n = 4), diabetic retinopathy (n = 2) and retinal detachment or macular hole (n = 2). 14 Additionally, there are several clinical registries focusing on IRDs or RP throughout the world. 2,14-17 A well-organized disease identification based on genotyping and phenotyping characteristics of each type of IRD provides a comprehensive dataset based on actual data focusing on geographical features that can be crucial for gene therapy in future. 3 The complete set of inherited and environmental factors contributing to IRDs in the Iranian population has not been systematically cataloged due to lack of access to standardized information resources. A national patient registry for IRDs has the potential to identify disease prevalence along with genetic inheritance patterns, diseasecausing genes and mutations as well as clinical outcomes and to set a framework for applying possible therapeutic approaches. [8][9][10][11][12][13] The drive to learn more about these aspects in our patient population resulted in the development of our first national registry for IRDs with the research protocol described below and the initial results of the pilot phase.

Material and Methods
This is a community-based participatory study which was performed in two referral eye centers including Labbafinejad Medical Center located in the capital city, Tehran, and Alzahra Eye hospital, located in southeast of Iran, city of Zahedan. The pilot discovery phase was conducted to identify possible barriers for implementation of the national phase in multiple cities with various geographical characteristics. The following steps describe how this registry was developed and implemented. Case Definition and Recruitment in our National IRD Registry Patients from all age groups were recruited sequentially based on the general announcement via social media and patient support associations. An initial registration was performed for Iranian individuals who were diagnosed with IRD based on clinical and paraclinical findings by a board certified retina specialist. The IRD diagnosis was later confirmed by genetic testing in some of the registrants.

Data Elements
In this phase, a minimum data set (MDS) for the IRD registry was developed in the following three steps including literature review, classification of data elements, and final approval by Delphi technique. An initial data set was provided based on a clinical form being used in the clinic of ophthalmology at Labbafinejad Medical Center, Tehran, Iran. The dataset was further completed based on other IRDs registries worldwide. We completed several interviews with academic retina specialists to verify that the dataset is accurate in practice.
Subsequently, the data set resulting from interviews was presented to the Delphi panelists. In the last step, the final MDS was validated using two rounds of Delphi technique. For this purpose, subject matter experts were invited from different domains including board certificated retina specialists (n = 5), epidemiologists (n = 2), geneticists (n = 2) and a visual science specialist (n = 1). A questionnaire was presented to the panel participants in order to determine the importance of each data element using the 5-Likert scale (from 1-lowest to 5-highest), to display information (numeric, text, multiple choice, and binary), and the necessity of data recording (required or optional). All participants were asked to insert additional suggestions for each data element or suggest another one if necessary.
Inclusion criteria of data elements in the MDS were based on the level of agreement resulting from the participants' response in Delphi rounds. Elements with a score of 4 or 5 by at least 70% of participants were included in the MDS, while elements with a score of 1 or 2 from at least 70% of participants were excluded. Afterwards, the remaining data elements based on participants' suggestions were discussed in the second round of Delphi technique.
In this step, the first questionnaire was revised based on the feedback received from participants with no chance for additional suggestions, and it was presented to the second Delphi round. Initially, the results obtained from the first round were presented to the participants and they were asked to assign a scoring number to each data element as had been previously performed. Like the first round, data elements with 70% agreement that scored 4 or 5 were included in the final MDS and the remaining elements were disregarded.

Data Collection and Quality Assurance
The quality control phase contained three steps including data collection, data entry, and final registration confirming the IRD diagnosis. For this purpose, a standard administrative protocol was developed and approved by the steering committee members. All examination tools including low vision chart as well as the procedure guideline for patient examination were standardized for cohesive data collection. Furthermore, a software was developed so that the validation rules prevent non-standard data entry. Clinical staffs were also trained to document patients' data either online or offline. The medical history of each patient was also recorded carefully for accurate final diagnosis. Genetic testing was done for some of the registered cases to verify diagnosis. Furthermore, family members of some patients were recruited to better characterize genotypephenotype correlations.

Governance and Steering Committee
The work for this national registry was approved and funded by the Iranian Ministry of Health and Medical Education in 2016. We established a national steering committee for managing eight sub-committees including health terminology and coding, financial and administrative, data entry, quality control and evaluation, statistical analysis and epidemiology, information technology, scientific and research, as well as documentation and website update group (Figure 1). Meetings were held annually to investigate and resolve barriers as well as making decisions about the future of the program.
Biobank and Biological Samples Blood samples were collected from all consenting participants to create a DNA biobank for genetic studies. Genomic data from DNA was extracted from peripheral blood leukocytes using the standard salting out method. A unique ID for each sample was used for the DNA collection tubes. We aimed to create a DNA biobank linked to phenotypic data derived from a web-based registry software in order to increase our understanding about causative genes/mutations and genotype-phenotype correlations among the Iranian IRD affected patients. This knowledge base could potentially help with genetic counseling of the affected families and planning for the future genetic-based therapeutic approaches.

Genetic Testing
We analyzed 195 patients from 122 Iranian families, with no relationship to each other, who participated in our IRD registry using two sequencing techniques. These techniques include targeted sequencing of 108 IRD-associated genes (160 patients from 92 unrelated families) and whole exome sequencing (35 patients from 30 unrelated families) in collaboration with two academic institutes from the Netherlands and Germany, respectively.
A cost-effective targeted sequencing based on molecular inversion probes (MIPs) was used to sequence the coding regions of 108 non-syndromic IRD-associated genes in probands of 92 unrelated families. For the 30 other families, exome enrichment (Nextera Rapid Capture, Illumina) was carried out by hybridization of all coding exons and flanking intron sequences and paired-end 2x76 bp sequencing on the NextSeq 500 sequencer (Illumina) using the NextSeq Reagent Kit v2. Sequence alignment and variant calling were performed against human reference genome UCSC NCBI37/hg19 by CASAVA software (Illumina). Additionally, wANNOVAR (http://wannovar. wglab.org/) and ENSEMBL (http://asia.ensembl.org/ info/docs/tools/index.html) tools were used for variant detection. Inherited retinal dystrophies are heterogeneous genetic disorders. Variants in more than 300 genes have been described to cause IRDs (RetNet; Retinal Information Network, https://sph.uth.tmc.edu/retnet/). Variations in all identified IRD genes were considered for genetic analysis in our research. Subsequently, candidate variations were selected by removing SNPs with a MAF of > 0.01 in the dbSNP database (https://www.ncbi.nlm.nih.gov/ snp/), the 1000 Genomes databases (www.1000genomes. org), the NHLBI Exome Sequencing Project (http://evs. gs.washington.edu/EVS/), and the Exome Aggregation Consortium database (http://exac.broadinstitute.org/). At first, we searched for the already known pathogenic mutations in the variations list. In the absence of known pathogenic mutations, the candidate variants were screened for segregation in the pedigrees by Sanger sequencing of the DNA obtained from existing family members. For the nucleotide changes that were novel or of unknown clinical significance, in silico bioinformatics tools were used to predict and score the deleterious effects of the identified variants on the protein product. Cases with likely pathogenic variant(s) were flagged when the affected gene matched the described phenotype and the detected variant(s) segregated in the family.

Software Development
The steering committee selected an expert engineering team affiliated to Amir Kabir University of Technology in Tehran, Iran for software development as well as a private vendor who contributed to software support and maintenance. The steering committee members had several business analysis meetings with the company analysts and developers who were medical informaticists and specialists in health information technology. In these meetings, the functionalities of the software, data forms, users' accessibilities, validation rules and pre-defined reports were specified. The Iran IRD registry software is a web-based software (http://irdreg.org) that has been developed based on an open software license (DHIS2) utilizing PostgreSQL and Java.
The developed software covers several informational sections such as patient demographic and hereditary characteristics, history of systemic and ocular diseases, biobank data, visual and ophthalmic findings, paraclinical information, genetic data and final IRD diagnosis in an electronic case report format. Several validation rules were designed to ensure data validity, format, and completeness. Additionally, the final IRD diagnosis concepts were verified  Figure 2.
For security and data accessibility, we defined several organizational units for our collaborative centers all around the country and designated specific users for each unit. Every user could only have access to his/her profile's data. The software is centrally maintained by the Ophthalmic Research Center affiliated to Shahid Beheshti University of Medical Sciences. In this center, the national principal investigator and executive director of the project have access to all data.

Statistical Analysis
To describe data, we used mean, standard deviation, median and interquartile range, frequency and percentage. Clopper-Pearson 95% confidence interval was used to present precision of percentage estimates. All statistical analyses were performed using the STATA software (StataCorp. 2017. Stata Statistical Software: Release 15. College Station, TX: StataCorp LLC).

Results
The IRD registry was developed according to the abovementioned steps in two qualitative and quantitative study phases. The registry protocol was also listed and disclosed on: www.clinicaltrial.gov website (registration number NCT04131400). The workflow from patient recruitment to patient registration is presented in Figure 1.

Minimum Data Set
Based on the comprehensive review of literature, clinical forms from other registries, and reference textbooks, a total of 160 preliminary data elements were identified. This list was verified by individual interviews with each of the academic retina specialists and reduced to 145 data elements in eight categories, including demographic information, hereditary characteristics, history of systemic and ocular status, biobank data, visual examination, ophthalmic findings, paraclinical data and genetic findings. The final list, however, contains a total of 151 data elements in the eight above-mentioned categories that were approved and validated by two rounds of the Delphi technique (some of data elements are presented in Supplementary file 1). Figure 3 presents a flowchart showing the process by which the elements were included in the final dataset.
Demographic Characteristics A summary of the demographic characteristics of the registered patients are presented in Table 1. A total of 1001 IRD patients with a mean age of 32.41 ± 15.60 years (range, 3 months to 74 years) were registered in our database. Clinical examination of 555 male subjects (56%) and 446 female subjects (44%) were recorded. The majority of the IRD patients (26.7%) were in the third decade of their lives, while only 10 patients were aged one year or younger. Furthermore, parental consanguinity was recorded in 76% of the registered patients (first degree relationship was observed in 66%) and 58% of our cases had a positive family history of at least two affected individuals in their family or relatives.
The majority of registered patients were of Persian ethnicity (51%). Additionally, 35% of the registered patients were employed in different occupations which were in proportion to their visual ability, while 13% of them were jobless due to their visual impairment. The majority of our registered patients (57%) had a low level of education and 3% of them were illiterate due to the early onset of severe visual impairment.   19 was identified in 30% of our patients. The mean spherical equivalent of refractive errors was -1.36 ± 3.96 diopters (D; range, -17.75 myopia to +11.75 hyperopia) with a high percentage of myopia (44.5%, Table 2). Figure 4 illustrates the percentage of IRD diagnoses in the eight general categories according to the functional and genetic entities. The majority of the registered patients had retinitis pigmentosa (42%, 95% CI: 38.9% to 45%) which was classified as the most prevalent subtype of diffuse photoreceptor dystrophy. Leber's congenital amaurosis (15%, 95% CI: 12.9% to 17.3%) and cone-rod dystrophy (11.4%, 95% CI: 9.5% to 13.5%) were diagnosed as the second and the third common diagnoses among the study population.
Genetic Report To date, we have DNA samples from the majority of the registered patients in our biobank from 843 unrelated families. About 15% of these families (122 from 843 families) and 20% of all the registered patients (195 from 1001 patients) have been genetically analyzed and the genetic reports have been reported back to the participants. The causative mutations in about 72% of the investigated families were identified in the known IRDs causing genes and about 80% of the identified variants were novel as IRD-causing mutations. Final diagnosis based on retinal imaging and clinical diagnosis was confirmed by analysis of all genetically studied cases.

Discussion
The IRD registry developed by us is the first national registry established for eye diseases in Iran. In the pilot phase, two medical centers located at different geographical zones with diverse socioeconomic conditions were selected; the capital and one of cities in the southeast of Iran were selected in order to identify possible obstacles in establishing the registry.
The IRD registry application software was developed based on open access DHIS2, which is one of the main health information systems (HIS) for data collection in disease registries. 20 The registry outputs are frequently used in decision making in health care systems. 20,21 Based on a recent review conducted by Dehnavieh et al, 21 DHIS2 has been used in patient registries at different levels of HIS in 46 countries. Additionally, electronic health records can not only be beneficial for individual research studies but can also facilitates scientific collaboration with other international research centers throughout the world. 22 Electronic health information resulting from patient registries or other healthcare data sources are considered  valuable data sources that are essential in any national HIS system for identifying the natural history of diseases as well as the common therapeutic plans. 23 Therefore, linkage between various data (and ultimately knowledge) objects is one of the fundamental considerations in registry database design. In the IRD registry, the personal profile of each patient was identified by his/her 10-digit standard national code which provides the possibility of data linkage with other sources of patient information in EHRs in Iran such as SEPAS and SIB national health projects. Furthermore, utilizing five standard terminology systems with respect to the anatomical and genetic characteristics of the IRD concepts increases our registry validation in data recording and provides the possibility of scientific collaboration with higher accuracy.
To our knowledge, there are several global or national patient registries which have been established for IRDs throughout the world. 2,[14][15][16][17] The most prominent registry is My Retina Tracker TM which was launched by the FFB to register patients around the world. 15 The blood samples of the unaffected relatives of patients with IRDs were kept in the biobank of My Retina Tracker TM , which is a standard practice that we have also integrated into our IRD registry workflow. In My Retina Tracker TM , patients can access their data. However, this feature was not considered in our registry to avoid sample selection bias. The steering committee of our registry believed that self-registration by the patients induces selection bias, since only literate patients, those with accessibility to an internet connection, and patients who were aware of their disease conditions would have an opportunity to register. With respect to the frequency of IRD subtypes, our reports were compatible with My Retina Tracker TM report showing about half of the registered patients with retinitis pigmentosa. Patients with a diagnosis of age-related macular degeneration were also registered in My Retina Tracker TM which is not classified as IRDs.
The Danish Retinitis Pigmentosa Registry is the national registry for IRD diseases in Denmark. 2 In this registry, patient diagnosis is performed based on clinical findings via fundus color photography and electrophysiological tests.
Three diagnostic proceedings including pedigree drawing, electroretinographic, and genetic testing are recommended for definite diagnosis of IRD, especially in the pediatric population. 1 Comprehensive clinical examinations and retinal imaging were performed to determine the IRD diagnosis in our IRD registry. Additionally, genetic testing is another supplementary diagnostic tool which has been conducted for registered patients in our registry and will be completed in the future. Syndromic retinitis pigmentosa was observed in 28% of the registered patients in Danish Retinitis Pigmentosa Registry, while it was seen in only 8.89% of our registered patients.
Our review of the literature shows the frequency of IRDs, as well as the inheritance pattern and genetic characteristics among the affected Iranian population have not been previously reported; therefore, this registry would be helpful in identifying eligible patients who may give consent to participating in clinical trials. It can also be beneficial for providing gene-and variant-level information for the development of a novel gene therapy.
Establishment of a genetic diagnosis in patients with IRD diseases is essential to guide appropriate genetic counseling. Furthermore, recognition of the inheritance pattern is crucial for calculating risk in family planning situations in individuals with a positive family history. 24 This feature was also considered in the design of our registry.
Our clinical findings show that nearly one-third of our registered patients (30.5%) have noticed symptoms such as nyctalopia, hemeralopia, photophobia, color vision deficiency, restricted visual field, and the prolonged latency of light and dark adaptation in the first decade of life. The early onset of disease manifestation correlates with severe visual impairment in the majority of our patients (30.7%) who have been categorized as blind based on the WHO classification. 18 Furthermore, the majority of our registered patients (74.43%) were under 40 years of age who are expected to benefit from educational tools and occupation opportunities. Low levels of education and high percentage of unemployment among our patients can be attributed to the reduced vision and lack of appropriate ability for learning and working.
Due to the diversity of the geographical features of Iran and difficulties for referring patients to the centers that are far from their residential areas, we intend to expand the project in the provinces without any registration centers. This will also enhance our registration coverage.
The current study, as a pilot, establishes a database registry for IRDs in Iran with the following lessons learned: To avoid selection bias, access to data entry was given only to one retina specialist as the focal point in each academic center. To enhance the validity of the date, no information extraction from other EMR sources was permitted. Data linkage among the affected individuals related to one genetic pedigree was guaranteed by assignment of a prefix to the IRD codes as a unique identifier. Also, the attachment of the affected patients to the main proband was expected in the software design. We avoided duplication in data entry through automated verification of the patients' national code. Lastly, to anticipate occasional internet connectivity disruption, both offline and online modules were provisioned for data entry, while the software was designed as a mobile application for better accessibility.
One of the limitations in data collection was relying on patient's self-report for some of the data elements such as the disease onset age. Therefore, cross-validation of patients' response with clinical records was performed to increase data validity. Data collection was incomplete for some of the data elements; however, the missing rate was generally under 8% which could be in the acceptable range.
In conclusion, our study shows a successful web-based software design and data collection as a proof of concept for the first Iranian IRD registry. Therefore, we are currently in the process of planning to extend this study and IRD application to other medical centers throughout the country. These data will assist researchers to rapidly access structured information about the distribution of the patterns of genetically heterogeneous retinal diseases in Iran for identification of preventive and therapeutic measures in near future.