Building a Cancer Biobank in a Low-Resource Setting in Northern Iran: the Golestan Cancer Biobank

Fatemeh Ghasemi-Kebria, PhD candidate1#; Nastaran Jafari-Delouie, MSc1#; Taghi Amiriani, MD1; Alireza Norouzi, MD1; Behnoush Abedi-Ardekani, MD2‡; Dariush Nasrollahzadeh, PhD3,2‡; Mohammad Ashaari, MD4; Sima Besharat, PhD1; Mohammad NaeimiTabiei, MD5; Isen Gharanjic, MSc1, Zahra Babapalangi, MSc1, Hossein Poustchi, PhD3, Shahryar Semnani, MD6,1; Abdolreza Fazel, MD5*; Zisis Kozlakidis, PhD7‡; Elisabete Weiderpass, PhD8‡; Gholamreza Roshandel, PhD1*


Introduction
Despite the development of oncology and cancer genomics research, cancer is the second leading cause of death worldwide. The latest global cancer data from the International Agency for Research on Cancer (IARC) suggested that cancer burden increased to 19.3 million new cases and 10.0 million cancer deaths in 2018. 1 The growing cancer incidence and mortality, especially in low-and middle-income countries (LMICs), are due to several factors, including population growth and aging as well as changes in the prevalence and distribution of the main risk factors. 1,2 Conducting cancer research to identify risk factors is one of the most important steps in designing effective cancer control programs. Identifying risk factors requires the analysis of large numbers of biological samples in order to be able to effectively differentiate causal biological relations. Harmonized and standardized collections of biological materials and associated data for the longer term, and representative of populations, may ensure achieving this goal while promoting cancer research. 3,4 In the current era of availability of sophisticated analytical technologies, the high quality of collected biological material plays a key role in ensuring experimental reproducibility and eventually meaningful interpretation of cancer research. Therefore, the standardized collection of high-quality, research-ready samples and data will help to expand medical research on cancer and will consequently inform cancer control planning. 5 Biobanking is a resource intensive undertaking, and thus might be overlooked in systems with many competing priorities, especially within LMIC contexts, where it may seem too difficult or even impossible to set up such research infrastructure. 6,7 However, the appropriate adaptation of the standard operation protocols and guidelines according to these limitations can provide a workable technical solution. At the same time, the consideration of the potential benefit(s) of the supported research and the downstream impacts of this research on community health may justify biobanking activities and allow them to be considered as one of the priorities in health-related research. 8 The Golestan province in Northern Iran has been known as a high-risk area for upper gastrointestinal (GI) cancers since the 1970s. 9,10 The Golestan Population-Based Cancer Registry (GPCR), a voting member of the international association of cancer registries (IACR) since 2007, covers the total population of Golestan and provides high-quality population-based cancer data to researchers as well as to health policy makers at local and national levels since 2004. 11 As reported in 2014, 12 the agestandardized incidence rate (ASR) (per 100 000 personyear) of esophageal cancer in Golestan males (15.81) and females (11.71) was considerably higher than the national rates of males (6.47) and females (5.13). The ASRs of stomach cancer were also higher in the Golestan male (24.79) and female (10.95) populations compared to those reported for Iranian males (21.24) and females (9.44). 12 In addition, recent reports from the GPCR showed increasing trends in the incidence rate of other cancer such as colorectal, 13 breast 14 and lung cancers. Therefore, because of the high rate of gastrointestinal (GI) cancers and the increasing incidence trend of other cancers in the Golestan province, 15 promotion of cancer research activities has been taken into account by the medical research communities and policymakers in this area. For this purpose, the Golestan Cancer Biobank (GoCB) was established in 2016 to provide biological samples and associated data collected from cancer patients by applying international standards and supporting high-quality cancer research. This article will present the process of designing and creating this biobank infrastructure (within an LMIC context), as well as the initial results of the GoCB, as the first and only cancer-specific biobank in the Golestan province, a high-risk area in northern Iran.

GoCB Protocol
The creation of GoCB was conducted as a 5-year research project. As the first step, the GoCB protocol and corresponding standard operation procedures (SOP) were developed considering internationally accepted standards and protocols 16,17 and for different operational aspects of the GoCB, such as the setting up of the GoCB lab, the GoCB field centers, types of samples, sample collection and sample processing methods, sample preservation and the access policy for samples.

GoCB Organization
The GoCB organization chart includes a steering committee, a scientific committee and the GoCB secretariat. The steering committee consists of the principal investigators (PIs) of the GoCB project. The scientific committee consists of clinicians and researchers specializing in various fields related to cancer and biorepositories, including cancer surgery, oncology, pathology, genetics, biochemistry, cellular and molecular biology, and epidemiology. The main activity of the GoCB scientific committee is to supervise all scientific aspects of the GoCB, especially the development and revisions of the main protocol and the specified SOPs. The GoCB secretariat consists of dedicated trained staff, located in the Golestan Research Center of Gastroenterology and Hepatology (GRCGH) affiliated with the Golestan University of Medical Sciences (GOUMS), Gorgan, Iran.

Setting up the GoCB Laboratory
The GoCB lab, at the GRCGH, is located within Sayyad Shirazi hospital, Gorgan city, northern Iran. It is a referral tertiary public hospital for cancer patients of the Golestan province and other cities of neighboring provinces, including Semnan and North Khorasan provinces. The GoCB lab was equipped from the outset with standard laboratory cabinets (re-purposed), centrifuges, vertical laminar flow hood as well as equipment to process the tissues into formalin fixed paraffin embedded (FFPE) tissue blocks. The GoCB lab was also equipped with -80°C freezers for storage of fresh tissue samples as well as blood and urine samples. An in-house GoCB-specific, FFPE box and cabinet were created for the storage of FFPE samples ( Figure 1).

GoCB Field Centers
The GoCB also receives biological samples from cancer patients from its field centers (sample collection sites) which are mainly located in public and private hospitals. Additional selected outpatient centers (e.g., centers providing endoscopy, colonoscopy, bronchoscopy and colposcopy services) also collaborate with the GoCB as potential field centers. In each GoCB field center, an experienced healthcare worker (nurse, laboratory assistant, midwife, surgical operation assistant and endoscopy/colonoscopy/bronchoscopy assistant) is selected and trained to be assigned as GoCB field staff. These GoCB field staff are full-time personnel of their field centers and are compensated for their efforts proportionately, depending on the number of participants they recruited. As part of continuous training program, GoCB annual meetings are arranged during which all GoCB staff are invited for reviewing and open discussion on the protocols, as well as to appreciate their contribution to the GoCB.

Types of Biological Samples
The main biological samples collected by the GoCB include blood sample, urine sample, fresh mucosal biopsies by endoscopy, fresh surgically resected tissue samples and FFPE tissue samples. GoCB systematically collects tumor/non-tumor pairs from all cancer patients, if routine diagnostic sample collection is not hampered.

Collection of Data
After obtaining informed consent, data collection is performed using two GoCB-specific data collection forms. The GoCB data collection form 1 (general questionnaire) consists of patients' demographic data, clinical data and tumor characteristics (site of tumor, morphology, behavior and tumor grade); the latter are collected based on the pathology report. The third edition of the International Classification of Diseases for Oncology (ICD-O-3) is used for coding tumor characteristics, including topography, morphology, behavior and grade. 18 The GoCB data collection form 2 (risk factor questionnaire) was developed using the Golestan Cohort Study questionnaire. 19 It covers a broad range of generic risk factors applicable to all cancer types as well as specific ones based on the epidemiological studies on common cancer types in our population, including anthropometric measures, level of education, occupation, smoking behavior, alcohol and opium consumption, diet, oral hygiene habits and medical history, cancer history in family, exposure to X-ray radiation, exposure to pesticides, physical activity and data on reproductive behaviors (only for women).

Collection of Biological Samples
The GoCB guidelines for the collection of biological samples include detailed information on each type of biological sample collected, including temperature, time and place of preservation of sample in the filed center, condition of transporting the sample to the GoCB lab, as well as name and telephone number of contact persons in the GoCB laboratory. Additional information includes ischemia time (for surgical resected samples), date and time of sample shipment to the GoCB and cool box temperature during sample shipments.
Following the patient's informed consent, blood samples (9 mL) are collected using ethylenediaminetetraacetic acid tubes. Blood samples are shipped to the GoCB laboratory within 2 hours after collection using a cool box and ice pack to ensure appropriate temperature conditions during shipment. Mid-stream urine samples are collected using specific sterile containers and transported to GoCB laboratory in the same package and transport facility as blood.
Fresh surgical tissue samples (size of 0.5 cm * 1 cm * 1 cm) and fresh endoscopy tissue samples are collected using containers with RNA-later and stored at room temperature until shipment to the GoCB laboratory. Formalin fixed surgical tissue samples are collected using prefilled containers with 10% buffered formalin and stored at room temperature until shipment to the GoCB laboratory.

Quality of Collected Samples and Data
Upon receiving the biological samples and questionnaires, the quality of data and biological samples are checked by GoCB secretariat staff to ensure adherence to the GoCB guidelines and protocols. An in-house developed quality control checklist allows to check for the completeness and accuracy of data in a consistent manner. If there is any deviation observed, e.g., missing data, errors in filling out the questionnaire, unclear data, inadequate amount of specimen (volume or size) and errors in labeling, the GoCB secretariat staff provide feedback to the field staff and follow up the issue until resolved.

Processing and Preservation of Biological Samples
GoCB uses high-quality cryotubes (of 0.5, 1.4 and 3 mL) according to the type of biological sample. The tubes are sterile and free from DNase, RNase, DNA, and endotoxins. Each tube has a unique ID, as a laser etched 2-D barcode on the base. A 2-D barcode reader is used to ensure entering the correct sample code into the GoCB software.

Processing and Preservation of Blood Sample
Blood samples are separated and stored in 1 mL aliquots, as different parts including whole blood (red cap), plasma (white cap), Buffy coat (blue cap) and red blood cell (RBC) (gray cap). 16 Urine samples are also stored in 1 mL aliquots. The samples information is registered in the GoCB software through the sample barcode, as well as detailed information on the sample location (freezer, rack, box, place within the box). After registering in the GoCB software, the samples are transferred and stored in the -80°C freezer.

Processing and Preservation of Fresh Tissue Samples
After receiving fresh endoscopy and surgical tissue samples in the GoCB laboratory, tissue containers with RNA-later are placed overnight in refrigerator (4°C). On the next day, the tissue samples are removed from the container and RNA-later and transferred into 0.5 mL cryotube (endoscopy sample) or 3 mL cryotube (surgical resected sample) with external thread screw caps. After registering sample information in the GoCB software, the samples are transferred and stored in the -80°C freezer.

Processing and Preservation of FFPE Samples
After receiving formalin fixed tissue samples, the tissue specimens are processed and embedded in paraffin in accordance with routine histological techniques for the preparation of FFPE blocks. 20 The FFPE samples are located in the GoCB-specific FFPE box and stored in our specific FFPE cabinet at room temperature.
GoCB Software and Dataset An in-house software was developed for management of patient data and biological samples information ( Figure  2). The GoCB software consists of two main interfaces: the sample registration interface and the patient registration interface. The two interfaces are linked using a unique GoCB registration number.
The sample registration interface is a laboratory information management system, called Bio-Tracker, version 3.3.0, locally developed by "Batab Tajhiz Parsin" Company, Iran. It contains detailed information on the biological samples.
The patient registration interface consists of detailed information on the patients including demographic data, clinical data as well as risk factor data, as collected by the standardized questionnaires. The patient registration interface of the GoCB software was developed using the CanReg5, an open-source tool developed by the IARC to input, store, check and analyze cancer registry data. 21 The CanReg5 tables, fields and variables were modified and aligned to the GoCB questionnaires and prepared a GoCB specific dataset in the CanReg5.
Importantly, using the CanReg5 facilities (providing unique codes for patient, tumor and source) and considering patients' national identification numbers, all biological samples of a patient collected from different sources (field centers) or at different times and even from different tumor sites (in patients with multiple primary tumors) are linked to the patient's file using a unique GoCB registration number.

Linkage between the GoCB Dataset with Golestan Cancer Registry Dataset
The GPCR is a high-quality cancer registry which covers the entire Golestan population and collects populationbased cancer data since 2004. 11,15 The GoCB dataset is annually linked with the GPCR dataset to add additional patients' data (e.g., survival data, etc.) to the GoCB dataset.

Use of the GoCB Sample and Data
The GoCB samples and data may be provided to researchers according to a research proposal. The proposals are submitted by email to the GoCB secretariat (gocb@goums.ac.ir). Only pseudo-anonymized samples and data will be available, after approval of the proposal in the GoCB scientific committee. The requested samples and data will be provided to the PI of the research project by signing a material transfer agreement. At the time of delivering selected samples to research PIs, the GoCB secretariat may also provide the list of physicians involved in the collection of those samples (considering the number of samples), while both physician and research PIs are blinded regarding personal information of the research participants. This will help the PIs to access the physicians if they need further information and help regarding the clinical aspects of the disease and specimens and also regarding the interpretation and clinical application of the research findings. In addition, the physicians may receive (via the GoCB) feedbacks from the research team regarding the samples and research findings, and it may also give them a chance to be actively involved and collaborate in research projects.

Ethical Considerations
Participants must complete an informed consent form before starting sample and data collection. All information about the project including its benefits and risks as well as indications of using the samples, especially possible use of the samples in international research project are explained to the patient. All patients are well informed about their right to refuse or stop their participation at any stage of the project. The GoCB secretariat and field staff as well as clinicians are trained to ensure that participation in the GoCB does not interfere with appropriate patient diagnosis or care. The GoCB uses anonymized codes to ensure the protection of individuals' privacy. The GoCB secretariat and staff have limited access to the dataset according to their role and only a representative of the GoCB steering committee (GR) is authorized to have full access to the GoCB dataset. Neither physicians nor patients are reimbursed for participation in the GoCB.

Results
The GoCB started collecting data and biological samples in December 2016. Between December 2016 and November 2020, a total of 1256 cancer patients were invited to participate in the project. Thirty-nine of these patients (3.1%) did not consent for sample collection and were therefore excluded. Finally, 1217 cancer patients participated in the GoCB including 484 (39.8%) males with a mean (SD) age of 62.5 (13.3) years and 733 (60.2%) females with a mean (SD) age of 53.2 (14.4) years. Of the total GoCB participants, 715 (58.8%) were residents of urban areas and 502 (41.2%) were from rural settings. Table 1 shows the number and proportions of the GoCB samples by tumor site. The majority of the GoCB participants (n = 942, 77%) were those with gastrointestinal (esophagus, stomach, colorectal) and breast cancers. The GoCB participants were recruited from 18 different field centers through the contribution of 29 field staff and 23 physicians of 5 different specialties (Table 2). Complete data sets were successfully collected from 793 (65.2%) of the total 1217 GoCB participants.
Overall, 3563 samples were collected from the study participants between December 2016 and November 2020, as shown on Table 3 by sample type.
By November 2020, a total of 730 GoCB samples were used in 7 national and international research projects, for example the Mutographs project funded by the Cancer Research UK (CRUK) grand challenge program. 22 According to its preliminary findings, two GoCB tissue samples went through standard pipeline for whole genome sequencing of tumor. The quality control steps for data and sample quality suggested that percentage of tumoral cells through using allele-specific copy number analysis was 48% and 60% which is at the level of acceptance for downstream analysis (the average tumor purity in this multi-center studies is around 42%). The findings also suggested that more than 97% of reads were mappable to human genome without report of contamination.

Discussion
We presented detailed information on the design and operation of the GoCB, a cancer biobank in a restricted resource setting in northern Iran. As in other LMICs, the limitations and challenges were addressed through identifying the best strategies available, based on specific conditions and available resources. During the design phase, GoCB protocols were developed through close collaboration with national and international institutions and experts and the use of available, validated international standard protocols and guidelines. 16,17 The GoCB has been accepted as a member of the biobank and cohort building network (BCNet) of the IARC as of August 2019, 23 and the International society for biological and environmental repositories (ISBER) in January 2020.
The GoCB secretariat and laboratory were physically located within a public hospital, as a cost mitigation strategy. By using available facilities in the hospital (building, power supply, air circulation, sanitation, etc.), the GoCB could substantially minimize the setting up and operational costs. In addition, using in-house made equipment (e.g., GoCB-specific FFPE box and cabinet) and development of in-house GoCB software, especially using the open source CanReg5 tool, played a critical role in preserving the limited available resources within our existing settings.
According to the GoCB plan for the use of samples and data, the name of physicians involved in sample collection were provided to the research team(s) and this could establish a link between the physician and the research team, resulting in possible closer collaborations between all parties involved. Such potential collaborations could help research teams to improve the quality of their research and on the other hand, could motivate the physicians to become involved even more actively in research projects through the GoCB. In order to maximize the use of funds in each GoCB field center, the center personnel operating as GoCB field staff collaborate with the GoCB on a parttime basis, while maintaining their main jobs in the field center.
In order to maintain the stakeholder engagement for the longer term, all GoCB collaborators, including physicians and field staff were provided official certificates according to their contributions during the different phases of the GoCB project. As part of the quality maintenance activities, the GoCB also considered continuous professional training and education programs, as well as annual appreciation ceremonies for the GoCB collaborators. In addition, providing regular feedback on the quality of samples and data to the GoCB field centers was another strategy to ensure maintenance of the GoCB strong communication channels.
GI and breast cancers were the most common malignancies recruited in the GoCB during its first four years. This was not surprising, because they are the most common cancers in the Golestan population. [13][14][15]24,25 According to the latest GPCR report (unpublished data), cancers of the breast (female), stomach, colorectum, lung and esophagus were the most common malignancies in the Golestan population in 2017 (Table S1, Supplementary file 1).
Access to high-quality population-based cancer registry data is a major strength of the GoCB. By linking the GoCB dataset with the GPCR dataset, we could add complementary data (e.g. vital status and survival data) to the GoCB dataset and this could result in improving the GoCB dataset and could consequently improve the quality of researches on GoCB samples. Collaboration of GoCB with international research projects and networks 22,26 is another strength of the GoCB. These collaborations may result in accessing shared resources (e.g., expertise, protocols) and joining further collaborative works and these will finally improve the quality and promote the  GoCB project. As GoCB was designed and funded as a research project, lack of continuous financial support in our setting is a major limitation and its maintenance may be affected once the project ends. Therefore, although research grants were available and accessible resources in development of the GoCB, stable and continuous supports are necessary for successful maintenance. This point should be taken into account in all low-resource settings including LMICs.
In conclusion, we considered specific strategies to ensure the development and maintenance of the GoCB as a cancer-specific biological repository regarding limited resources in our high-risk population. Some of the most important strategies included considering research grant as available financial resource, using available infrastructures in public sector, considering re-purposed equipment, developing specific in-house equipment and software, considering specific plans to promote clinicians' contribution to collection of biological samples, assigning selected and trained personnel of healthcare centers as part-time field staff of the GoCB and considering justified incentives for GoCB collaborators. These strategies may be applicable in other similar low-resource settings in Iran as well as in other LMICs. Therefore, the GoCB may be considered as a model for designing and implementation of cancer biobanks in the LMICs. As a research infrastructure, it may play an important role in development of basic cancer research, especially in the field of cancer biology in our high-risk area.