22
Research Article
A simple sequencing protocol for genotyping the HLA-C locus by the
Sanger method
Tran Thu Ha Phama, Duc Minh Trana, Tiep Khac Nguyena, Thanh Huong Phunga*
a Faculty of Biotechnology, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi, Vietnam
Journal of Pharmaceutical Research and Drug Information, 2024, 16: 22-29
A R T I C L E I N F O
Article history
Received 02 Feb 2024
Revised 23 April 2024
Accepted 24 April 2024
Keywords
HLA-C genotyping
Sanger
Sequencing
A B S T R A C T
The HLA-C gene, which belongs to the HLA superfamily (Human Leukocyte
Antigen), codes for the Major Histocompatibility Complex (MHC), which
plays crucial roles in the human immune system. This study aimed to develop a
simple sequencing protocol by the Sanger method using fewer primers and
reactions for genotyping the HLA-C gene than published protocols. The simple
protocol with three primers includes one PCR reaction and two sequencing
reactions. The primer set comprising SEQ ID1 and SEQ ID2 was used for the
PCR reaction to specifically amplify the exon 2 – exon 3 region of the HLA-C
locus, which contains the typical SNPs of each HLA-C allele. The PCR product
was purified and used as a template for the sequencing reactions. Two forward
primers, SEQ ID1 and SEQ ID3 were used for sequencing, in which, the SEQ
ID1 forward primer is located in the intron 1 region and the SEQ ID3 forward
primer is located in the intron 2 region of the HLA-C gene. Testing the simple
sequencing protocol on four samples of known HLA-C genotypes showed 100%
accurate results. The established Sanger sequencing protocol is simple to
implement, and reduces cost and time. Thus, this protocol can be used for
HLA-C sequencing for pharmacogenetic studies and applications.
*Corresponding author: Thanh Huong Phung, email: huongpt@hup.edu.vn
https://doi.org/10.59882/1859-364X/165
Journal homepage: jprdi.vn/JP
Journal of Pharmaceutical Research and Drug Information
An official journal of Hanoi University of Pharmacy
1. Introduction
HLA (Human Leukocyte Antigen - HLA)
is a superfamily of genes located on the short
arm of chromosome 6 (6p21.1 6p21.3),
encoding the major histocompatibility
complex (MHC), which plays a crucial role
in the function of the immune system [1]. The
HLA superfamily is divided into three classes,
each is further subdivided into numerous
types with highly complex polymorphism, of
which, HLA class I, comprising three main
types - HLA-A, HLA-B, and HLA-C, is most
closely associated with the risk of adverse
drug reactions (ADRs) [1, 2]. In addition to a
23
wide range of well-known ADR-associated
variant alleles of HLA-A and HLA-B, an
increasing number of variant alleles of HLA-
C have also been identified [3, 4].
A study by Kang et al. on Korean patients,
including 25 cases of allopurinol-induced
severe cutaneous adverse reactions (SCARs)
and 57 well-tolerated allopurinol users,
identified a significant association between the
HLA-C*03:02 allele and the risk of severe
cutaneous adverse reactions (SCARs) induced
by allopurinol (OR=82.1; 95%CI: 15.8–426.5;
p=9.310-11) [5]. In another study by Pham et
al. on 100 allopurinol-induced SCARs patients
and 183 well-tolerated patients, in addition to
HLA-B*58:01, the HLA-C*03:02 allele was
independently associated with the risk of
allopurinol-induced SCARs (OR=79.91;
95%CI: 2.91–2192.58; p=0.012) [6]. Another
HLA-C allele, HLA-C*04:01, was associated
with the risk of Stevens-Johnson syndrome
(SJS) in Malawian HIV patients using
nevirapine (OR = 17.52; 95% CI: 3.31–92.80)
[7]. A subsequent study in 2017 demonstrated
that this allele was linked to the risk of
nevirapine-induced skin hypersensitivity
reactions in various ethnic populations, with an
overall OR of 3.06, p=0.0001. Specifically, the
risk in Asians was highest with OR=5.49,
p=0.0001, followed by Caucasians (OR=2.08,
p=0.02) and Africans (OR=3.84, p=0.04) [8].
Furthermore, the study identified another HLA-
C allele, HLA-C*05:01, which was associated
with the risk of nevirapine-induced skin
hypersensitivity in Caucasians, with an OR of
2.84, p=0.002 [8]. Chonlaphat S. et al. found
that the HLA-C*08:01 allele was associated
with the risk of SJS/TEN in Thai HIV patients
using Co-trimoxazole, with an OR of 10.74
(95% CI: 2.18-52.90, p=0.0035), and was
related to all forms of SCARs with an OR of
8.46 (95% CI: 1.96-36.47, p=0.0042) [9].
Additionally, another polymorphic HLA-C
allele was found to be associated with
the risk of liver injury following treatment
with infliximab. The study conducted on
European Caucasian patients in 2020 reported
the HLA-C*12:03 allele to be significantly
associated with the aforementioned risk
(OR=6.1; 95% CI: 0.9-47.4, p=0.032) [10].
Polymorphic HLA-C alleles have been
observed to be associated with adverse
reactions to various drugs. Therefore, HLA-
C genotyping can aid physicians in
selecting appropriate treatment regimens for
individuals, with the goal of achieving
optimal therapeutic effectiveness and
minimizing adverse drug reactions. Due to
the highly complex polymorphism of the
HLA-C gene, which comprises 7872
documented alleles [11], the gold standard
for determining HLA-C genotypes remains
the sequencing method. While next
generation sequencing (NGS) techniques
are suitable for large-scale studies, they
involve significant equipment investment,
complex protocols, and the need for highly
skilled personnel. In contrast, publications
on sequencing HLA-C genes using the
Sanger method, though simpler, still require
multiple primers with bidirectional
sequencing reactions [12, 13]. The
sequencing protocol for HLA-C genes with
the fewest published primers to date,
conducted by Peterson et al. [13], still
requires 4 primers corresponding to 4
sequencing reactions.
Therefore, our study aimed to establish a
simplified sequencing protocol for HLA-C
genes, using only 2 primers for sequencing
reactions, thereby reducing half of the
sequencing reactions, helping minimize the
time, and consequently reducing the
implementation costs.
Tran Thu Ha Pham et al. J. Pharm. Res-DI. 2024, 16: 22-29
24
2. Materials and Methods
2.1. Samples
Four DNA samples previously genotyped
for HLA-C using the Sanger sequencing
method as described by Peterson et al. [13]
were utilized: C01:02/15:05 (sample 1),
C03:02/C03:02 (sample 2), C03:02/03:04
(sample 3), and C01:02/C*04:03 (sample 4).
These DNA samples were extracted from
whole blood samples collected in the
research project funded by the Ministry of
Health, under decision No. 4694/QD-BYT,
provided by the Faculty of Biotechnology,
Hanoi University of Pharmacy. The study
received approval from the Ethics Committee
of the National Institute of Hygiene and
Epidemiology (Approval No. IRB-VN01057-
6/2018) [14].
Total DNA was extracted from whole
blood using the E.Z.N.A.® Tissue DNA Kit
(Omega Bio-tek, USA) following the
manufacturers protocol. DNA quantity and
purity were assessed using the Nanodrop
2000 spectrophotometer (Thermo Fisher,
USA). DNA samples with concentrations
ranging from 35 to 250 ng/µL and A260/280
ratios between 1.65 and 1.95 were used for
subsequent experiments.
2.2. PCR amplification and sequencing
primers
The protocol employs a total of 3 primers.
Two primers, comprising a forward primer and
a reverse primer, were utilized for the PCR
reaction to amplify the specific exon 2-3
region, where located the characteristic SNPs
of each HLA-C allele. The forward primer had
its binding site on the template DNA at intron
1, while the reverse primer had its binding site
on the template DNA at intron 3.
Notably, the forward primer from the
primer set used for amplifying the exon 2-3
region is also employed in a sequencing
reaction. The binding site of the forward
primer utilized in the second sequencing
reaction is located on the DNA template at
intron 2.
2.3. PCR reactions
The total reaction volume for PCR
amplification of the exon 2 - exon 3 region
was 30 µL, comprising 30 ng of DNA, 9.0 µL
nuclease-free water, 15 µL GoTaGreen
Master Mix 2x (Promega, USA), 0.5 pM each
primer (primers synthesized by Integrated
DNA Technologies, USA). The PCR Thermal
Cycler 2720 (Applied Biosystems, USA) was
used with the following thermal cycling
conditions: an initial cycle at 95oC for 3min,
followed by 35 cycles of 95oC for 30s, 65oC
for 30s, 72oC for 30s, and a final extension at
72oC for 7min. Five µL of the PCR product
was electrophoresed with 1.0% agarose gel to
verify the product size.
The remaining PCR product was purified
using the Wizard® SV Gel and PCR Clean-
Up System (Promega, USA) for two
sequencing reactions with two forward
primers.
2.4. Sequencing and HLA-C Typing
Sequencing was performed using the
BigDyeTM Terminator v3.1 Cycle Sequencing
Kit (ThermoFisher Scientific, USA) and the
ABI 3500 Genetic Analyzer (Applied
Biosystems, USA).
2.5. Statistical analysis
Sequencing data were processed using
Bioedit 7.0.5.3 software to compare the
sequencing results with the reference
sequences of HLA-C alleles obtained from
the IMGT/HLA Database [11].
3. Results and Discussion
In our previous study to detect carriers of
the HLA-C*03:02 allele and determine their
zygosities [14], the primer set including the
Tran Thu Ha Pham et al. J. Pharm. Res-DI. 2024, 16: 22-29
25
forward primer SEQ ID1 and the reverse
primer SEQ ID2 was employed in the PCR
reaction for the specific amplification of the
exon 2 - exon 3 region. In current research,
the primer set was optimized in terms of
thermal cycling cycles to generate a sufficient
amount of PCR product, which served as a
template for the subsequent sequencing
reaction. The forward primer SEQ ID1 was
modified from the CPCRF forward primer
used by Peterson et al. [13], and the reverse
primer SEQ ID2 was de-novo designed
(Table 1). The forward primer SEQ ID1
(Figure 1A) contains a mismatch nucleotide
at the penultimate position of the 3' terminus
(replacing G with T), and the reverse primer
SEQ ID2 has nucleotides at the 3' end (from
nucleotide 1007 - 1010) that complemented
to the HLA-C gene's template DNA without
binding to other HLA class I gene templates,
ensuring specificity for HLA-C (Figure 1A).
The primer set (SEQ ID1 and SEQ ID2)
enhances specificity for the exon 2 - exon 3
region of the HLA-C locus compared to the
previously published primer set (CPCRF and
CPCRR) by Peterson et al. (Figure 1A).
Besides, this modification reduces the melting
temperature of primers and shortens the PCR
product length for sequencing, while still
effectively covering the entire target exon 2 -
exon 3 region (Table 1). Furthermore, in the
PCR step targeting the exon 2 - exon 3 region,
the simple protocol utilized only 30 ng of
template DNA, a notable reduction compared
to the 50 - 200 ng required by previously
studies [12, 13], thereby conserving input
DNA samples.
The polymorphic region of exon 2 - exon
3 in the HLA gene, particularly in HLA-C,
concentrates most of the characteristic SNPs
for each HLA allele. The sequence of this
polymorphic region determines the amino
acid sequence of the expressed HLA protein.
Previous studies on the Sanger sequencing
protocol for exon 2 - exon 3 of the HLA-C gene
typically suggested using 4 - 6 primers
corresponding to 4 - 6 bidirectional sequencing
reactions for both DNA strands [12, 13].
Meanwhile, the simply developed protocol
used only two primers for sequencing (the
forward primer SEQ ID1 and the forward
primer SEQ ID3), reducing the number of
Tran Thu Ha Pham et al. J. Pharm. Res-DI. 2024, 16: 22-29
No. Primer Sequence (5’-3’) Amplicon Location Tm
o
C
The primers were described by Peterson
1 CPCRF
AGCGAGGTGC CCGCCCGGCG A
946 Intron 1 76
2 CPCRR
CGTGGGAGGC CATCCCGGGA GAT
Intron 3 78
3 CSEQ5F
GGGGACGGG GCTGAC
Intron 2 54
4 CSEQ3R
GCCGTCCGTG GGGGATG
Intron 2 60
The primers were used in this study
1 SEQ ID1
GCGAGGTGCC CGCCCGGCT A
912 Intron 1 72
2 SEQ ID2
GAGATGGGGA AGGCTCCCCA CT
Intron 3 72
3
SEQ ID3
CCGGAGAGAG CCCCAGT
Intron 2
Table 1. Sequences of primer sets used by Peterson and those used in this study
Note: Different nucleotides between CPCRF primer and modified SEQ ID1 primer are highlighted in gray
26
sequencing reactions by half compared to the
protocol with the fewest sequencing primers as
published by Peterson et al. [13] (Table 1).
Previously published sequencing protocol
had to use multiple primers to bidirectionally
sequence the exon 2 - exon 3 region due to the
primer binding site being too close to the target
gene region. In theory, automated Sanger
sequencing systems can sequence DNA
fragments of size < 1000 bp, but practical
results often ensure stable signals in the range
of ~ 400-600 bp [15]. Additionally, sequencing
primers need to be positioned at least 50 - 60
bp away from the target DNA to avoid
interference signal [16]. However, for
instance, in Peterson's protocol [13], the
forward primer CSEQ5F binds to the exon 3
region away 20 nucleotides of the exon's
starting sequence (Figure 1C); while the
reverse primer CPCRR binds to the DNA
template approximately 39 nucleotides away
from the exon 3's ending sequence (Figure
1A, 1C). Another protocol by Delfilo faces
similar limitations with binding sites at 20 40
nucleotides away from the target exon region
[12]. This proximity leads to noisy sequencing
signals at the beginning and end of the exon,
necessitating bidirectional sequencing to
overcome this issue. This increases the number
of primers, reactions, and complexity in
handling sequencing results. Therefore,
designing primers with suitable binding sites
on the template DNA is crucial to ensuring
stable sequencing signals at both the beginning
and end of the target exons.
The simply developed protocol used the
forward primer SEQ ID1 for one sequencing
reaction. The forward primer SEQ ID1 binds
to the template DNA from nucleotide 116 135
(Figure 1A). The sequence length from the
Tran Thu Ha Pham et al. J. Pharm. Res-DI. 2024, 16: 22-29