|
|
Title:
Using plasma proteomic pattern for diagnosis, classification, prediction
of response to therapy and clinical behavior, stratification of therapy,
and monitoring disease in hematologic malignancies
United States Patent: 8,097,468
Issued: January 17, 2012
Inventors: Albitar; Maher (Coto
de Caza, CA), Estey; Elihu H. (Houston, TX), Kantarjian; Hagop M.
(Bellaire, TX), Giles; Francis J. (Bellaire, TX), Keating; Michael J.
(Houston, TX)
Assignee: Board of Regents,
The University of Texas System (Austin, TX)
Appl. No.: 12/582,998
Filed: October 21, 2009
|
|
|
Executive MBA in Pharmaceutical Management, U. Colorado
|
Abstract
The present invention demonstrates that
the diagnosis and prediction of clinical behavior in patients with
hematologic malignancies, such as leukemia, can be accomplished by
analysis of proteins present in a plasma sample. Thus, in particular
embodiments the present invention uses plasma to create a diagnostic or
prognostic protein profile of a hematologic malignancy comprising
collecting plasma samples from a population of patients with hematologic
malignancies; generating protein spectra from the plasma samples with or
without fractionation; comparing the protein spectra with clinical data;
and identifying protein markers in the plasma samples that correlate with
the clinical data. Protein markers identified by this approach can then be
used to create a protein profile that can be used to diagnose the
hematologic malignancy or determine the prognosis of the hematologic
malignancy. Potentially these specific proteins can be identified and
targeted in the therapy of these malignancies.
Description of the
Invention
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to the fields of proteomics. More
particularly, it concerns the use of proteomics for diagnosis and the
prognosis of hematologic malignancies. Also, the invention relates to
predicting the response to therapy and stratifying patients for therapy.
2. Description of Related Art
Hematologic malignancies are cancers of the blood and bone marrow,
including leukemia and lymphoma. Leukemia is a malignant neoplasm
characterized by abnormal proliferation of leukocytes and is one of the
four major types of cancer. Leukemia is diagnosed in about 29,000 adults
and 2,000 children each year in the United States. Leukemias are
classified according to the type of leukocyte most prominently involved.
Acute leukemias are predominantly undifferentiated cell populations and
chronic leukemias have more mature cell forms.
The acute leukemias are divided into lymphoblastic (ALL) and non-lymphoblastic
(ANLL) types and may be further subdivided by morphologic and cytochemical
appearance according to the French-American-British classification or
according to their type and degree of differentiation. Specific B- and
T-cell, as well as myeloid cell surface markers/antigens are used in the
classification too. ALL is predominantly a childhood disease while ANLL,
also known as acute myeloid leukemia (AML), is a more common acute
leukemia among adults.
Chronic leukemias are divided into lymphocytic (CLL) and myeloid (CML)
types. CLL is characterized by the increased number of mature lymphocytes
in blood, bone marrow, and lymphoid organs. Most CLL patients have clonal
expansion of lymphocytes with B cell characteristics. CLL is a disease of
older persons. In CML, the granulocytic cells predominate at all stages of
differentiation in blood and bone marrow, but may also affect liver,
spleen, and other organs.
Among patients with leukemia there can be a highly variable clinical
course as reflected by varying survival times and resistance to therapy.
Reliable individual prognostic tools are limited at present. Advances in
proteomic technologies may provide new diagnostic and prognostic
indicators for hematologic malignancies such as leukemia.
The term "proteome" refers to all the proteins expressed by a genome, and
thus proteomics involves the identification of proteins in the body and
the determination of their role in physiological and pathophysiological
functions. The .about.30,000 genes defined by the Human Genome Project
translate into 300,000 to 1 million proteins when alternate splicing and
post-translational modifications are considered. While a genome remains
unchanged to a large extent, the proteins in any particular cell change
dramatically as genes are turned on and off in response to their
environment.
As a reflection of the dynamic nature of the proteome, some researchers
prefer to use the term "functional proteome" to describe all the proteins
produced by a specific cell in a single time frame. Ultimately, it is
believed that through proteomics, new disease markers and drug targets can
be identified.
Proteomics has previously been used in the study of leukemia. For example,
two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) of proteins
from the lymphoblasts of patients with ALL was used to identify
polypeptides that could distinguish between the major subgroups of ALL (Hanash
et al., 1986). In other studies of ALL using 2-D PAGE, distinct levels of
a polypeptide were observed between infants and older children with
otherwise similar cell surface markers (Hanash et al., 1989). Voss et al.
demonstrated that B-CLL patient populations with shorter survival times
exhibited changed levels of redox enzymes, Hsp27, and protein disulfide
isomerase, as determined by 2-D PAGE of proteins prepared from mononuclear
cells (Voss et al., 2001).
As these studies indicate, proteomics can be a useful tool in the study of
hematologic malignancies. There is, however, a need for proteomics
techniques that are more reliable and simple than those currently
available in the art.
SUMMARY OF THE INVENTION
The present invention provides a novel approach that uses plasma
proteomics to create a profile that can be used to diagnose hematologic
malignancies and predict a patient's clinical behavior and response to
therapy.
In one embodiment, the invention provides a method of creating a
diagnostic or prognostic protein profile of a hematologic malignancy
comprising: obtaining plasma samples from a population of patients with
hematologic malignancies; generating protein spectra from the plasma
samples; comparing the protein spectra with patients' clinical data
relating to the hematologic malignancy; identifying a protein marker or
group of protein markers in the plasma samples that correlate with the
clinical data; and creating a protein profile based on the identified
protein marker or group of protein markers, wherein the protein profile
can be used to diagnose the hematologic malignancy or determine the
prognosis of the hematologic malignancy.
In a preferred embodiment, the protein spectra is generated by mass
spectrometry. The mass spectrometry may be, for example, SELDI (surface
enhanced laser desorption/ionization), MALDI (matrix assisted desorption/ionization),
or Tandem mass spectrometry (MS/MS). In other embodiments of the
invention, the protein spectra is generated by two-dimensional gel
electrophoresis. In certain aspects, the protein samples are fractionated
before mass spectrometry analysis or two-dimensional gel electrophoresis.
Fractionation can be according to a variety of properties, such as pH,
size, structure, or binding affinity. In one aspect, plasma proteins are
fractionated into 4 different fractions according to pH using strong anion
exchange column (Fraction1.ident.pH9,pH7, Fraction2.ident.pH5,
Fraction3.ident.pH4, Fraction4.ident.pH3, organic).
In certain aspects, the protein marker or group of protein markers that
correlate with the clinical data are identified by univariate statistics,
multivariate statistics, or hierarchical cluster analysis. In a preferred
embodiment, the protein marker or group of protein markers that correlate
with the clinical data are identified using correlation statistics with
beta-uniform mixture analysis, genetic algorithms, univariate, and/or
multivariate statistics. In other preferred embodiment, the protein marker
or group of protein markers that correlate with the clinical data are
identified using a decision tree algorithm. In some embodiments of the
invention the clinical data comprises one or more of cytogenetics, age,
performance status, response to therapy, type of therapy, progression,
event-free survival, time from response to relapse, and survival time.
In preferred embodiments, the protein profile is used to diagnose the
hematologic malignancy; classify the type of hematologic malignancy;
predict a patient's response to drug therapy; predict a patient's survival
time; or predict a patient's time from response to relapse. In certain
embodiments, the hematologic malignancy is leukemia, non-Hodgkin lymphoma,
Hodgkin lymphoma, myeloma, or myelodysplastic syndrome. The leukemia may
be acute myeloid leukemia (AML), chronic myeloid leukemia (CML), acute
lymphocytic leukemia (ALL), or chronic lymphocytic leukemia (CLL).
In another embodiment, the invention provides a method of predicting
response to therapy in a patient with a hematologic malignancy comprising:
obtaining a plasma sample from a patient; identifying a protein marker or
group of protein markers in the plasma sample that is associated with
response to therapy; and predicting the patient's response to therapy. In
a preferred embodiment the hematologic malignancy is leukemia, non-Hodgkin
lymphoma, Hodgkin lymphoma, myeloma, or myelodysplastic syndrome. The
leukemia may be acute myeloid leukemia (AML), chronic myeloid leukemia (CML),
acute lymphocytic leukemia (ALL), or chronic lymphocytic leukemia (CLL).
The method may be used to predict a patient's response to therapy before
beginning therapy, during therapy, or after therapy is completed. For
example, by predicting a patient's response to therapy before beginning
therapy, the information may be used in determining the best therapy
option for the patient.
In one aspect of the invention, the protein marker is a peak. The peak may
be generated by mass spectrometry. The mass spectrometry may be, for
example, SELDI, MALDI, or MS/MS. In another aspect of the invention, the
protein marker is a spot. In a preferred embodiment the spot is generated
by two-dimensional gel electrophoresis.
In certain embodiments of the invention the therapy is chemotherapy,
immunotherapy, antibody-based therapy, radiation therapy, or supportive
therapy (essentially any implemented for leukemia). In some embodiments,
the chemotherapy is Gleevac or idarubicin and ara-C.
In some embodiments the protein marker or group of protein markers
associated with response to a specific therapy in a patient with AML is
one or more of Peak 1 to Peak 17 generated by SELDI mass spectrometry as
defined in Table 1 (see Original Patent). In one embodiment, the group of
protein markers associated with response to a specific therapy in a
patient with AML comprises Peak 1 and Peak 2.
In one embodiment, the invention provides a method of predicting time to
relapse in a patient with a hematologic malignancy comprising: obtaining a
plasma sample from a patient; identifying a protein marker or group of
protein markers in the plasma sample that is associated with time to
relapse; and predicting the patient's time to relapse. In a preferred
embodiment the hematologic malignancy is leukemia, non-Hodgkin lymphoma,
Hodgkin lymphoma, myeloma, or myelodysplastic syndrome. The leukemia may
be acute myeloid leukemia (AML), chronic myeloid leukemia (CML), acute
lymphocytic leukemia (ALL), or chronic lymphocytic leukemia (CLL).
In one aspect of the invention, the protein marker is a peak. The peak may
be generated by mass spectrometry. Preferably the peak is generated by
SELDI mass spectrometry. In another aspect of the invention, the protein
marker is a spot. In a preferred embodiment the spot is generated by
two-dimensional gel electrophoresis.
In a preferred embodiment the protein marker or group of protein markers
associated with time from response to idarubicin and ara-C to relapse in a
patient with AML is one or more of the Peak 18 to Peak 29 generated by
SELDI mass spectrometry as defined in Table 2 (see Original Patent).
In a preferred embodiment the protein marker or group of protein markers
associated with relapse in a patient with ALL is one or more of the Peak
30 to Peak 49 generated by SELDI mass spectrometry as defined in Table 3 (see Original Patent).
In a preferred embodiment the protein marker or group of protein markers
that differentiate between patients with L1/L2 ALL and patients with L3
ALL is one or more of the Peak 50 to Peak 69 generated by SELDI mass
spectrometry as defined in Table 4 (see Original Patent).
Those skilled in the art will recognize that the specific identity of the
proteins represented by the protein markers described herein, or of
protein markers revealed by the methods described herein, is not necessary
to create or utilize a diagnostic or prognostic protein profile. The
presence or absence, or increased or decreased levels, of a protein marker
or group of protein markers can be used to create or utilize a diagnostic
or prognostic protein profile without knowledge of what the proteins are.
For example, a diagnostic or prognostic protein profile could be created
or utilized based on the pattern of a group of protein markers without
needing to know the specific identity of the protein markers in the
pattern.
In another embodiment, the invention provides a method of predicting
response to therapy in a patient with a hematologic malignancy comprising:
obtaining a bone marrow aspirate sample from a patient; identifying a
protein marker or group of protein markers in the sample that is
associated with response to therapy; and predicting the patient's response
to therapy. In a preferred embodiment the hematologic malignancy is
leukemia, non-Hodgkin lymphoma, Hodgkin lymphoma, myeloma, or
myelodysplastic syndrome. The leukemia may be acute myeloid leukemia
(AML), chronic myeloid leukemia (CML), acute lymphocytic leukemia (ALL),
or chronic lymphocytic leukemia (CLL). In one aspect of the invention, the
leukemia is CML.
In certain aspects, a protein marker of the present invention may be a
P52rIPK homolog, follistatin-related protein 1 precursor, annexin A10,
annexin 14, tumor necrosis factor receptor superfamily member XEDAR, a
zinc finger protein, CD38 ADP-ribosyl cyclase 1, connective tissue growth
factor, CD28, Bcl2-related ovarian killer, tumor necrosis factor receptor
superfamily member 10D, X-linked ectodysplasin receptor, ectodysplain A2
isoform receptor, or chromosome 21 open reading frame 63.
It is contemplated that any method or composition described herein can be
implemented with respect to any other method or composition described
herein.
The use of the term "or" in the claims is used to mean "and/or" unless
explicitly indicated to refer to alternatives only or the alternatives are
mutually exclusive, although the disclosure supports a definition that
refers to only alternatives and "and/or."
Throughout this application, the term "about" is used to indicate that a
value includes the standard deviation of error for the device or method
being employed to determine the value.
Following long-standing patent law, the words "a" and "an," when used in
conjunction with the word "comprising" in the claims or specification,
denotes one or more, unless specifically noted.
Other objects, features and advantages of the present invention will
become apparent from the following detailed description. It should be
understood, however, that the detailed description and the specific
examples, while indicating specific embodiments of the invention, are
given by way of illustration only, since various changes and modifications
within the spirit and scope of the invention will become apparent to those
skilled in the art from this detailed description.
DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
A. The Present Invention
Among patients with hematologic malignancies there can be a highly
variable clinical course as reflected by varying survival times and
resistance to therapy. Depending on the type of hematologic malignancy a
patient has, therapy may include radiation, chemotherapy, bone marrow
transplant, biological therapy, or some combination of these therapies.
Thus, the accurate diagnosis of a patient's hematologic malignancy is
important in determining which therapy option to pursue, as different
malignancies respond differently to certain therapies. Even within a
particular form of hematologic malignancy (e.g., AML, ALL, CML, CLL) there
is significant variability in response to therapy among patients. For
example, in acute myeloid leukemia (AML), response to standard
chemotherapy (idarubicin+ara-C) varies significantly between patients,
with approximately 50% of patients not responding to therapy. Although
specific cytogenetic abnormalities in AML patients, such as -5, -7 and 11
q abnormalities, or poor performance status and advanced age are known to
be associated with poor response to therapy, accurate prediction of
response to therapy remains elusive. The ability to accurately diagnose
and predict clinical behavior in patients with hematologic malignancies
would allow stratification of patients for therapy options.
Current methods for determining diagnosis or clinical behavior in patients
with hematologic malignancies are not reliable and typically depend on one
molecule. The present invention enables the evaluation of thousands of
proteins at the same time from which a protein profile can be generated
that can be used to diagnose or predict clinical behavior in patients with
hematologic malignancies. In addition, the invention uses proteomics in
combination with blood plasma. Blood plasma is easy to collect and
provides the most complex human-derived proteome, making it superior to
cells and serum for proteomic studies of hematologic malignancies.
The present invention demonstrates that the diagnosis and prediction of
clinical behavior in patients with hematologic malignancies can be
accomplished by analysis of proteins present in a plasma sample. Thus, in
particular embodiments the present invention uses plasma to create a
diagnostic or prognostic protein profile of a hematologic malignancy
comprising collecting plasma samples from a population of patients with
hematologic malignancies; generating protein spectra from the plasma
samples; comparing the protein spectra with clinical data; and identifying
protein markers in the plasma samples that correlate with the clinical
data. Protein markers identified by this approach can then be used to
create a protein profile that can be used to diagnose the hematologic
malignancy or determine the prognosis of the hematologic malignancy. In
some embodiments, protein markers may be identified by comparing the
protein profile from patients with hematologic malignancies with protein
profiles from unaffected individuals.
Using the methods of the invention, those skilled in the art will be able
to identify protein markers that can accurately diagnose hematologic
malignancies, predict a patient's response to therapy, predict a patient's
time to relapse, and predict a patient's survival time. Furthermore, the
invention provides several protein markers shown to accurately predict
response to therapy in patients with AML, as well as several protein
markers shown to accurately predict the time to relapse in patients with
AML.
B. The Plasma Proteome
Blood plasma is easy to collect and provides the most complex
human-derived proteome, containing other tissue proteomes as subsets. The
protein content of plasma can be classified into the following groups:
proteins secreted by solid tissues and that act in the plasma;
immunoglobulins; "long distance" receptor ligands; "local" receptor
ligands, temporary passengers; tissue leakage products; aberrant
secretions from cancer cells and other diseased cells; and foreign
proteins (Anderson and Anderson, 2002).
Other body fluids including cerebrospinal fluid, synovial fluid, and urine
share some of the protein content with plasma. These samples, however, are
more difficult to obtain in a useful state than plasma. For example,
collection of cerebrospinal fluid and synovial fluid are invasive
procedures that can be painful and involve some risk, while processing
urine to a useful sample for protein analysis can be difficult in a
clinical setting. Blood plasma, however, may be easily collected by
venipuncture. For example, venous blood samples can be drawn and collected
in sterile ethylene diamine tetra acetate (EDTA) tubes. The plasma can
then be separated by centrifugation. If desired, the plasma may be stored
at -70.degree. C. for later analysis.
Characterizing the proteins in plasma can be challenging due to the large
amount of albumin present and the wide range in abundance of other
proteins. The present invention, however, shows that proteomics in
combination with plasma can provide a reliable approach to diagnosing
hematologic malignancies and predicting clinical behavior in patient's
with hematologic malignancies.
C. Protein Analysis
The present invention employs methods of separating proteins from plasma.
Methods of separating proteins are well known to those of skill in the art
and include, but are not limited to, various kinds of chromatography
(e.g., anion exchange chromatography, affinity chromatography, sequential
extraction, and high performance liquid chromatography) and mass
spectrometry. The separation and detection of the proteins in a plasma
sample generates a protein spectra for that sample.
1. Mass Spectrometry
In preferred embodiments the present invention employs mass spectrometry.
Mass spectrometry provides a means of "weighing" individual molecules by
ionizing the molecules in vacuo and making them "fly" by volatilization.
Under the influence of combinations of electric and magnetic fields, the
ions follow trajectories depending on their individual mass (m) and charge
(z). Mass spectrometry (MS), because of its extreme selectivity and
sensitivity, has become a powerful tool for the quantification of a broad
range of bioanalytes including pharmaceuticals, metabolites, peptides and
proteins.
Of particular interest in the present invention is surface-enhanced laser
desorption ionization-time of flight mass spectrometry (SELDI-TOF MS).
Whole proteins can be analyzed by SELDI-TOF MS, which is a variant of
MALDI-TOF (matrix-assisted desorption ionization-time of flight) mass
spectrometry. In SELDI-TOF MS, fractionation based on protein affinity
properties is used to reduce sample complexity. For example, hydrophobic,
hydrophilic, anion exchange, cation exchange, and immobilized-metal
affinity surfaces can be used to fractionate a sample. The proteins that
selectively bind to a surface are then irradiated with a laser. The laser
desorbs the adherent proteins, causing them to be launched as ions. The
"time of flight" of the ion before detection by an electrode is a measure
of the mass-to-charge ration (m/z) of the ion. The SELDI-TOF MS approach
to protein analysis has been implemented commercially (e.g., Ciphergen).
2. Two-Dimensional Electrophoresis
In certain embodiments the present invention employs high-resolution
electrophoresis to separate proteins from a biological sample such as
plasma. Preferably, two-dimensional gel electrophoresis is used to
generate a two-dimensional array of spots of proteins from a sample.
Two-dimensional electrophoresis is a useful technique for separating
complex mixtures of molecules, often providing a much higher resolving
power than that obtainable in one-dimension separations. Two-dimensional
gel electrophoresis can be performed using methods known in the art (See,
e.g., U.S. Pat. Nos. 5,534,121 and 6,398,933). Typically, proteins in a
sample are separated by, e.g., isoelectric focusing, during which proteins
in a sample are separated in a pH gradient until they reach a spot where
their net charge is zero (i.e., isoelectric point). This first separation
step results in one-dimensional array of proteins. The proteins in one
dimensional array is further separated using a technique generally
distinct from that used in the first separation step. For example, in the
second dimension, proteins separated by isoelectric focusing are further
separated using a polyacrylamide gel, such as polyacrylamide gel
electrophoresis in the presence of sodium dodecyl sulfate (SDS-PAGE). SDS-PAGE
gel allows further separation based on molecular mass of the protein.
Proteins in the two-dimensional array can be detected using any suitable
methods known in the art. Staining of proteins can be accomplished with
colorimetric dyes (coomassie), silver staining and fluorescent staining
(Ruby Red). As is known to one of ordinary skill in the art, spots/or
protein profiling patterns generated can be further analyzed for example,
by gas phase ion spectrometry. Proteins can be excised from the gel and
analyzed by gas phase ion spectrometry. Alternatively, the gel containing
proteins can be transferred to an inert membrane by applying an electric
field and the spot on the membrane that approximately corresponds to the
molecular weight of a marker can be analyzed by gas phase ion
spectrometry.
3. Other Methods of Protein Analysis
In addition to the methods described above, other methods of protein
separation known to those of skill in the art may be useful in the
practice of the present invention. The methods of protein analysis may be
used alone or in combination. a. Chromatography
Chromatography is used to separate organic compounds on the basis of their
charge, size, shape, and solubilities. A chromatography consists of a
mobile phase (solvent and the molecules to be separated) and a stationary
phase either of paper (in paper chromatography) or glass beads, called
resin, (in column chromatography) through which the mobile phase travels.
Molecules travel through the stationary phase at different rates because
of their chemistry. Types of chromatography that may be employed in the
present invention include, but are not limited to, high performance liquid
chromatography (HPLC), ion exchange chromatography (IEC), and reverse
phase chromatography (RP). Other kinds of chromatography include:
adsorption, partition, affinity, gel filtration and molecular sieve, and
many specialized techniques for using them including column, paper,
thin-layer and gas chromatography (Freifelder, 1982). i. High Performance
Liquid Chromatography
High performance liquid chromatography (HPLC) is similar to reverse phase,
only in this method, the process is conducted at a high velocity and
pressure drop. The column is shorter and has a small diameter, but it is
equivalent to possessing a large number of equilibrium stages.
Although there are other types of chromatography (e.g., paper and thin
layer), most applications of chromatography employ a column. The column is
where the actual separation takes place. It is usually a glass or metal
tube of sufficient strength to withstand the pressures that may be applied
across it. The column contains the stationary phase. The mobile phase runs
through the column and is adsorbed onto the stationary phase. The column
can either be a packed bed or open tubular column. A packed bed column is
comprised of a stationary phase which is in granular form and packed into
the column as a homogeneous bed. The stationary phase completely fills the
column. An open tubular column's stationary phase is a thin film or layer
on the column wall. There is a passageway through the center of the
column.
The mobile phase is comprised of a solvent into which the sample is
injected. The solvent and sample flow through the column together; thus
the mobile phase is often referred to as the "carrier fluid." The
stationary phase is the material in the column for which the components to
be separated have varying affinities. The materials which comprise the
mobile and stationary phases vary depending on the general type of
chromatographic process being performed. The mobile phase in liquid
chromatography is a liquid of low viscosity which flows through the
stationary phase bed. This bed may be comprised of an immiscible liquid
coated onto a porous support, a thin film of liquid phase bonded to the
surface of a sorbent, or a sorbent of controlled pore size.
High-performance chromatofocusing (HPCF) produces liquid pI fractions as
the first-dimension of protein separation followed by high-resolution
reversed-phase (RP) HPLC of each of the pI fractions as the second
dimension. Proteins are now mapped (like gels), but the liquid fractions
make for easy interface with mass spectrometry (MS) for detailed intact
protein characterization and identification (unlike gels) on more
selective basis without resorting to protein digestion. ii. Reversed-Phase
Chromatography
Reversed phase chromatography (RPC) utilizes solubility properties of the
sample by partitioning it between a hydrophilic and a lipophilic solvent.
The partition of the sample components between the two phases depends on
their respective solubility characteristics. Less hydrophobic components
end up primarily in the hydrophilic phase while more hydrophobic ones are
found in the lipophilic phase. In RPC, silica particles covered with
chemically-bonded hydrocarbon chains (2-18 carbons) represent the
lipophilic phase, while an aqueous mixture of an organic solvent
surrounding the particle represents the hydrophilic phase.
When a sample component passes through an RPC column the partitioning
mechanism operates continuously. Depending on the extractive power of the
eluent, a greater or lesser part of the sample component will be retained
reversibly by the lipid layer of the particles, in this case called the
stationary phase. The larger the fraction retained in the lipid layer, the
slower the sample component will move down the column. Hydrophilic
compounds will move faster than hydrophobic ones, since the mobile phase
is more hydrophilic than the stationary phase.
Compounds stick to reverse phase HPLC columns in high aqueous mobile phase
and are eluted from RP HPLC columns with high organic mobile phase. In RP
HPLC compounds are separated based on their hydrophobic character.
Peptides can be separated by running a linear gradient of the organic
solvent.
Along with the partitioning mechanism, adsorption operates at the
interface between the mobile and the stationary phases. The adsorption
mechanism is more pronounced for hydrophilic sample components while for
hydrophobic ones the liquid-liquid partitioning mechanism is prevailing.
Thus the retention of hydrophobic components is greatly influenced by the
thickness of the lipid layer. An 18 carbon layer is able to accommodate
more hydrophobic material than an 8 carbon or a 2 carbon layer.
The mobile phase can be considered as an aqueous solution of an organic
solvent, the type and concentration of which determines the extractive
power. Some commonly used organic solvents, in order of increasing
hydrophobicity are: methanol, propanol, acetonitrile, and tetrahydrofuran.
Due to the very small sizes of the particles employed as the stationary
phase, very narrow peaks are obtained. In some embodiments, reverse phase
HPLC peaks are represented by bands of different intensity in the
two-dimensional image, according to the intensity of the peaks eluting
from the HPLC. In some instances, peaks are collected as the eluent of the
HPLC separation in the liquid phase. To improve the chromatographic peak
shape and to provide a source of protons in reverse phase chromatography
acids are commonly used. Such acids are formic acid, trifluoroacetic acid,
and acetic acid. iii. Ion Exchange Chromatography
Ion exchange chromatography (IEC) is applicable to the separation of
almost any type of charged molecule, from large proteins to small
nucleotides and amino acids. It is very frequently used for proteins and
peptides, under widely varying conditions. In protein structural work the
consecutive use of gel permeation chromatography (GPC) and IEC is quite
common.
In ion exchange chromatography, a charged particle (matrix) binds
reversibly to sample molecules (proteins, etc.). Desorption is then
brought about by increasing the salt concentration or by altering the pH
of the mobile phase. Ion exchange containing diethyl aminoethyl (DEAE) or
carboxymethyl (CM) groups are most frequently used in biochemistry. The
ionic properties of both DEAE and CM are dependent on pH, but both are
sufficiently charged to work well as ion exchangers within the pH range 4
to 8 where most protein separations take place.
The property of a protein which govern its adsorption to an ion exchanger
is the net surface charge. Since surface charge is the result of weak
acidic and basic groups of protein; separation is highly pH dependent.
Going from low to high pH values the surface charge of proteins shifts
from a positive to a negative charge surface charge. The pH versus net
surface curve is a individual property of a protein, and constitutes the
basis for selectivity in IEC.
As in all forms of liquid chromatography, conditions are employed that
permit the sample components to move through the column with different
speeds. At low ionic strengths, all components with affinity for the ion
exchanger will be tightly adsorbed at the top of the ion exchanger and
nothing will remain in the mobile phase. When the ionic strength of the
mobile phase is increased by adding a neutral salt, the salt ions will
compete with the protein and more of the sample components will be
partially desorbed and start moving down the column. Increasing the ionic
strength even more causes a larger number of the sample components to be
desorbed, and the speed of the movement down the column will increase. The
higher the net charge of the protein, the higher the ionic strength needed
to bring about desorption. At a certain high level of ionic strength, all
the sample components are fully desorbed and move down the column with the
same speed as the mobile phase. Somewhere in between total adsorption and
total desorption one will find the optimal selectivity for a given pH
value of the mobile phase. Thus, to optimize selectivity in ion exchange
chromatography, a pH value is chosen that creates sufficiently large net
charge differences among the sample components. Then, an ionic strength is
selected that fully utilizes these charge differences by partially
desorbing the components. The respective speed of each component down the
column will be proportional to that fraction of the component which is
found in the mobile phase.
Very often the sample components vary so much in their adsorption to the
ion exchanger that a single value of the ionic strength cannot make the
slow ones pass through the column in a reasonable time. In such cases, a
salt gradient is applied to bring about a continuous increase of ionic
strength in the mobile phase.
D. Analysis of Protein Markers
1. Extraction of Protein Marker Locations
Following the generation of protein spectra by, for example, SELDI-TOF MS,
protein markers are identified for further analysis. Protein marker
detection can be made easier by reducing the background noise. The
background noise can be reduced at different levels. One method of
reducing background noise is to average the raw protein spectra data.
First, peaks should be normalized to assure that equal amounts of samples
are compared. There are several methods for normalization known to those
skilled in the art. A common approach is normalizing according to
intensity: Total Ion Current, height, area, or mass. A different method
for normalization is using the following formula (I=intensity): Normalized
I=CurrentI-MinimumI/MaximumI-minimumI
After normalization, reducing background can be achieved by eliminating
peaks that are not seen in majority (50-70%) of samples.
Systems for mass spectra acquisition are commercially available. One
example is the Ciphergen ProteinChip.RTM. Reader (Ciphergen Biosystems,
Inc.). The chip reader may be used with peak detection software such as
CiphergenExpress 3.0. This software calculates clusters by determining
peaks that are above a given signal-to-noise ratio, and that are present
in multiple spectra. Various settings for noise subtraction, peak
detection, and cluster completion may be evaluated to optimize the
analysis. For example, a first pass peak detection of 5.0 signal-to-noise
on both peaks and valleys, and a cluster completion window of 1.0 times
peak width, with a second pass signal-to-noise setting of 2.0 for both
peaks and valleys may be used.
The use of total ion current as a normalization factor is a common
practice in SELDI data analysis; however, other methods of normalization
may be used. For example, normalization could be done using the peak ratio
approach in which the ratios of peaks near each other (e.g., within 5
peaks upstream and downstream) are used for normalizing. The peak ratio
approach has an additional advantage of possibly detecting
post-translational modifications more effectively.
Peaks may also be detected manually. The results of manual peak detection
may then be analyzed using software, such as Matlab (MathWorks, Natick,
Mass.), followed by decision tree analysis. A non-limiting example of
decision tree analysis software is CART from Salford Systems, which is
implemented in Biomarker Patterns Software 4.0 from Ciphergen Biosystems,
Inc.
Replicate samples can be analyzed to confirm the reproducibility of the
protein spectra generated according to the methods of the invention. Those
of skill in the art are familiar with statistical methods that can be used
to determine the reproducibility of the analysis. For example, an
agglomerative clustering algorithm may be used to show that replicate
samples cluster as nearest neighbors, thus confirming reproduciblitiy.
Agglomerative clustering analysis is the searching for groups in the data
in such a way that objects belonging to the same cluster resemble each
other. The computer analysis proceeds by combining or dividing existing
groups, producing a hierarchical structure displaying the order in which
groups are merged or divided. Agglomerative methods start with each
observation in a separate group and proceed until all observations are in
a single group.
2. Determining the Relevance of Protein Markers
To test the relevance of the protein markers identified in the protein
spectra, various methods of statistical analysis known to those of skill
in the art may be employed. For example, a univariate model, multivariate
model, or hierarchical cluster analysis may be used. a. Multivariate
Modeling
A multivariate model is a model that aims to predict or explain the
behavior of a dependent variable on the basis of a set of known
independent variables. The purpose of using multivariate analysis is to
demonstrate that the proteomic analysis as a variable in predicting
response, survival, and duration of response is independent from the
currently known variables that can predict the same thing. If the
proteomic data adds to the model that includes the conventional markers,
the p-value will be significant, but if the proteomic data does not add to
the model and similar prediction can be achieved using other conventional
markers, the p-value will not be significant even if it was significant in
univariate analysis.
For predicting a the response to therapy of a patient with a hematologic
malignancy, a multivariate model is preferred. An example of a
multivariate model for predicting response to therapy in a patient with
AML is (Response.about.Cytogenetics+Performance.Status+Age).
Cytogenetic findings represent the chromosomal abnormalities that were
found in the tumor cells. Dependent on these abnormalities, the
leukemia/tumor can be classified as good, intermediate, or bad. For
example, in a patient with AML and cytogenetic abnormalities including
deletion of chromosome 5 or 7 or abnormalities on chromosome 11, this
patient has a "bad" disease (>90% die within one year and will not respond
to therapy). Patients with AML and t(8;21), t(15;17), or Inv 16 are
classified as "good" disease and the rest are with "intermediate" disease.
With regard to age, the older the patient the worse the disease
(continuous variable). Patients >65 years old are classified with "bad"
disease.
Performance status is a scoring system to evaluate the patient's overall
health as described below in Table 5 (see Original Patent). Obviously, the
higher the grade (ECOG), the less likely the patient will survive.
To test the relevance of a specific protein marker to the prediction of a
behavior, the protein marker can be added to the multivariate model. For
example, the value (i.e., height) of a protein peak identified by SELDI MS
can be added to the base multivariate model for predicting response to
therapy in a patient with AML to give the extended multivariate model of (Response.about.Cytogenetics+Performance.Status+Age+Peak
Info) where Peak Info is information from a given peak. Preferably Peak
Info is a transformed peak value, such as logPeak, logPeak+(logPeak).sup.2,
Peak+Peak.sup.2, or Peak+logPeak.
After applying the peak value to the multivariate model, a p-value is
produced. Those of skill in the art are familiar with methods of
calculating p-values. For example, a p-value may be determined by applying
ANOVA (analysis of variance between groups) on the base multivariate model
and the extended multivariate model.
To adjust for multiple testing a beta-uniform mixture analysis may be
used. The p-value is considered significant only if it is less than the
cut-off as determined by the beta-uniform mixture analysis, in which the
transformation is confirmed to be unique and not uniform. This adjusts for
the multiple testing. b. Cox Model
Those of skill in the art are familiar with the Cox proportional hazards
model, which is a commonly used regression model for analyzing data points
with time, such as survival, time to progression, time to relapse, or time
to therapy. The Cox model allows the estimation of nonparametric survival
(or other event of interest) curves (such as Kaplan-Meier curves) in the
presence of covariates. This can be performed with continuous or as
dichotomized variables. The effect of the covariates upon survival is
usually of primary interest. The Cox model can also be performed in the
context of multivariate analysis by incorporating several variables. In
the multivariate model, the analysis will first analyze the first
variable, then analyze the second variable in the groups generated from
the first variable and so on.
In one embodiment of the invention, protein peak values were fitted to the
Cox model: h(t)=h.sub.0(t)exp(.beta.f(Peak)),
where h(t) is the hazard at time t, h.sub.0(t) is the baseline hazard, and
f(Peak) is some transformation of the peak value. When the Cox model was
applied to predict time to relapse, the "hazard" was relapse, and the
"baseline hazard" was the risk of relapsing based on variables other than
peak value. Resulting p-values may be analyzed by means of a beta-uniform
mixture analysis. A positive value of the coefficient .beta. means that an
increased peak height corresponds to increased risk of relapse. The
p-value was considered significant only if it is less than the cut-off as
determined by the beta-uniform mixture analysis, in which the
transformation is confirmed to be unique and not uniform. This adjusts for
the multiple testing.
In addition to the analyses described herein, many additional questions
can be asked using the Cox model. For example, the data can be used to
predict patients who will have fungal infection, or patients who would die
in the first two weeks. Similar statistical analysis can be used to
determine response to second therapy after relapsing c. Decision Tree
Algorithm
In one embodiment of the present invention, a decision tree algorithm was
used to identify protein spectra useful for predicting clinical outcome
(e.g., responders versus non-responders). CART software from Salford
Systems is one example of a commercially available decision tree tool.
CART automatically sifts large, complex databases, searching for and
isolating significant patterns and relationships. This information can
then be used to generate predictive models. Variables that may be included
in the analysis along with peak values and peak ratios include clinical
outcome, patient demographics, and cellular analysis. When using decision
trees, caution must be exercised to prevent overfitting (Wiemer and
Prokudin 2004). When approach to limiting overfitting is to limit the
number of levels allowed. For example, the number of levels may be limited
to two, meaning that the model could only be comprised of at most two
variables from the set of all peak values and all observational variables
(e.g., clinical outcomes, patient demographics, cellular analysis).
Claim 1 of 4 Claims
1. A method of predicting an increased
risk of relapse following therapy or distinguishing between L1/L2 and L3
in a patient with acute lymphoblastic leukemia (ALL) comprising: (a)
performing mass spectrometry on a plasma sample from said patient to
generate a protein spectra comprising protein peaks; (b) identifying a
protein peak or group of protein peaks in the protein spectra
corresponding to one or more of Peak 30 (7727.972 Daltons), Peak 31
(61940.76 Daltons), Peak 32 (124797.7 Daltons), Peak 33 (53623.64
Daltons), Peak 34 (10216.72 Daltons), Peak 35 (145023.4 Daltons), Peak 36
(6808.864 Daltons), Peak 37 (7249.661 Daltons), Peak 38 (6588.005
Daltons), Peak 39 (78971.03 Daltons), Peak 40 (4924.562 Daltons), Peak 41
(55864.83 Daltons), Peak 42 (6801.569 Daltons), Peak 43 (13298.19
Daltons), Peak 44 (83531.42 Daltons), Peak 45 (39542.43 Daltons), Peak 46
(159276.8 Daltons), Peak 47 (106256.1 Daltons), Peak 48 (88687.58
Daltons), Peak 49 (135305.2 Daltons), Peak 50 (7727.865343 Daltons), Peak
51 (10214.09619 Daltons), Peak 52 (9263.336516 Daltons), Peak 53
(10217.12293 Daltons), Peak 54 (7722.657526 Daltons), Peak 55 (7728.041349
Daltons), Peak 56 (9268.979905 Daltons), Peak 57 (7741.020002 Daltons),
Peak 58 (9248.709422 Daltons), Peak 59 (7720.190664 Daltons), Peak 60
(13870.3916 Daltons), Peak 61 (7725.474001 Daltons), Peak 62 (9275.311795
Daltons), Peak 63 (41782.2775 Daltons), Peak 64 (8896.712054 Daltons),
Peak 65 (4911.78345 Daltons), Peak 66 (83363.03733 Daltons), Peak 67
(45087.95748 Daltons), Peak 68 (121673.475 Daltons), or Peak 69
(7727.155842 Daltons), and (c) predicting risk of relapse following
therapy or distinguishing between L1/L2 and L3 based on the identification
of one or more of Peaks 30-69, wherein Peaks 30-49 are predictive of an
increased risk of relapse following therapy, and Peaks 50-69 distinguish
between L1/L2 and L3 ALL. ____________________________________________
If you want to learn more
about this patent, please go directly to the U.S.
Patent and Trademark Office Web site to access the full
patent.
|