Internet for Pharmaceutical and Biotech Communities
| Newsletter | Advertising |
 
 
 

  

Pharm/Biotech
Resources

Outsourcing Guide

Cont. Education

Software/Reports

Training Courses

Web Seminars

Jobs

Buyer's Guide

Home Page

Pharm Patents /
Licensing

Pharm News

Federal Register

Pharm Stocks

FDA Links

FDA Warning Letters

FDA Doc/cGMP

Pharm/Biotech Events

Consultants

Advertiser Info

Newsletter Subscription

Web Links

Suggestions

Site Map
 

 
   

 

  Pharmaceutical Patents  

 

Title:  Bioinformatic approach to disease diagnosis
United States Patent: 
8,005,627
Issued: 
August 23, 2011

Inventors:
 Porwancher; Richard (Princeton, NJ)
Appl. No.:
 11/852,283
Filed:
 September 8, 2007


 

Patheon


Abstract

A multivariate diagnostic method based on optimizing diagnostic likelihood ratios through the effective use of multiple diagnostic tests is disclosed. The Neyman-Pearson Lemma provides a mathematical basis to produce optimal diagnostic results. The method can comprise identifying those tests optimal for inclusion in a diagnostic panel, weighting the result of each component test based on a multivariate algorithm described below, adjusting the algorithm's performance to satisfy predetermined specificity criteria, generating a likelihood ratio for a given patient's test results through said algorithm, providing a clinical algorithm that estimates the pretest probability of disease based on individual clinical signs and symptoms, combining the likelihood ratio and pretest probability of disease through Bayes' Theorem to generate a posttest probability of disease, interpreting that result as either positive or negative for disease based on a cutoff value, and treating a patient for disease if the posttest probability exceeds the cutoff value.

Description of the Invention

BACKGROUND OF THE INVENTION

The present invention relates to methods for constructing multivariate predictive models to diagnose diseases for which current test methods are considered inadequate in either sensitivity or specificity. In particular, the present invention relates to predictive models for diagnosing diseases with a combination of laboratory tests, generating specificities of at least 80%.

More particularly, the present invention relates to the construction of a multivariate predictive model for diagnosing Lyme disease (LD) by choosing the best tests from among those currently available, utilizing the raw data produced by these tests instead of the manufacturers' binary test results, combining the test values into a single score through a special statistical function, weighting the importance of each component of the function when producing the score, generating a likelihood ratio from each patient's score, determining the pretest probability of disease through a special algorithm utilizing individual clinical signs and symptoms, combining the likelihood ratio with the pretest probability of disease through Bayes' Theorem to produce a posttest probability of disease, and determining a posttest probability cutoff point through a prospective validation study of the multivariate predictive model, against which individual patients' test results can be interpreted as indicative Lyme disease or not. The present invention also relates to component laboratory tests identified by the predictive model as critical for diagnosis in the form of test kits with the test panel components incorporated into a microtiter plate to be analyzed by a commercial laboratory.

Since the discovery that the spirochete Borrelia burgdorferi was the cause of LD over 25 years ago, numerous tests have been developed to detect this organism. Direct cultures of tissue or body fluids are possible, but suffer from low sensitivity. Direct detection methods involve assays for a component of B. burgdorferi or the DNA itself. Most PCR tests for B. burgdorferi DNA are insensitive, such as plasma, serum, whole blood, urine, and spinal fluid. Although invasive, arthrocentesis and skin biopsies often detect DNA by PCR in acute cases, aiding diagnosis. Performing skin biopsies is unnecessary under most circumstances because a well-trained physician can usually diagnose the characteristic rash, erythema migrans, by visual inspection alone.

Patients presenting with neurological symptoms or chronic arthritic symptoms will usually not benefit from PCR tests for B. burgdorferi DNA. In the latter cases, serological tests for antibody for B. burgdorferi are commonly used. Numerous methods have been employed, including whole-cell EIA, capture-EIA, peptide-antigen EIA, recombinant protein EIA, immunofluorescent antibody, immunodot, and immunoblots to detect IgG, IgM, and IgA antibodies. All serological methods may lead to false-positive results; however the most common test for B. burgdorferi antibody, the whole-cell EIA, is particularly susceptible to false-positive results. Therefore the CDC has advised a two step process to confirm antibody: first test serum by whole-cell EIA or an equivalent method, then use a highly specific immunoblot to confirm those results positive or indeterminate by the first step.

Most antibody methods are insensitive early in the disease (<4 weeks), but become more sensitive after the first few weeks have passed. This lack of sensitivity for early disease and a high rate of false-positive serology have undermined public confidence in the two-step process. The CDC and NIH have conducted active research programs for better diagnostic tests. The most promising of these new tests have been the recombinant and peptide-antigen EIAs; these tests exhibit sensitivity and specificity similar to the prior two-step process, but embodied in a single test.

The concept of a single test is the most appealing and some experts have advocated using C6 IgG as an alternative to the two-step method. The lack of sensitivity in early disease persists (at least 40% false-negative rate) with this new generation of tests (including C6 IgG), leading to recommendations for alternative interpretive algorithms by some physicians and Lyme advocacy groups. Western immunoblots using alternative interpretive algorithms (Donta, Clin. Infect. Dis., 25 (Suppl. 1), S52-56 (1997)) have demonstrated better sensitivity, but much worse specificity (up to 40% false-positives). This trade-off between sensitivity and specificity is a well recognized limitation in diagnostic testing.

The use of multiple tests in combination is not new. The two-step algorithm is borrowed from the literature on syphilis and HIV testing: a sensitive but non-specific screening test is confirmed by a more specific test. Implicit in this paradigm is the knowledge that the second, confirmatory test is at least as sensitive as the screening test. This analogy breaks down for LD (Trevejo et al., J. Infect. Dis., 179(4), 931-8 (1999). The Western blot, though specific, is not as sensitive for early disease as the EIA test. The improved specificity of the two-step method is offset by limited sensitivity.

Tests are used in combination to gain either sensitivity or specificity; interpretive rules are usually generated through Boolean operators. If the "OR" operator is used, then a combination test is positive if either component is positive. If each component detects a different antigenic epitope of B. burgdorferi, then a test fashioned using the "OR" operator will likely be more sensitive than any individual component. However, each new component also has its own intrinsic rate of false-positive reactions. Overall false positive rates increase linearly when using the "OR" operator combinations (Porwancher, J. Clin. Microbiol. 41(6), 2791 (2003)). If the "AND" operator is used, then a test is positive only when both components are positive; this operator is used to improve the specificity of a given combination of tests, often at the expense of sensitivity.

When using the "AND" operator, a counterintuitive event may occur: additional antigens can be used to improve specificity without loss of sensitivity. This effect has been demonstrated for ElpB1 and OspE; when FlaB and OspC were added to the mix; requiring multiple antibody responses actually improved specificity from 89% to 98%, while maintaining sensitivity (Porwancher, J. Clin. Microbiol., (2003)). Sensitivity was maintained because there were 15 new ways for antibody combinations to form when two new antigens were added; patients with disease tend to have multiple positive antibody combinations. Specificity improved because false-positive combinations are rare, even though there are more ways for these to form.

Bacon et al., J. Infect. Dis., 187, 1187-1199 (2003) evaluated using two peptide or recombinant antigens together in binary form and assigned equal importance to antibodies generated by either antigen. The authors used the Boolean "OR" operator, evaluating several different antibody combinations and settled on two pairs of antibodies for diagnosis, either C6 IgG and pepC10 IgM or V1sE1 IgG and pepC10 IgM. While the 2-tier method using a VIDAS whole-cell EIA was included, no other recombinant antigens were evaluated. By limiting the choice of antigens and not weighting the ones that are included, this method compromises test performance.

Western blots are basically multiple binary test observations: a band is formed when antibody and antigen mix together in a clear electrophoretic gel, creating a visible line. Antibody is either observed or not. Of the 10 key antibodies detected by IgG Western blot, we do not know which antibody results contribute independent information to diagnosis. Nor is the information weighted according to its level of importance; all positive components are weighted the same. Failing to weight the importance of individual bands might have led to requiring an excessive number of bands to confirm disease, thus limiting sensitivity.

Honegr et al., Epidemiol. Mikrobiol. Immunol., 50(4), 147-156 (2001), interpreted Western blots using logistic regression analysis. While directed toward human diagnosis, the study tried to determine the optimal use of different species of B. burgdorferi to utilize in European tests, as well as determine interpretive criteria. Band results reported in binary fashion were used to create a quantitative rule; however, no likelihood ratios were reported from this regression technique, no partial ROC areas were maximized using the logistic method [as in McIntosh and Pepe (2002)], there were no specificity goals for ROC areas, and there was no attempt to utilize clinical information. While key Western blot bands were identified, and weighted, the failure to use clinical information, set specificity goals, or to maximize likelihood ratios (and therefore partial ROC areas) raises a question about the validity of the rules that were derived (according to the Neyman-Pearson Lemma).

Robertson et al., J. Clin. Microbiol., 38(6), 2097-2102 (2000), performed a study whose purpose was similar to Honegr et al. However Robertson et al. did not produce a quantitative rule as a consequence of utilizing multiple Western blot bands. While significant bands were identified through logistic regression, they utilized this information in a binary fashion and generated interpretive rules using either two or three of the bands so identified. There was no attempt to weight the importance of individual bands. In the end, the purported rules developed by logistic regression were no better than pre-existing interpretive criteria. No likelihood ratios were generated, no ROC curves, and no clinical information was utilized. There was no attempt to use the Western blot with other tests. Their failure to quantify their results severely limited its use.

Guerra et al., J. Clin. Microbiol., 38(7), 2628-2632 (2000), studied the use of log-likelihood analysis of Western blot data in dogs. The emphasis of her study was to develop a rule to diagnose Lyme disease in dogs that had received the Lyme disease vaccine (known to interfere with diagnosis). Guerra did produce a quantitative rule based on likelihood ratios. She combined this rule with epidemiological data to generate posttest probabilities. None of the animals were sick. No ROC analysis was performed, nor was there an attempt to determine the specificity or sensitivity of the technique. While a predictive rule could be generated, its performance was unclear because the epidemiological data was poorly utilized.

As demonstrated above, the LD field is limited by the lack of a theoretical basis for test strategy. There has been remarkably little work done using multivariate analysis and Lyme disease. Multiple tests exist to diagnose LD, but little is known about which tests are optimal or how to use tests together to enhance diagnostic power. U.S. Pat. No. 6,665,652 described an algorithm that enabled diagnosis of LD using multiple simultaneous immunoassays; this method required that the antibody response to antigens selected for diagnostic use be highly associated with LD (i.e. few false-positive results) and conditionally independent among controls. The disclosure of the above patent, particularly as it relates to LD diagnosis, is incorporated herein by reference.

Diagnostic methods are usually compared based on misclassification costs (utility loss), a value tied to the prevalence of LD in the general population. While the dollar cost of diagnostic tests is one means to compare outcomes, another and possibly more important goal is to estimate the loss of productive life (regret) from a given outcome. The two factors that generate regret are false-negative and false-positive serology.

The cost associated with false-negative results is the difference in regret between those with false-negative and true-positive serology, for which the increased personal, economic, and social cost of delaying disease treatment are factors. The cost associated with false-positive results is the difference in regret between those with false-positive and true-negative serology, for which the personal, economic, and social costs of administering the powerful intravenous antibiotics to healthy patients are all factors.

The foregoing issues also exist for many other infectious and non-infectious diseases. There remains a need for a predictive model that enables the selection of the fewest number of tests that contribute significantly to disease diagnosis, thereby limiting the cost of testing without sacrificing diagnostic sensitivity.

SUMMARY OF THE INVENTION

This need is met by the present invention. A multivariate diagnostic method based on optimizing diagnostic likelihood ratios through the effective use of multiple diagnostic tests is proposed. The Neyman-Pearson Lemma (Neyman and Pearson, Philosophical Transactions of the Royal Society of London, Series A, 231, 289-337 (1933)) provides a mathematical basis for relying on such methods to produce optimal diagnostic results. When individual diagnostic tests for a disease prove inadequate in terms of either sensitivity or specificity, the present invention provides a method for combining existing tests to enhance performance.

The method includes the steps of: identifying those tests optimal for inclusion in a diagnostic panel, weighting the result of each component test based on a multivariate algorithm described below, adjusting the algorithm's performance to satisfy predetermined specificity criteria, generating a likelihood ratio for a given patient's test results through said algorithm, providing a clinical algorithm that estimates the pretest probability of disease based on individual clinical signs and symptoms, combining the likelihood ratio and pretest probability of disease through Bayes' Theorem to generate a posttest probability of disease, interpre-ting that result as either positive or negative for disease based on a cutoff value, and treating a patient for disease if the posttest probability exceeds the cutoff value.

Therefore, according to one aspect of the present invention, a method is provided for constructing a multivariate predictive model for diagnosing a disease for which a plurality of test methods are individually inadequate, wherein the method includes the following steps:

(a) performing a panel of laboratory tests for diagnosing said disease on a test population including a statistically significant sample of individuals with at least one objective sign of disease and a statistically significant control sample of healthy individuals and persons with cross-reacting medical conditions;

(b) generating a score function from a linear combination of the test panel results, wherein the linear combination is expressed as .beta..sup.TY, wherein D is the disease; Y.sub.1, . . . , Y.sub.k is a set of K diagnostic tests for D; Y is a vector of diagnostic test results {Y.sub.1, . . . , Y.sub.k}; D'=not D; .beta. is a vector of coefficients {.beta..sub.1, . . . , .beta..sub.k} for Y; and .beta..sup.T is the transpose of .beta.;

(c) performing a receiver operating characteristic (ROC) regression or alternative regression technique of the score function, wherein the test panel is selected and .beta. coefficients are calculated simultaneously to maximize the area under the curve (AUC) of the empiric ROC as approximated by -- see Original Patent.

(d) calculating for each individual the pretest odds of disease; generating a diagnostic likelihood ratio of disease by determining the frequency of each individual's test score in said diseased population relative to said control population; and multiplying the pretest odds by the diagnostic likelihood ratio to determine the post-test odds of disease for each individual;

(e) converting a set of posttest odds into posttest probabilities and creating an ROC curve by altering the posttest probability cutoff value;

(f) comparing the ROC areas generated by one or more regression techniques to determine an optimal methodology comprising the tests to be included in an optimum test panel and the weight to be assigned each test score alone or in combination;

(g) dichotomizing the optimal methodology by finding that point on the final ROC graph tangent to a line with a slope of (1-p)C/pB, where p is the population prevalence of disease, B is the regret associated with failing to treat patients with disease and C is the regret associated with treating a patient without disease, thereby generating a posttest probability cutoff value; and

(h) displaying the optimum test panel for disease diagnosis, the weight each individual test score is to be assigned alone or in combination, and the cutoff value against which positive or negative diagnoses are to be made.

When t.sub.0 is the maximum false-positive rate desired by a physician interpreting the tests and is a multiple of 1/n.sup.H; then the .beta. coefficients and test panel are chosen simultaneously through partial ROC regression in order to generate the largest area below the partial ROC curve for the (1-t.sub.0) quantile of individuals without D, where .beta..sup.T Y.sub.j>c and S.sub.H(c)=t.sub.0 (the survival function of patients without disease with a score of c). When several predictive models are under consideration, their partial AUC for the (1-t.sub.0) quantile are compared with that produced by partial ROC regression in order to determine the optimal technique (Dodd and Pepe, 2003).

Methods according to the present invention further include the steps of testing individual patient serum samples using the optimum methodology; reporting the diagnostic result to each patient's physician and treating those patients whose posttest probability exceeds the cutoff value for disease D. When the posttest probability falls below the cutoff value, but the illness is less than 2 weeks duration, the test should be repeated in 14 days in order to look for seroconversion.

Pretest risk can be determined using an individual's clinical signs and symptoms. In the event that there is insufficient data to determine the pretest risk that a patient has Lyme disease, then the laboratory may report the likelihood ratio for that patient's test results directly to the physician, as well as the cutoff value to distinguish positive from negative results. A diagnostic cutoff can be determined by observing the likelihood ratio which results in 99% specificity in a control population of patients.

The present invention has also identified significant roles for pepC10 IgM, V1sE1 IgG and C6 IgG antibodies in the diagnosis of LD, in combination with one other or with different antibodies. The present invention therefore also includes a test panel comprising a plurality of antibody tests, kit and methods for the detection of LD including one or more of these additional antibodies.

A computer-based method is also provided for diagnosing a disease for which a plurality of test methods are individually inadequate, which method includes the steps of combining weighted scores from a panel of laboratory test results chosen through the multivariate techniques described above, comparing the combined weighted results to a cutoff value, and diagnosing and treating a patient for disease D based on exceeding the cutoff level. The disease D can be Lyme disease. Computer-based methods include methods evaluating results from a test panel including at least one antigen test selected from V1sE1 IgG, C6 IgG, and pepC10 IgM antigen tests.

The inventive method reduces error because specificity requirements are satisfied; this requirement is particularly important for LD because of overdiagnosis and overtreatment for false-positive results. When the disease is LD, the tests chosen by the proposed method may be employed by the algorithm described in U.S. Pat. No. 6,665,652 after being dichotomized. Alternatively, these tests can be directly utilized by new methodologies for LD prediction.

Alternative multivariate methods, including but not limited to logistic regression, log-likelihood regression, linear regression, and discriminant analysis, can learn which features are optimal from ROC regression methods. The learning process is particularly valuable for diseases where high specificity is needed. These alternative methods cannot focus their regression methodology on a portion of the ROC curve. By learning the optimal test choices, they can rerun the regression analysis using these specific variables, thus maximizing their predictive power.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The LD field is limited by the lack of a theoretical basis for test strategy. Signal detection theory provides a theoretical basis to create rules to both include and weight the contribution of different tests. The likelihood ratio for a given set of test results is the probability that those results will be seen in patients with disease, divided by the probability that those same set of results will be seen in patients without disease. The Neyman-Pearson lemma (1933) states that the algorithm that produces the highest likelihood ratio for a given specificity is the optimal interpretive algorithm. This mathematical statement leads us to search for methods that will maximize the diagnostic likelihood ratio derived from a given set of tests.

ROC regression methods are the optimal methods to maximize likelihood ratios. (Pepe, The Statistical Evaluation of Medical Tests for Classification and Prediction, (First Edition ed. Oxford, U.K., Oxford University Press, 2003); Ma and Huang, Regularized ROC Estimation: With Applications to Disease Classification Using Microarray, (University of Iowa, Department of Statistics and Actuarial Science, Technical Report No. 345, 2005)). ROC curves are generated by varying the score cutoff values generated using a specific algorithm for a given set of tests. Sensitivity and specificity results follow from producing such cutoff values. An ROC curve quantifies the trade-off between sensitivity and specificity. It is not well known that the derivative of the ROC curve at any given specificity level is the likelihood ratio for that test cutoff value. Therefore ROC curves are, in essence, reflections of the likelihood ratio associated with a given set of test results. ROC regression methods attempt to maximize the ROC curve at each point (maximizing the likelihood ratio for each test cutoff value). Therefore ROC techniques are able to produce the optimal rules for any given set of test results.

Regression techniques are approximations; for ROC regression, the approximation is to the empiric ROC curve. The empiric AUC (area under the curve) represents the optimal solution for a given set of tests. For large studies using multiple test results and covariates, the solution to the empiric ROC requires near impossible calculation power. Therefore approximation methods are needed. (Ma and Huang, 2005; McIntosh and Pepe, Biometics, 58 657-664 (2002)). One of the best methods is the sigmoid function approximation to the empiric ROC curve (Ma and Huang, 2005). Partial ROC regression maximizes the ROC curve within clinically acceptable limits of specificity (usually 95% to 100%).

While logistic regression can attempt to approximate the empiric ROC curve over the entire ROC space, only partial ROC regression is able to maximize a portion of the curve; the clinical impact of this nuance is that partial ROC regression using a sigmoid function is better at choosing tests that produce high levels of specificity, while maintaining sensitivity. Penalized likelihood functions may also be employed using the LASSO technique with an L.sub.1 penalty to choose the best tests among highly correlated methods. (Kim and Kim, "Gradient LASSO for feature selection," Proceedings of the 21.sup.st Internation Conference on Machine Learning, Banff, Canada, (2004)). By optimizing the number of tests, the specific tests chosen, and the rules used to combine those tests, it is possible to maximize the likelihood ratio at each point of the partial ROC curve.

Logistic regression using a log-likelihood method provides a good approximation to the empiric ROC curve, though imperfect in areas requiring high specificity (McIntosh and Pepe, 2002); good agreement has been demonstrated between log-likelihood and ROC methods for the CDC dataset (Bacon et al. 2003) used to confirm the inventive methodology. Regardless of the value of logistic methods using small sample sizes, picking the correct variables for evaluation of large samples is critical for performance reasons and cost (Pepe and Thompson, Biostatist., 1(2), 123-140 (2000)).

Partial ROC regression is theoretically superior to logistic regression because of its inherent ability to maximize a portion of the ROC curve. Because logistic regression methods are computationally easier and because of the need to compare multiple predictive models, logistic methods were chosen for the remainder of our analyses. (McIntosh and Pepe 2002). However, the above theoretical reasons predict that for some data sets, ROC regression will produce superior results, either by picking better tests or by using more efficient rules to maximize the critical portion of the ROC curve.

It is not sufficient to choose other regression methods that might produce results superior to current two-step techniques. Rather the ability to choose the best antigens is key, both from a therapeutic and cost perspective. The present invention helps other regression methods learn the correct antigens to use to achieve specificity and sensitivity goals, allowing them to recalibrate more accurately. Both because of theoretically superior overall performance and the ability to improve other techniques, partial ROC regression using a sigmoid approximation and penalized likelihood functions is an optimal means to both choose tests and produce optimal rules to combine tests. Techniques like logistic regression can utilize those features (variables) selected by partial ROC methods to optimize its selection of beta coefficients, thereby enhancing its predictive power.

Rules based on likelihood ratios produce outputs that can be easily combined with pretest probability results through Bayes' Theorem. By multiplying the pretest odds times the likelihood ratio, one generates the post-tests odds, specific to that patient and their test results. The present invention uses an algorithm to determine the pretest probability of disease based on the signs and symptoms of disease. The method described in U.S. Pat. No. 6,665,652 and a new literature review helped formulate the estimates in Table 1 (see Original Patent). For example, the pretest probabilities listed below can be used in to optimize prediction of LD. Similar pretest probabilities and algorithms can be generated for other diseases without risky experimentation.

Although it is possible to use a likelihood ratio alone to categorize patients as having disease or not, combining clinical and laboratory results has demonstrated even more impressive performance relative to the CDC's 2-tier method. All tests seem to benefit from including information about the pretest risk of infection, but ROC and logistic regression seem to produce the best overall results when combined with pretest risk assessment.

The multivariate method of the present invention is used to select the optimum test panel for disease diagnosis, weight those results to maximize sensitivity and specificity, and ultimately choose a cutoff value for the posttest probability of disease that minimizes the regret associated with false positive and false negative test results. Component laboratory tests identified by the predictive model as critical for diagnosis can be manufactured in the form of test kits with the test panel components incorporated into a microtiter plate to be analyzed by a commercial laboratory.

The laboratory will utilize reading equipment and software provided by the present invention to collect and interpret test data, generating a likelihood ratio for each patient. According to one embodiment of the present invention, the commercial laboratory will electronically transfer each patient's likelihood ratio to their physician's office, to be received by software provided by the present invention for a computer or personal digital assistant. The physician will then evaluate each patient's individual signs and symptoms through a clinical algorithm on the office software to determine the pretest probability of disease. Should there be insufficient information to generate such a score, then the physician may choose to accept the laboratory-derived likelihood ratio for that patient and cutoff value as the final report.

The physician's software will combine the patient's likelihood ratio with the pretest probability of Lyme disease as determined by the physician, generating a posttest probability of Lyme disease. The physician's software will generate a report, including the above results and an interpretation of posttest probability of disease as it relates to the cutoff level we provide. Test results exceeding the cutoff level will help determine whether the patient requires additional tests or treatment for Lyme disease.

The test kit containing the component tests and interpretive clinical and laboratory software, plus the test kit reader, will be marketed as a single test to be FDA approved.

The present invention thus also provides diagnostic software containing code embodying a computer-based method for scoring results from the optimum test panels according to the weights assigned each test or combination thereof and comparing the results against the assigned cutoff value to render a positive or negative diagnosis. Optimum test panel kits are also provided, including kits in which the diagnostic software is included. Methods for diagnosing disease with the test panels and software are also provided.

The multivariate method of the present invention is performed as a computer-based method. The input, processor and output hardware and software other than that expressly described herein is essentially conventional to one of ordinary skill in the art and requires no further description. The input, processor and output hardware employed by computer based methods for diagnosing disease constructed from information derived by the multivariate method of the present invention are also essentially conventional to one of ordinary skill in the art and require no further description

The foregoing principles are illustrated in the following example in the context of LD, however, it should be understood that the inventive method can also be applied to other diseases for which there exists multiple diagnostic tests such as connective tissue diseases, Rocky Mountain Spotted Fever, Babesia microti, and Anaplasma granulocytophilia. Diagnostic testing panels can be developed for each of the foregoing against a test population according to the methods described herein incorporating pretest clinical information to select the optimum test panel for disease diagnosis, the weight to assign each test of combination thereof, and cutoff values that minimize regret associated with false positive and false negative results. For example, the inventive method can be applied to a diagnostic test panel for the diagnosis of Lupus erythematosis and the ARA diagnostic criteria for Lupus erythematosis can be used to determine the pretest probability of disease.


Claim 1 of 15 Claims

1. A method for constructing a multivariate predictive model for diagnosing a disease for which a plurality of test methods are individually inadequate, said method comprising: (a) performing a panel of laboratory tests for diagnosing said disease on a test population comprising a statistically significant sample of individuals with at least one objective sign of disease and a statistically significant control sample of healthy individuals or persons with cross-reacting medical conditions; (b) generating, by a computer, a score function from a linear combination of said test panel results, said linear combination expressed as .beta..sup.TY , wherein D is the disease; Y.sub.1, . . . , Y.sub.k is a set of K diagnostic tests for D; Y is a vector of diagnostic test results {Y.sub.1, . . . , Y.sub.k}; D'=not D; .beta.is a vector of coefficients {.beta..sub.1, . . . , .beta..sub.k}for Y; and .beta..sup.T is the transpose of .beta.; (c) performing, by the computer, a receiver operating characteristic (ROC) regression or alternative regression technique of the score function, wherein the test panel is selected and .beta. coefficients are calculated simultaneously to maximize the area under the curve (AUC) of the empiric ROC as approximated by -- see Original Patent; (d) calculating, by the computer, for each individual the pre-test odds of disease; generating a diagnostic likelihood ratio of disease by determining the frequency of each individual's test score in said diseased population relative to said control population; and multiplying said pretest odds by said likelihood ratio to determine the post-test odds of disease for each individual; (e) converting, by the computer, a set of posttest odds into posttest probabilities for each methodology and creating an ROC curve for each methodology by altering its respective post-test probability cutoff value; (f) comparing, by the computer, the ROC areas generated by one or more regression techniques to determine an optimal methodology, comprising the tests to be included in an optimum test panel and the weight to be assigned each test score alone or in combination; (g) dichotomizing, by the computer, the optimal methodology by finding that point on the final ROC graph tangent to a line with a slope of (1-p)C/p B, where p is the population prevalence of disease, B is the regret associated with failing to treat patients with disease and C is the regret associated with treating a patient without disease; thereby generating a posttest probability cutoff value; and (h) displaying, by the computer, the optimum test panel for disease diagnosis, the weight each individual test score is to be assigned alone or in combination, and the cutoff value against which positive or negative diagnoses are to be made, wherein said disease is Lyme Disease.
 

 

____________________________________________
If you want to learn more about this patent, please go directly to the U.S. Patent and Trademark Office Web site to access the full patent.
 

 

     
[ Outsourcing Guide ] [ Cont. Education ] [ Software/Reports ] [ Training Courses ]
[ Web Seminars ] [ Jobs ] [ Consultants ] [ Buyer's Guide ] [ Advertiser Info ]

[ Home ] [ Pharm Patents / Licensing ] [ Pharm News ] [ Federal Register ]
[ Pharm Stocks ] [ FDA Links ] [ FDA Warning Letters ] [ FDA Doc/cGMP ]
[ Pharm/Biotech Events ] [ Newsletter Subscription ] [ Web Links ] [ Suggestions ]
[ Site Map ]