Plasma D-dimer data are often not normally distributed. In the research setting, such data is non-parametric and statistical analysis is often based on log-transformed data. In the clinical pathology, results are not transformed, but interpreted as it is.

Plasma D-dimer data are often not normally distributed. In the research setting, such data is non-parametric and statistical analysis is often based on log-transformed data. In the clinical pathology, results are not transformed, but interpreted as it is.

This was a critical review of cross-sectional laboratory data. A total of ninety samples comprising N = 30 per group were equally selected from groups from a pool of plasma D-dimer tests. The three groups were control, diabetes mellitus (DM), and diabetes plus cardiovascular (DM + CVD) disease. A descriptive analysis and comparison were performed on log-transformed and untransformed data.

ANOVA on untransformed data showed non-significant difference between groups, but the log normalized data achieved statistical significance (p < 0.03). Comparing the DM with DM + CVD groups, mean value is higher in DM group of untransformed data, but lower in the same data when it is log-transformed.

There is need to clarify the background statistics behind the reference values recommended in various quantitative kits- whether it is based on log-transformed or untransformed data. Either the researcher who transforms data or the clinician who does not transform result would need to review the correctness of employing the recommendations on the reagent kit to interpret results.

Clinical pathology, Data analysis, Log transformation, Non-parametric statistical methods, Translation of results

CVD: Cardiovascular Disease; DM: Diabetes Mellitus; DM + CVD: Diabetes with Cardiovascular Disease Co-Morbidity; DVT: Deep Vein Thrombosis; PE: Pulmonary Embolism

Plasma D-dimer is used as a screening marker for atherothrombosis [1,2]. It is a common clinical tool used by which a patient may be excluded from deep vein thrombosis (DVT) and pulmonary embolism (PE). Although D-dimer testing provides sufficient diagnostic accuracy, the interpretation of result and clinical judgement are inevitable requirements [3].

Different methods exist for the determination of D-dimer including automated vs. manual; qualitative, quantitative or semi-quantitative; and rapid or non-rapid [4,5]. In terms of principle, there are enzyme-linked immunosorbent assay as well as immunoturbidometric and agglutination techniques. All of these assorted methods constitute a problem in the clinical utility of plasma D-dimer test. For instance, while the qualitative techniques are less sensitive than quantitative and hardly discriminate between a weakly positive case from normal [3], the results of quantitative assays vary between methods and are not transferable [6].

One problem associated with quantitative assay is the fact that D-dimer is not normally distributed reports [7]. Data that is not normally distributed is usually analysed by one of two methods i.e. using either the non-parametric statistics or by transforming the data for parametric statistics. There are recommendations for the use of log-normalization of data and subsequent back-transformation [8,9], which is now widely practiced in research [7,10].

However, logarithmic transformation of data approach has its own advantage and disadvantage. The advantage of the non-parametric approach is that the data is not altered as medians and sums or signs of ranks are used rather than assuming a distribution. The disadvantage is that parametric identifiers such as standard deviation may not necessarily be useful interpreters for the raw data. Therefore, discussion of whether a patient falls inside or outside a normal range, which assumes a normal distribution in the population, is not usually valid in a non-parametric context. The closest non-parametric alternative discussion would be if a patient falls between certain percentiles [8].

On the other hand, if the data is transformed to obtain a normal distribution, one can discuss how normal a patient is, because there would be applicable standard deviation to decide on a reference range. However, it is noteworthy that the values (means and standard deviations) being such instance have been transformed and therefore do not directly correspond to the raw data. That is, results from a transformed data analysis may not be directly applicable or interpretable in clinic diagnostic practice [11,12]. Indeed, there had been an outstanding recommendation that "careful consideration should be given to use of a log transformation at the protocol design stage" [9]. This brings to fore the question: has this been the case with D-dimer testing in research and diagnostic pathology?

This study attempts a critical evaluation of descriptive analysis of D-dimer data based on untransformed and log-transformed data. The importance is to reaffirm the need to improve plasma D-dimer testing techniques with a view to improve its clinical utility.

As part of a study on diabetes and its cardiovascular complications at the Charles Sturt University, subjects were tested for plasma D-dimer amongst other parameters. The study was approved by the Ethic in Human Research Committee. The detailed information on subjects' selection criteria has been previously published [7,13]. Plasma D-dimer was assayed using MiniQuant® D-dimer reagent kits (Biopool). The results were generated by the MiniQuant-1™ instrument (Trinity Biotech Ireland).

For the purpose of this critical review study, a cross-section of 'N = 90' random selection from all three groups comprising equal number (30) per group. That is, 30 results each were randomly selected from the apparently health (Control), diabetes mellitus (DM) and diabetes with cardiovascular disease (CVD) complications (DM + CVD) groups. Descriptive statistics of data were performed on transformed and untransformed data. Group comparison was by the ANOVA method. Statistical analysis for this report was by Data Analysis ToolPak (Microsoft Excel).

In terms of scope of this study, the choice of this limited scope and simple statistical analysis is based on the fact that in clinical practice, the clinicians interpret the result of D-dimer without any transformation or logarithm. CVD is based on clinical diagnosis of hypertension and/or any form cardiovascular ill-health. Thus, all members of the DM + CVD group had clinical diagnosis of both diabetes and a form of CVD. The focus of this critical review was not to confirm diagnosis of pulmonary embolism, but to compare descriptive statistics from transformed vs. untransformed' data. However, no participant selected in this review had just CVD only.

Table 1 presents the raw vis-à-vis untransformed data for the 'N = 90' separated into the three groups. When data were analysed without log transformation, ANOVA showed non-significant difference, but DM group had a highest mean value (503 μg/L) for plasma D-dimer relative to DM + CVD (465 μg/L) and control (210 μg/L) groups.

Table 1: Raw data. View Table 1

Kurtosis was greater than three in all the three groups (Table 2), which would justify the condition for consideration of log normalization for parametric data analysis. When data were converted to natural log and re-analysed and the mean values were back-transformed, DM + CVD showed the highest mean value (280 μg/L) relative to DM (271 μg/L). At this stage, data achieved normal distribution with kurtosis being < 1.6 in both groups (Table 3).

Table 2: Descriptive statistics of untransformed data. View Table 2

Table 3: Descriptive statistics of transformed data. View Table 3

Based on the laboratory-based observation data (Table 1), this report presents a classical situation where a statistically determined lower mean value in untransformed data analysis can come up to be higher if the same data is analysed as log-transformed and afterwards back-transformed. When means and standard deviations were determined in the raw data, it was observed that group DM originally had the highest mean value among the three groups compared to group DM + CVD. Meanwhile, the standard deviations were large, and each group showed kurtosis > 4 (Table 2). When the natural log values were used for the determination and back-transformed, it was observed that DM + CVD group, which originally had a lower mean value, presented the highest mean value compared to DM and control groups. Further, statistical significance was achieved (Table 3).

The implication is that from statistics' perspective, there is justification for a clinician to limit clinical judgement and utilization of the diagnostic test to exclusion, only. There is also justification for a researcher to perform log-transformation and normalization as per statistical standards. However, from the perspective of this review objective i.e. translating natural log statistics in research to clinic diagnostic practice [11,12], this is an affirmation that reports of comparative levels that are based on log transformed data are not true representations of the original comparisons. The implication is that application of 'log transformation and back-transformation' is at least misleading and requires a clarification, if not a review.

What this report contributes to literature is evidence to reaffirm and update the need to address the non-transferability of some statistical methods from research to clinical practice. The observation raises an important question about the current quantitative D-dimer techniques, *viz:* on what statistical basis are the given normal values or reference ranges made in the procedures of the various kits?

The importance of answering this question, in the clinical utility of plasma D-dimer test, is invaluable. If values provided were statistically determined from untransformed data, which has no normal distribution, it behoves that a reference range is not applicable. This is assumed in the current practice whereby improvement is being sought. It also implies that the practice of logarithmic transformation of D-dimer data may need to be reviewed as the interpretation of results is not *comparing apples with apples* per se. Besides, it means that such a research results cannot be applied to clinical practice.

On the other hand, if the values provided were statistically determined from log-transformed data, which has normal distribution, it also means that clinicians are not. This is basically because log-transformed and back-transformed data do not directly correspond to values on the same scale as the raw data [11,12]. As shown in Table 2 and Table 3, DM group presented with highest mean value in the untransformed data, but not after log transformation. In particular, the result further shows that the mean values and standard deviations were much smaller for transformed data relative to untransformed data (Table 1). Therefore, pathologists in clinical practice may be comparing higher 'raw' laboratory results with reference ranges that are based on much lower 'transformed' values.

In the comparison study, Triage and Vidas D-dimer kits were indicated to have comparable accuracy, but different levels of sensitivity. The report recommended that lowering the reference range of the Triage method will improve the sensitivity at the expense of specificity [14]. This is the crux of the matter: if a clinician employs this recommendation to interpret the results presented in Table 1 shows, the groups' mean values are normal on the basis of log-transformed data, but absolutely very high on the basis of untransformed data.

Currently, there now seems to be motion for age-adjusted reference values [15]. This situation will potentially never eliminate the wrong diagnostic exclusion of DVT or PE. It has been identified that the interpretation as well as variability of unit's measurement of D-dimer are complicated and unreasonable [16]. What this report adds to the discourse is that the statistical basis of recommended reference values needs to be identified and aligned with clinical practice.

There is need to clarify the background statistics by which the reference values recommended in the various quantitative kits of plasma D-dimer are made. If recommendations are based on non-parametric analysis, then the practice of log-transforming D-dimer data in research needs to be reviewed. The statistics also needs to be reviewed in order to obtain an applicable reference range. This need for reference range is imperative to make sense out of quantitative assay relative to semi-quantitative method. If the recommendations are based on log-transformed data, the clinicians and pathology services need to be updated regarding ongoing interpretation of plasma D-dimer.

This study was funded by the Charles Sturt University.

- Ridker PM, Brown NJ, Vaughan DE, Harrison DG, Mehta JL (2004) Established and emerging plasma biomarkers in the prediction of first atherothrombotic events. Circulation 109: IV6-IV19.
- Spronk HM, van der Voort D, Ten Cate H (2004) Blood coagulation and the risk of atherothrombosis: A complex relationship. Thromb J 2: 12.
- Siragusa S (2006) D-dimer testing: Advantages and limitations in emergency medicine for managing acute venous thromboembolism. Intern Emerg Med 1: 59-66.
- Gosselin RC, Owings JT, Kehoe J, Anderson JT, Dwyre DM, et al. (2003) Comparison of six D-dimer methods in patients suspected of deep vein thrombosis. Blood Coagul Fibrinolysis 14: 545-550.
- Freyburger G, Trillaud H, Labrouche S, Gauthier P, Javorschi S, et al. (1998) D-dimer strategy in thrombosis exclusion--a gold standard study in 100 patients suspected of deep venous thrombosis or pulmonary embolism: 8 DD methods compared. Thromb Haemost 79: 32-37.
- Edlund B, Nilsson TK (2006) A proposed stoichiometrical calibration procedure to achieve transferability of D-dimer measurements and to characterize the performance of different methods. Clinical Biochemistry 39: 137-142.
- Nwose EU, Richards RS, Jelinek HF, Kerr PG (2007) D-dimer identifies stages in the progression of diabetes mellitus from family history of diabetes to cardiovascular complications. Pathology 39: 252-257.
- Cole TJ (2000) Sympercents: Symmetric percentage differences on the 100 log(e) scale simplify the presentation of log transformed data. Stat Med 19: 3109-3125.
- Keene ON (1995) The log transformation is special. Stat Med 14: 811-819.
- Salomaa V, Stinson V, Kark JD, Folsom AR, Davis CE, et al. (1995) Association of fibrinolytic parameters with early atherosclerosis. Circulation 91: 284-290.
- Feng C, Wang H, Lu N, Chen T, He H, et al. (2014) Log-transformation and its implications for data analysis. Shanghai Arch Psychiatry 26: 105-109.
- Feng C, Wang H, Lu N, Tu XM (2013) Log transformation: Application and interpretation in biomedical research. Stat Med 32: 230-239.
- Richards RS, Nwose EU (2010) Blood viscosity at different stages of diabetes pathogenesis. Br J Biomed Sci 67: 67-70.
- Ghys T, Achtergael W, Verschraegen I, Leus B, Jochmans K (2008) Diagnostic accuracy of the Triage((R)) D-dimer test for exclusion of venous thromboembolism in outpatients. Thromb Res 121: 735-741.
- Nybo M, Hvas AM (2017) Age-adjusted D-dimer cut-off in the diagnostic strategy for deep vein thrombosis: A systematic review. Scand J Clin Lab Invest 77: 568-573.
- Koracevic GP (2012) Nine modalities to report d-dimer concentration: How many is too many? Am J Emerg Med 30: 1007-1008.