Curiously, Dr. Bauchner’s resistance to ensuring accuracy of the data contradicts his own comments on editorial policy at JAMA regarding statistical analysis (6), published in July 2013, just 4 months prior to publication of the article by Vigen et al: “If concerns are raised in the process of editorial evaluation, we reserve the right to request the entire data set from authors to conduct our own statistical analysis…As with all manuscripts, the first priority in decisions about publication will always be the integrity of the research.” Although we note that Dr. Bauchner’s comments here pertained to the evaluation of manuscripts prior to publication, if the “integrity of the research” is “the first priority,” we see no reason why this should not equally apply to manuscripts post-publication when major errors are revealed.
Moreover, the authors were allowed by JAMA in their reply to letters to mischaracterize this series of errors as “an incorrect notation” in their original work. This is a gross distortion that minimizes the seriousness of the errors. An incorrect notation suggests a typographical error, or a single numerical error in the publication. In fact it was a series of errors, involving data entry, data review and oversight, data analysis, and publication. This mischaracterization of serious and multiple errors by JAMA is deceitful, and an ethical violation, falling well below the standard of transparency, clarity, and corrective actions for a journal that aims for “the highest standards of editorial integrity.”
These issues are more than sufficient for us to ask the Oversight Committee to investigate the actions of JAMA’s editorial board, and to support our call for retraction of this article. However, there have additionally been a series of troubling events which also raise serious ethical and editorial concerns which we wish to bring to your attention. These are as follows:
1. JAMA knew, or should have known, that the statistical methodology used by the authors has never been validated. The key citation in Methods, (Reference #23 in Vigen et al) by Xu et al (7), was co-authored by senior author Michael Ho. That study explored the use of inverse propensity weighting when applied to Kaplan-Meier methods in time-varying treatment, which is the same methodology applied in Vigen et al. The conclusions of this citation by Xu et al, so central to the study by Vigen et al and published only one year earlier (2012), concluded: “Clearly, assessing and confirming adequate covariate balance in IPTW time-varying models is challenging and needs further study... Further work with simulations and contrasts to other methods and other study applications would help elucidate the advantages and disadvantages of this approach." At the time of publication of Vigen et al, none of the 5 other citations of Xu et al had used the methodology to report results. This study thus relies on a new, inadequately studied statistical approach that has not yet been accepted by the statistical community, and must therefore be considered unvalidated. It is disturbing that the reviewers and JAMA’s editorial board would allow this to go unchallenged.
2. We wonder about the review process itself, given this and other previously stated concerns. Were the reviewers friends of the authors? We believe it would be worthwhile interviewing them to determine why they failed to note the misreporting of results, the improper exclusion of men who were treated with T prescriptions after an adverse event, and why they believed this unvetted methodology was valid.
3. The authors elected to not respond to all questions raised in letters, citing lack of space (see reply by authors to letters, attached). These were letters selected by JAMA’s editors, presumably because the editor(s) believed the concerns within those letters merited a response. In our own case, the editors substantially edited our letter, removing certain questions and editing others. It is troubling that JAMA would allow the authors to avoid responding to questions they had deemed worthy. This gives the impression the authors had something to hide. JAMA failed in its editorial duty to ensure that the authors responded to all questions. The end-result is a lack of transparency, compromise of the scientific process, and the appearance of allowing the authors to hide behind JAMA’s skirt.
4. The authors failed to provide an adequate explanation for the improper exclusion of men who received T prescriptions after suffering an MI or stroke. The FDA agreed this was improper, writing, “Finally, the exclusion of 128 patients who experienced MI or stroke before initiating testosterone was not appropriate. These patients should have been included in the analysis…” (see attached analysis by FDA).
5. We find it curious, and troubling, that the lead author (Rebecca Vigen) did not participate in the response to letters.
6. The authors performed a post-hoc analysis of their data to include the 128 men who had been improperly excluded. Even though 128 events should have been added to the untreated group with the lower rate of events, the authors claimed this resulted in a stronger hazard ratio (1.30 instead of the original 1.29), with a stronger 95% CI (1.06-1.60 instead of the original 1.04-1.58). Either the authors have made a mathematical error, or their model makes no sense.
7. The results are unverifiable. The authors have not published their statistical programming for their analysis on SAS or made it available for inspection anywhere else. Even if an outside party were given the original dataset, it would be impossible to reproduce the results. This violates a basic tenet of the publication of scientific research, namely that enough information regarding methods should be provided so that others may attempt to replicate and thereby verify the results. The inability to examine the methods means also that any inadvertent errors in programming would escape detection.
8. Over 50 variables were used for adjustment, some of which have never been associated with risk for CV events, eg, transesophageal echocardiogram and transthoracic echocardiogram. The authors also failed to include obvious variables that have been shown to be associated with CV events, such as number of medications, and baseline serum testosterone, which differed significantly between groups, and could well have influenced results. Any reviewer familiar with testosterone literature would be aware of the known inverse relationship between T concentrations and mortality.
9. Figure 1 (study cohort) has now been the subject of a published correction but remains available for view with the incorrect values for several items. Much of this figure is inaccurate (see annotated pdf of revised article, attached), even beyond the acknowledged errors. For example, the initial cohort lists 23,173 “men” when it has now been revealed that this value also includes women. A critical, unanswered question is how many women were inadvertently included in the final study population of 8709 individuals. The accuracy of other values for excluded individuals is also highly questionable. For example, 12 men were excluded for having PSA value >4 ng/dl, representing 0.05% of the initial study pool (N=23,173). It is not believable that only 5 men per 10,000 had an elevated PSA. In contrast, this value was 15% in the PCPT trial (8), a 300-fold difference.
10. Figure 2 (Kaplan-Meier curves) is misleading and nearly impossible to understand. This is the heart of the study, as the only statistically significant difference between groups was derived from these curves. There is no indication that these curves represent adjusted values, let alone for >50 variables. It is also misleading because at what appears to be end of the study (2000 days) the total percent with events in the T group is presented as greater than 30% (<70% survival) when the actual percentage of events in this group was only 10.1%. It is impossible to understand is that the authors indicate for the first time in their reply to letters that time zero for the T group began with filling of their first T prescription, which occurred at a median of 531 days following coronary angiography, whereas time zero for the untreated men was coronary angiography. Is this reflected in the X-axis? Should the legend for the X-axis of “days” be changed to “day-equivalents” or similar? See annotated copy of article (attached) for clarification.
11. The primary results of the revised version of the study still appear likely to be incorrect or inaccurate. The revised language in the abstract reads: “At 3 years after coronary angiography, the Kaplan-Meier estimated cumulative percentages with events were 19.9% in the no testosterone therapy group vs 25.7% in the testosterone therapy group…” Although time zero for the untreated group began with the date of coronary angiography, time zero for the T group began a median of 531d (approx 1.5y) after coronary angiography. It continues to be unclear how this time difference impacted the reporting of results. Were the values for estimated cumulative events accurately described for the time period 3y after coronary angiography? At that time approximately half the T group would have been exposed to T for less than 1.5y. It is almost certainly incorrect to describe results as stated. Also, the revised abstract still refers to “absolute risk difference of 5.8%” even though the term “absolute risk” has been replaced in the reporting of results. This appears to be incorrect and misleading, based on the definition of “absolute risk difference” in JAMAEvidence, which refers to actual percentages of events between groups.
12. Finally, we note the additional criticism of the FDA: “It is also unclear why the authors excluded 1,301 participants for not having coronary anatomy data (CAD status), considering the wealth of baseline information collected on medical and drug history. It is unclear how this exclusion might also have affected the risk estimate.” We understand this criticism and others are most relevant for the authors. However, it is JAMA’s editorial board that selected this paper for publication, and subsequently promoted it despite these and other serious criticisms. JAMA must bear ultimate responsibility for its publication.
13. Note that we have posed several of these scientific questions via email to senior author, Michael Ho, who replied that these questions should be directed to JAMA.
This article is a mess, and JAMA has behaved badly. Something is terribly amiss when a premier medical journal publishes such an obviously weak study that contradicts well-established literature, and in so doing, fosters fear among the public. The concern is heightened when the journal’s response to inescapable evidence that the study is meritless is to deceive, distort, stonewall, and dig in. JAMA’s behavior suggests it is more interested in sensationalism and media coverage than scientific accuracy and integrity.
We recognize this is a long letter with a long list of concerns. It is unfortunate that so many troubling irregularities have occurred with publication of this article and with JAMA’s actions or inactions since that time. We note the following violations of JAMA’s critical objectives as noted at the outset of this letter:
Objective #1- JAMA has failed to maintain the “highest standards of editorial integrity.”
Objective #2- By no reasonable standard can this article be considered “valid.”
Objective #3- There can be no “responsible and balanced debate” when JAMA publishes misreported data using unvalidated statistical methodologies, and fails to note that the actual percentages of events in two groups were reversed by this methodology.
Objective #4- Publication of an erroneous study undermines the goal of helping physicians make “informed clinical decisions.” In this case, JAMA is responsible for a media firestorm that has affected clinical practice, all based on a meritless study.
Objective #8- JAMA has failed terribly in its goal of “promoting the integrity of science.”
Objective #10- JAMA has violated basic rules of “ethical medical journalism” and has published a study that cannot be considered “credible.”
JAMA has enormous responsibility here. It has created unfounded concerns among millions of men treated appropriately for testosterone deficiency, caused patients to question the care provided by their physicians, and prompted many men to discontinue treatment even among those who had clearly benefitted. It precipitated a safety bulletin by the FDA that will result in a public review on September 17, 2014. And it has birthed a new area of medical malpractice. Although Dr. Ho declined responsibility for any such effects by citing his call for further research at the end of the article,this stance is contradicted by the final sentence of the abstract, which clearly expresses confidence in the validity of the reported results and its utility in clinical decision-making: “These findings may inform the discussion about the potential risks of testosterone therapy.”
We note that the global community of testosterone experts has called for retraction of this study. Even the FDA has rejected the conclusions of the study, which is particularly noteworthy since as a government agency dedicated to public health and safety, it routinely errs on the side of caution in announcing warnings about medication risks. In this case, the FDA concluded (see attachd): “Given the described limitations of the study by Vigen et al. it is difficult to attribute the reported findings to testosterone treatment.”
We believe that with regard to this article JAMA has failed in its review process, its selection of this article for publication, its promotion of this article, its post-publication response to important revealed errors, and in its editorial ethics and integrity. We believe these failures have damaged public health, compromised the medical literature, injured patient-physician relationships, and furthered public suspicions regarding scientific research.
We hope you will give serious consideration to the issues we have raised, and we look forward to your response.
Abraham Morgentaler, MD (Chairman), on behalf of the Androgen Study Group
André Guay, MD
Mohit Khera, MD
Martin Miner, MD
Abdulmaged Traish, PhD
* Note about the Androgen Study Group. The Androgen Study Group is a multidisciplinary group with extensive clinical and research experience with T deficiency (hypogonadism) and its treatment, representing the disciplines of urology, endocrinology, family medicine, and basic science research. Our mission is to promote accurate reporting of results of testosterone research. Our members teach medical and doctoral students, residents, and fellows at the medical schools of Harvard, Tufts, Brown, Boston University, and Baylor College of Medicine; participate as faculty or course directors at more than two dozen CME events annually; and have served on national and international clinical guidelines and recommendations committees regarding T therapy. We are wholly dedicated to the science of testosterone in men, and to the wellbeing of our patients. The ASG receives no funding from any pharmaceutical company, and no member has any direct financial stake in any pharmaceutical company manufacturing or selling T products. However, several ASG members have received payments from pharmaceutical companies for consulting, participation in scientific advisory boards, research grants, and lecture honoraria.
1. Vigen R, O’Donnell CI, Baron AE, et al. Association of testosterone therapy with mortality, myocardial infarction, and stroke in men with low testosterone levels. JAMA. 2013 ;310(17):1829-1836.
2. Morgentaler A, Kacker R. Andrology: Testosterone and cardiovascular risk--deciphering the statistics. Nat Rev Urol. 2014 Mar;11(3):131-2.
3. Traish AM, Guay AT, Morgentaler A. Death by testosterone? We think not! J Sex Med. 2014 Mar;11(3):624-9.
4. Cappola AR. Testosterone therapy and risk of cardiovascular events in men. JAMA 2013; 310: 1805-6.
5. Wager E, Barbour V, Yentis S, Kleinert S; COPE Council. Retractions: guidance from the Committee on Publication Ethics (COPE). Obes Rev. 2010 Jan;11(1):64-6.
6. Bauchner H. Editorial Policies for Clinical Trials and the Continued Changes in Medical Journalism. JAMA 2013; 310:149-150.
7. Xu S, Shetterly S, Powers D, Raebel MA, Tsai TT, Ho PM, Magid D. Extension of Kaplan-Meier methods in observational studies with time-varying treatment. Value Health. 2012 Jan;15(1):167-74.
8. Thompson IM, Pauler DK, Goodman PJ, Tangen CM, Lucia MS, Parnes HL, Minasian LM, Ford LG, Lippman SM, Crawford ED, Crowley JJ, Coltman CA Jr. Prevalence of prostate cancer among men with a prostate-specific