History of clinical trials

History of Clinical Trials

Before describing the importance of sex as biological variable in clinical (and preclinical) research, it is first worth considering the history and the major developments that have followed the evolution of clinical trials to the randomised clinical trials (RCTs) that are the current standard practice.

This section covers early, more general, works before focussing on the legislation surrounding the inclusion and/or exclusion of women. To jump to the section regarding women (from approximately the 1950s) use the button below.

Jump to 1950s

Overview (Pre-1950)

Pre-/16th Century

The first uncontrolled clinical trial recorded, the first recorded rules for testing novel compounds, and the first uncontrolled clinical trial of a novel therapy was performed.

Main developments in

6th Century BCE

1025 CE

1537

18th Century

The first controlled clinical trial was performed.

Main developments in

1747

19th Century

The first trial to compare a novel therapy to an inactive agent (placebo) was performed, and alternate allocation trials became the new standard.

Main developments in

1863

1898

20th Century

The British MRC performed the first double blind trial, and the first randomised control trial (the current standard practice).

Main developments in

1943

1947

Timeline (Not to Scale)

6th Century BCE

1025 CE

1537

1747

1863

1898

1943

1947

The first uncontrolled clinical trial was recorded in the ‘Book of Daniel’ in The Bible, 6th century BCE. King Nebuchadnezzar of Babylon ordered his people to eat only meat and drink only red wine, as he believed this would keep them healthy. When a few royals objected, the king allowed them to follow a diet of legumes and water only, for 10 days. At the end of the 10 days, the legume eaters appeared better nourished than the meat eaters and hence were permitted to carry on their diet. (Bhatt, 2010, p6). Other early milestones in the history of clinical trials includes the documentation of an approach for trials and an accidental uncontrolled clinical trial. Avicenna in his encyclopaedic ‘Canon of Medicine’ (1025 CE) notes some rules for testing novel compounds, such as two differential cases being studied to allow for comparison, but the application of these principles isn’t documented. (Bhatt, 2010, p6). Subsequently, the surgeon Ambroise Paré accidently performed the first uncontrolled clinical trial of a novel therapy in 1537. He was tasked with treating wounded soldiers, but with the traditional treatment not readily available and the number of injured rapidly increasing, he developed a mixture of egg yolks, rose oil and turpentine to replace the boiling oil conventionally used to cauterise wounds. He feared the soldiers wouldn’t make it through the night, however, found them in better health than those that received the conventional treatment. (Bhatt, 2010, p6-7).

Dr James Lind is considered the first physician to perform a controlled clinical trial. He started his study on 20th May 1747 on 12 sailors suffering from scurvy. He described the main components of a controlled trial, with patients being as similar as possible, and made six groups with 1 group given 2 oranges and a lemon daily. He concluded that the citrus diet was most effective treatment for scurvy and published his paper ‘Treatise on Scurvy’ in 1753. However, the British navy didn’t make citrus a compulsory component of sailors’ diet until 1795. (Bhatt, 2010, p7). Following this, the use of a placebo in a trial was introduced in 1863. Austin flint completed the first trial to include comparison between a novel treatment and an inactive remedy (placebo). He gave 13 subjects herbal extracts for rheumatism instead of the established remedy. In 1886, Flint described the study in his book ‘A Treatise on the Principles and Practice of Medicine’. (Bhatt, 2010, p7). However, the word ‘placebo’ was first used in medical literature in the early 1800s and added to Hooper’s Medical Dictionary of 1811 as “an epithet given to any medicine more to please than benefit the patient”.

After this, alternate allocation trials became the standard practice for testing new therapies. Alternate allocation trials are conventionally dated to Johannes Fibiger’s study of diphtheria antitoxin in Copenhagen (1898). It entailed treating every other patient with an experimental remedy and withholding it from others and comparing outcomes. For example, Fibiger’s study treated patients every other day, to a total of 484 patients. Fibiger was the most famous use of the technique, but it appeared throughout literature during the 1890s. The technique could also involve patient or researcher blinding, use of placebos for control and statistical analysis of results. (Bothwell and Podolsky, 2016, p502). After this, alternate allocation studies remained the dominant model for controlled trials in the late 19th and first half of the 20th centuries. However, many professionals lacked the economic, regulatory or social incentives to evaluate their new treatments in trials and tended to rely on previous methods that were widely accepted by society. Others refused to use controlled trials on the basis that they didn’t believe it ethical to withhold treatment from patients based on the experimental group they were allocated to.

“The main difficulty encountered was the inability of our special investigators to withhold this promising agent from any stricken child. . .Our sentiment overruled our reason.”

A 1935 statement from researchers, regarding a trial evaluating convalescent serum for the treatment of poliomyelitis (Bothwell and Podolsky. 2016, p503).

Soon after Fibiger’s study, selection bias became the main issue with alternate allocation. This was demonstrated in a collection of pneumonia trials, where the benefit of an antipneumococcal antiserum was found to be varied among trials. The first trial run by Edwin Locke of Boston City Hospital in 1924 found no difference in mortality between experimental groups. Maxwell Finland repeated the study in 1930 and found benefit of the treatment but also added that some unconscious choice may have been exerted in his study. Selection bias was the topic of many debates regarding alternate allocation throughout the 1930s and 40s, especially surrounding pneumonia treatments. (Bothwell and Podolsky, 2016, p503).

Sir Austin Bradford Hill, a British epidemiologist and statistician, evaluated a group of these MRC (British Medical Research Council) trials of antipneumococcal antiserum in the 1930s. In an attempt to eradicate the selection bias associated with these trials and other alternate allocation experiments he designed the randomised control trial. Concealed randomisation of patients to treatment or control groups was first introduced, followed by blinding of researchers to patients’ allocation. Hill and his colleagues soon gained MRC funding and completed a spate of RCTs (as described later). (Bothwell and Podolsky, 2016, p503). The MRC also carried out the first double blind trial of patulin (an extract of Penicillium patulinum) for common cold in 1943. The study enrolled over 1000 British workers. The treatment plan was allocated in strict rotation by a nurse in separate room to the doctor and patient. Unfortunately, the trial did not suggest any protective effects of patulin, but the study was a step in the right direction for clinical research. (Bhatt, 2010, pg7-8). Returning to RCTs, the MRC Streptomycin in Tuberculosis Trials Committee (1946) started the first randomised controlled trial of streptomycin for pulmonary tuberculosis in 1947. This trial is regarded as ground breaking in the history of clinical research. The previously mentioned statistician, Sir Austin Bradford Hill, allocated patients to streptomycin and bed rest or bed rest alone by reference to a statistical series of random sampling. (Bhatt, 2010, p8). The Hill group were rapidly followed by other researchers, including those in the US, and by 1970 the FDA required RCT results be submitted with new drug applications.

1950s Onwards

In the trials described so far, few women were included. In a form of observer bias, male clinicians are more likely to study men. Additionally, women have historically been excluded from clinical trials based partly on concerns that female hormone fluctuations render women difficult to study (and the struggle wasn’t worth it), or liability concerns regarding women with childbearing potential that a new drug could damage foetuses. Childbearing potential was defined broadly as having the capacity to become pregnant, with the ban not only covering sexually active premenopausal women who were not using contraceptives, but also those women who were unlikely to become pregnant. This included sexually active, premenopausal women who were using contraceptives (including oral, injectable, or mechanical) or whose partner(s) had had vasectomies, women who were sexually inactive and lesbians. This extensive exclusion of women conveys the lack of respect held for women’s autonomy and their capacity to make their own medical decisions. Moreover, if female hormone fluctuations are that important and can disrupt clinical trial results, is it not worth studying drugs in women before administering them? The issue is too many biological factors surrounding the female body, such as menstruation, menopause, pregnancy and more, have been used to justify women’s subordination to men.

The timeline below briefly depicts the main advancements in the history of trials that relate to women. This is also covered in more detail under the figure.

Timeline from 1950 to 2020 depicting the major developments in the history of clinical trials. Created by author.

The thalidomide scandal of the 1950/60s significantly changed the course of clinical trials, as will be further discussed. Thalidomide was originally used as a sedative in the late 1950s but was rapidly marketed for the treatment of morning sickness, with widespread use across Europe, Australia, and Japan. After 1000s of babies began being born with phocomelia (a deformity where hands/feet are attached directly to the body, or limbs are underdeveloped/absent), a ban was introduced in 1961 across many countries. However, in areas of poor medical surveillance, the drug is still used. (Kim and Scialli, 2011, p1-2). The lack of worldwide restriction of thalidomide has resulted in cases of thalidomide-induced phocomelia, with the most thoroughly documented cases in South America. For example, access to thalidomide in Brazil is high due to the sizeable number of patients suffering from leprosy, and the use of thalidomide as a treatment. Since the scandal of the 1950/60s the drug has been used by many Brazilian women not suffering from leprosy who weren’t warned of the dangers. Hence, since the mid 1960s, over 30 cases of thalidomide induced phocomelia have been reported in Brazil. In 1998 the FDA approved its use for a form of leprosy and multiple myeloma under restricted access, including mandatory classes on the drug’s dangers and the importance of contraception, and requiring women to be using two forms of contraception. (Swann, 2009, p139).

Timeline (Not to Scale)

1966

1977

1993

2013

2016

A benefit of the scandal is that it led to the development of systematic bioassay testing of pharmaceuticals for developmental toxicity before marketing. The FDA adopted segment I, II and III protocols, relating to fertility and general reproduction, teratogenicity, and perinatal studies, respectively in 1966. The tragedy also showed that there are differences in species sensitivities to drugs and that the use of a second species in animal testing is important. (This came from the fact that thalidomide caused similar birth defects in rabbits but not in rodents in the same way). Prior to this modification, toxicology testing was hypothesis driven. However, the thalidomide scandal intensified women’s exclusion from clinical trials and led to the ban of women of childbearing potential from clinical trials becoming law in 1977.

In response to women’s health advocates and AIDs activists in the 1980s, the US National Institute of Health (NIH) Revitalization Act of 1993 reversed the previous ban and mandated enrolment of women, as well as minorities, in federally supported phase III clinical trials, except when their exclusion could be justified. The 1993 FDA guidelines emphasise 3 pharmacokinetic issues with drug design/testing: the effects of the menstrual cycle and menopausal status on drug pharmacokinetics, the effects of oestrogen supplementation or use of contraceptives on drug pharmacokinetics and the effect of the drug on the effectiveness of oral contraceptives. (Merkatz et al, 1993, p294). The issue with this language is that the inclusion of women in clinical trials is seen as a burden and again the male anatomy is seen as the norm- there is no consideration of male sex hormones on drug pharmacokinetics.

The requirement to include women in trials led to an increase in the inclusion of women in research, but a 2018 paper, looking at RCTs published in US medical journals in 2015, found that (of the studies that didn’t focus on one sex), only 26% of studies examined reported an outcome by sex or included sex as a covariate and only 13.4% of the total 142 reported outcomes by race/ethnicity. Of the 107 studies that enrolled both sexes, the median enrolment of women was 46%. While this statistic doesn’t seem bothersome, 16 of those trials enrolled fewer than 30% women and 7 of those enrolled less than 15%. It is interesting to note that of those 16 only 4 studies discussed the fact that their findings may not be generalisable to women. The authors also compared their results to their previous studies in 2009 and 2004 and found no significant increases in reporting by sex, race, or ethnicity despite the changes in NIH policies. (Geller et al, 2018, p632-3). They state that simply enrolling more women and non-white subject does not correlate to increased analysis of sex or ethnicity. The authors recommended strong journal policies to increase compliance with NIH policies.

Despite the adaptation to experimental design in 1993, the same legislation did not apply to animal or cellular research. There is typically an over-reliance on male animals and cells in preclinical research that can obscure key sex differences that could guide clinical studies. The exclusion is again based on the idea that females are more variable than males and as such need to be tested at all 4 stages of the menstrual/oestrous cycle. This is not only time consuming but also significantly increase research costs, hence is just not seen as worth the hassle. However, in a meta-analysis of 293 studies, Prendergast et al. (Prendergast et al., 2014, p3-4) concludes that female mice tested throughout their hormone cycles display no more variability than males. In fact, male mice exhibited greater variation in several traits measured. As a result of the lack of legislation, the ORWH (Office of Research on Women’s Health), in 2013, launched a programme that provides funding supplements to existing grants to add subjects, tissues, or cells of the sex opposite to that used in the original grant, or to increase the power of a study to analyse for a sex or gender difference by adding more subjects of either sex to a sample that already includes both males and females. (Clayton and Collins, 2014, p283). As previously mentioned, it is also important for journals to encourage the inclusion or increase the criteria in terms of inclusivity of both sexes in cell and animal studies. For example, the American Physiological Society mandates that papers must report the sex of experimental animals, the sex of animal’s cells are derived from and the sex and/or gender of human subjects. Similarly to the ORWH programme, the National institute of health mandated from January 2016 that grant applications have to recruit both sexes in vertebrate animal and human studies. This is to ensure that sex as a biological variable is considered. If one sex is proposed for a study, there must be strong justification from literature or other data for the application to be considered. (NIH, 2015).

The development of clinical trials has been a long journey, not only to the randomised clinical trial that is the current standard, but also to the necessity for consistent inclusion of women and the recognition of sex as a biological variable in drug testing and design. As has been mentioned, and will be further discussed, more work needs to be done to ensure women are equally considered in healthcare, drug administration and clinical trials.