A History of Women in Sport: Part 1

I want to leave the recent bit of Internet drama behind for a bit (don’t worry, it’s not over) and post an excerpt from The Women’s Book Vol 2 (which will deal with training) on The History of Women in Sport.  Since it’s nearly 11 pages, I am going to divide it into two parts.  Today I will look at the involvement, development and other aspects of women in sport from the turn of the century up until the start of the modern era and will finish next week with the modern era and beyond.


Chapter 1: A Brief History of Women in Sport

In Volume 1 of this book, I addressed at least briefly that, for the majority of time sports have been contested, men have made up the majority of both competitors and coaches (it’s also likely that they were the primary audience as well). Practically this means that the majority of athletes and coaches (and probably spectators) have been men which means that the approaches taken to training, diet, etc. have primarily come from work with or on men. This was used to examine in some detail issues of physiology and research in order to make the point that women cannot be treated as nothing more than little men in this regard.

Since this book is oriented more specifically to training, I want to start by looking in somewhat more detail at this topic, and the changes that have occurred over the 120 some odd year in terms of women’s involvement in and acceptance in sport. Let me make it clear that I am no historian and will not attempt to be comprehensive in this regard as I am sure others have done much more thorough examinations of the topic. Rather I want to look at some of the overall changes that have occurred over time along with some of the driving forces behind them.

For much of the discussion, I will be focusing on those changes that occurred during the 20th century as this represents the grand majority of the time that is relevant (if I’m honest, it’s also the time when the most readily available history) exists. At least in broad strokes I will also look at some of the more current changes that have occurred.

A great deal of the information will examine changes that have occurred in Olympic involvement as this tends to be representative of global sport as a whole (1). I will also include information about the changes that occurred in America specifically as I think they probably broadly represent the changes that have been occurring in non-Socialist, non-Communist Western countries. As needed, I will mention specific countries or exceptions to the overall trends. In many cases, I will also look at some of the forces that were driving the changes that occurred (or did not occur in some cases).

Women in Sport Part 1: The Turn of the 20th Century

Throughout the majority of the 1800’s, sport was considered almost exclusively the domain of men. Early German competitions allowed no women and only a handful would be involved in sports such as bicycle racing, swimming, parachuting or ski jumping (I cannot begin to explain these last two). They assuredly were not accepted in sport on any real level and were too much of a minority to be representative of much.

At the turn of the 19th century came the creation of the modern Olympic games by Baron Pierre de Coubertin. In attempting to revive the earlier Greek Olympics, he considered it a completely male affair and it should go as no surprise that exactly zero women were present during the games except to crown the winners. At the next two games, women made up 1.7% and 0.9% of the athletes respectively and women would represent no more than 3% of the total athletes by 1920. In most cases, the women who were competing were either citizens of the host country, or came from Great Britain, which had had the longest sporting tradition to date. Even then, women were limited to competing in sports that did not involve “visible exertion”.

During the 1920’s, the realities of World War I would cause huge sociological changes due to the necessity for women to take on traditional “male” jobs to support the war effort. This would include the right to vote and, along with other changes, led to a greater push for women’s involvement in sport, at least in certain countries (in others, there was still significant resistance to the idea). Germany was one exception, encouraging sporting clubs to create women’s sections. A women’s sporting movement would also develop in France during this time. Of some importance, it was during these years that the first women’s athletic championships were organized.

Including the “unfeminine” track and field events, the Women’s World Games were held in 1921, 1922 and 1923 and organizations dedicated to women’s sports would be created during this time. A Women’s Olympic Games was even held in 1922, 1926, 1930 and 1934 although it was not officially associated with the Olympics and would ultimately have to give the name up. But the mere existence of these games not only showed that women had the capacity for high performance sport but gave its organizing body (called FIFSA, check this) a way to exert pressure over the International Olympic Committee (IOC) to advocate for greater involvement of women.

During this time, De Coubertin and others still maintained that women should be excluded from the Olympic games but this had already become a losing battle. Women’s fencing and gymnastics had already been added and a limited number of track and field events would be added in 1928 (I’ll come back to this below). Controversy erupted at this games when many women were reported to have collapsed near the end of the 800m race. Most likely this was due to inadequate training (at the time the idea of training for sport was more or less considered cheating) but it was still deemed scandalous and unaesthetic. This event would be used as “proof” that women were simply not suited to sports or competition. Even with increasing involvement, women would make up no more than 4.5-8.5% of the total athletes at the Olympic games during this time.

Of some interest is that, during this same time, an entirely separate sporting event called the Worker’s World games occurred in Germany. This was organized by a socialist sporting federation and would include both gymnastics and track and field. In 1937, an Alternative Olympic Games would also be held in protest of the official Olympic Games. It’s poster was of a muscular woman throwing a discus and shows how much cultural norms and acceptance of women in sport varied even at this time.

Even at the official 1936 Olympics in Berlin, the Germans had the strongest women’s team, especially in track and field. Even here, the idea of women in sport went against political beliefs of racial hygiene and femininity but it was more important for Germany to show the superiority of Nazi ideals at the world stage of international sport. In the 1970’s and 1980’s the German Democratic Republic (GDR, East Germany) would field a dominant women’s team for similar political reasons.

Medical Issues: Part 1

We might ask why there was such a huge push to exclude women from sporting competition in general and the Olympic games in specific. The most general reasons were sociological and cultural as it was simply not seen as appropriate for women to be involved in what were considered “masculine” activities. De Coubertin even stated that he did not want women to “…sully the Olympic games with their sweat.” Sport was simply seen as an exclusively male domain.

Even this idea came as much from cultural norms as the majority-held idea of women being the weaker sex, who needed more rest than men and who were less physically capable or resilient than men. This is ironic in that, as I discussed in Volume 1, women are actually far more likely to survive many threats such as famine than men and in many ways are far more physically resilient (the basic reason being that they had to be since they were tasked with the survival of the human race).

But this idea was taken even further as the medical experts of the day, almost universally men, promoted ideas and theories that were used to support the belief that women should be disallowed from sport. Broadly in medical literature at that time, men were considered the norm while women were considered deviant (used literally here as a deviation from the norm) or deficient. One prominent physician described female organs as “incomplete”, presumably implying that they had not completed their development into a penis.

Others developed various theories and ideas about how sport were potentially damaging to women, primarily in terms of their reproductive capacity. At the time, for fairly obvious reasons (i.e. their absolute crucial importance for survival of the human race), much of women’s medicine revolved around reproductive function and what impacted it. And sport was felt to, in one way or another, damage a woman’s potential ability to give birth. A number of different ideas were suggested relating to this.

One early idea, coming out of 19th century ideas of vitalism was that the body had a limited and non-renewable amount of stored energy to use over a lifetime. It was felt that by expending energy on sport, women would essentially deplete this energy permanently and be unable to bear or look after children.

It was also felt that the uterus was the most vulnerable and fragile part of a woman. One prominent German gynecologist believed that the uterus “…pulls at its sinews with every vigorous jumps a woman makes and may even tilt backwards”. The same individual wrote that “…each attempt to train the muscles of the female abdomen and pelvis lead to a tautening of the muscle fibers so that childbirth becomes much more difficult if not impossible.” Basically it was felt that exercise would render women infertile, an idea that persists to some degree even today.

Finally he opined that “too frequent exercise will lead to masculinization…the female abdominal organs wither and the artificially created virago is complete”. Since I doubt many know the word (I didn’t), virago is a term that currently means domineering or bad-tempered woman but which has an archaic meaning of female warrior or woman of masculine strength. Ignoring the abject absurdity of this in a biological sense, I have to think that many women in the modern era might look upon this term as more of a compliment (in the sense of female warrior) than an insult.

I’d point out, at this point, that nobody had actually studied or examined any of this directly. Rather, a bunch of (invariably male) doctors were making proclamations based on their inherent biases and assumptions. Essentially they had decided that women were the weaker sex, should not be involved in sport, and then came up with the rationale for those beliefs more or less after the fact. Even among female doctors, many of these ideas were still held to be true but once again it was based almost solely on theory rather than any sort of direct experimentation.

Interestingly, in the late 1920’s, German doctors actually took the time to examine female competitors at a sports festival and were unable to find any of the claimed negatives. By 1934, 10,000 girls and women had been examined with no support for any of the medical claims that had been made. Since that data did not fit the narrative of the day, these findings went more or less ignored and the idea that women were not only unsuited for sport but could be physically damaged by it had taken hold. Even if these ideas seem patently ludicrous in hindsight, the idea is entrenched enough that they often continue to this day.

The fear that women will become masculinized by sport, especially weight training, is still prevalent along with the idea that sports can damage their reproductive function. As a personal anecdote, when I worked at a wellness center in the mid-1990’s, a female member was told by her doctor that the Stairstepper would make her ovaries swell and I imagine that many female readers have come across these ideas in one form or another. That heavy lifting will make their ovaries fall out, or damage them reproductively, etc.

Women in Sport Part 2: The Post WWII Years

Following World War II, arguably the first major change in women’s sports involvement would occur. In 1953, IOC president Avery Brundage put up the idea of removing women’s competition from the Olympics completely while also advocating that they only be allowed in sports “appropriate” for them (whatever that means). As late as 1966, the IOC was still discussing whether women’s discus and shotput should be part of the games, presumably due to both events being considered “unfeminine”.

But these individuals were ultimately fighting a lost battle and the inclusion of women in the Olympic games was going to happen whether they liked it or not. Much of this was driven by the importance that the games had taken in terms of global politics. Just as with Berlin in 1936, the games were being increasingly used to promote the superiority of political ideology. Specifically the Soviet Union and Germany saw the games as a way to promote Communist and Socialist ideology (respectively) through sports performance. They didn’t care whether it was women or men winning medals and were just as supportive of their women’s teams as of their men’s (the GDR women’s team would be absolutely dominant in the late 70’s and early 80’s).

Since Western countries were not sending many women to the games at this point, it made sense for these countries to invest proportionally more in their women’s teams since it was relatively easier to win medals. In addition to pushing for greater inclusion of women overall, the Soviet Olympic Committee pushed for an increased number of events for women in order to increase the potential medals that they could win. Not only did this serve to cement the presence of women at the games, it would force Western countries to make a greater effort towards women’s sport to keep up with those countries on the global stage.

Even so, the changes were only relative with women’s involvement at the Olympics increasing from 9.5% in 1952 to 20.6% by 1976. It was a huge increase compared to the 5% or less from the early 20th century but women were clearly still vastly underrepresented.

A Bit of Trivia Regarding Exercise and the Menstrual Cycle

Before continuing with the discussion, I want to make readers aware of a bit of historical trivia that I think is interesting. Inasmuch as women were not going into sport on a global scale, there were countries with relatively more involvement and acceptance. And here there were concerns, primarily revolving around the menstrual cycle and whether or not exercising during menstruation was safe. Even female exercise advocates of the day argued that exercise during menstruation should be avoided although this was not based on anything more factual than the other medical theories about the topic.

But in the relatively sports oriented country of Australia, one of the single largest events for sportswomen occurred in the 1950’s with the introduction of the tampon who’s advertisements apparently provided some of the most accurate information regarding the topic yet available. One stated that women could now swim “at any time of the month.” and I have to wonder if the colloquial phrase “shark week” to describe women’s menstruation didn’t come out of the literal fear of being attacked by a shark while swimming in open water. But even with those changes, there were still many concerns that were being voiced regarding women’s involvement in sport overall.

Medical Issues Part 2

While at least some changes in sociocultural factors had occurred by the middle 1950’s with proportionally more women entering sports, it’s fair to say that, at least in the Western World, women were still not involved in sport. Whether this was due to a lack of interest or availability is not a question that I can answer but it sort of doesn’t matter in a practical sense. Simply, by the halfway point of the 20th century, women were still not involved on any major scale.

Of more importance is the medical concerns of the early 20th century were still prevalent with arguments about whether or not women could or should be allowed to engage in certain sports. I mentioned shot put and discus above but there was still some debate over whether women were suited to long-duration endurance sports such as the marathon in the sense of being able to physically complete it. The Boston Marathon, even then the largest event of its kind, actively banned women from competing in the event until 1972 and female competitors were often physically pulled from the course prior to this.

The marathon, along with long-distance cycling would not be added to the Olympics until 1984, nearly 100 years after men first ran the race. Amusingly, prior to the first Olympics, a Greek woman would unofficially complete the marathon distance in 4.5 hours prior to the game with a second, a 35-year old female mother of 7 completing it in 5.5 hours. Women were clearly capable of completing the distance and, ironically, women’s physiology is more suited to long-duration activities than men in many ways.

Similar ideas were held regarding sports involving physical contact including many team sports where it was felt that women might be damaged from the physical contact that was often involved. Women’s volleyball would not be added until 1964, basketball and handball in 1976 and hockey in 1980. Perhaps surprisingly, women’s soccer, not often thought of as a high-impact sport, would not be added to the Olympic games until 1996.

Women in Sport Part 3: Over in America

Leaving the topic of the Olympics briefly, I want to look at the changes that were occurring during this time in America specifically. As I mentioned above, it’s a topic I have actual data on and will assume is more or less representative of other Western countries. Certainly sport in America was huge and had been since the turn of the century but it was predominantly a male domain.  I grew up in the 70’s and it was simply the norm for boys to be involved in little league soccer, baseball, football and many other sports from a young age. I certainly did as did everyone else I knew. In contrast, outside of a few select sports, this was simply not part of most girl’s childhoods.

This can be made clear by looking at the statistics of high-school sports. In 1971, 295,000 girls played sport compared to 3.7 million boys, a 12.5:1 ratio. Above, I said that it’s often difficult to say if women’s involvement in sports is due to a lack of interest or accessibility but at least in this case, there’s a very good indication that it was the latter, that the lack of women in sport was due to a lack of accessibility more than any other factor. I say this as the passing of Title IX in 1972 would signal a step change for female involvement in sports and women’s involvement would increase drastically from that point forwards. Once girls had access to sports, they began to enter it in increasing numbers with each passing year.

Title IX was an amendment to the then current education laws and said, essentially that there could be no discrimination based on sex from any educational program or activity that received federal funds. While this wasn’t explicitly aimed at sports, inasmuch as the public school system in America is federally funded, it would have a tremendous impact in that area. Going forwards, schools would legally have to provide equal access to sports and this would have the long-term effect of increasing women’s sports involvements enormously.

Mind you, other changes were occurring during the time and there were likely changes occurring both interest, acceptance and acceptability throughout this time. First wave feminism had developed during the 70’s and a push for women’s equality was occurring throughout the country in all domains. By the late 1970’s, a number of high profile female athletes would begin to act as positive role models for girls and make it more socially acceptable for women to enter sport in the first place. This would create a cycle whereby girls who were interested in sports would finally have access to them and would then go on to become role models for the next generation of girls, increasing interest and involvement further.

And the changes that occurred in response to Title IX and the other societal changes are against borne out by looking at the statistics on high-school sports involvement. By the 1999-2000 school year, 2.7 million girls would be involved in sport compared to 3.8 million boys. The 1971 12.5:1 ratio had dropped to 1.4:1 in only 2 decades and it’s interesting to note the boy’s numbers were essentially unchanged. Rather, women were beginning to enter sports in increasingly larger numbers.

Similar changes were seen at the collegiate level (public colleges also being federally funded meaning that Title IX applied). In 1972, there were only 80,000 female collegiate athletes and this had increased to 150,000 by 1998-1999. I don’t know how this compares to the number of male collegiate athletes and will only comment that the reduction in total numbers from the high-school level most likely reflects the fact that most high-school athletes (female or male) don’t continue competing into college or beyond. That said, there is some indication that, due to inadequate training, girls tend to quit sport at an earlier age so there is more going on than just this one issue. I will come back to this later in the book.

Regardless of the specifics, the passing of Title IX, along with other sociocultural changes clearly opened the door for women to begin to enter sport in increasingly larger numbers and this trend would continue into the modern age. Accessibility more than inherent interest seemed to be what was holding women back.

Women in Sport Part 4: The Olympics at the Turn of the Century

By the end of the 20th century, it was clear that a step change had occurred with women’s involvement in sport. I mentioned some statistics for American high school and collegiate sport above but, by 1996, women would represent 35% of the athletes at the Olympic games (contrast this to zero at the first games, 11.5% in 1952 and 21% by 1980). At least some of this was due to some sports not having official women’s competition which prevented women from achieving true parity with men in terms of numbers. The inclusion of women’s wrestling and boxing was still being debated due to fears over the potential for physical damage to the athletes although wrestling would be added in 2004 and boxing in 2012.

But at least one factor at the Olympic level is related to culture. Because while 35% of the total athletes were women at this point, the actual percentage varied enormously by country. For example, at the Soul Olympics in 1988, the Spanish delegation was only 18% female compared to 35% of the total number of women at the games. Some of this is assuredly sociological in the sense of cultural beliefs about what are appropriate activities for women. But there were also economic factors at play. In general, the greater the greater the economic resources available to a country, the greater proportion of female athletes that will be present. This assuredly represents the fact that, right or wrong, the men’s events often carry more “weight” (in a viewership or global status sense) and countries with limited resources would logically allocate them where they feel that there are the most potential benefits to be gained politically.

It’s worth mentioning that this is not universal; the Chinese Olympic team is 46% female and has been since they appeared on the global stage. As with the Russians and Germans before them, this is clearly a politically motivated decision to promote the superiority of Communist ideals at the world stage. With women’s competition often being less strongly contested than the men’s, the Chinese clearly see the women’s events as a place to earn medals and status at the global stage.

For at least some time, some countries sent no women to the games. Typically these were Muslim countries and there was a brief push to have them excluded from the games entirely but this was never implemented as it was felt to be anti-Islamic. Of some interest, an Islamic Women’s World Games were held in 1993 and 1997 and men were both barred from competing and spectating. This would change going forwards as a number of Muslim women have entered and won Olympic medals. Even with those changes there will still likely remain cultural and financial reasons by which some countries still send male dominated teams going forwards and this may prevent women from ever being truly equally represented in the Olymipics. But overall, it’s clear that the inclusion of women at the games has increased to a staggering amount.

Medical Issues Part 3

Even at the turn of the century, at least some of the same medical ideas regarding women in sport persisted. Women’s pole vault, for example, was not added to the Olympic games until the year 2000. Amazingly, the reasoning for this seems to go back to a 1950’s physician who felt that if a woman were turned upside down it would damage their internal organs and reduce their ability to have children (2). While this sounds absurd to a modern audience, as late as 2005 the International Ski Federation argued that women should not ski jump, stating that “…it’s like jumping down from, let’s say about two meters on the ground about a thousand times a year, which seems not to be appropriate for ladies from a medical point of view.”



And that’s where I will cut it for today. Next week I will pick up with women’s sports in the modern era.

Training Volume and Muscle Growth: Part 3

Ok, let’s finish this thing up.  So far I’ve looked at the 7 current studies (as of this article’s writing in October of 2018) in often excessive detail in Part 1 and Part 2 and now it’s time to put them all together to see how training volume and muscle growth relate.  As noted in Part 1, I’m throwing the Radaelli paper into the trash.  I consider the results too random and nonsensicial  to be worth considering.

There is simply no world where growth in triceps in beginners doesn’t start until 45 sets per week but 18 sets for biceps is effective and where LBM gains are higher for calisthenics than low-volume weight training.  So it’s out.  Agree or not, I put my reasoning up front and looked at it in detail to explain why I think it’s garbage so it wasn’t just a hand-wave like most would do.

That leaves 6 studies in trained individuals (minimum 1 year training experience and a usual range of 1-4 years) looking at different volumes of training and the muscle growth response.   Yes, they used varying methodologies, some only used body composition methods via DEXA, some used DEXA and Ultrasound, one used DEXA, Ultrasound and muscle biopsy (of quads only).

As I stated at the outset, I’m going to simply take them at face value for the time being.  It’s the data we have and with the qualifications I was sure to make as I went, it’s what we have to build the model on at this point.  Yes, future data may change the model.   When it arrives, the model will have to be updated with it.

Building the Model

First, let me put the 6 remaining studies together to see if a pattern shows up.

Yes, this is a terrible chart and I probably got at least one of the numbers typed in wrong since I type fast and frequently don’t take the time to check after the fact.  The true guru will dismiss the entire article based on a typo.   But guru gon’ guru and there’s nothing I can do about that.

Instead let’s focus more on the generalities of the data and less on my lack of proofreading.   As no values went down, all numbers represent an increase from the beginning of the study.  Percentage change is a percentage change and any absolute numbers are mm changes in muscle thickness.  I’ve shown the best response in red.

Paper Muscle Low Mod High Volume of best response (sets/wk)
Ostrowski Quads 6.7% 5% 13.3% 12 Nothing higher tested
Triceps 2.3% 4.7% 4.8% 14 28 sets no better than 14
Amirthalingam Totes LBM 2.7% 1.9% Lower volume better than higher
Trunk LBM 4.1% 1% Lower volume better than higher
Arm LBM 7.8% 3.4% Lower volume better than higher
Leg LBM 0.5% 2.4% Higher volume better than lower.
Tri MT 2.3 4.5 18-19 triceps
12-18 quads
For triceps 26-28 sets upper no better than 18-19
Leg volume of 12-13 and 16-18 so similar that I am considering them as part of a full range.
Bi MT 2.4 0.3
Ant Thigh MT 2.6 1.1
Post Thigh MT 1.2 2.2
Hackett DEXA only 18-19 vs. 26-28 upper

12-13 vs. 16-18 lower

Essentially identical to Amirthalingam with moderate volume superior for overall LBM gains and upper body and the slightly higher volume slightly better for lower body.  Differences were overall small and moderate and high volumes were more or less identical.  Very small number of subjects weakens statistical power.  Lack of direct measure of muscle thickness is not ideal.
Huan LBM changes Gains up to 20 sets/week.

No further from 20-32.

By muscle thickness via Ultrasound, triceps showed growth up to 20 sets and then SHRANK.
By biopsy, quads shrank up to 20 sets and grew after that.  Unclear why Ultrasound did not pick up quad growth but biopsy did.Total summed changes in triceps and quads a whopping 1.8 mm.  Miniscule changes overall.
Over 20 sets per week, water retention measured by TBW and ECW significantly increased, making LBM gains above that point insignificant.Two conclusions: cap to useful volume of ~20 sets per week.  Volume without tension is shit for growth because volume is NOT the primary driver of hypertrophy and never will be.
Schoenfeld Biceps 0.7% 2.1% 2.9% 18 sets By their own statistical methods, the highest volume was weakly/insignificantly better than moderate (described as ‘not worth mentioning’ in stats texts).

Conclusion: 18 sets upper and 27 lower optimal with higher volumes showing no meaningful differences (and certainly not for doubling the volume).

RF and VL changes ADDED together for poor comparison to Ostrowski.

Data in question because of an outright LIE in the discussion regarding Ostrowski triceps data.

Triceps 0.6% 1.4% 2.6% 18 sets
RF 2.0% 3.0% 6.8% 27 sets
VL 2.9% 4.6% 7.2% 27 sets
Heaselgrave Biceps 1 3 2 No statistical difference but a trend for 18 sets better than 9 and 27 no better than 18 (or even worse).

Ok, now let me summarize that horrible chart by showing where each study finds that their optimal results fall in terms of sets per muscle group per week.

Study Optimal Volume
Ostrowski 12 sets lower body (no higher tested), 14 sets upper body
Amirthalingam/Hackett* 12-18 sets lower body, 18-19 sets upper body
Huan 20 sets upper as a cap, possibly 20+ for lower body (needs more study)
Schoenfeld 18 sets upper (compared to 6 with no middle value for comparison)
27 sets lower (compared to 9 with no middle value for comparison)
Heaselgrave Trend for 18 better than 9 but 27 no better than 18.

*I grouped the two studies together since they were an identical methodology and only differed by length
and I’m getting tired of writing and making tables as it’s a pain in the ass in WordPress.

Looked at this way a pattern starts to show up.  Which is that a moderate volume tends to beat out either lower or very high volumes under basically every circumstances in trained individuals (again defined in most studies as 1-4 years of training or a minimum of 1 year regular training).  Or rather, a set count somewhere between 10/12 to 20 sets/week provides about the optimal results in all cases, at least within the limitations of the data available.  Only Schoenfeld’s leg data exceeds this but this is from a low volume of 9 compared to 27 with no middle value for comparison.  We can’t know what would have happened between those numbers.

Let me note that even IF you prefer the conclusion that Brad’s highest volumes groups gave a trend towards higher growth, it STILL contradicts the broader body of literature.  He still can’t explain why he needed to use 2X or 4X the volume to achieve the SAME growth as Ostrowski.  He can’t (read: won’t) explain a damn thing, especially when the data disagrees with him.  Let me note again that James Krieger made the explicit point that you have to look at all of the data and not focus on one study.  Yup.  And what all of the data except Brad’s study says is that 10-12 to 20 sets/week is the right number and Brad’s numbers are wrong.  Gotcha, James.  You played yourself, too.

Spoiler: My conclusion above is EXACTLY what Eric concluded as well in his MASS piece, that 10-20 sets/week was about optimal (and of course he did because it is what the MAJORITY of data supports). This was after he desperately tried to make Brad’s numbers and study not be total bullshit by dismissing the endless problems with it methodologically, by playing the “I do science” card and all other manners of silliness.   Which makes you wonder why he tried so hard to defend it with such pitiful arguments and reasoning.  He doesn’t even think the numbers are right or he’d drawn a different conclusion.  And yet he keeps trying to defend it with weaksauce defenses (leg extension volume load hahahahahaha.  I will never stop laughing at this).  I guess when seminar appearances are on the line you have to tow that line…

But this is kind of interesting because it does actually agree with Brad’s original meta-analysis (I am giving him the benefit of the doubt that it’s worth a shit to begin with and I question that with every passing day) which concluded only that 10+ sets gave the best growth response with no ability at the time to determine an upper cap.  So that passes the first reality check.  In all cases, up to 10 sets there is a clear improvement in growth response.  Above that, TO A POINT, there is a greater growth response but it shows a clear cap where higher volumes do NOT generate a greater response.  In most cases, it’s the same, in the case of Haun and triceps, it was worse.  As this is the only study showing a worse response at the highest volumes so no global conclusions can be drawn here in terms of more volume being detrimental.   It simply isn’t any better.

But despite Brad’s attempts to make huge volumes better, the broader body of work (5 of 6 studies) supports a cap of about 20 sets for upper body, if that, and possibly more for the legs (for which we need far more systematic research).  Not 45 sets/week more.  But possibly more than 20.  This goes along with endless anecdotal beliefs that legs need more training volume but more systematized studies need to be done to show where an optimum might fall and whether or not upper and lower body truly have different optimal volume levels in terms of their growth response.

Now I could cut this article here, going a really long way to reach the above conclusion.  But that would be the easy way out and I’m not done yet.  Because I now want to return to an issue I brought up in Part 1 and said I’d revisit.

The Set Count Issue Redux

I want to return to the issue of how sets should be counted that I mentioned in Part 1.  When the Heaselgrave study came out Brad responded with the following to try to make the study fit his conclusions.  Or he may have been talking about both that paper and the modified GVT paper as they are the only two that used isolation movements and he needed to dismiss the fact that they contradicts his results (nevermind that his own study used leg extensions).

He was jumping around a lot and it was tough to tell what he was trying to dismiss to make his own (incorrect) results happen.  Guru speak can be tough since the goalposts change with every post, sometimes including leg extension load volume data when nothing else will work….sorry can’t resist.

His assertion was that perhaps volume requirements are different for compound and isolation movements and that that changed how sets should be counted (which means that his numbers could still be right).   Or rather, that isolation movements should be counted differently (basically trying to dismiss Heaselgrave’s set count accuracy).

And honestly, this is just total guru bullshit in the sense that is is a change in argument when he needed it.  For years now, in every study he’s done and every meta-analysis Brad and his group have counted sets on a 1:1 basis.  They’ve always done training with compounds movements and measured peripheral muscles and treated the total set count for the compounds as applying to those muscles on a 1:1 basis.

It’s always been 1:1.

If someone does 1 set of bench press, that’s 1 set for very muscle involved by bench so one for shoulder and for triceps.  It has to be because even in his own paper he was measuring bicep and tricep size changes in response to compound work.  He wasn’t looking at chest and back thickness (again I question why pec isn’t done more since it is clearly technically possible and wonder if back can be measured).   If you’re going to look at bicep/tricep growth in response to only compound work (and note the odd little bench press study I described that looked at growth in pecs and triceps in response to bench only which means that pec CAN be measured) and count those sets in total, you’re calling it 1:1 for compound and isolation.  Brad and his group have treated it as such from the get go.  It’s also how he reported the ‘findings’ of his recent study too.  He didn’t say it was 30 and 45 sets but this has to be kept in the context of measuring triceps and biceps with only compound training movements.  He said 30 and 45 sets was best for growth with NO qualification whatsoever (well, he didn’t qualify anything until he got backed into a corner).

It’s been 1:1 from the get go for him.

Or it was UNTIL a study/studies came out that he wanted to dismiss. Suddenly, it’s no longer accurate to count it 1:1 which is just terribly convenient. Because you do not after the fact get to decide that the most recent study (or the GVT study) was different due to the isolation work and therefore do not contradict your results.   Even here the argument is totally worthless.  The Heaselgraves study did row, pulldown and curls.  Two compounds and one isolation so at most you count the third exercise differently.  Brad’s leg workout was 2 compounds and one isolation too so what’s the difference except that he doesn’t like data that contradicts him?  The Ostrowski study used isolation work too and Brad was happy to lie about the numbers there without considering set count.  He considered it 1:1 when it was convenient and then lied about the data to change the conclusion on top of it all and then decided it wasn’t 1:1 when it was no longer convenient to do so.

Pure guru shenanigans. The argument changes when it needs to.

I said back in Part 1 that I only used that convention for consistency to what they had been doing and that I didn’t agree with it at face value.  I’ve said the same thing for years.  Brad only said it when he needed it to defend his paper.  But even so, let’s go with this logic and see where it leads us.

Because NOW Brad seems to be arguing that isolation work counts differently towards the training response than compound work.  Presumably it’s worth more since it’s well direct.  That is, nobody is denying that bench works triceps and delts.  The question is to what degree in terms of tension and volume overload it works them and how it compares to direct work in that regard.

Let me add for honesty: in the discussion section of his most recent study, Brad does acknowledge in the limitations that the use of compounds and measurements of isolation might alter the set count conclusion and he used this as an argument for why his change of attitude wasn’t just a convenient excuse.  Which might fly except that this was NEVER brought up until he was backed into a corner and needed to bring it up to dismiss a study that contradicted his.  So he can say all he wants that he considered it but he knows that the majority don’t read the discussion (maybe why he thought he’d get away with his lie about the Ostrowski data).  But when every post he makes crowing about his results IGNORES it, he’s just bullshitting after the fact so far as I’m concerned.  An honest scientist mentions the limitations of their work UP FRONT when they present it whether in research OR PUBLICLY.  Like I have done in this article series for each paper, addressing the potential limitations (small subject number, DEXA only) as I went.  I’m not waiting to get backed into a corner to change my argument or magically find new data (cough cough leg extension load volume…..fucking seriously?)  He did just like James Krieger did.  You can ask me any question about this article series, and nothing I report will change from what I’ve already written unless I made an explicit mistake (which I will then fix).

But let’s go from the assumption (which I have felt from the get-go) that you need to count volume differently for compound and isolation work in terms of determining the growth response to training (note that Eric said this was also true in MASS even if he hemmed and hawed about the ratios involved).  I always have in practice and I’d say anyone with real-world training/coaching experience does as well.  We don’t consider a set of compound chest work to be 100% a triceps exercise (and nobody counts pec deck as a triceps movement although it does involve a little biceps although absolutely nobody counts that).  Most people aren’t even aware that triceps long head is involved in rowing due to it’s function as a shoulder extensor but nobody on the planet would count that towards triceps volume.

It might be conditionally true for trainees with very specific levers who can just bench and build big tris (and even perfectly built benchers do extra triceps work) but we don’t generally count it that way.  Well I don’t and neither does anybody else I know with actual training or coaching experience.  If you look at every workout routine I’ve ever written, total sets of chest (which always includes a compound movement which might be followed by a second compound or an isolation movement depending) are higher than for direct arm work because I’m counting some of the compound chest (or back) work towards arms in terms of daily and weekly totals.   Delts is always funky but I do the same thing kind of.  Shoulders are complicated since it’s three heads with different functions and one is pushing, one is pulling and one is either neither or both depending on you look at it (it’s really humeral abduction but let’s not get too entrenched in this).

How Do We Count Sets?

The question is then how should you count the sets of a compound exercise towards smaller muscles.  I don’t know but I’m going to start with an assumption of a 0.5:1 relationship.  That is, I will count one set of bench or row as one half of a set for triceps or biceps.  Is this right?  Doesn’t really matter, ignore the specifics and follow the logic.  You can math it out for a different value if you’d like.  Call it 1/3rd or 2/3rds.  Call it 3/4ths.   Whatever fits your personal bias. This is my assumption but it’s only that.  Make your own and do the math based on it.  It will only change the specifics but won’t change the general conclusions that come out of the exercise.

Let me note that the ratio must be lower than 1:1 since nobody would EVER count a compound movement as MORE than 1 set for the smaller muscles (ok, I know how people read my articles and someone will make a strawman about poorly done rows being more biceps and that’s fine. Let’s define this as the movement being done properly in a technical sense).  The question is simply how much lower.  Pick your ratio and break out the calculator.

Just don’t call it 1:1 until it’s convenient not to do so like Brad did.

Re-Analyzing the Volume and Hypertrophy Data

Because if Brad is now going to say that compound and isolation movements count differently in terms of sets then EVERY OTHER ONE OF BRAD’S STUDIES AND ALL THE REST have to be recalculated in terms of their effective set count including his original meta-analysis AND his most recent paper and all of the others I’ve examined.   His study used only compound movements (with leg extensions for quads) but looked only at single joint muscles and many studies seem to do this.  If he NOW thinks isolation exercises count differently than his effective bodypart volume changes.  And so do the set counts on every other study (the set counts on his meta-analysis will also change but I don’t know what body of literature it used and what proportion used compounds versus compounds and isolations).

So let me recalculate them based on my assumption that a compound set counts as 1/2 a set for smaller muscles and a  direct exercise counts as a full set (i.e. bench press is 0.5 sets for triceps, triceps extension is 1 set for triceps).  Again, this is my starting assumption and nothing more.  Use whatever ratio makes you happy so long as it’s less than 1:1.  And don’t play silly buggers at the extremes, we all know it’s not 0.9:1 to one or 0.1:1.  It’s probably not 1/4 to one or 4/5ths to 1 either but somewhere clustered around the middle depending on the movement.  Maybe 1/3, 1/2, 2/3, 3/4…again I don’t know for sure and nobody else does either. So I’m using 1/2 and literally splitting the middle.

So all I did was go back and count the sets.  If it was compound exercises, I cut the number of sets in half (10 sets per week becomes 5).  If isolation, I didn’t (5 sets per week = 5 sets per week).  If there was a mix I counted compounds as half and isolations as one and added them (10 sets compound chest = 5 sets + 5 sets isolation triceps = 10 total sets for triceps, down from 15 originally).  Only two used a mixture so I probably didn’t screw the math up too badly.  Even if I got a number slightly wrong it doesn’t change the overall conclusions.

Now let me be clear again, I am NOT saying that this is a perfect analysis or that a 0.5:1 estimate is right so spare me the strawman arguments that I’m trying to force a set of data.  I’m simply saying that if Brad is going to dismiss a study result he doesn’t like based on isolation vs. compound needing to be counted differently, that opens the door for this type of analysis.  Beating a (very) dead horse, you can redo it assuming compound is worth 2/3rd of set of 3/4.  The numbers will change slightly and that’s fine and they’ll be marginally higher than my 1/2 assumption. If you go with 1/3rd they will be marginally lower.  But for rational set counts, the differences aren’t even that much.  Focus on the principles, not the specifics, folks.

But they will ALL go DOWN from what Brad was claiming them to be originally.  EVERY SINGLE STUDY.  That means in his meta, in his own study and in his examination of previous studies the numbers will all decrease.  Even in Ostrowski which he lied about in his discussion.  All of the set counts decrease.  None of the studies I examined used only isolation movements so there is NO situation where the numbers don’t go down.  And since nobody would ever count a compound movement as MORE that one set, they can’t EVER go up.

Yes I am beating a dead horse but I know how people read my articles. At least one person will claim “Lyle said all compound movements are worth one half a set for the other muscles involved” which I am not saying in the least.  I am saying this is my working assumption for lack of a better one and that’s all it is.  Again, use your own number that is lower than 1:1.  Just follow the logic here.

Yes, we need data on how to compare the exercises to see what the best counting approach would be.  I am aware of one that compared pulldown to biceps curls for recovery and the biceps curls took longer to recover from than the pulldowns so clearly it’s NOT 1:1.  The pulldowns didn’t hit the biceps as much as direct arm work did.  Dadoi.  Until more data exists, we make assumptions.

It might even and probably will turn out that different movements should be counted differently.  An undergrip pulldown is more biceps involvement than overgrip and a parallel grip is halfway in between (with more brachiails).  A high bar squat is more quads than low bar and a close grip bench is close to an compound triceps exercise but a flared elbow bench is more pec specific and how you’d count those towards triceps would likely differ (I’d call close grip almost 1:1 for triceps but flared elbow as 0.5:1), etc.  Back gets super complicated as we’re dealing with the traps (with multiple sections), rhomboids, teres, lats (which have two segments with slightly different orientations) and back movements work them to varying degrees based on movement, grip, bar, etc.

Coaches make adjustments for this based on experience.  If I were using overgrip pulldowns with a trainee, I’d give them slightly more direct biceps work to compensate compared to if they were doing undergrip pulldowns which would have worked the biceps more. If they did V-bar rows, I’d make adjustments to biceps compared to doing an undergrip row.  I’d give a low bar squatter who sits back more direct quad work than one squatting high bar for example.  Everybody with any real-world experience does this in practice to some degree.  We do this based on 1/2 guesswork,  1/2 experience, 1/2 science, and 1/2 intuition (and sometimes1/2 luck).

Again, let’s not get too mired in the specifics here (as I do that very thing).  Follow the logic.

Rebuilding the Model: Part 1

And with my assumption of a 0.5:1 relationship, here are the re-mathed set counts for each of the 6 studies I’ve included. I’ve shown the original set counts in parentheses next to the re-mathed value and I probably messed at least one of these up because math is hard, my brain is tired, and I don’t bother to run it twice.  And this is total sets per week.  I’ve indicated in red which group did best (based on the analysis above) and might have even gotten it mostly right.  I am quite sure that anybody wishing to dismiss my conclusions based on a single typo will make me aware of that typo and I will change it because that’s the intellectually honest thing to do.

Study Muscle Low Moderate High
Ostrowski Lower 2 (3) 4 (6) 8 (12)
Upper 5 (7) 8 (14) 20 (28)
Amirlingtham/Hackett* Triceps 10-11 (16-18) 16 (26-28)
  Quads 7-8.5 (11-13)  8.5-9 (16-18)
Huan Increasing from 5-16 sets with a cap on growth at 10 sets for upper and possibly more (up to 16) for lower.
Schoenfeld Upper 3 (6) 9 (18) 15(30)
Lower 6 (9) 18 (27) 30 (45)
Haeselgrave 6 (9) 12 (18) 18.5 (27)

*I grouped the two studies together since they were an identical methodology and only differed by length
and I’m getting tired of writing and making tables as it’s a pain in the ass in WordPress.

And this brings the results into even starker view.  Ostrowski fits with all other data which shows a clear dose response relationship up to 10 sets for legs although the lack of data above 8 sets limits this finding and we can’t know if more would generate more growth. For upper, 20 sets (down from 28) wasn’t better than 8 down from 14.   Since 8 and 20 got the same growth, it seems unlikely that a middle value would get different results although it’s possible that 14 would but 20 was too much for some reason.  Without data, this is a guess and we need studies examining different intermediate values to know for sure.  Test 8, 14 and 20 next time.

For the GVT studies 10-11 sets per week as a mix of compounds and isolation was as good as 16 sets/week for upper body.  Inasmuch as the differences were miniscule, 8.5-9 sets per week for legs was better than 7.5-8.5 but at this point, we’re looking at a single set difference when it’s re-mathed and that alone would explain why the results were essentially identical (the stimulus was essentially identical).  You wouldn’t expect 1 set to matter but maybe if it were 7-8.5 vs. 15-16 it would. That group has already done the same study twice, now do it a third time with real differences in lower body volumes (give them a second leg day).

Schoenfeld’s data becomes a lot less idiotic now and at least starts to pass the reality check, in line with the other studies.  9 sets was as good as 15 in terms of triceps growth (because his stats did NOT show that the highest volume was more than insignificantly superior to the moderate).  Even if you believe that his highest volume was superior, it’s cut to a realistic 15 sets per week from an absolutely moronic 30 sets per week.  This starts to fit the reality check and is still well within the realm of 10-20 sets.  Like I said, big picture whether you accept my contention that moderate was as good as high or potential trend for highest to be superior, when you count the sets rationally, it stops mattering, at least for upper body where both moderate and high fall within 10-20 sets/week.

We still lack data on chest growth per se and it might require more volume or it might not so whether or not the original values matter is unknown (i.e. does the chest somehow need 30 sets of direct work…I doubt it). Until it’s measured, we don’t know.  No study can address that yet  and the bench press only study I referenced in Part 1 didn’t compare different volumes although it would be hard to see how the 9 sets that worked for 6 months suddenly needed to be tripled after one year.   But he doesn’t get to count chest volume and then measure triceps to draw conclusions about optimal sets for all muscle groups (which he essentially did) and then decide that you have to count volume differently for isolation exercises after the fact (which he actually did).

For lower body the 18 sets was as good as 30, again passing the reality check.   Here, if you take his higher volume claims as better that’s a pretty high set count (30 sets/week) although there might very well be a plateau value between 18 and 30 sets (we don’t know) which would be consistent with Haun (maybe) and anecdote.  Maybe.   If more than 20 sets IS optimal for legs (and this is still in the IF stage), a third group at 24 sets might have done better.  Testing 18 vs. 24 vs. 30 sets would be very informative but it has to be a lab that isn’t Brads.  The stats and strength gains still don’t support it and the fact that he lied about data should make his study inadmissible on fundamental grounds.

There’s still that pesky ECW issue to worry about above 20 sets per week which now ONLY the highest volume leg work in Brad’s study crosses (maybe that explains the almost significantly higher leg extension load volume.  Hahahahaha.  I’m never gonna stop laughing at that shit).  Then again, Haun was using pure compounds so that probably doesn’t make any sense as I think about it since I’m now comparing a compound only study to remathed sets.  So yeah, forget that bit, it’s wrong.  Based on initial volume, several groups in Schoenfeld cross 20 sets/week.  And that means ECW might be playing a role or artificially increasing the results.  I’d only note that with a spread of 18 to 30 sets, we don’t know if a middle value (i.e. 24 sets) would be superior until it’s directly tested.  Finally is Haeselgrave which found that 12 sets was better than 6 but no better than 18.5.

Rebuilding the Model: Part 2

Recreating the chart from above with the new numbers we get the following optimal volumes per week.

Study Optimal Volume
Ostrowski 8 sets lower body (no higher tested), 8 sets upper body
Amirthalingam/Hackett* 10-11 sets upper, 8.5-9 sets lower (the huge drop in set count is due to the single leg day)
Huan Increasing from 5-16 sets with a cap on growth at 10 sets for upper and up to 16 for lower.
Schoenfeld 9 sets for upper body, 18 sets for lower body (15 and 30 if you accept the highest volumes)
Heaselgrave 12 better than 6 but 18.5 no better than 12

So we get systematically lower numbers here, as expected.    And again, if you disagree with my 0.5:1 and use a different value, the numbers change slightly but they still all go down (i.e. if you use 3/4:1 Ostrowski’s leg data might be 10 sets instead of the original 12 or my 0.5:1 assumption 8) Basically, for any moderate set count, the differences in remathed sets just isn’t that significant.  I mean, consider a group that did 10 sets/week of compound.  If I assume 0.5:1 that goes to 5 sets.  Assume 1/3rd and it goes to 3.  Assume 2/3rds and it goes to 6.  Assume 3/4 and it goes to 7.5 or whatever and we are looking at a 4 set spread.  Use the lower ratio and it’s a little lower, use a higher ratio and it’s a little higher. And it all more or less stays in the same overall range we’re looking at here.

But it’s ALWAYS less than the original value which is my point here.

Looking at the new numbers, Ostrowski’s upper body optimal volume is 8 sets/week.  Lower body matches 8 sets/week with no higher values tested so we can’t know what would happen above that.   The two GVT studies are 10-11 sets for upper and 8.5-9 sets for lower with no higher volumes of lower body tested.  Haun finds a cap on upper body of 10 sets (down from 20) and possibly up to 16 for lower (down from 32).  Brad’s number stop being totally moronic when you don’t count them in an ass-backwards way with 9 sets for upper and 18 sets for lower body, generally matching the results of Haun.  Heaselgrave is at 12 sets for triceps but 18.5 was no better.  And all of this basically agrees with the original 10+ set meta-analysis even remathed (though it’s conclusions should probably change if it is remathed) except that we now have a much better idea of the upper caps on weekly set volume.  There’s a dearth of leg data at higher volumes and more study is needed here.

So my next to final comment: whether you look at the original unadjusted data or the semi-adjusted data for set count, you still see a general optimal range of 10-20 (original count)/8-16 (remathed count) sets per week per muscle group which is all close enough for government work.   Let’s just call it 8-20 sets/week and move on with our lives.  And again, this is consistent with Eric Helm’s own conclusions in MASS of 10-20 sets/week after his pitiful defense of Brad’s paper.

There is still the slight indication that *maybe* more for sets/week for legs would be better but it’s understudied and any conclusions would be tentative as hell.  But there is no way on god’s green earth to justify the 30 and 45 sets Schoenfeld et. al. is so desperate to prove as optimal.  His own data doesn’t support it, his stats don’t support it, the bullshit apologism by everyone involved doesn’t support it, a rational re-analysis of the set count doesn’t support and neither does the broad body of literature, his lie about the Ostrowski data notwithstanding, support it.  Nothing supports it except his burning desire to support it with guru games and because he’s believed all along in these types of high volumes.

Now We Refine the Model

Because like I said, this is how science works: you take all the available data and you make a model.  You don’t fixate on individual studies and it’s the overall body of literature that is relevant (again, I thank James Krieger for making my point for me on this).   And ignoring Radaelli, I have presented 5 studies showing that, in the aggregate, moderates volumes somewhere between 8-20 sets per week provide the maximal growth response and 1 that fails the reality check so hard it hurts, where data was lied about in the discussion (a fact that NOBODY has yet to address directly for me) and which should be dismissed on that fact alone.  Legs maybe need a bit higher but we need more data.

Now, it’s possible that more work will change that and I’ll change my model and opinion when and if they do.  But unless we do find out that Ultrasound doesn’t measure muscle growth or something and all of these studies go into the junk pile I won’t hold my breath.  They match one another despite varying methodologies and they match (for what it’s worth) with real-world training practices.  They pass the reality check is what I’m saying.  At best we’ll refine the above numbers with more targeted research.

That is, future research might start from the idea that 10 to 20 sets/week is optimal and determine what specific volume is optimal within that range.  Or more systematically compare lower body and upper body.  Perhaps look at 10,15,20 for upper body exercises and 15,20,25 for lower. But stop doing 9,18,30 where the variance is just too huge to know what happens in the middle ranges.  Is 12 the same as 18, is 24 the same as 30?  Stop focusing on sets per exercise.  If you want to test sets per week, set it up to do that in a rational way.

Feel free to contact me with help with the study design.  I can also probably figure out how to pre-register the study, describe the randomization in the methods (and randomize the subjects in a blinded fashion to do the Ultrasound) and efficiently write the discussion with accurate data representation for anybody who just doesn’t have time…..

But expectationally based on the broad body of literature, optimal results are likely to be found between 8-10 to 20 sets per muscle group per week.  Once again, Eric drew this same conclusion after his desperate efforts to make Brad’s paper not be shit.   And I just read something by Bret Contreras of all people saying to stop doing insane volumes and focus on intensity.  Good lord, when Bret is the rational one and Mike Isratel is the intellectually honest one in all of this, Mercury must be in retrograde.  But Bret was on Brad’s paper and he better be careful or Brad will kick him out of the paper publishing circle jerk or prevent him from getting seminar appearances for not towing the party line.

Ok, two more comments and I’m done.

The Generic Bulking Routine

With the above in mind, a rough volume of perhaps 10-20 (or 8-16 depending on the analysis) sets per week as an optimal growth number, I want to look at what I have presented for years as my Generic Bulking Routine.  This was an intermediate program I drew up absolute ages ago that has proven to work for intermediates for over a decade.  I report this only anecdotally and nothing more, I’m not James Krieger who thinks anecdote counts as ‘science’.  But if we’re going to pretend to integrate science and practice, then it is always nice when practice actually matches up with the science.

It was an Upper/Lower routine done 4 times per week with each day having the general structure shown below and was meant to be done as 2 weeks of a submaximal run up and then 6 weeks of trying to make progressive weight increase (progressive tension overload being the PRIMARY driver on growth with sufficient volume within being optimal) prior to backcycling the weights and starting over with the goal of ending up stronger over time.   It was mean to be an intermediate program used from about the 1-1.5 year mark of consistent training to maybe 3 year mark before my specialization routines were implemented.

Weights didn’t HAVE to be increased every week or workout, that was simply the goal (as Dante Trudell put it in his Doggcrapp system, you should be trying to beat the log book at each workout).  In my experience, so long as folks were eating and recovering well and started submaximally, they could do so over relatively short time periods like this (over a longer training cycle, I’d do different things).  Women perhaps less so than men for unrelated reasons but no matter, Volume 2 is coming eventually…

I’m going to provide specific exercises in the template but just think of them as either compound or isolation for the muscles involved since exercise selection is highly individually dependent. RI is rest interval and note that I use fairly long ones so ensure quality of training with real weights and ideally all sets are at the same heavy weight (oh yeah, in ‘Merkun a single apostrophe is minutes and a double is seconds and I am told this is the opposite of the rest of the world).  Big compound movements get 3 minutes, smaller muscles get 2 minutes and high rep work gets 90 seconds since it’s meant to be more of a fatigue stimulus to begin with.

But for big movements, a 90 second rest interval is bullshit and means that you’re probably squatting with 95 lbs on the bar by your fifth set ‘to failure’.  Better to do less sets and give yourself long enough to do quality work.  In that vein, after the submax run up, the goal RIR was maybe 2-3 for the initial set which would likely drop to 1 or even near failure by the last set.  The goal is progressive TENSION overload over time (meaning multiple training cycles).  When your workouts don’t use stupid volumes, you’re in the gym the same amount of time but can actually do quality work than when you’re trying to fit in 45 fucking sets and get done before tomorrow.

Upper SetsXReps(RI) Lower SetsXReps(RI)
Flat Bench 3-4X6-8(3′) Squat 3-4X6-8 (3′)
Row 3-4X6-8 (3′) RDL or Leg Curl 3-4X6-8 (3′)
Incline DB Bench 2-3X10-12 (2′) Leg Press 2-3X10-12 (2′)
Pulldown 2-3X10-12 (2′) Another leg curl 2-3X10-12 (2′)
Lateral Raise 3-4X6-8 (3′) Calf Raise 3-4X6-8 (3′)
Rear delt 3-4X6-8 (3′) Seated Calf Raise 2-3X10-12 (2′)
Direct Triceps 1-2X12-15 (90″) Abs Couple of heavy sets
Direct Biceps 1-2X12-15 (90″) Low Back Couple of heavy sets

The exercises are for example only and the other two workouts per week could be a repeat of the same movement or different within that general structure (exercises can be also be changed with each succeeding training cycle).  So start with incline bench and pulldown for the sets of 6-8 and do flat bench and row for the sets of 10-12 or whatever.

Now let’s add up the set count:

Compound Chest: 5-7 sets twice/week for 10-14 sets/week (counted as 5-7 sets for tris at 0.5:1)
Compound Back: 5-7 sets twice/week for 10-14 sets/week (counted as 5-7 sets for bis at 0.5:1)
Side delts: 6-8 sets per week.
Rear delts: 6-8 sets per week (gets hit somewhat by pulling but hard to math out)

Note: Effective delt volume is likely a bit higher than this but it’s a pain in the ass to estimate how much side delts do or do not get hit by compound pushing.  Or rear delts via compound pulling.  It might math out to 8-10 sets/week or less or maybe more.  Again, hard to say but most report just fine delt growth from the above (and no shoulder problems which is why it’s an upper/lower to begin with).

Bis/Tris: 1-2 direct sets added to compound work = 2-4 sets/week + 5-7 indirect sets/week = 7-11 sets/week.  Add a third or even fourth set if you like to get 8-12 sets/week or 10-14 sets/week of combined indirect and direct arm work.  I certainly agree that if you do heavy pushing and pulling you don’t need a lot of warm work.  I simply do NOT agree that the sets count 1:1.  But my workout designs usually have proportionally less direct arm work since I partially count the compound pushing/pulling and always have and always will.

Let me comment before moving forwards that while this might seem like a low per workout volume to some (and high to others), it matches the set count data based on my analysis above.  As well, I have contended for years that if you can’t get a proper stimulus to your muscles with that number of sets, volume is not the problem.  Rather, you are.  Whether it’s due to suboptimal intensity, focus, technique sucking, etc. you are the problem with your workout.  Doing more crap sets will never top doing a moderate amounts of GOOD sets.

Regardless, looking at it now, with 15 years of experience with it and the data analysis I just did, I might bump up the side delt volume a bit.   As noted above, the contribution from chest work is tough to really establish here and the delt has three heads with differing functions.  But no matter.  Let’s focus on generalities.  Which are that my general set count for this workout is and has always been right in the range of what the analysis of the majority of the training studies found to be optimal.

This template could be adjusted in various ways.  The second chest and back movement could be isolation which would reduce the indirect set count on arms, necessitating an increase in direct work.   So if someone did 4 sets of flat bench and 3 sets of incline flye that’s still 14 sets/week for chest but reduces indirect arm work to only 4 sets/week (8 sets of compound pushing divided by 2) so you bump direct arms to 3-4 sets per workout to get to 6-8 direct sets per week and 4 indirect sets for 10-14 sets per week.  I think that makes sense.  The point being that I am looking at total set counts per week (actually I was counting reps but it all evens out) and adjusting volumes for smaller bodyparts based on exercise selection.  If you use more isolation movements for chest or back that decreases the indirect set count for bis and tris so I’d add more direct work there.

The same holds for legs where quads are worked for 5-7 sets twice weekly or 10-14 sets, same for hams and calves.  I might bump this up slightly although high volumes of truly HEAVY leg work is pretty brutal, add a third movement like leg extension and another leg curl to for a couple of higher rep (12-15 rep) sets apiece.  Now it’s 7-9 sets twice a week or 14-18 sets.  Towards the higher end of volume but until we know for sure that it’s 20+, I’m not changing much here.  And, again, a workout with 20+ heavy sets of legs (including quads, hams and calves) is gruelling.

But overall upper body comes in at somewhere between 7-14 sets for upper body muscles and 10-14 for legs.  Again, intermediate program from like 1.5-3 years or so.

Those numbers look so very familiar.

The Wernbom Meta-Analysis

Let me finish by revisiting the original Wernbom analysis that looked at intensity, volume and frequency in terms of optimal growth.  It’s become pretty fashionable these days to dump on it for various reasons.  It’s fairly old, there is more data now and there was simply very little work done on intermediate much less advanced trainees at the time.

Irrespective of that, within moderate intensities (the typical ‘hypertrophy zone of perhaps 70-85% 1RM), it concluded that a volume of 40-70 repetitions twice/weekly was optimal for growth with triceps and quads being the muscles of interest.   I honestly think using reps per week is a better approach than sets since obviously 10 sets of 1 and 10 sets of 10 are not the same stimulus.  That said, since almost all of work on this topic stays in the 8-12 range or so, set counts are at least conditionally appropriate.  Within any rationally accepted repetition range, it just all sort of balances out.

If you add up the reps on my GBR you can see where my numbers come from. I use a combination of heavy 6-8’s for tension and 10-12 or 12-15 for more fatigue which is why I mix them but you end up with roughly that number of reps for every muscle group (you can count reps on compound chest/back/legs as half the reps for arms but it should all math out more or less correctly because that’s how I set it up).

No Wernbom wasn’t on well trained subjects but none of the above studies used elite guys either because a 1.1 bodyweight bench is not elite in men, it’s advanced noob.  Wernbom was basing on a limited data set in, at best, limited work on even intermediate trainees (again, just like the above studies) and still concluded 40-70 contractions twice a week gave optimal growth compared to lower and higher values.

So we double 40-70 and that’s 80-140 repetitions per week per muscle group.   Some quick maths.

At 10 reps per set 80-140 reps per week yields 8-14 sets per week.
At 8 reps per set 80-140 reps per week yields 10-16 sets per week.
A mix of 4X8 (32 reps) and 3×12-15 (36-45) for 68-77 reps per week is 14 sets/week.
A mix of say 5X5 (25 reps) and 3-4X1012 (30-36 reps) for 55-71 reps twice a week is 16-18 sets/week.

So for any rational workout design an optimal repetition count of 40-70 reps/workout done twice per week for 80-140 total reps per week put us somewhere in the realm of 8-18 sets/week for the optimal growth response.

Well whaddya know about that?

Training Volume and Muscle Growth: Part 2

So continuing from last time, when I looked at four studies (one of which I threw out based on what I consider absurd results in terms of making zero sense) on the topic of training volume and hypertrophy, I want to look at the remaining three studies (these are the ones that came out in the past few weeks) next to complete the set.  I’ll do the same basic analysis and this will all lead into the final part 3 where I’ll look at the results in overview to see if any general conclusions can be drawn regarding the questions I originally posed.

Effects of Graded Whey Supplementation during Extreme-Volume Resistance Training

The next paper is by Haun et. al. and was published in Frontiers in Nutrition in 2018.  It is notable for having been (at least partially) funded by Renassiance Periodization and having Mike Isratel as one of the authors.   I do NOT mention this to dismiss it out of hand on a  “Who funded it?” kind of way because I think that’s crap.  Just mentioning it since it was clearly an attempt to support/test Mike’s ideas about volume, MRV, etc.   Of some trivial interest, it literally came out like 3 days before Brad Schoenfeld et. als. paper.  Which is a shame because had it come out last year, it would have brought up a critically important issue that will have to be considered going forwards on this topic.  More below.

As you’ll see, it also had a semi-negative outcome in terms of NOT supporting what I suspect Mike was trying to prove to begin with.  Yet it was still published (and I am told that Mike has adjusted his workout templates volumes down in response).  That’s the mark of intellectual honesty since I’m 100% sure they wanted the opposite result of what they got.  Or Mike did anyhow.  Not only did they publish essentially a negative finding but Mike adjusted his recommendations (mind you, it would have been faster to have just listened to me in the first place but no matter).

The paper actually had a couple of different goals.  One was to examine the response of muscle growth to progressively increasing volumes.  Basically to see what happened as volume went up weekly.  This is where I will put my focus as hypertrophy has been my criterion endpoint from the outset.  But a second goal was see if increasing protein intake along with increasing volume had any additional benefit (the title makes it sound like this was the primary goal and it might be.  Whatever, there were two goals).   The idea being that getting optimal growth from increasing volumes require more protein or whatever.

Dietarily, the groups were either supplemented with maltodextrin, a single serving of whey or given graded doses of whey protein (from 25-150 grams/day).  But they all did the same training program.  Since the dietary manipulation ended up having zero effect on the results, I won’t mention it again.

In it 34 subjects were recruited with 3 dropouts resulting in 31 total subjects.  The participants were resistance-trained with at least 1 year of self-reported resistance training experience and a back squat 1RM of greater or equal to 1.5 times body weight.    The subjects performed the following workout with barbell back squat, barbell bench press, barbell SLDL and lat pulldown at every workout.  So one compound movement per muscle group(s).

Huan Squat Workout

The sets were done at 60% of maximum and you can see how training volumes increased weekly from 10, 15, 20, 24, 28 and finally 32 sets/week.  Yes, 32 sets per week of squats.   Even at 60% of max, well…yeesh.  That makes GVT look sane by comparison. That said…

It’s a little bit tough to tell from the methods how the workout was performed but it almost looks like one set of each movement was done before moving to the next, the next and the next and then going back to the first exercise.  As 2′ were given between each exercise unless the subject wanted to go sooner or needed a bit longer, this is like 10 minutes between sets of any individual exercise.  That is assuming I’m reading this right.

Exercises were completed one set at a time, in the following order during each training session: Days 1 and 3—barbell (BB) back squat, BB bench press, BB SLDL, and an underhand grip cable machine pulldown exercise designed to target the elbow flexors and latissimus dorsi muscles (Lat Pulldown); Day 2— BB back squat, BB overhead (OH) press, BB SLDL, and Lat Pulldown. A single set of one exercise was completed, followed by a set of each of the succeeding exercises before starting back at the first exercise of the session (e.g., compound sets or rounds). Participants were recommended to take 2min of rest between each exercise of the compound set. Additionally, participants were recommended to take 2 min of rest between each compound set. However, if participants felt prepared to execute exercises with appropriate technique under investigator supervision they were allowed to proceed to the next exercise without 2 min of rest. Additionally, if participants desired slightly longer than 2 min of rest, this was allowed with intention for the participant to execute the programmed training volume in <2 h each training session.

Subjects reported their Reps in Reserve (RIR) for each set which just means how many more reps they think they could have done. So an RIR of 3 on a set of 10 means they think they could have done 13 (and this method is fairly accurate for trained folks).   The average RIR started at roughly 3.7+-1 and this went up slightly to 4.3+-1.6 by week 6.  So on their sets of 10, they could have done 13-14 reps  Basically, it was all pretty submaximal as would be expected for sets of 10 at 60% with an almost 10 minute rest interval.  Ten reps is usually ~75% and with a 10 minute rest there simply isn’t any accumulated fatigue occurring.

Body composition changes were measured by DEXA (which is at best rough for estimating true muscular change) but muscle thickness via Ultrasound was also measured for the vastus lateralis (VL) and biceps.  Biopsies (where a chunk of muscle is literally cut out of it) of the vastus lateralis were also taken to determine the physiological cross sectional area of actual muscle fibers.   While way more invasive (with its own limitations), biopsy is arguably a far more direct method of assessing fiber size changes since you are literally looking at the muscle fiber area directly.

Of some interest, total body water (TBW) and extra cellular water (ECW) were measured via BIA (it’s only real use) as this is representative of inflammation and edema (basically, the body retains water when inflammation is present) and this will be important in a moment.   Measurements of mood (POMS, profile of mood state) and muscle tenderness were also measured, basically to check for overtraining and inflammation although I won’t focus on this.  This was a thoroughly done study to be sure.

Measurements were made before week 1, at week 3 and again at week 6.  And here is where it gets interesting since the results ended up being pretty different from Weeks 1-3 and Weeks 4-6 as the volumes got stupid.    First let me look at the lean body mass changes.  From week 1 to week 3 (10->20 sets/week), the subjects gained 1.35 kg/3ish pounds with that value dropping to 0.85 kg/2ish pounds from week 3 to week 6 (20->32 sets/week).  So already there is a reduction in training gains with more volume producing far less gains.  Still size is size and 2 lbs in 3 weeks is still good, right?  Hang on.

The study did something I haven’t seen before which was to correct the change in LBM  for extracellular water (ECW) which will show up as LBM.  Basically swelling and edema (not the same as sarcoplasmic hypertrophy) that can occur.  And when this correction was made, the LBM changes dropped from 1.3 kg/3 ish pounds to 1.18 kg/2.6 lbs from week 1 to week 3 which was about the same as before and from 0.8 kg (2 ish pounds) to a statistically insignificant 0.25 kg/0.55 lbs from week 4 to week 6.  So they gained about 0.9 lb/week for the first 3 weeks and this dropped to an insignificant 0.2 lbs/week from 3 to 6 when the volume got stupid.  Basically the increased LBM in the last three weeks of the study was just fluid accumulation (perhaps lending credence to the idea of pump ‘growth’ occurring, just not by the previously thought mechanism).

I’ve shown this data below.

Time Weeks 1-3 (10->20 sets/week) Week 4-6 (20->32 sets/week)
LBM Gains 1.35 kg (3 ish pounds) 0.85 kg (2 ish pounds)
ECW Corrected LBM Gains 1.2 kg (2.6 lbs) 0.25 kg (0.55 lbs)
LBM Gains/week
0.4 kg (0.9 lbs) 0.08 kg (0.2 lbs) BFD


In the abstract (which most in the industry don’t read past) the authors try to spin this as the higher volumes still being superior but in the discussion itself they state:

Thus, when considering uncorrected DXA LBM changes, one interpretation of these data is that participants did not experience a hypertrophy threshold to increasing volumes up to 32 sets per week. However, if accounting for ECW changes during RT does indeed better reflect changes in functional muscle mass, then it is apparent participants were approaching a maximal adaptable volume at ~20 sets per exercise per week.

Note their use of the word ‘if’ after however in the last sentence.  They are kind of hedging their bets here but it is NOT yet clear if you do or do not have to correct for ECW when measuring changes in muscle thickness via Ultrasound.  It’s hard not to see how it would given how Ultrasound works but that may reflect my own bias on the matter.   In their discussion they do state:

In this regard, Yamada et al. (24) suggest expansions of ECW may be representative of edema or inflammation and can mask true alterations in functional skeletal muscle mass. Further, these authors suggest the measurements of fluid compartmentalization (e.g., ICW, ECW), which are not measured by DXA, are needed if accurate representation of functional changes in LBM are to be inferred.

Suggesting that ECW can skew Ultrasound measurements and must be both measured and accounted for to get any idea about actual changes in skeletal muscle size or amount.  More importantly, this has to be directly EXAMINED before we go forwards with any more high volume studies.  I’ll harp on this but we must KNOW if the increase in ECW/edema impacts on measurements or not.

Regardless, a cap was seen where anything over 20 sets had no further advantage in total LBM gains.    But as I’ve said, LBM gains themselves aren’t necessarily indicative of actual muscular growth since it can represent a lot of things (perhaps the increased LBM even in weeks 1-3 was glycogen storage, since it was not measured, we can’t know).

So let’s look at the changes in thickness via Ultrasound.   As with Radaelli I’m not providing specific numbers because the raw data wasn’t presented, the graphics in the paper are tiny and my eyes hurt trying to read them.  So I can’t even begin to try to extrapolate it out (and if I did I’d be subject to my bias guessing at the numbers which I won’t do).   But here it is.

Huan Figure 5 DataThe verbiage in the results is even more obscure in terms of what happened.  The best I can do is quote from the legend on Figure 5 above.

Muscle thickness and VL fiber size differences between supplementation groups. Only a significant time effect was observed for biceps thickness (A) assessed via ultrasound with MID values being greater than PRE- and POST values. Only a significant time effect was observed for VL thickness (B) assessed via ultrasound with MID values being less than POST values. Panel (C) provides representative images of ultrasound scans from the same participants. Only significant time effects were observed for total fiber cross sectional area (fCSA) (D), type I fCSA (E), and type II fCSA (F) assessed via histology with MID values being less than PRE and POST values. Panel (G) provides representative 10x objective histology images from VL biopsies of the same participant. All data are presented as means ± standard deviation values, and values are indicated above each bar. Additionally, each data panel has delta values from PRE included as inset data. MALTO, maltodextrin group; WP, standardized whey protein group; GWP, graded whey protein group.

This is really hard to parse.  By Ultrasound, biceps thickness was bigger at week 3 than at the beginning or the end of the study, possibly suggesting that it increased in size and then decreased again.  For quads it’s even weirder.   By the Ultrasound the Week 6 size was greater than the Week 3 size but, statistically neither seems to have been different than the starting value.   So while there does appear to be growth from Week 3 to 6, there was no net size gain from Week 1 to 6.  As I’ll discuss below, the biopsy data showed shrinkage from Week 1 to 3 before regrowth to Week 6 and it’s possible that the Ultrasound simply didn’t pick the same pattern up statistically.   Perhaps the larger point is that there was no net growth from Week 1 to 6.  That’s a lot of squatting to achieve nothing.

Mind you, overall, the actual total change in muscle thickness via Ultrasound appear to have been absolutely tiny to begin with.   They state:

When summing biceps and VL thicknesses at each level of time, there were no significant differences between groups at each level of time. However, a significant main effect of time revealed that the summed values of thickness measurements were significantly higher at POST compared to PRE (p = 0.049). The summed value at POST was 7.16 ± 0.77 cm where the summed value was 6.98 ± 0.81 cm at PRE (data not shown).

What they actually did here was add up the total growth for the biceps and VL to make the number larger (and presumably reach statistical significance).   And looked at that way, it was a bit higher at 6 weeks than at the beginning of the study.  But look at that change, the summed total of both muscles was all of 0.18 cm (7.16 – 6.98) or 1.8 mm TOTAL (if this was split evenly and I don’t know if it was, that’s less than 1 mm growth per muscle).  And the only way they got this to be significant was by adding up two different muscles (a scientific approach called playing silly buggers).  It would be like a training study where the subjects improved their bench by 15 lbs and squat by 25 lbs and neither were meaningful but if you ADD THEM UP, suddenly the total 40 lb improvement from the training program is significant.  Feh.

Looking at the quad biopsy data, quad size actually went down from Week 1 to 3 (visible in D/E/F in the figure above) before returning to the starting size by week 6. This backs up the tentative conclusion on the Ultrasound above.  Basically at the lower volumes, quads shrank before increasing to their starting size at the end of the study.  If they hadn’t measured at week 3, they would have seen zero change from start to finish.

This is perhaps the more interesting finding, that there is a possibly discrepancy in growth requirements for upper and lower body.  For the upper body, biceps size went up from week 1 to 3 before going back down but ended up right where it started: a lot of work to achieve nothing.  Not only was more volume not better, it was worse.  In contrast, quads (by biopsy anyhow) shrank from Weeks 1 to 3 before growing from Week 4 to 6.  This suggests that the lower volumes early on were insufficient but doesn’t change the fact that, by biopsy, there was no overall muscle growth from Week 1 to 6.  The subjects did a LOT of squatting to make zero gains.

This does suggest, as I’ve noted already, that upper and lower body might have different optimal volume requirements. Perhaps if the quad volume had started at a higher level, there would not have been size loss from Weeks 1 to 3 or net size gain from Weeks 1 to 6.  Perhaps if upper body volume had been capped at 20 sets, there would have been no shrinkage or loss of size or even a further increase.  Perhaps perhaps perhaps.  Since they didn’t do it, we can’t know.

This is all still colored by the fact that any total growth of any sort was absolutely tiny.  When you have to add both values together to get it to be significant, that’s pretty damn telling.   This is what the Ostrowski study got in triceps alone with 14 sets/week.  Mind you, this study was a mere 6 weeks which is seriously short.  Perhaps we should consider that significant growth in that time period.  Perhaps we sholdn’t.  Perhaps over a longer study, perhaps with slower volume increases, different results would have been seen.  Perhaps, perhaps, perhaps.  This is what they did and this is the data they have.  In this vein they state

Finally, while a 6-week RT program seems rather abbreviated, we chose to implement this duration due to the concern a priori that the implemented volume would lead to injuries past 6 weeks of training.

Which is a consideration a lot of people seem to be missing in the current volume wars.  Even IF super high volumes generate better growth, what happens when they are followed over extended periods? I’ll tell you what happens: overuse and other injuries, overtraining, burnout, etc, none  of which are beneficial for long-term progress.  It’s just something nobody seems to be talking about outside of a few folks on my Facebook group.

Simply, even IF massive volumes generate better short-term growth (and the overall indication is that they do not), if they get you hurt that’s not a good thing.  Long-term progress is the goal here and that usually means a more moderate approach over time.  Put differently, it’s better do do a series of training cycles with 15 sets/week that gets some growth than doing one with 30 sets/week that keeps you out of the gym for 3 months with tendinitis.

Since it will likely be used to rebut the above, let me note that Brad Schoenfeld’s study, discussed next, used high volumes over 8 weeks and saw no dropouts due to injury.  Well, 8 vs. 6 weeks it not much of a difference and it’s when absurd levels of training are followed for months at a time (as most tend to do) that they get into trouble.   Try it for 3-4 months and get back to me is what I’m saying.   As well, the nature of the workout in Brad’s study had some real implications for training poundages that were used.   It’s one thing to do all the volume when the overall loading is low and another to do it when it’s heavy.  Only the latter wears stuff out.

Ignoring that for the time being, there is another issue that has to be addressed which is the ECW issue I mentioned above.  IF increased ECW is throwing off the Ultrasound, then it is only impacting on high volumes of training and it MUST be accounted for. But first it has to be determined if it impacts anything.  Do a pilot study, do something, but figure this out before another of these damn studies is done.  All you have to do is measure muscle thickness when there is no increase in ECW, then do something (like 32 sets of squats) to increase ECW and re-measure it when true growth could not have occurred.  BIA can clearly pick up ECW so this shouldn’t be terribly difficult to do methodologically.

Either muscle thickness does or does not change in response to it.  If it doesn’t, there’s no problem.  If it does, there is a HUGE problem where high volume studies are now measuring changes in ECW rather than actual growth. Or, at the very least, ECW changes are causing an artificial elevation of the Ultrasound BUT ONLY IN THE HIGH VOLUME GROUPS which would make it look like they are generating MORE growth than they actually are.  But ONLY in the high volume groups.

A second issue I want to discuss.  A current theme in the training world is that “volume is the primary driver on hypertrophy” and this study was clearly designed to test that.  Except that what it really did was disprove the idea entirely.  Keep in mind that the intensity of training was 60% of maximum with stupid long rest periods and the subjects reported 3-4 reps in reserve during the study.  So pretty submaximal for endless sets of 10 with little to no cumulative fatigue (even Poliquin’s original GVT was 10X10@60% on 1 minute and it got HARD by set 10 as fatigue set in).

And despite throwing all the volume at folks, growth was, factually, sucky, a mere 1.8 mm which they only achieved by adding up biceps and VL to begin with.  Basically, with insufficient TENSION overload, all the volume in the world doesn’t generate meaningful growth.  It does cause lots of water retention and maybe this does lend some credence to the whole idea of pump growth, it just happens to be increased ECW rather than sarcoplasmic growth.  So pump it up with light weights and you look bigger (maybe, for a little bit) and weigh more.  But it’s just fluid accumulation.  Or you could lift real weights for less volume and actually grow better.

And all of the above is true because TENSION is the primary driver on hypertrophy and this study shows it in spades (in his apologist article, Eric even states this very thing the same citing the same study I’ve been referring people to for almost 20 years).  Ostrowski got much more growth in the triceps (2 mm) with only 14 sets by using heavy loads (admittedly over a longer time period).   So did the two GVT studies (which at least had some heavy work, 70% for sets of 10).  A relatively small study by Mangine showed the same (when I mentioned this study to Brad he hand-waved it away for reasons I forget) where much lower volumes at a higher intensity generated more growth and strength than higher volumes at a lower intensity.

Tension/progressive tension overload trumps volume every time.

It’s really that simple: tension/progressive tension overload beats volume every time. Yes, volume plays a role but, in the absence of sufficient tension, it doesn’t mean shit.   That means that volume can’t be the primary driver because that’s not what primary means. If this is unclear consider the following:

  1. Insufficient intensity + high volumes = dick for growth (this study)
  2.  Sufficient intensity + low volumes = growth (all other studies)
  3. Sufficient intensity + higher volumes SOMETIMES = more growth (all other studies)

What’s the common factor in the two situations that generate growth? It’s tension, NOT volume.     That makes tension overload primary as without it, growth is effectively nil.  Volume is purely secondary to sufficient tension overload and there’s no escaping that fact.  Yes, more volume AT A SUFFICIENT INTENSITY may yield more growth. But ALL the volume at submaximal intensities accomplishes jack squat.

Since I imagine some will use it to rebut the above, I should address the low-load training studies (usually using 30% 1RM to FAILURE) to generate growth.  This is true but note my capitalized word.  By taking light loads to failure, the muscle fibers ARE exposed to high tension loads at the end of the set.  This is due to fatigue occurring and requiring the high threshold fibers to be recruited to keep the weight moving until FAILURE occurs.  And this is distinctly different than doing 10 non-fatiguing sets (remember RIR never went below 3-4) at 60% with 10 minutes rest.  The latter simply never exposes muscle fibers to a high-tension overload.  And clearly doesn’t work very well (if at all).  Even if it did, why bother with 20 light sets when, most likely, 8-10 heavy sets would be more effective?


As their own conclusion in the discussion (not the abstract) shows, the growth response seemed to hit a top-end cap of 20 sets with more sets generating insignificant gains in LBM but rather increased water storage due to edema/inflammation.  It suggest a difference in optimal volume levels for upper and lower body as well with the upper body seemingly responding better to lower volumes and the legs actually getting smaller with the same lower volumes (before returning to their initial size as volume went up).

But this has to be considered with the fact that the overall growth was miniscule, only being even remotely significant with the silly buggers method of adding biceps and VL growth together.  Admittedly, this was short study using, insufficient loads (also demonstrating clearly that volume is NOT the primary driver on growth because volume didn’t drive jack shit for growth here).

Perhaps the most interesting point is that water retention may have to be accounted for going forwards since it can clearly skew the results and make it look like more growth is occurring than actually is.  And since it looks like higher volume causes more water retention than lower, well….this has HUGE implications for a lot of volume studies.  Because if water retention is ONLY an issue beyond 20 sets/week, then it means that any study comparing volumes below and above this level have a confound that must be taken into account for the high volume groups (but NOT the lower).

That is, if you compare 12 sets to 24, the 24 set group MUST be adjusted for ECW but the 12 set group doesn’t have to be.   And IF that ECW is shown to impact on Ultrasound measurements, this means that any “apparent” benefit of the highest volumes might just be a measurement artifact of increased ECW.  Note my use of the word might.  We don’t know if ECW colors Ultrasound or not.  But it MUST be studied before another one of these studies is done.

Bringing me to Brad Schoenfeld’s study.

Resistance Training Volume Enhances Muscle Hypertrophy but not Strength in Trained Men

This is the paper by Brad Schoenfeld et. al. that has been a driver on all of the recent Internet drama.  It’s been discussed to death and I’ll try to keep this short (and probably fail).    It took 45 college aged men with at least 1 year of training experience (the single reported value was like 4 years +-3.9 or something).  Of those initial 45, 11 dropped out (not due to the workouts) leaving 34 total subjects.

They did the following workout flat barbell bench press, barbell military press, wide grip lat pulldown, seated cable row, back squat, leg press and one-legged leg extension and this was done three times weekly for either 1, 3 or 5 sets per exercise.  For the upper body, this meant that the volumes ranged from 6 to 18 to 30 sets and for lower it was 9, 27 or 45 sets per week.  Sets were 8-12RM (to concentric failure supposedly) on a 90 second rest interval.  Which is really already failing the reality check.

Let me ask:

  1. How many people have you ever seen squat to failure voluntarily?  By this I mean continue lifting until they fail mid-rep and either need a spotter to get them to the top or lower the bar to safety pins or dump it?
  2. How many could do 5 sets of squats to failure with 90 seconds rest with any decent poundage?

The answers will likely be

  1. Almost nobody.  I’ve done it, I’ve had the occasional trainee do it.   I’ve seen almost none do it on purpose.  If failure occurs on squats, it’s usually because the person is a macho dipshit and something goes desperately wrong mid-set (or it’s a powerlifter missing a heavy max).  But few, if anybody, do it on purpose.
  2. Literally nobody.  Well certainly no men.  Due to differences in fatigue, women would be far more likely to survive this.  But the study only used men so that doesn’t matter.  Who would never survive it.  A true set of 12RM and you have to lay down for a few minutes.  5 sets of 12RM on 90 seconds?  Maybe with 95 lbs on the bar and then the first set wouldn’t be to failure.  What you’d probably see is a decent first set and the poundages or reps dropping excruciatingly with every set.  This would be the definition of junk volume and anything less than 2-3′ rest between heavy sets of squats anywhere near limits would be a minimum.  I’d love to see what poundages were actually used in the high-set workouts because I bet they started ok and ended up as jack shit.

Which once again raises immediate questions about the study in the sense that the high-volume workouts were basically impossible to complete but no matter.

Body composition was, shockingly, not measured (even post-training weight wasn’t reported which is a bizarre oversight) especially given that the head researcher apparently likes to crow about his super amazing Inbody body comp device in his lab.  Sure, BIA is shit but if you have this amazing gizmo, why not use it?  Guess they couldn’t find the time with all the not pre-registering, doing the unblinded Ultrasounds, not describing the randomization in the paper they were doing along with figuring out how to misrepresent the Ostrowski data to change its actual conclusion to match theirs.

However, muscle thickness for triceps, biceps, vastus lateralis and rectus femoris was done by Ultrasound.   I’ve presented the data for each muscle group in the chart below.  All that is presented is the mm change for each muscle and the %age change this represents (I did this by dividing the average change by the average starting value in the study).   I’ve indicated sets as sets/week for upper and lower body.  I’ve indicated, using THEIR statistical methods which groups were different from which group.  An asterisk means it is different from a non-asterisk but equal to any other value with an asterisk.  NS means non-significant statistically.

6 sets/week 18 sets/week 30 sets/week
Triceps Change 0.6 mm 1.4 mm NS 2.6 mm NS
Triceps %age Change 1.3% NS 2.9% NS 5.5% NS
Biceps Change 0.7 mm 2.1 mm* 2.9 mm*
Biceps %age Change 1.6% 4.7%* 6.9%*
9 sets/week 27 sets/week 45 sets/week
Rectus Femoris Change 2.0 mm 3.0 mm 6.8 mm
Rectus Femoris %age Change 3.3% 5.1%* 12.5%*
Vastus Lateralis Change 2.9 4.6 7.2
Vastus Lateralis 5% 7.9%* 13.7%*

So despite a numerical change, triceps showed no significant differences between groups even if the absolute values were kind of different.  1.4 is over double 0.6 and 2.6 is nearly double that.  But it was non-significant.  But this is the beauty of statistics where an apparent real-world change can be statistically irrelevant but an irrelevant real-world change can be significant.

For biceps, the 18 and 30 set/week group were not statistically different from one another but both were higher than the 6-set group (and somehow smaller absolute changes in this case were significant).   This is likely due to it being statistically underpowered and I guess there is a small trend for 30 sets to be superior but that’s an enormous increase in volume and training time to get a relatively smaller increase (15 more sets/week for 0.8 mm).  Basically, the percentage change kind of hides the fact that going from 6 to 18 sets got 300% more growth and adding another 12 sets got only 40% more.  Even if the higher volumes are superior, it’s a terrible return on investment.

The same is true for the RF and VL changes.  Both the 27 and 45 set groups were better than the 9 set group but were NOT statistically different from one another.  But there is a visible trend present.  Clearly 12.5% is more than double 5.1% and 13.7% is a little less than double 7.9%.  Like in Ostrowski, the visible trend simply didn’t show up as a statistical difference.

Let me note that the above is true for all statistical methods applied, there was NO statistical difference between the moderate and high volume groups although both were better than the lowest volume groups.  This actually makes sense given the general improvement in growth up to 10+ sets.  But there is also the issue of the spread of sets.  From 6 to 18 sets up to 30 is big although 9 to 27 to 45 is enormous.  What happens in-between those values?  We don’t know.

Now when the P-value statistics failed to show a benefit, James Krieger invoked something called Bayesian statistics (a probability thing that I only vaguely understand and won’t attempt to discuss further).  Here the Bf values were like 1.2 for the 3 set and 2.4 for the 5 set group and this was claimed as double the probability of a real difference for the higher volume group to be superior by James Krieger over and over, usually to deflect my direct questions.   But in Bayesian statistics numbers that low are called a weak effect (as Eric honestly reported even if he still concluded it meant something).

As a friend with much more experience in statistics explained.

Know what another way to say “weak” effect is? “Not worth more than a bare mention” (that comes from the Jeffries (1961) reference that the Raftery (1995) paper [cite 20] refers to.

Cite 20 was in Brad’s paper and Jeffries is probably one of those ancient statistics papers.  Basically they reference a paper on Bayesian statistics that says that those Bf values barely worth a mention (until they get to like 100+ they just don’t mean anything).  Yet Brad and James et. al. used it to draw their conclusion and James continually repeated it in lieu of answering direct questions.  I believe his response to me was “in Lyle’s world a doubling of the chance of rain isn’t important.”  Not when the doubling is from 0.1% to 0.2% it’s not, James.  Is it in yours?  Because the simple fact is that double jack shit is still jack shit.

Basically, the paper was really struggling to make the highest volume group look better than the moderate volume group (the P-hacking I mentioned in Part 1) and I’ll link out to the extremely detailed statistical analysis my friend did on the topic for anybody interested in this. Basically none of the statistical metrics used support that the high volume group was better than the moderate volume group although I will still acknowledge that there was a trend, just as with Ostrowski (it would be disingenuous for me to acknowledge it for one paper but not this one).

I’ve brought up my other issues with this study endlessly and said at the outset that I wasn’t going to harp on methodological issues one way or another for any paper and it would be unfair of me to do it for this one only.  If Eric thinks that the argument “Well other papers are methodologically flawed so it’s ok for this one to be” is correct, I’ll concede the point and apply that rule.  This makes all studies fair game by his own argument (recall that I dismissed Raedelli on the totally nonsensical results, not the methods).

He opened the door, I get to walk through it.   Eric, you played yourself.

One oddity is that, Eric’s and Jame’s attempt to dredge up nonsense about leg extension volume load notwithstanding (data once again NOT reported in the paper and ONLY brought up when they were desperate to maintain the bullshit), the strength gains in all groups were identical over the length of the study.  This fails the reality check so hard it hurts.  It contradicts literally every past study on the topic showing that strength gains increase with increasing volume (at least up to a point).  Several suggestions were made from the length of the study to the repetition ranges that were used.  But it still fails the reality check.

However, in that muscle size and strength show a strong (but imperfect) relationship, the lack of strength gains differences implies something else: a lack of differences in muscular size gains.   Same size gains, same strength gains.  QED.  Note my use of the word IMPLIES.  But there is a big issue when this paper runs basically opposite to every other one.  One argument I’ve seen is that the short rest periods prevented the weights from being heavy enough.  Well, that’s true.

Brad himself has done a paper showing that longer rest intervals beat out shorter.  So why design the study like this to begin with?  Brad says it was for practical reasons, to keep the workout about an hour.  Which raises a practical issue about insane volumes: with a real rest interval, workouts with this much volume are impossibly long (I believe the 32 set workouts in Haun were about 2 hours).   But it still fails the reality check and contradicts all other literature on the topic.  Brad has since spun this to the NY Times as “You can make strength gains with 13′ workouts”.

Now, acknowledging that there is a trend (albeit one that could not even be forced into being statistically significant despite throwing multiple statistical methods at it), let’s go back to the Ostrowski data since that is what Brad compared his results to in his paper’s discussion in terms of muscle growth.  In doing so, Brad represented the leg data as 6.7% for the low volume group and 13% is for the high (3 and 12 sets respectively) stating that this is similar to his own data.   This is accurate but incomplete and you have to wonder why he left out the middle data point. Was it to save ink (on a PDF and yes I know that the article is also printed, spare me), was he running out of words?   The journal doesn’t have a word count limit so that’s not it.  It’s not even as if adding that information would have added more than like 3-4 words.  Allow me to demonstrate:

Brad wrote this:
“Ostrowski et al. (11) showed an increase of 6.8% in quadriceps MT for the lowest volume condition (3 sets per muscle/week) while growth in the highest volume condition (12 sets per muscle/week) was 13.1%”

36 words

I can rewrite this as:
Ostrowski et. al. (11) showed a dose-response for growth in the quadriceps of 6.8%, 5% and 13.1% for 3, 6 and 12 sets/week respectively

26 words.

Perhaps Brad should hire me to help him with the writing of his papers.  I saved him 12 words and accurately represented the data.

But let’s look at this side by side.  Below I’ve presented Ostrowski’s data for his 3,6 and 12 set groups and Brad’s for his 9, 27 and 45 set groups.  I can only compare percentages here since the Ostrowski data was presented in mm^2 and there’s no way to back calculate that to mm to compared it directly.

Schoenfeld RF %age Change 3.3% (9 sets/week) 5.1% (27 sets/week) 12.5% (45 sets/week)
Ostrowski RF %age Change
6.5% (3 sets/week) 5% (6 sets/week) 13.1% (12 sets/week)

Ok, so yes, it is true that both studies showed about 13% growth with the highest volumes.  Mind you, Ostrowski got DOUBLE the growth with only 3 sets that Brad needed 9 sets to get and even the 6 set/week group equalled his 27 set group.  And while the third groups got roughly the same RF growth, we have to ask why it took 4 times as much volume in Brad’s study to accomplish this.

Seriously, in what world do you need 45 sets/week to achieve the same growth as another study got in 12 set/week?  In discussing this discrepancy, Brad says this.

Interestingly, the group performing the lowest volume for the lower-body performed 9 sets in our study, which approaches the highest volume condition in Ostrowski et al. (11), yet much greater levels of volume were required to achieve similar hypertrophic responses in the quadriceps. The reason for these discrepancies remains unclear.

Basically a shoulder shrug and “dunno”.  Not even a speculation as to why.  Fantastic.

He actually tried to defend this on his website by writing the following:

“Some have asked why we did not discuss the dose-response implications between their study and ours. This was a matter of economy. Comparing and contrasting findings would have required fairly extensive discussion to properly cover nuances of the topic. Moreover, for thoroughness we then would have had to delve into the other dose-response paper by Radaelli et al, further increasing word count. Our discussion section was already quite lengthy, and we felt it was better to err on the side of brevity. However, it’s certainly a fair point and I will aim to address those studies now.”

This is seriously sad.  First off the journal doesn’t have a word limit so who cares for economy.  Second off, it’s good scientific practice to examine the discrepancies between your work and previous research to actually try to forward the field by determining what the explanation might be.  Third, with respect to Ostrowski, what nuance?  The study design was nearly identical, the weekly set count nearly identical, the training status nearly identical.  There’s no nuance and the results are the results and the FACT is that Brad needed 4 times the volume to get them and should try to explain why.  Raedelli is an issue because that paper is a mess but so what.  He was happy to address it in the STRENGTH data and examine the discrepancy writing this

However, for the bench press and lat- pulldown exercises, the 30 weekly set group experienced greater increases than the two other groups. Given that their subjects did not have any RT experience it might be that the greater strength gains in the 30 weekly set group are due to the greater opportunities to practice the exercise and thus an enhanced „learning‟ effect (22). Also, their intervention lasted 6 months while the present study had a duration of 8 weeks. It might be that higher training volumes become of greater importance for strength gains over longer time courses; future studies exploring this topic using longer duration interventions are needed to confirm this hypothesis.

But what, by the time he got to the GROWTH data, he didn’t have the energy or time to discuss the differences?   Please.  But I guess when he’s putting out 5 papers per month, Brad is probably too time stretched to actually do something properly.

Anyhow, since Brad is too time crunched to do it, I will speculate on the possible reasons for the discrepancy in the growth data.  I think a possibility, and this is impossible to know given that Ostrowski failed to report on their rest intervals, is that the absolutely sub-optimal rest interval in Schoenfeld’s study was the issue.  Basically, when you train on 90 seconds and can’t use heavy loads, maybe you need an absolute metric ton of junk volume to get the same growth as you could get doing 12 quality sets. This is MY speculation and nothing more but at least I provided one.  Perhaps Brad should get me to help him with the discussion in the future since he clearly can’t do it himself.  As noted above, I guess when you have to get your name on 45 papers per year (as of October 2018), you don’t have the time.

Now we come to the triceps data.     First let me reiterate that Brad deliberately misrepresented the data set here, reporting that Ostrowski found that 28 sets gave better results than 7 sets (4.8% vs. 2.3%) which was broadly similar to his results.  This technically true but inaccurate and misleading since Ostrowski found a plateau at 14 sets, said data going unreported (arguably why he left out the middle data for the thigh changes, to establish the pattern).  By leaving out this data point, Brad changed the results of Ostrowki from disagreeing with him to agreeing with him.  This is called lying.

Honestly, that single fact should disallow this finding on every level and this paper should have never passed peer review for that alone (too bad I wasn’t on the peer review since I caught the Ostrowki lie upon my first read and I bet Brad wishes he hadn’t sent me the pre-publication paper at all).

Regardless of that, let me look at the side by side data in terms of %age gains.

Schoenfeld Tri %age Change 1.3% (6 sets/wk) 2.9%  (18 sets/wk) 5.5% (30 sets/wk)
Ostrowski Tri %age Change
2.3% (7 sets/wk) 4.7% (14 sets/wk) 4.8% (28 set/wk)

So we see a similar pattern to legs.  First, Ostrowski got almost double the %age growth with 7 sets/week that Brad got with 6.  Moving up to 14 sets/week, Ostrowski still crushed Brad’s 18 set/week group.  And ignoring the misrepresentation of the data, it took Brad 30 sets/week to achieve the a little more growth than Ostrowski got in 14.  So over double the volume to achieve only a slight improvement.   So just like with the leg data we ask why it seemed to take twice as much volume to generate the identical growth in Brad’s study (bonus question: why will nobody directly address Brad’s lie about this data set?).

Now, since Brad misrepresented the data to make it look like it supported him, there was no discussion of the discrepancy.  It makes me wonder why more researchers don’t simply lie about data.  It would save so much time discussing reasons why it disagrees with you.  Just lie about the data and, boom, all research agrees with you. The amount of time it would save typing is enormous.

As with the leg data, I’d speculate again that the brutally suboptimal design of the workouts in Brad’s study are again at fault.  Even for upper body how many can do a lot of RM load sets on a short rest interval?  Not many.  But again maybe you need twice as much volume when those sets are lower quality junk volume.  Simply put, Ostrowski got the same growth as Brad’s with 1/2 as many sets.  Why do double for essentially the same results (doubling your volume for a small %age increase is asinine)?

There are other issues relating to this paper but I’ll only focus on the one that the Haun paper recently brought into light which is the issue of ECW and water retention which only appears to become an issue when more than 20 sets/week are done (note: Brad couldn’t have known about this paper since it came out a week before his so I can’t say he ignored it).  As a reminder, for the upper body workout, the set count was 6, 18 and 30.  Only the last one hits the volume level of 20+ sets where ECW might be relevant.  For legs it was 9, 27 and 45 so both the moderate and high volume group clear it.

As above, do we know that ECW impacts on Ultrasound?  No.  But we don’t know that it doesn’t (and there’s endless other work that edema is still present at the 48-72 hour time point Brad measured at anyhow even if James Krieger is desperate to dismiss it).  Haun clearly showed that it skewed the results hugely at high volumes and it needs to be studied and addressed.  But the issue only applies to 3 of the groups in this study (30 set upper body, 27 and 45 set lower body).   And if ECW turns out to skew the Ultrasound measurements, then even the trend towards better growth may very well disappear or be a measurement artifact.

Note I said may, not will and it may turn out not to have an effect (and I’ll accept that).
It has to be studied so we can know for sure.

Rather, based on the detailed analysis of the statistics, it’s clear that the high volume group did in fact NOT do better than the moderate volume group although there was an apparent visible trend based on absolute mm change and %age change.  Not by P value and not by Bayesian factors (which were too small to be relevant because double jack shit is still jack shit) and not by any nonsensical argument by the people involved or those defending it.    Going forwards, I’m going to treat this study as if the moderate volume group did best. I’ll look at the high volume group too but NOTHING about this study supports the conclusions being drawn.  NOTHING. You can agree or not.  As I said above, Brad’s lie alone should dismiss the paper out of hand but I’m keeping it in so that I won’t be accused of bias or simply making data I don’t like disappear (you know, like Brad did with Ostrowski).

Because even if you take the highest volume results at face value, you’ll see next week that it contradicts the other 5 papers I will have looked at.  And, as I discussed in Part 1, we base models on the overall data, not the one study (as James so helpfully pointed out despite the fact that only HIS group of folks were doing it).  If 1 study is the outlier on the other 5, we ignore the 1 until it’s replicated.  And in this case it must be replicated


Brad can continue to churn out studies showing that volume is all that matters but I suspect that looking at all of them in detail would turn up an equal amount of shenanigans (a project for another day).  As he’s now shown that he is willing to lie in a discussion to change the conclusion of a contradictory paper, nothing he puts out from here on out can be taken at face value: what’s to stop him from lying about data again?  If ONLY he can produce these results, then we have another issue (remember, not only is replication required but is better if someone else does it).  If another lab, perhaps one that thinks he’s wrong replicates it, then we can consider it. Let’s get Jeremy Loenneke on the job, hahahaha (in-joke, sorry).

You’ll also see in Part 3 that if we take into account another issue that has only recently been brought up (by Brad himself), that even if we take the highest set counts at face value as generating higher growth, it still stops mattering.  You’ll have to wait until next week for that.


Despite their attempts to make volume happen, Brad’s own statistics at best support that moderate volumes of 18 sets for upper body and 27 for lower body give better growth than lower volumes with statistically irrelevant support for the highest volumes of 30 and 45 sets being any better (there was a trend that did not reach significance by any of the methods used).  Even if you accept the highest volume data, it doesn’t change the fact that Brad needed 2X and 4X the volume to get the SAME growth as Ostrowski with him attempting to defend why he did NOT explain the discrepancy.

Since the leg volumes vary from a low of 9 sets to a medium 27 it’s impossible to know if a value between that level would have generated different results.   But it’s a huge spread of volume and an 18 set group would be very informative.  But they didn’t do it, the data is the data and we don’t know what might have happened (any speculation I could make it colored by my own bias so I won’t make one).

Also, the lack of differences in strength gains between groups goes against literally every other study on the topic where more volume, up to a point, leads to greater strength gains.  Literally every study.   Brad speculates on the reason but it still contradicts a lot of other data.  But the most parsimonious explanation, given the general relationship of muscle size and strength, would simply be that the gains in muscle didn’t differ.  Occam’s Razor folks, now available from Dollar Shave Company. Same muscular gains equal same strength gains.  QED.

This study just fails the reality check so hard it hurts.  The workouts can’t possibly have been completed to begin with using any decent poundages, the lack of strength gains scaling with supposed size gains contradict all previous data, etc.  It goes against every other study.  So either they are all wrong or it is wrong and well…..

Couple that with the deliberate misrepresentation of the data of Ostrowski and this study has some real issues.  It failed half of the criteria Greg Knuckols himself laid out in MASS methodologically even if Eric still said “Good study, broseph” (and Greg tried to somehow dismiss what I wrote last time).  But, I’m nice, I won’t dismiss it out of hand like I did with Radaelli because then people will say I’m biased against Brad and I’ll include it in the rest of my analysis.  Even IF it’s results are valid, and I clearly don’t think they are, it won’t end up mattering.

Dose-Response of Weekly Resistance Training Volume and Frequency on Muscular Adaptations in Trained Males

And on the heels of all of that is a brand new paper that came out a week after Brad’s by Heaselgrave et. al and published in the Int J Sports Physiology and Performance.    In it, 49 resistance trained males (1 year or more consistent training so not untrained) were put in either a low, moderate or high volume group that did 9, 18 or 27 sets for biceps for 6 weeks.

This was accomplished with a workout of biceps curls, bent over row and pulldown and the workout structure is a little bit goofy, unfortunately having both a frequency and volume component which adds an unnecessary second independent variable.

The low volume group trained once weekly performing 3 sets of each exercise for a total of 9 sets.  The moderate group performed that workout twice per week for a total of 18 sets.  The high volume group did one workout consisting of 5 sets of curls and row and 4 sets of pulldowns (14 sets) and a second workout of 4 sets of curls and row and 5 sets of pulldowns (13 sets) for a total of 27 sets.  I’ve replicated it to the best of my ability below with the + in the high group referring to the second workout.

Volume Low (9 sets/wk)
Moderate High
Pulldown 3 sets 1X/week 3 sets 2Xweek 4 sets 1X/week + 5 sets 1X/week
Bent over row 3 set 1X/week 3 sets 2Xweek 5 sets 1X/week + 4 sets 1X/week
Biceps curl 3 sets 1X/week 3 sets 2Xweek 5 sets 1X/week + 4 sets 1X/week

They did sets of 10-12 with a goal RIR of 2 reps (so close to failure), starting at 75% of 1RM and a 3 minute rest between sets.  The workouts were overseen with a lifting tempo of one second up and 3 seconds down.  Basically they lifted heavy weights, near limits and used a long enough rest interval to keep the loads heavy and as they state “…to maximize MPS [muscle protein synthesis] and strength gains.”

The subjects were allowed to work out outside of the study which is a HUGE methodological problem but they provided workout logs in order to show that they were not doing extra biceps work.  This unfortunately makes it easy to hand-wave these results away which is more or less what Brad tried to do online.  How do we KNOW that they didn’t do more arm work?

Well we don’t so we kind of have to trust them and their self-reporting.  And since Brad said he can do studies unblinded because you can trust him, I think it’s only fair to apply the same standard here unless he wants to call the study subjects liars.  I for one would hope Brad would not impugn someone’s integrity in such a fashion unless he is the only human in the world who can be trusted.  Shouldn’t we give the subjects of this study the benefit of the doubt?

Of course, I could claim on any study I didn’t like, that the subjects did extra training outside of the study (call it The Colorado Experiment effect).  In Brad’s study in Haun’s study, nobody can say for sure if the subjects did more training outside of the study itself.  Unless you lock them up in a metabolic ward, you can never know for sure.

Is this design ideal?  No.  It simply is what it is and I’m describing this limitation up front because that’s the honest thing to do and I think the results are still worth examining.  And if I’m keeping in Brad’s methodological shit show of a study (remember, failed about HALF of Knuckols’ methodolgy list) and Eric says it’s ok if one study is unsound because others are, I’m keeping this too.

They opened the door, I get to use it.

Looking at muscular thickness, the average changes in size were 0.1 cm (1mm) for the low volume group, 0.3 cm (3mm) for the moderate volume group and 0.2 cm (2mm) for the high volume group.

Heaselgrave Biceps Change

Now all three groups were significantly different from the initial value.  However, as with Ostrowski, there was no difference between groups.  That is, statistically the researcher concludes that all groups grew the same.  At the same time, as with Ostrowski there is a visible trend (admittedly with LARGE variance) from low (1 mm) to moderate (3 mm) back to high (2 mm).  Unfortunately the researchers did not provide the starting or ending levels, only the change so I can’t calculate the percentage changes here.   And I am hesitant to eyeball it as it will reflect my own bais.

At the same time, statistically insignificant or not, the absolute values are similar in absolute terms to the Ostrowski triceps data (1mm, 3mm, 2mm here vs. 1 mm, 2mm, 2mm in Ostrowski).  If Brad Schoenfeld at. al. get to use that magnitude of non-significant changes (albeit misrepresented) I feel as I have the same right.  They opened the door, I get to walk through it. An alternate conclusion, and this would be consistent with other work would be that all groups did in fact do the same and that 9 sets was just as effective as 18 or 27.    You can take your pick on which you think is correct.

What kind of stands out is the individual variance (and this is always an issue with these studies, hi James).  Clearly some subjects in the low volume group grew better than others in the moderate volume group (which has both the highest AND lowest responder).

The high volume group is more well clustered together certainly.  But if other papers are going to look at average response, I’m using this one’s average response for consistency.  And on average, moderate volumes beat out low and high volume was either equal to or even a bit less than moderate.



While the researchers reported no difference in growth from low to moderate to higher volumes there is a trend towards better growth from 9 to 18 sets with no further growth at 27 sets.  Again we see a cap/threshold at moderate volumes.

And with that final study out of the way, I’ll cut it here.  Next week in the third and final part, I’ll look at the studies in the aggregate to see if any general patterns show up along with re-addressing the set counting issue.  Don’t worry, this is almost over.


Training Volume and Muscle Growth: Part 1

So for the last few weeks, I’ve been addressing different issues regarding Brad Schoenfeld’s recent paper suggesting that an incredibly high training volume, far more than have ever been suggested or used by any sane human, give the most growth.  I won’t re-examine the issues I have with it but you can read my first diss track and my second diss track if you’re not got caught up.

Rather, as discussed two weeks ago, I want to now look at the other papers examining the issue of training volume and muscle growth.  As it turns out there are currently 7 of relevance, including Brad’s, of which 2 came out within roughly a week of his.  Since I have a lot to cover, this will take 3 articles to address everything I want to say.  First, a bit of a tangent and this will be a long piece.

Building Scientific Models

While imperfect, mainly due to the fact that scientists are only human, the scientific method is currently the best approach to answer questions about our universe.  This is because rather than being predicated on faith, intuition or being told by someone in charge, it is predicated on testing a hypothesis.  A scientists asks a question, designs an experiment and sees if the data does or does not support it.  If it does that means little until verification occurs.  One study simply doesn’t mean squat in isolation.  Maybe it was a crap study, sometimes researchers fake data (see Andrew Wakefield and vaccines/autism).

Ideally for science to occur you get verification, with more than one study showing the same finding.  If verification comes independently, from a different lab, that’s even better.  It’s always a little bit curious when one lab seems to always find the same results time after time but another lab always seems to find the opposite result time after time.

And how those results almost always reflect the bias of the person or people running the lab.  But when two different labs get similar or the same results studying the same thing, that’s good.  Presumably they don’t share the same biases and this means that the finding is more solid.   It’s even betterer if that lab thinks you’re an idiot and wants to prove you wrong but can’t.  Boom, now the finding is that much stronger.  Because if that lab actually wants to prove you wrong and can’t, well…

One of my favorite stories from ages ago.  The issue of muscle fiber hyperplasia had been long debated for years.  One lab said it happened, another said it didn’t.  In a fit of intellectual honesty, the second lab sent one of their people to the first lab to do some of the data gathering.   Basically they sent someone who thought the first lab was full of shit to help them do the science.  And as it turned out the first lab was right.  That’s how you do good science and I wonder when this concept got lost.

But sometimes a similar study finds divergent results.  Now, this might be methodological, perhaps the scientists don’t know how to properly blind, describe randomization or register studies for example.  Sometimes it’s an issue of population or specifics.  A training study in young men probably should get different results than in older men.  Or older women.  Or women on birth control or with PCOS.  Which is why you have to study all of them rather than assuming they are the same and why science often only inches forwards.  So over time you accumulate all of these freaking studies, some of which agree, some of which show different findings, etc.

Now you build a model.  By that I mean you now have to take all of the available data and come up with a single overarching model that can explain all of it.  Or at least the studies that are methodologically sound (the problem being here that researchers tend to define studies they don’t agree with as being methodologically unsound.  Well, unless it’s their study, then it’s ok  for it to be a pile of shit.)  But presuming that the data is all good, the model has to include it.  And sometimes when you build the model, the divergent findings make sense.

A non-training example.  For about 40 years there has been debate over whether metabolic adaptation occurs during dieting.  By this I mean the extra adaptive bit that causes a drop in metabolic rate outside of the weight loss related bit.  Understanding that most of the work was in overweight individuals, about half said it happened and half didn’t.   I have endless review papers in this folder on my computer and it’s fun reading the ones who say it does and doesn’t happen, especially when they carefully pick different studies in support of their belief.

But when you put the studies up against one another, collected the ones that said yes and the one that said no and see what their commonalities are, invariably what happened was the ones that said no were in extremely obese individuals, for whom adaptive thermogenesis is often trivial.  It’s only when folks get past a certain level of leanness that it kicks in to enough of a degree to matter.  I even remember one study that indirectly demonstrated this, observing no major adaptation in the first half of the study (when the subjects were fatter) and an increase once they got beyond a certain level of leanness.   Mechanistically this can be explained by leptin transport saturating above a certain blood level, thought to be 20-25 ng/dl or so.

So if someone is super obese and has a leptin of 40, the body doesn’t really sense it until they get leaner and leptin drops below that level.  If someone starts with 20, they get hit immediately.  Aha, the model fits the data and can explain all of it and there’s even a plausible biological reason for the difference.  Science crawls forward another inch.

But that’s how you do it in science, you take the study results on a given topic, group them roughly based on who they are studying (i.e. comparing studies in complete beginners and advanced trainees is kind of pointless) and see what falls out of the aggregate data.  One way to do this is with meta-analysis.  Take all the studies, statistically massage the numbers and come up with a way to compare them using effect sizes or something. Or you can do what I’m about to do and just look at each one in detail and then compare the results at the end.

Building a Model of Training Volume and Hypertrophy

Because what I’m going to do over the course of 3 articles is examine the aggregate data on training volume and muscle growth. But first some qualifiers.  I won’t be looking at the tons of low volume studies out there.   There’s really no debate that there is a dose-response relationship with more volume and better gains up to about 10 sets per week with 10+ having been established as providing the optimal response.    Some of the studies I will examine have groups that are below the 10 set cutoff but they usually ends up just supporting what is already known so meh.

It is beyond this 10+ sets per week point that the questions arise as, up until fairly recently, there just wasn’t enough data to draw any useful conclusions.   But as of the writing of this in October of 2018, there are 7 studies (3 of which came out in the last few weeks amazingly enough) and this provides enough data to draw at least tentative conclusions.  Yes, future data may add to this and, as above, the model should always be updated with that new data.

At the same time, if you find that the majority of studies are in overall agreement, it tends to be rare that new research will turn the model on its head or change it completely.  At best it will refine it somewhat (perhaps adding data for a different group, so maybe older trainees need more or less volume or whatever) or let you draw betterer conclusions.  I’m not saying it can’t happen, simply that once you have enough studies saying the same general thing, you don’t expect to see hugely divergent finding show up unless it’s in a completely different population or there is some huge change to the methodology being used.  So if all of a sudden a new and more accurate way of measuring muscle growth showed up it might very well turn the model on its head since old results would stop becoming relevant or might be proven totally false (i.e. what if Ultrasound weren’t even measuring actual muscle growth…).

This has actually happened on the issue of muscle fiber conversion.  From about the 70’s for like 30 years, research had said it doesn’t meaningfully happen but new methods and technologies of measurement have turned that idea on its head.  With a better ability to identify muscle fibers types (specifically, Myosin Heavy Chain or MHC subtypes), hybrid fibers and transitions between them, the old conclusions basically got thrown out and it’s now understood that much more fiber conversion than previously thought can occur.  So it can happen where a well-established model gets demolished by a change in technology or measurement techniques  And when it does, the model has to be changed.  And until that occurs, all we can do is look at the current data with the current methodologies to build the best current model that we can.

I’m also not interested in beginners.  There are a zillion studies on this already and, often, you see no meaningful difference in the responses to training almost irrespective of volume or anything else.  One set or three sets, 60% of max or 90% of max, twice a week or three times per week, it’s all about the same.  A lot of the gains early on are neural, learning to lift the weight basically, and it all just cancels out.   Mostly what these are looking at is whether tripling training time or increasing training days from 2 to 3 is worth it for the average individual.  And it’s usually not as any slight gains are far out of proportion to the increased training time commitment.

Yes, fine, sometimes there are small differences in results, more volume gives more practice and improves neural effects or whatever and you see a little bit more strength gains for 3 vs. 1 set or three days per week versus two or whatever.  But it mostly equals out and isn’t relevant to the question at hand.   The volumes are low to begin with, beginners all respond more or less the same and that’s not the question I want to address here.

The Question

Rather, the question is whether or not intermediate or advanced trainees needs higher volumes for optimal growth.  Even the original Wernbom review, that I’ve often used myself but which a lot of people like to dismiss of late, had limited data on intermediate or advanced subjects and it was unclear if those numbers (~40-70 reps twice a week per muscle group) applied to intermediate or advanced trainees.  But that was then and this is now and we have at least a decent research base to examine that might shed more light on the topic.

So the focus of the papers I will examine, 7 of them, will be on studies that examined volumes above 10+ sets (often compared to lower volumes) and that were done in individuals who were not rank beginners.  There is one exception, a study I will include that used military recruits new to strength training but you’ll see that it doesn’t really matter in the big picture. Some of them compare two different volumes, some three different volumes, one examined escalating volumes over a number of weeks. But they are all fundamentally examining the issue of training volume and hypertrophy response and can be used to build a model of this to see what falls out.


So, as it turns out, this week Eric Helms analysis of Brad’s paper and review of studies on training volume came out in his MASS newsletter as well.  And in it 11 studies were cited.  And I want to explain why I’m only including 7 of those 11 because I feel that complete transparency is always the best approach.  This is unlike the gurus who wait until they are backed into a corner over something to come up with a new argument.

Two in his list ( Paulsen, Ronnestad) were on beginners and I don’t care.  One (Baker) was in recreationally trained men but trained only pecs, delts and biceps for three exercises three times per week and either 1 or 3 sets of each exercise (so 9 vs. 27 sets/week for the three muscles trained although Eric reports it as 9-12 vs. 27-36 probably due to differences in set counting).  It used skinfolds for body composition, only reported the change in skinfolds without body composition data per se, did no direct measure of muscle growth and, in any case, both groups got the same strength gains (bizarrely the 1 set group had a bigger drop in skinfolds than the 3 set). I can’t be bothered to address it in any more detail than that so I’m ignoring it.  It would not materially alter anything in my analysis, especially given that it found that higher volumes provided no more gains than lower.  The fourth (Gonzalez-Badillo) was a study in Junior Olympic Weightlifters and looked only at strength gains in the competition movements.  Humorously, it found that moderate volumes were superior to high or low but I’m still not including it.

But those 4 studies do not fit the criteria I laid out above to begin with regarding volume (10+ sets), training status (trained) or end point (hypertrophy).  I just don’t want anybody who has read both thinking that I deliberately left out relevant studies.  But that’s why I was specific in my criteria above ahead of time.  If anything, those studies would all support my overall conclusions but since they don’t meet MY criteria, I’m not going to include them after the fact because that is intellectually dishonest.

Ironically, in the issue itself there is a piece by Greg Knuckols on how to trust studies and talks about how study pre-registration is important, to avoid P-hacking and throwing different statistical methods at the paper until something sticks, how not blinding a study with subjective methods is a problem and a bunch of others.  You know, the exact same criticisms I brought up with Brad’s study that went unaddressed or were simply excused because “I don’t know science”  Well, James, I guess Greg must not either because his article indirectly says that your study is methodological shit, too.  It contained 17 total points and depending on how you count Brad’s paper failed 7-9 of them.

Despite that, Eric then concludes that Brad’s study, despite literally breaking about half of the points that that Greg said is important, was methodologically sound and based this on “I do research”.  Basically just an appeal to authority and the “If you don’t do science you don’t get an opinion” like James used   Eric also uses the “Since other studies have this flaw, it’s ok that Brad’s does” argument.  It’s really sad to watch.

Eric also says that the authors of the paper addressed the online criticisms which is an outright lie.  Brad never addressed a single question, Eric deflected with the 9 year old blog bullshit (he wasn’t an author) and James attempted feebly to defend it (at least he tried).   NOBODY has even attempted to address Brad’s LIE in the discussion.  NOT A SINGLE PERSON.

Eric also repeats the Bayesian factors, pointing out (honestly) that the Bf of 1-3 is weak (it’s NOT WORTH MENTIONING)  And yet concluding that the study still showed a trend for the highest volume to be better.  It’s odd, almost like he can’t make up his mind whether he’s evidence based (in which case the Bayesian values are meaningless and don’t support the conclusion) or just trying to stay on Brad’s good side by trying to defend the paper with weak sauce apologism.

Perhaps the most desperate argument was data that James ONLY put up on his BLOG that was NOT in the paper (so it doesn’t count since nobody else had access to it until it was necessary) that found an almost significantly greater volume load for leg extensions in the high volume group.  OMG, this changes everything.  No, wait, it doesn’t.  I mean, does anyone give a flying fuck that the leg extension load volume (weight * sets * reps) was ALMOST higher in the highest volume group?  This is a study primarily about fucking MUSCLE GROWTH and this is the best Eric or James can do?  Fucking leg extension training volume?  Jesus.

Oh oh wait, one more: the letter from the editors states that Eric provide an UNBIASED look at Brad’s study to address online chatter.


Unbiased my hairy ass.   Nobody is unbiased, including me and true objectivity cannot exist.   I’m simply the only one honest enough to admit it.  I have my biases, I just try to put them up front so you know where I’m coming from.  Everybody else ignores their biases and pretends they aren’t biased.  But I guess Eric pulled out his Fair Witness cloak (get THAT reference, nerds) to write his piece.  It’s the only explanation.

The same intro mentioned that the paper was published in MSSE, one of the top journals in the field.  An appeal to authority and irrelevant. The Lancet, a top tier medical journal for over a century published Wakefield’s autism study so clearly even good journals can publish utter shit. This is just pitiful.

At least Eric made the rational conclusion regarding weekly sets (spoiler: NOWHERE CLOSE TO BRAD’S NUMBERS) but it was wrapped in nonsense and vagaries and apologism to sort of explain away why Brad’s numbers were so far out of the reality we live in without just saying it was a piece of shit paper with shitty methodology and shittier statistics that in no way supported Brad’s shitty claims or conclusions.  But when seminar money is on the line I guess you pretend you’re unbiased and evidence based and hope nobody notices how full of it you are.

Unfortunately for him, I noticed.  Stay tuned for my lesson on integrity for him.

I could literally make this another diss piece about Eric’s apologism for Brad’s study and just tear his MASS article to shreds but I already dropped the mic on that topic.  So….

Back to the Point

I will be looking at the 7 relevant studies on the topic in some detail and will do so in their chronological order of publication.  I will not really be comparing them as I go.  Rather, that final analysis, looking at the body of literature in toto (No, not this Toto or that Toto) to see if any patterns arise will have to wait until Part 3 when I’ll also look at them in a slightly different way related to the next topic.

I’d note that there is still little to no data on truly advanced trainees at this point, perhaps 4+ years of consistent training.  Or not enough to draw any real conclusions.  We can quibble about what defines a true intermediate or advanced.  Most of the studies had training ranges of 1-4 years or a minimum of 1 year of weight training with the one exception.  I’d say at 3-4 years someone is advanced intermediate approaching advanced.    Most of the strength levels were distinctly advanced beginner or approaching intermediate.  So this still isn’t in highly trained individuals.  But it’s the data we have.

And the specific question I will be examining is what the relationship between weekly training volumes (set count) and muscular hypertrophy are.  As noted above, the original meta-analysis only concluded that 10+ sets was optimal but the data didn’t allow for conclusions beyond that at the time.  So we ask:

  • Does more volume keep generating more growth indefinitely?
  • Is there some upper limit where the growth response stops or even reverses?
  • Can we try to define some optimal volume where growth is maximized?
  • Does that optimal volume possibly differ for different muscle groups (or upper vs. lower)?

The only way to determine this is to look at ALL the studies in some detail to try to build a model which is why this will take a solid 3 articles.

A Note on Volume Nomenclature

Since it will come in Part 3, I want to talk about how volume is being counted or considered.   In many studies, only compound exercises are used although triceps, biceps or quadriceps size (either by vastus lateralis or rectus femoris) are all that is being measured.   Others use a mix compound and isolation.  This raises the question of how to consider the set count. Does one set of cable rows or bench press count as one set of direct work for biceps or triceps respectively?  Should it be counted the same or differently as a set of biceps curls or triceps pushdowns?

As I look initially at the data, I will count it as if it does, primarily because this is how Brad Schoenfeld and his group have done it in their studies and meta-analyses and it makes the most sense to use a consistent (and their) method throughout at least initially.  So any exercise done involving any muscle is counted as 1 set for that muscle and any other muscle that it works.   A set of bench press is thus counted as one set for chest, shoulders and triceps.  A set of rows or pulldowns is one set for back and biceps and a set of squats or leg press is one set for quads.  A set of triceps extensions is only triceps, a set of curls is only biceps, and leg extensions are only for quads.  I’ll say right now that I do not agree with this approach but I will go along with this nomenclature to maintain consistency with their original analyses and their own recent study until I re-examine it at the end a little bit differently.

And with that out of the way, let’s start with the paper that kind of kicked off this whole mess to some degree.  I have linked the title of each study I will be examining to the Pubmed reference so that if anybody wants to check and see if I’m honestly reporting the data or not, they can.  I am quite sure I have flubbed a number or two in this series since I write and type quickly but you can spare me the claim that it was deliberate.  Usually it’s an obvious typo and I’m too lazy to check it that hard.

The Effect of Weight Training Volume on Hormonal Output and Muscular Size and Function

The first paper I want to look at is titled The Effect of Weight Training Volume on Hormonal Output and Muscular Size and Function by Karl J Ostrowski et. al. which was published in the Journal of Strength and Conditioning Research in 1997.   I feel like I’ve talked about this a zillion times but will examine it again here.

It recruited 35 men who had been weight training from 1-4 years (hence not untrained) with a minimum of 1 year of regular weight training experience.  They needed to be able to squat and bench at least 130% and 100% of bodyweight (we might quibble how trained they were but they were not beginners in any case).   More specifically, the average squat 1RM was 133 kg (294 lbs) at a bodyweight of 77kg or 1.7xBW.  Bench average was 87.5 kg (192.5 lbs) for a 1.13 x BW.   This puts them at a solid intermediate squat and somewhere between a beginner and intermediate bench based on the standards you typically find online.

Ostrowski Workout ProgramEach followed the workout that appears to the left (click to enlarge) and performed 1, 2 or 4 sets per exercise.  The study lasted 10 weeks and size changes in the triceps and rectus femoris (representing quad growth) were measured via Ultrasound.

8 subjects dropped out for reasons unrelated to the training, leaving only 27 subjects but each group still had 9 subjects so it was balanced (although underpowered in a statistical sense).  There were no significant initial differences between groups in anthropometric variables, training history, performance, or hormonal variables prior to training (and they were listed individually for each group which is fairly standard, or should be).   So the groups were also balanced in this regard.

So let’s add up the volumes.  Legs are easy, there were three quad and three hamstring movements once a week for 1,2 or 4 sets apiece and this yields a lower body volume of 3, 6 or 12 sets per week.  We might question if this is truly high volume but it is looking at increasing volumes in a dose specific way and that’s what they did so that’s the data we have.

For upper body it’s a bit more complicated due to the presence of a push, pull and arm day.  Using the counting convention I described above (where both compound and isolation movements count as one set for all muscles involved), and focusing on triceps, we count the pressing day as having 4, 8 and 16 sets for triceps (4 exercises * 1, 2 or 4 sets).  The additional 3 triceps movements on Day 4 add another 3, 6 and 12 sets.  Totaling that up we reach a triceps volume of 7, 14 and 28 sets per week for the different groups.  I hope that makes sense.  The same would hold for biceps but it wasn’t measured so it doesn’t really matter.

First I’ve presented the results for the rectus femoris indicating the starting and ending size, change and percentage increase (calculated by dividing the starting value by the change and multiplying by 100).  I’ve indicated changes based on total weekly sets.  This value is in mm^2 (millimeters squared, explaining why they are so much higher than the triceps values below)

3 -sets 6-sets 12-sets
Quads (pre) 930 mm^2 940 mm^2 860 mm^2
Quads (post) 993 mm^2 987 mm^2 973 mm^2
Change +63 +47 +113
%age Change 6.7% 5% 13.1%

As I’ve discussed previously, this data is a little bit weird with the moderate volume group growing less than the low volume group although it’s probably just a statistical blip of zero relevance and it’s probably that in this group low volume and moderate volume were about the same in terms of the growth response.  Percentage wise, the higher volume group got about double the growth.

However the researchers found none of this was statistically significantly different between groups.  Not the starting values and not the changes regardless of volume.  Still it’s hard not to see a trend from the 3 and 6 set groups and the 12 set group where 3 and 6 are the same and 12 sets got more (about double) the growth.  Given what we know about the dose response to training volume, that is up to 10+ sets it keeps going up, this makes sense.  In somewhat trained lifters, you might expect a higher volume to be superior or required for optimal results.    The lack of statistical significance is likely just a statistical blip related to the study being underpowered statistically.  But let’s all agree that a trend is there, one that would pass a reality check with the real world of the gym.

Make no mistake, 12 sets is only high volume by comparison and it’s a shame they didn’t have an even higher volume group as that would have been informative.  Perhaps 15 or 20 sets would have achieved even a higher percentage growth that reached statistical significance. But that’s mere speculation and without the data we can’t conclude anything.  But for the legs there is something of a dose response relationship. Both low volumes grew the same, the higher volume grew better.

Moving to triceps, I’ve presented the same data.  These are in mm, explaining the vast difference in absolute magnitude of values from the leg data above (mm vs. mm^2).

7-sets 14-sets 28-sets
Triceps (pre) 44 mm 43 mm 42 mm
Triceps (post) 45 mm 45 mm 44 mm
Change 1 mm 2 mm 2 mm
%age Change 2.3% 4.7% 4.8%

As with quadriceps, the researchers concluded that these changes were statistically non-significant although there is an apparent trend towards between growth from the lowest to moderate volume.  Given the set count this also passes the reality check and previous data.  Growth improves up to 10+ sets and 14 is higher than 10 and 7 is lower than 10 so that fits.

Importantly, there was NO REAL FURTHER INCREASE ABOVE THAT.  Basically, unless you consider a doubling of training volume and time to be worth an additional 0.1% growth, we can conclude that this study found a plateau in growth with the moderate weekly set count.   Basically 14 sets outperformed 7 but 28 sets was not better.  So here we have a first indication that there might be an upper limit above which no further growth is seen.


Over 10 weeks, Ostrowski et. al. found that trained men showed a trend (non-statistically significant) for better growth at the highest volume (12 sets) for quads and a plateau at a moderate volume (14  sets/week) for triceps with no further increase at 28 sets/week.  Moving on.  Of some interest, the highest quad set count and middle triceps set count are in similar ranges or 12 and 14 sets per week.  Since this study didn’t examine even higher quad volumes, we don’t know if there would have been further growth above that point.

Dose-response of 1, 3, and 5 sets of resistance exercise on strength, local muscular endurance, and hypertrophy.

The next study is by Radaelli et. al. published in 2015 in the Journal of Strength and Conditioning research. It recruited 48 men from the Brazillian Navy School of Lieutenants.   The subjects were untrained in the sense of having zero weight training experience but were experienced with “traditional military training involving body weight exercises, such as push-ups, pull-ups, and abdominal exercises”. This is reflected in their baseline strength values; the 5 set group had a 5RM bench of 89.6 kg at baseline which might translate to roughly a 105 kg max (assuming 5RM is 85%).    The average bodyweight was 79.3 kg and this equates to a 1.3BW bench in this group.  I’d note that the 1 set group was much weaker with only a 64.5kg 5RM, equating to roughly a 76 kg  1RM or less than bodyweight.  The 3 set group was in the middle with a 73.4 kg 5RM and the control group was in that realm with a 68.3 kg bench.

The study lasted 6 months which would have taken them through the rank beginning weight training stage.   So while this is still a study on “beginners” the results should be illustrative.  You’ll see at the end that it doesn’t matter much.

The men were divided into four groups.  The first did bodyweight calisthenics 3X/week for an hour while the others performed a weight training workout consisting of bench press, leg press, lat pulldown, leg extension, shoulder press, leg curl, biceps curl, crunch and triceps extension three times per week.  Groups did either 1, 3 or 5 sets per exercise in the 8-12 RM range.   Growth was measured only in the arm flexors (biceps + brachalis) and triceps.  But here the volumes differed.

Again using the counting convention described above there are three exercises that work the triceps: bench press, shoulder press, triceps extension.   So that’s 3, 9 and 15 sets done three times weekly for 9, 27 and 45 sets week for triceps.  For biceps there are only pulldown and biceps curl so that’s 2, 6 and 10 sets per workout for 6, 18 and 30 sets per week.   Quads would have been the same as pulldowns with only 2 exercises for 2, 18 and 30 sets per week.

First let me look at the overall body composition results since I think they are vaguely indicative of just what this study did not show.

Radaelli Body Comp Changes

I’m gonna focus on Fat Free Mass (FFM) and let me note that this is not a particularly good proxy for actual muscle growth since it can be impacted by too many things (i.e. water, glycogen, poo).  It’s still worth examining.  Because what we see is the following.

Group Pre Post Change
Control 61.95 kg 64.86 kg +3kg
1-set 67.24 kg 67.70 kg +0.5kg
3-set 63.01 kg 65.99 kg +3kg
5-set 71.4 kg 74.7 kg +3.3kg

So the calisthenic group gained 3kg, identical to the 3-set group while the 1-set weight training group gained almost nothing.  And the 5 set group only gained a little bit more than either the control/calisthenics group although everybody stomped the 1-set group.

Ok, does this pass the reality check to anybody?  Does anybody believe that an hour of calisthenics three times per week beats out even low volume resistance training in beginners training for FFM gains?   Yeah, me neither.

Irrespective of that, it certainly appears that going from low to moderate volumes improves results (0.5 kg to 3kg) but going from moderate to high doesn’t do much more (3.3 kg).  Basically, there is a seeming cap for volume and FFM gains, inasmuch as FFM gains per se are a good proxy for muscle growth.  Which they really aren’t.  But it already sort of illustrates how goofy the results are at the outset.  And it gets worse.

Ok, moving onto the actual growth data. Unfortunately, the researchers chose to present the results in this tiny ass little graphic that is really hard to see and the numerical data was not provided so I cannot present it as I did in the study above.  I could try to estimate the numbers from the graphs but then we have the issue of MY bias and it hurts my eyes anyhow.  So I’m just focusing on changes and whether or not they were statistically significant or not.  This is shown in the graphic by the symbols and shit over the bars.  Click it to make it bigger and read the legend to figure out what was different from what.

Radaelli Size Changes

Ok, so biceps is on the left and recall that that group was doing 6, 18 and 30 sets per week.  Here were the results:

Group Control 6 sets 18 sets 30 sets
Change None None Some (more than 6 or control) Most (more than 18 sets, 6 sets and control)

I guess this is kind of a dose response relationship although not really since it’s actually none, none, some, more.  For a true dose response relationship, you’d expect zero for control, some for 1-set, more for the 3 set and even more the 5 set group or something.Ignoring that let me ask the following question:

Does anybody believe that, in beginners (well, newbies to weight training) literally ZERO biceps size was gained with 6 per sets week over 6 months of training?  Because this actually runs counter to essentially every study ever on on beginning trainees where those volumes reliably generate at least growth.  And yet this study found zero (or nothing that was statistically significant).  Yeah, I don’t believe it either.

Now let’s look at triceps.  Now we see an even weirder pattern.

Group Control 9 sets 27 sets 45 sets
Change Zero Zero Zero Some (different than all other groups)

First and foremost, this isn’t a dose response result which as above described increasing results with increasing volumes.  Second and secondmost, does anybody believe this?  Does anybody believe that, in total beginners, 9 or even 27 sets per week generated ZERO growth in the triceps and it took 45 sets/week to see anything happen?  Can this be rationalized against at least some biceps growth in the 18 set per week group (do triceps need 2.5 times as much volume to get growth?)  Because again this runs counter to every study ever done. And none, none, none, tons is not a dose-response.  The results seem random and don’t make an iota of sense.

I’d note that at least a plausible explanation is that the prior military training the subjects had done altered their response to training.  It’s still difficult to reconcile the different responses in the groups and for the different muscles.  It’s also important to note that body fat percentage went down in all groups by 4% in the 1 set, 6% in the 3 set and 5.3% in the 5 set group (the control group saw no change in bodyfat) so the subjects were in a deficit.  This could definitely have altered the results although it still does not explain the overall pattern of growth in individuals new to weight training.  But there was also that pesky LBM gain.  So something weird is going on.  It also still fails the reality check completely.  Nothing in this study’s results make an ounce of sense.

In this vein, I want to refer back to a rather cool older paper titled Time course for Arm and Chest Muscle Thickness Changes Following Bench Press training by Ogasawara et. al. and published in 2012 in the journal of Interventional Medicine and Applied Science.

In it, 7 healthy untrained men performed nothing but bench press for 3 sets three times per week and they did it for 24 weeks, the same 6 months as the Radaelli study. Biceps, triceps AND chest muscle thickness was measured (making me question why chest isn’t measured more often since it’s clearly technically possible and would give way more information than doing a ton of chest and shoulder work and only measuring triceps).  I’ve presented the results below.

Ogasawara Muscle Thickness Changes

So this is super interesting.  So no shock, essentially no growth in the biceps.  Good control here, you wouldn’t expect bench to have much impact on biceps growth.  Triceps did not show significant growth until week 5 (the asterisk) and then more or less plateaued about halfway through the study.  In contrast, bench showed significant growth by week 3 and showed a more or less constant gradual increase to the end of the study as did 1RM (which you’d expect as muscle size increased gradually).

So why am I bringing this up?  This study was identical in duration at 6 months to the Radaelli study and used a very moderate volume of 9 sets per week for chest with no other confounds in terms of exercises being done.  And it generated measurable growth in both the pecs AND triceps with nothing but that volume in rank beginners.  And yet somehow Raedelli couldn’t generate ANY triceps growth until 45 sets were done.  None.  Zero.  Zip. Nada.  Sure.

Like I said, Raedelli’s results run counter to every study ever done on the topic along with common sense. Combined with the FFM data where the calisthenic group outperformed the low volume training and matched both medium and high volume weight training groups, there’s simply no reason to even consider the results as meaningful or correct.  The data is seemingly random, makes no sense, there is no logic to biceps getting at least some growth from 18 sets but triceps getting NONE with 27 and needing 45 to get anything.

So forget this study.


This paper is a pile of crap so far as I’m concerned.  The results are random and fail every reality check while contradicting every previous beginner study to date.  I’m throwing it out and you can agree with that decision or not but I think that the results are garbage and I will be ignoring it going forwards.   It doesn’t materially impact on my conclusion as you’ll see in Part 3.


Effects of a Modified German Volume Training Program on Muscular Hypertrophy and Strength.

The next paper I want to examine was done by Amirthalingam et. al and published in the Journal of Strength and Conditioning Research in 2017.  It recruited 19 healthy males who had an average resistance training experience between 3.5-4.8 years although the standard deviations were pretty big (the 10 set group was like 4.8 years +- 3.9 years so it was a spread of 1-4 years or so).  So they were trained, at least to varying degrees.

However, looking at bench press, the 5 set group had a 1RM of 70.7 kg at a weight 74.8 kg for a 0.95 BW bench  while the 10 set group had a 1RM of 79.7kg at a weight of 77.5 kg for just over a bodyweight bench.  This is a little weird and you’d expect individuals with this much weight training experience to be stronger. There is also the 10kg/22 lb difference between groups though it’s hard to say if that’s meaningful or not.  Certainly they were weaker than the subjects in Ostrowski which could impact the results.

Subjects were placed into one of two groups who performed what was called a Modified German Volume Training (GVT) program (GVT was originally popularized by Charles Poliquin and consisted of 10 sets of 10 at 60% of 1RM).    The following split routine was used and body comp was measured by DEXA with muscle thickness of biceps, triceps and quad being done with Ultrasound.

Day 1 Day 2 Day 3
Flat bench, incline bench, lat pulldown, seated row Leg press, lunges, leg extension., leg curl, calf raise Shoulder press, upright row, bicep curls, triceps pushdown


The two groups performed either 5 or 10 sets of the first compound exercise for each muscle group (i.e. flat bench, pulldown) at 60% of 1RM (so the loads were submaximal) and this was followed by 3-4 sets for the secondary movement (i.e. incline bench, row) at 70% of 1RM so closer to limits.   This makes it a little bit different than most studies I’ll look at since a lot of the volume was submaximal but, as above, that’s the GVT method.

Counting volume is a hassle here but the 5 set group did 8-9 sets hitting triceps on Day 1 (5 sets bench, 3-4 sets incline) and another 8-9 on Day 3 (5 sets press, 5 sets tricep pushdown) for a total of 16-18 sets/week while the 10 set group would have done 13-14 sets on Day 1 and another 13-14 sets on the arm day for 26-28 sets per week.  Biceps is vaguely in the same range depending on if you count upright row as hitting biceps (it is a pulling movement).  For legs, quads would have gotten 5 or 10 sets of leg press plus 3-4 of lunges and 3-4 of leg extensions so that’s 11-13 sets (5 + 5-8) or 16-18 sets (10 + 6-8).

Looking first at overall body composition, the results for lean body mass and regional lean body mass appear below.  I’ve shown the largest changes in each in red to make it easier to see.  I am not saying that these are or are not statistically significant, just looking at absolute magnitudes.

5-set (pre) 5-set (post) Change
10-set (pre)  10-set (post) Change
LBM (kg)
58.3 60.1 +1.8 61.8 63 +1.2
Trunk LBM (kg)
26.5 27.6 +1.1 28.5 28.8 +0.3
Arm LBM (kg)
7.7 8.3 +0.6 8.6 8.9 +0.3
Legs LBM (kg)
19.9 20 +0.1 20.6 21.1 +0.5



The researchers reported that the 5-set group increased total LBM by 2.7% while 10 set increased by only 1.9% with the 5-set group showing a SUPERIOR response trend for trunk and arm.  I think it is interesting to note that, while non significant, the leg growth was higher in absolute terms although overall small in magnitude (0.5 kg vs. 0.1 kg).  Let’s face it, 5X not a lot is still not a lot.

Looking at muscle thickness, they report the following data:

Muscle 5-set (Pre) 5-Set (post) Change 10-set (Pre) 10-set (Post) Change
Triceps 41.1 43.4 +2.3 42.0 46.5 +4.5
Biceps 33.1 35.5 +2.4 34.6 34.9 +0.3
Anterior Thigh 53.1 55.7 +2.6 53.3 54.4 +1.1
Posterior Thigh 66.4 67.6 +1.2 66.7 68.9 +2.2

Looking at muscle thickness, they report no significant interactions for either group meaning that everybody grew the same and the results are kind of all over the place.  I’ve put the largest changes in thickness in red above and you can see that 5-sets is better for biceps and anterior thigh and 10-sets was better for triceps and posterior thigh.  So there’s really no overall advantage or no aggregate advantage for one or the other training volumes.  5 set was superior on two measures, 10 sets was superior on two measures and now we go to Thunderdome to find a winner.

I suppose it’s possible that this represents a difference in optimal volumes for biceps and triceps but it’s more likely that the small number of subjects making it underpowered to detect if this was a true difference or just statistical noise.  It might also be due to whether or not you want to count upright rows as a biceps movement which would change the set counts a bit but the difference would be small.  But overall neither group showed any real advantage.

Admittedly this study was small and short but overall it found that higher volume did not generate any further growth than lower in terms of actual muscle thickness.    Specifically 26-28 sets per week for arms (counting compound and direct work) was not superior to 18-19 sets per week.  The leg volumes were very similar, 12-13 vs. 16-18 sets which might explain the lack of differences there.

Again, leg LBM gains were a little higher for the higher volume group which fits with anecdotal beliefs that legs need more volume but the differences were still small overall.  But the volumes were so close to make it irrelevant.  Once again without truly higher volumes for legs, we can’t really draw that many conclusions.  Perhaps 20+ sets for legs would be better but this study didn’t examine it.   As well the volume for triceps and biceps was near the high end so we don’t know if lower volumes would have been equally effective.  Without data, no conclusions can be made.


Overall, this study found no difference in upper body growth with very high versus lower volumes although the leg volumes were so close together it’s hard to tell.  Optimal growth was found between 12-18 sets for legs and 18-19 sets for upper body with 26-28 sets showing no greater growth.  Whether lower bicep/tricep volume would have been as effective or higher leg volume more effective is unknown.

Effects of a 12-Week Modified German Volume Training Program on Muscle Strength and Hypertrophy—A Pilot Study

This was a follow-up paper to the previous one done by the same group with Hackett as the primary author.  It was literally identical in structure and methodology with the primary difference being that it was 12 weeks rather than 6.  This was done to address the fact that the original had found marginally different results in terms of strength gains being greater in the lower volume work compared to other studies.   They comment that other researchers suggest that studies need to be at least 12 weeks or more to show meaningful results and they wanted to see if their previous results were real or just an artifact of it being so short.

Towards that goal, 12 males (an admittedly small number) with a minimum of 1 year training experience were recruited and split into one of two groups with the design being identical to the previous study in terms of the workout done and the number of sets.  So I won’t repeat that here.  Pretty much the entire methodology was identical.  Genuinely, it was just twice as long as before. They had similar levels of strength as in the previous study.  The 5-set group weighed 75kg with a starting bench of 76kg or bodyweight and the 10-set group weighed 83kg with about a 82kg 1RM bench so again bodyweight bench.  Fairly beginner but that’s consistent with the minimum 1 year training experience.

The real limitation to this study was the measurement of body composition and lean body mass, fat mass via DEXA with no direct measurement of muscle thickness being made.  As I’ve mentioned, LBM gains are not a fantastic indicator of actual muscle gains and are, at best, kind of indicative of what’s going on.  It’s a little curious that the 6-week study used Ultrasound and this one didn’t but the title of this paper…a Pilot Study suggests it might actually have been done first.  Generally speaking, pilot studies are small with a reduced methodology (reducing cost) to see if it’s worth doing a longer/larger study or not.   Really just a proof of concept study before you throw the big bucks at bigger studies.  Which makes me wonder why this is written as if it were done second.  I truly have no idea what’s going on.

Looking at body composition changes the 5-set group gained weight and LBM throughout with a small increase in fat mass while the 10-set group did not (while not statistically significant, their average weight went up in the first 6 weeks and then decreased).  As per the previous study, trunk, leg and arm mass was measured and there were no differences between group (so the lower volume worked just as well as the higher).  Of some interest, leg mass started to decrease from week 6 to 12 in the 10 set group although the exact reason is unclear.  The researcher note that the 10-set group started to lose weight and BF% in the second half of the study so this might have been a diet issue.

5-Set (Pre) 5-Set (6 wks) 5-Set (12 wks) 10 set (pre) 10-set (6 wks) 10 set (12 wks)
LBM (kg)
59.9 60.6 60.9 62.5 63.8 63.5
Trunk LBM (kg)
26.7 27.7 28.0 29.1 28.8 29.6
Arm LBM (kg)
7.9 8.5 8.5 9.1 9.1. 9.2
Legs LBM (kg)
20.3 20.4 20.5 20.7 21.7 20.4

Of perhaps more importance, while the 5-set group gained a small amount of body fat, the 10-set group lost a small amount.  This suggests that the diet might have varied with only the 5-set group being in a surplus while the 10-set group was in a deficit.  This is always a problem with such studies (and why global measures of body composition are informative in addition to direct muscular size), the lack of dietary control.  But it could have altered the results to be sure.

Looking at strength gains, neither group increased their leg press 1 rep max which is weird but the 5-set group is reported as being the only group showing strength gains in terms of bench 1RM.   This occurred over the first 6 weeks of the study but I think this is a misrepresentation of what actually happened.     I’ve shown the strength changes below.

GVT Bench Press gains

As you can see below, the 5-set group started off with a lower 1RM bench press which wasn’t statistically significant but is probably real world significant (7ish kg = 15 ish pounds) and basically caught up over the first 6 weeks.  So yeah, they made more strength gains to end up just as strong so meh.  And both groups made the same small improvement from week 6 to 12 both of which were significantly different from starting levels.  So I don’t think this means much.  The bigger conclusion is that, over the length of the study, the absolute strength gains were really no different between groups despite the different training volumes: the moderate upper body volumes worked just as well as the higher.  Once the 5-set group caught up, there was no meaningful difference in gains from that point until the end of the study.  Even the slope of the line is basically identical.

I am not going to present the detailed statistical analysis they did in terms of effect sizes but only the conclusions.    They state:

The trivial effects for leg lean mass in the 5-SET group compared to the small effect found for the 10-SET group at six weeks suggests that a greater volume over a relatively short duration is effective for leg hypertrophy gains. These trivial effects on leg lean mass were maintained over the whole 12 weeks for the 5-SET group, whereas, as discussed previously, there was an unusual decrease for this measure in the 10-SET group. In contrast, for the upper body hypertrophy measure at each time point, there was a tendency toward slightly greater effects for the 5-SET compared to 10-SET group. This may also point toward a different response to resistance training volumes for muscles of the upper and lower body.

Which is basically a long way of saying that statistically, the higher volume group was a little better for leg mass gains while the lower volume group was a little better for upper body mass gains.  But the effects were tiny in either direction.  A trivial versus small effect just doesn’t mean a lot in statistical jargon.  Even the researchers didn’t think it meant much and only mentioned it for completeness.  It is at least generally similar to the previous study in terms of the results and directions of the results.  Anecdotally, many feel that the lower body needs more volume than the upper and this speaks to that if just barely.

Since is bears repeating, the 5 set group did 8-9 sets hitting triceps on Day 1 (5 sets bench, 3-4 sets incline) and another 8-9 on Day 3 (5 sets press, 5 sets tricep pushdown) for a total of 16-18 sets/week while the 10 set group would have done 13-14 sets on Day 1 and another 13-14 sets on the arm day for 26-28 sets per week.  Biceps is vaguely in the same range depending on if you count upright row as hitting biceps.   For legs, quads would have gotten 5 or 10 sets of leg press plus 3-4 of lunges and 3-4 of leg extensions so that’s 11-13 sets (5+5-8) or 16-18 sets (10 + 6-8).

Putting that together we get:

5-Set 10-set
Upper body volume 13-14 sets 26-28 sets
Upper body gains Slightly better Slightly worse
Lower Body volume 11-13 sets 16-18 sets
Lower body gains Slightly worse Slightly better

With the differences being essentially insignificant in the big picture.  It is interesting that the lower body did a little bit better with a rather small volume increase (only 5 sets difference) but this still doesn’t help to answer the question of whether or not higher leg volumes (i.e. 20+) would be superior.  But basically the higher volumes of training didn’t outperform the lower volumes.


So overall, even doubling the training volume and workout duration (over a longer study), the results were basically the same supporting that moderate volumes are sufficient for maximal gains in the upper body.  The lower volume group did just as well as the higher volume group for upper body with a small indication that legs responded to the higher (but still relatively reasonable) weekly volume and were in the same range as the moderate volume upper body set count to begin with (i.e. 11-13 or 16-18 sets vs. 13-14 sets).  Since no higher lower body training volume was tested, no conclusions can be drawn as to whether that would have been superior.

And that’s where I’m going to cut Part 1 of this series.  In Training Volume and Muscle Growth: Part 2, I’ll look at three newer studies in detail to see how they fit with the above before finishing up in Part 3 by trying to put together a model based on all of the data (except the Radaelli study which I will ignore going forwards).

Read Part 2 of Training Volume and Muscle Growth.

A Challenge to Brad Schoenfeld and Others

So I had originally said I would leave this be, that this wasn’t a rap battle, after writing my last detailed criticism of the recent Brad Schoenfeld study.  Well clearly that’s not the case.

More on the Statistics

First let me point readers to a thorough analysis of the statistics used in Brad’s paper by Brian Bucher.  Basically he takes them apart and shows that none of the THREE metrics supports their strongly worded conclusions.

None of them.

In this vein, here’s something interesting.

Brad and his group have NEVER used Bayesian statistics until this paper.  I searched on my folder of his papers and the term Bayesian shows up 4 times.  Three are papers that Menno Henselmans was on and it’s his email address.  The fourth is one of James Krieger’s meta analyses.  At best James has used them before.

Now I find this interesting because there is no way to know if Brad and James had planned to use this approach ahead of time.  James has asserted that they did but this cannot be proven.  Here’s why: in research it is common to register trials before doing them.  This is required in medical research by the Declaration of Helsinki (note that not all journal choose to follow this).

Basically you outline your goal, hypothesis, methodology, what statistical methods you intend to use.  This prevents researchers from gaming it after the fact, using different statistical methods to try and make an outcome happen.  This is common in research, you just keep throwing different statistical methods at your data until something says it’s significant. Registration is just another way to reduce bias or, in scientific terms, shenanigans.

Brad Schoenfeld appears to have never registered a single trial of his.  Again, apparently he is above scientific standards.  But this allow for the following potential to occur.

They gather their data, colored by every issue I already described.  Now they run standard P value stats and find that there’s no difference between moderate and high volume.  That’s what Brian’s analysis showed as the subscript on the 3 and 5 set groups is identical.   They were NOT different from each other by that method.  Both were better than low volume but moderate and high were IDENTICAL statistically.

In the absence of registration, there is NOTHING to stop them from applying another method, say Bayesian to try and make their desired outcome happen.  Which they did.  Even here, the Bayesian factors were weak, approaching not worth mentioning by standard interpretations.  This didn’t prevent them from making a strong conclusion in the paper or online.  Brad is still crowing about it despite the simple fact that the statistics do not support the conclusion.

But there is no way to prove one way or another that they did or did not do this.  But without registration of the trial, they can’t prove that they didn’t. And the onus is on THEM to do so.  Somehow I doubt that they will provide said proof.  But registering the trial would have prevented yet another criticism of their paper.  More below.

More on the Individuals Involved

Before I get into anything else, I want to examine how three individuals involved in this responded to both my original and more recent criticisms.

Brad Schoenfeld: Brad ducked them completely.  He said he wouldn’t respond because I insulted him.  You know who else told me that?  Layne Norton when I took him to task over reverse dieting years ago.  Then Brad left my FB group and blocked me on Facebook.  Because it’s always easiest to win an argument when you don’t allow for dissenting opinions.  Gary Taubes, Noakes, Fung and Layne do the same thing.  It’s standard guru operating procedure.  Just make any criticism you can’t address disappear.  If anyone had done it to Brad you’d have never heard the end of it.  But he’s above the law.

I’d note that I never removed Brad from my group.  He left voluntarily, presumably because he got tired of seeing people ask him to address my criticism when he couldn’t.  So he just punked out completely.  I’d also note that he played the “you don’t even science” card on a couple of critics.  That’s a Layne Norton tactic too.  Typical guru approach.

I’d add that Brad also blocked Lucas Tufur (who wrote an excellent article on their paper) as well for the mere act of “suggesting Brad was misinterpreting a study somewhat”.  Typical guru behavior and don’t pretend it’s anything else.

Science is based on discourse and debate and that is how it progresses.  Honest scientists embrace debate because it gives them the opportunity to defend their work (flatly: if you can’t address criticism, perhaps your work is not as strong as you think).

When Brad writes letters to the editor about papers he doesn’t like, he expects a response.  But just as with blinding and randomization and Cochrane guidelines, Brad is clearly above the scientific method.  He gets to guru out. Others do not.  The behavior he would never allow anyone else to engage in is acceptable for him and him alone.  Well and other gurus like Layne Norton who built himself up as the “anti-guru” until he became one himself.  The standards he held ALL OTHERS to stopped mattering when he was selling reverse dieting (seminars on which bought him a mansion).

Eric Helms: Now in a sense Eric Helms has no dog in this fight in that he wasn’t involved with the paper.  Except that he does because of how he dealt with this issue.  In email I had asked him about it and told him my issues and he debated back and forth before telling me he hadn’t even read the paper.  But he was already defending Brad.

In my group (and note that I tagged him, forcing him to get involved) I asked him about Brad’s misrepresentation of the Ostrowski data.  His response, a total deflection was “But you’ve done it too.”  I asked him when.  And now get this: he referred me to a NINE YEAR OLD blog post I did on FFMI.   NINE YEARS OLD.

In it, I looked at some data from the Pope paper on previous Mr. America winners and had stated that only one or two exceeded the FFMI cutoff.  The real number was closer to 6.   But it’s of no relevance.  My mistyping didn’t change my conclusion: While there are exceptions to the FFMI cuttoff, overall it is a good cutoff in 99% of cases.  But it was an error, yes.  I admitted it and changed it immediately (because intellectually honest individual admit a mistake and fix it, something more in this field should try).  I’d have changed it sooner had I known before a week or so ago.

Regardless, what Eric did was to compare a nine-year old blog post (where the error changed nothing) to a scientist lying about data (in such a way as to change its conclusion) in a published peer-reviewed journal. Said scientist using that lie to support a paper’s conclusion and increase his own visibility (and presumably seminar visits at $5k per appearance).

I’m sorry but does Eric really think a NINE YEAR OLD blog post can or should be held to the same standard as a published scientific paper?  Apparently so but only because it allowed him to completely avoid addressing my issue with Brad’s paper.  It was a guru deflection and nothing more.   He has never since addressed a single criticism I have levied against Brad’s paper.  NOT ONE.

More than that it was a blindside.  Eric has been a colleague and I guess friend for many years. I edited his books, he contributed a lot to The Women’s Book including an appendix on peak week and making weight.  He could have told me about this error any time in the last half decade.  Instead, he apparently saved it up for ammunition against me in case he ever needed it.  It would be like me leaving one of the myriad errors in his books in when I edited it to use against him if need arose.  But I didn’t do that.  Because I have intellectual integrity.

Eric has since blocked me on FB and left my group.  Again it’s easy to win an argument when someone can’t defend themselves.  He has apparently claimed it was due to me ‘impugning his integrity’. Sorry, Eric I can’t impugn something that doesn’t exist.  He has acted unprofessionally and, in an effort to defend Brad (with whom he also does seminars with) blindsided a different colleague entirely.

A man with integrity would not deflect a real criticism with a blindside.  A man with integrity doesn’t have to crow online about how he has integrity.  A man with integrity shows that he has integrity by his actions.  By being intellectually honest and not applying a pathetic double standard when it suits him.

Eric has not shown integrity in this matter.

Whatever, I will show him the meaning of true integrity shortly.

James Krieger: And finally James himself. First let me point readers to a FB thread where James is getting kicked around by Lucas Tufur over his recent post. You can watch James mis-reference and mis-represent studies front, back and sideways while Lucas points out his errors and he just moves the goalposts.  Maybe that will tell you what’s going on.  It’s just desperation at this point, he can’t admit that the study was methodologically unsound, the statistics didn’t support the conclusion or say they were wrong.  So it’s pure guru behavior.

Now I will continue to give James credit in that he was the only one with the balls to even attempt a defense.  Brad punked out and deflected and Eric did too which is simply pathetic.  James at least tried even if he used the same guru tactics, deflections and obfuscations in doing so.  He still doesn’t understand and can’t defend Bucher’s analysis of his stats but even then I’d question why someone with an MS in nutrition and ex. phys is doing the stats in the first place.  That’s what mathematicians are for.  I have a cousin with an MS in applied mathematics and she runs the stats on big medical trials.  Why is James doing it?

Well I think it’s simple: he’s good enough at it (mind you computer programs do most of the work at this point), knows how to use Bayesian statistics to obfuscate stuff and, most importantly, shares Brad’s bias about volume.  An unbiased statistician wouldn’t play silly buggers like James did.  Note again my comment above: in an unregistered trial there is NO evidence that James didn’t run the frequentist methods and then, when it didn’t support the conclusion they wanted, use other stats in a feeble attempt to make them happen.  This happens a LOT in science.  That’s why you register papers, to eliminate the potential or accusation that happens. It’s why you randomize and blind to reduce the RISK of bias.

After writing my last criticism, James first attempted to defend to some degree the criticisms before writing HIS final last response.  You can find it here.  Note that the intellectually honest individual shows both sides of the story something I doubt he has done.  Now I’m sure he made some good points.  But he also made some gross misrepresentations, ones most won’t catch.  Some highlights.

Lyle Doesn’t Even Science

James asserts that I simply don’t have enough experience doing science (there it is again, “Lyle doesn’t even science”) to understand the realities of doing it.  And yet I know that proper randomization, blinding, trial registration, data reporting etc. are good practices to reduce the risk of bias.  Maybe Brad and James should do some remedial work since I seem to know a lot more about good scientific practices than they do.  Seriously, if a bunch of randos on the Internet are having to educate ‘professional’ researchers about basic methodology….

Cost and Funding

He also blathers about the cost and funding involved and how many of the methodological issues aren’t realistic financially.  I never said science was easy and I know it’s expensive so there’s his strawman.   According to Google Scholar, Brad has published about forty seven papers already this year.  That’s 5 per month since it’s only September, most researchers do maybe 1 a year because that is how long data gathering and analysis usually takes.  Funding is clearly not an issue and perhaps Brad should do one GOOD study per year instead of putting his name on 4 per month.  Or maybe that’s why they are hiring a non-mathetmaticist to do the stats: he can’t afford an actual statistician because he’s using his funding on too many poor quality papers.

Let me add: Describing the randomization of a study is free.  Registering a trial is free.  Blinding might increase the costs for technical reasons but, as above, rather than doing endless sub-par studies, why not put the funding towards fewer QUALITY STUDIES?  The same fancy computer that James uses to run his stats can randomize subjects to the different groups at no cost.  Certainly getting a second Ultrasound tech might cost money, perhaps Brad is the only one on campus trained in it.  He could still be blinded to who he is measuring which, as per Cochrane, reduces the risk of bias from high to low.

Blinding of Ostrowski

He also babbles something about whether or not Ostrowski was blinded or why I didn’t mention it above.  This is a pure deflection.  Essentially he’s arguing that since other studies might be methodologically unsound, it’s ok for theirs to be.  This is like arguing in court that “Yes, this man may have murdered someone.  But how do we know YOU haven’t murdered someone.” to deflect attention from the issue at hand  The methodology of Ostrowski is not at question here, the methodology  and discussion of Brad’s paper is.  Regardless, it’s irrelevant.

Whether or not the Ostrowski is blinded or not doesn’t matter because I’m not the one holding it up as providing evidence.  I agree that there is a trend as claimed by Brad but I’m not using the data per se as evidence.  If James is saying it should be dismissed for not being blinded, then Brad can’t use it in the discussion to support his conclusions.  And in using it in his discussion, Brad is saying it’s valid.  Which means that how it was represented is all that is at issue here.  And it was misrepresented completely, a point that James finally acknowledged himself.

Basically, either the data is valid or it’s not and James can’t have it both ways.   And my actions don’t impact on that.  Only Brad’s does.

And that is still just a deflection from the fact that, whether the Ostrowski data is good or not, BRAD LIED ABOUT WHAT IT SAID to change it’s conclusions from contradicting to agreeing with him.  Of course, James has still failed to address that so far as I can tell at least not directly.  He even said it was a misrepresentation. Ok, so why is it still not an issue that has to be addressed?

Put differently, why does Brad Schoenfeld get to lie about data in a published paper and nobody blinks?  I make an inconsequential error in a 9 year old BLOG post and I’m at fault.

Edema Studies

James also argues that the studies on edema timing aren’t relevant sine it was a new stimulus to the trainees.   The Ogasawara study cited in Brad’s paper was in beginners while the Ahtiainen I cited in my last piece was a long-term training study in strength-trained men.  So James is not only trying to have it both ways but he’s factually wrong.  The study THEY used is in untrained individuals, the study showing edema is in trained individuals.   So this is just more of his endless deflection.  I refer you back to the link above where Lucas Tufur is kicking James around on this topic and you can see James continue to defend what is indefensible.

Oh yeah, James Krieger has now blocked me on FB as well, right after publishing his article.  But it’s always easier to win an argument when the person you’re arguing with or attacking can’t argue back isn’t it?   I’d note again that I left all three of them in my FB group to give them the opportunity to address criticisms and all three voluntarily left.  I did not and would not have blocked or booted them so that I could win by default.  They left by choice.  And then blocked me so that I don’t even have the opportunity to respond to them.

On his blog, James has asserted that he blocked me due to me sending them nasty emails and calling them mean names.  All true.   And?  First and foremost, Layne Norton played the same card.  Said he didn’t have to address my criticisms since I made his wife cry (I wonder if she cried more when he left her for an Australian bikini chick).  Second, this is just disingenuous posturing.  I’ve been emailing Brad, Eric and James for a  couple of weeks calling them guru shitbags and more (my creativity for insults is quite well developed after so many years).

And it wasn’t until the day he posted his last response that he blocked me (or that I noticed, he had been in my group for that previous week).  He waited to do it to ensure I couldn’t respond to him.  So just another, well let’s call it what it is: a lie.  He’s lied in his articles about studies and data, and he’s lying now.

Finally, what is with this industry and otherwise big muscular dudes being such insecure children?  Do words on a screen really hurt them that much?  Or is this just more pathetic guru behavior to avoid my criticisms? Hint: it’s the latter. Seriously, these guys need to eat less tilapia: their skin is too thin.  But I digress.

The Anecdote

Oh yeah, if you look at Jame’s article he put up a picture of one of his clients who does the high volume bicep work as supposed proof of concept.  But the last time I looked, anecdote (i.e. one individual) doesn’t count as science.  It never has an never will.

This is like the cancer quack holding up a SINGLE survivor from their program and ignoring everybody who died.  This is like Layne Norton, who pretends to be the anti-guru, saying he’s got 100’s of emails so science doesn’t matter.  So James, either pretend to be a scientists or don’t.  Don’t play silly buggers where it’s science until it’s not.  Even if it’s bad science in this case which it is.

If I were a different person, I’d do the same and put up a picture of someone who got big using moderate volumes.  But that’s not an argument that has any validity so I won’t.  I’ll leave such nonsense to gurus and pay attention to the scientific facts.  If you want to science, stick with the science.  If you want to use anecdote, that’s fine.  But don’t pretend it’s science.  James wants it both ways, just like he does in most of his discussion above.

The Guru Crew

So add James Krieger to the guru group of Brad Schoenfeld and Eric Helms.  Their actions are no different than endless others before them: Tim Noakes, Dr. Fung, Gary Taubes, Layne Norton, individuals that most reading this piece see as stupid gurus while they give Brad, James and the rest a pass for identical behaviors.

Individuals who just block and ignore criticism rather than address it, usually after endless deflections and obfuscations.   Brad ducked every criticism, Eric deflected it and blindsided me and James used a mix of deflection and obfuscation.  Standard guru operating manual.

If that doesn’t tell you everything you need to know about the conclusions of this study, no amount of in-depth analysis by me or Lucas Tufur or Brian Bucher will help.  They have played nothing but guru games from the get go.  They shouldn’t get a pass when others do not.  And yet here we are.

Others at Play

I am told that others have joined the circle jerk.  Greg Knuckols of course but he writes for MASS with Eric Helms and of course he has to agree.  Also, that pesky seminar circuit he has to keep himself on to make the big bucks.  And all are, of course, saying my criticisms have no weight or that I have no weight.

Nevermind that Brad routinely shared my previous research reviews, Eric thought I was good enough to contribute to my book, James and I have done a webinar together.  All I have done is ask them to address specific criticisms which, for the most part they have not.  So like many before them, they resort to simple ad hominem attacks against me.  Now I don’t even science, now I don’t have any weight.  They didn’t mind when they were making money off of me.  But that’s par for this course isn’t it?

This is every guru in the history of ever.  Food babe, the Snake diet guy, more than I can think to name.  The true guru ignores criticism, deflect, obfuscate, block critics and then attack them in a forum they can’t defend themselves.  Don’t tell yourself it’s anything else.  Don’t let the fact that you like them and think I’m a foul-mouthed asshole color it. Simply ask yourself why this group of individuals is being allowed to engage in behavior that, if it were anybody else, would be criticized and destroyed?  It’s that simple.

Ask yourself why they get a pass.

A Challenge to the Whole Crew

And now my challenge.  All of the individuals involved claim to blend science and practice in terms of their training recommendations.  I have done the same for nearly 25 years (and note that most of these people came up reading MY books to begin with).  I’ve been in the weight room since I was 15, I’ve been involved in the science of performance since college.  It’s 35 years of training and nearly 30 years of being a training and nutrition nerd.  I know that science is good but limited, so is anecdotal evidence.  So you have to see how they fit together.  Sometimes science supports what the bros knew (protein is good), sometimes it contradicts it (meal frequency doesn’t matter).  It’s useful to compare.

But if the crew is going to say that they blend science and practice, well that has an implications which is that new data, such as Brad’s volume data, should be incorporated into their training advice or how they train themselves.   But will it be?  Somehow I doubt it.

I doubt that any are going to move their trainees to the workout that Brad’s data would suggest as optimal.  And that tells you all of what you want to know.  If they believe in this study’s results, they should all adapt their own personal training and that of their trainees to it.  When/if they do not, then they either don’t believe it or are being disingenuous in saying they blend science and practice.  It’s that simple.

They can’t have it both ways.


Ignore the study, ignore the fundamentally flawed methodology, ignore Brad’s lie about the Ostrowski data, ignore the back and forth between the nerds and everything else and just ask yourself this question:

If Brad et. al. think this data is valid, why aren’t they implementing it in their trainees?  Yes, I know, Brad has prattled on about using high volume overreaching cycles with folks.  Huge volumes for a couple of weeks.  Fine, I’ve no issue with that.  But this study was 8 weeks straight of volumes no human, juiced or otherwise, has done with good result.  If they’ve done it at all.

Now, if I saw data that I found applicable, I would implement it into the advice I give.  If I found data I didn’t agree with, I wouldn’t.  I don’t think this study is worth a damn and won’t change my approach to training or my recommendations based on it.  I’m finishing up an analysis of the 7 extant volume papers to present in a week or two when it’s done.  And the gross data supports not only what I’ve always recommended but will continue to recommend.

But I bet that neither will they.  Yet they are defending a piece of data that I can almost assure readers they will not apply.  Why?  Because they know it’s bullshit.  They know it’s not right.  Because if it were, they’d have moved their trainees to that style of training 2 weeks ago.  I even asked them via email when they would be changing their own or their client’s training.


They know this finding doesn’t mean shit.  It also goes against 5 of the 7 total papers on the topic.  One is garbage 5 say moderate volumes beats out high and Brad’s paper is the outlier that’s left.  In science, you build models of the total of the data, not the single study.  Again note that James stated this explicitly, how you can’t draw a conclusion from a single paper, implying that I was doing that.  Except that I wasn’t, Brad et. al. are the ONLY ones drawing strong conclusions and it was yet another deflection by James, attempting to project THEIR behaviors onto me.  I love it when other people make my argument for me.  I will be examining all 7 current studies on the topic in a week or two and you’ll see what falls out of the model.  And it’s not Brad’s conclusions.

But whatever, I’m back into nerd mode.

Do the Workout

So back to the challenge, either to them or anybody who thinks they are more right than I am (which is fine, I never said everybody had to agree with me, I just asked that they address my criticisms honestly which has not been done).

Do the workout.  If you think the results are valid then DO THE WORKOUT.

And because I am a helper, I have drawn one one up based on THEIR findings. Recall that it suggested 30 sets per muscle group per week for upper body and 45 for lower body as providing optimal growth.  This was in individuals with a minimum of 1 year of training and their strength levels were advanced noob at best.  So if you’re not a rank beginner, and believe their data, it applies to you.  One year or more of regular training and at least a 1.2bodyweight squat and 1.1 bodyweight bench and you can DO THE WORKOUT.

But also keep in mind that it only used compound movements (ok, leg extension for quads) and looked at quads, biceps and triceps.  We have no data on pecs or back or delts or glutes or hamstrings.  So those have to exist independently of the other muscle groups, especially for lower body.  And the below is what that implies in terms of a practical workout scheme based around the performance of 30 sets/week for each upper body muscle (all sets to failure) and 45 sets/week for each lower body muscle.

I’ve provided two options below.  The first is a non-split routine training either upper or lower body on each of 3 days/week to get the total volume in.  All sets are on 90 seconds rest and taken to concentric failure. If you get more than 15 reps on any given set, add 3-5%.

Option 1: Non split routine

Mon/Wed/Fri: Lower body
Squat: 5X8-12RM
Leg press: 5X8-12RM
Leg extension: 5X8-12RM
RDL: 5X8-12RM
Lying leg curl: 5X8-12RM
Seated leg curl: 5X8-12RM
Standing calf raise: 5X8-12RM
Leg press calf raies: 5X8-12RM
Seated calf raise: 5X8-12RM

So that’s a 45 set workout for just legs.  If I wanted to get pedantic, I’d suggest another 15 sets for glutes to make Bret Contreras happy*.  I leave that to the individual trainee but that would take it to a 60 set workout three times weekly.  Have fun.

Tue/Thu/Fri: Upper Body
Flat bench: 5X8-12RM
Incline bench: 5X8-12RM
Cable row: 5X8-12RM
Undergrip pulldown: 5X8-12RM
Shoulder press: 5X8-12RM
Lateral raise: 5X8-12RM
Rear delt on pec deck: 5X8-12RM
Face pull: 5X8-12RM
Barbell curl: 5X8-12RM
Preacher or incline DB curl: 5X8-12RM
Close grip bench: 5X8-12RM
Triceps pushdown: 5X8-12RM

Now we might quibble over the above, Brad’s study did 30 sets/week of compound pushing and pulling. But they measured biceps and triceps.  Should we take out the direct arm work or is it 30 sets of compound pushing and pulling and 30 more sets of direct arm work.  I leave it to you to decide and I’ll address the odd way of counting sets (which is fundamentally wrong in my opinion) in a future article.

Split Routine

This is a 6 day/week split routine with each muscle hit once/week.    That means that all 30 sets for upper or 45 for lower muscle groups have to be done on that single day.  Note that even Arnold and his ilk did 20 sets per muscle once a week and not all sets were remotely close to failure.  With drugs.    Brad is suggesting 1.5 times that for upper body and a little over double for legs.  Enjoy.

Monday: Quads
Squat: 15X8-12RM
Leg Press: 15X8-12RM
Leg extension: 15X8-12RM

You could technically pick more movements but this is what they used in their paper so I’m using it too. If you want to do 5 sets of 9 different movements that target quads, go to town.  But it’s 45 sets of 8-12 to positive failure with 90 seconds rest for quads no matter what.

Tuesday: Chest/back
Flat bench press: 5X8-12RM
Flat DB press: 5X8-12RM
Cable crossover/pec deck: 5X8-12RM
Incline bench: 5X8-12RM
Incline DB press: 5X8-12RM
Incline flye: 5X8-12RM
Narrow grip Cable row: 5X8-12RM
DB row: 5X8-12RM
Shrugback: 5X8-12RM
Undergrip lat pulldown: 5X8-12RM
Medium grip lat pulldown: 5X8-12RM
Cable pullover: 5X8-12RM

Wednesday: Calves
Standing calf raise: 15X8-12RM
Leg press calf raise: 15X8-12RM
Seated calf raise: 15X8-12RM

Thursday: Delts
DB overhead press: 5X8-12RM
Barbell overhead press: 5X8-12RM
Hammer overhead press: 5X8-12RM
DB Lateral raise: 5X8-12RM
Cable lateral raise: 5X8-12RM
Machine lateral raise: 5X8-12RM
Pec delt rear delt: 10X8-12RM
DB bent over rear delt: 10X8-12RM
Face pull: 10X8-12RM
I can’t think of more rear delt movements.

Friday: Hamstrings
RDL: 15X8-12RM
Lying leg curl: 15X8-12RM
Seated leg curl: 15X8-12RM

Saturday: Arms
Barbell curl: 5X8-12RM
DB curl: 5X8-12RM
Preacher curl: 5X8-12RM
1-arm preacher curl: 5X8-12RM
Incline DB curl: 5X8-12RM
Cable curl: 5X8-12RM
Close grip bench: 5X8-12RM
Triceps pushdown: 5X8-12RM
1-arm triceps pushdown: 5X8-12RM
Barbell nosebreaker: 5X8-12RM
French press: 5X8-12RM
Cable French press: 5X8-12RM

Again we might quibble over the set count on arms.  Does it count separately or does the compound pushing and pulling get it done? Another topic for another day.

But there ya’ go.  The applied Schoenfeld et. al workout routine.

Believe their data? Think I’m full of shit? Then do the workout.  When you get overuse injuries and overtrain and get tendinitis, you can buy my injury nutrition recovery book.  Regardless, if you think they are in the right it’s simple:


But make up your own decision: If you believe them, do the workout above and report back on your results.

I love to be proven wrong and will always say I was wrong when that happens.

I also love to tell people “I told you so” and have them tell me “You were right.”

Time will tell which will occur.

See you in 8 weeks.

* I mentioned Bret Contreras above and wanted to add this comment.  Bret is listed as like the third author on the paper, right.  But what did he actually do on the paper?  Mind you, this is another issue: in most papers, the contributions of each individual author is commonly listed.  Is Bret even in the same location as Brad?  What did he contribute to the study?  As importantly, why has he remained completely quiet on the issue (so far as I can tell)?  He’s not defending it nor promoting it and one has to wonder why.