Training Volume and Muscle Growth: Part 3

Ok, let’s finish this thing up.  So far I’ve looked at the 7 current studies (as of this article’s writing in October of 2018) in often excessive detail in Part 1 and Part 2 and now it’s time to put them all together to see how training volume and muscle growth relate.  As noted in Part 1, I’m throwing the Radaelli paper into the trash.  I consider the results too random and nonsensicial  to be worth considering.

There is simply no world where growth in triceps in beginners doesn’t start until 45 sets per week but 18 sets for biceps is effective and where LBM gains are higher for calisthenics than low-volume weight training.  So it’s out.  Agree or not, I put my reasoning up front and looked at it in detail to explain why I think it’s garbage so it wasn’t just a hand-wave like most would do.

That leaves 6 studies in trained individuals (minimum 1 year training experience and a usual range of 1-4 years) looking at different volumes of training and the muscle growth response.   Yes, they used varying methodologies, some only used body composition methods via DEXA, some used DEXA and Ultrasound, one used DEXA, Ultrasound and muscle biopsy (of quads only).

As I stated at the outset, I’m going to simply take them at face value for the time being.  It’s the data we have and with the qualifications I was sure to make as I went, it’s what we have to build the model on at this point.  Yes, future data may change the model.   When it arrives, the model will have to be updated with it.

Building the Model

First, let me put the 6 remaining studies together to see if a pattern shows up.

Yes, this is a terrible chart and I probably got at least one of the numbers typed in wrong since I type fast and frequently don’t take the time to check after the fact.  The true guru will dismiss the entire article based on a typo.   But guru gon’ guru and there’s nothing I can do about that.

Instead let’s focus more on the generalities of the data and less on my lack of proofreading.   As no values went down, all numbers represent an increase from the beginning of the study.  Percentage change is a percentage change and any absolute numbers are mm changes in muscle thickness.  I’ve shown the best response in red.

Paper Muscle Low Mod High Volume of best response (sets/wk)
Ostrowski Quads 6.7% 5% 13.3% 12 Nothing higher tested
Triceps 2.3% 4.7% 4.8% 14 28 sets no better than 14
Amirthalingam Totes LBM 2.7% 1.9% Lower volume better than higher
Trunk LBM 4.1% 1% Lower volume better than higher
Arm LBM 7.8% 3.4% Lower volume better than higher
Leg LBM 0.5% 2.4% Higher volume better than lower.
Tri MT 2.3 4.5 18-19 triceps
12-18 quads
For triceps 26-28 sets upper no better than 18-19
Leg volume of 12-13 and 16-18 so similar that I am considering them as part of a full range.
Bi MT 2.4 0.3
Ant Thigh MT 2.6 1.1
Post Thigh MT 1.2 2.2
Hackett DEXA only 18-19 vs. 26-28 upper

12-13 vs. 16-18 lower

Essentially identical to Amirthalingam with moderate volume superior for overall LBM gains and upper body and the slightly higher volume slightly better for lower body.  Differences were overall small and moderate and high volumes were more or less identical.  Very small number of subjects weakens statistical power.  Lack of direct measure of muscle thickness is not ideal.
Huan LBM changes Gains up to 20 sets/week.

No further from 20-32.

By muscle thickness via Ultrasound, triceps showed growth up to 20 sets and then SHRANK.
By biopsy, quads shrank up to 20 sets and grew after that.  Unclear why Ultrasound did not pick up quad growth but biopsy did.Total summed changes in triceps and quads a whopping 1.8 mm.  Miniscule changes overall.
Over 20 sets per week, water retention measured by TBW and ECW significantly increased, making LBM gains above that point insignificant.Two conclusions: cap to useful volume of ~20 sets per week.  Volume without tension is shit for growth because volume is NOT the primary driver of hypertrophy and never will be.
Schoenfeld Biceps 0.7% 2.1% 2.9% 18 sets By their own statistical methods, the highest volume was weakly/insignificantly better than moderate (described as ‘not worth mentioning’ in stats texts).

Conclusion: 18 sets upper and 27 lower optimal with higher volumes showing no meaningful differences (and certainly not for doubling the volume).

RF and VL changes ADDED together for poor comparison to Ostrowski.

Data in question because of an outright LIE in the discussion regarding Ostrowski triceps data.

Triceps 0.6% 1.4% 2.6% 18 sets
RF 2.0% 3.0% 6.8% 27 sets
VL 2.9% 4.6% 7.2% 27 sets
Heaselgrave Biceps 1 3 2 No statistical difference but a trend for 18 sets better than 9 and 27 no better than 18 (or even worse).

Ok, now let me summarize that horrible chart by showing where each study finds that their optimal results fall in terms of sets per muscle group per week.

Study Optimal Volume
Ostrowski 12 sets lower body (no higher tested), 14 sets upper body
Amirthalingam/Hackett* 12-18 sets lower body, 18-19 sets upper body
Huan 20 sets upper as a cap, possibly 20+ for lower body (needs more study)
Schoenfeld 18 sets upper (compared to 6 with no middle value for comparison)
27 sets lower (compared to 9 with no middle value for comparison)
Heaselgrave Trend for 18 better than 9 but 27 no better than 18.

*I grouped the two studies together since they were an identical methodology and only differed by length
and I’m getting tired of writing and making tables as it’s a pain in the ass in WordPress.

Looked at this way a pattern starts to show up.  Which is that a moderate volume tends to beat out either lower or very high volumes under basically every circumstances in trained individuals (again defined in most studies as 1-4 years of training or a minimum of 1 year regular training).  Or rather, a set count somewhere between 10/12 to 20 sets/week provides about the optimal results in all cases, at least within the limitations of the data available.  Only Schoenfeld’s leg data exceeds this but this is from a low volume of 9 compared to 27 with no middle value for comparison.  We can’t know what would have happened between those numbers.

Let me note that even IF you prefer the conclusion that Brad’s highest volumes groups gave a trend towards higher growth, it STILL contradicts the broader body of literature.  He still can’t explain why he needed to use 2X or 4X the volume to achieve the SAME growth as Ostrowski.  He can’t (read: won’t) explain a damn thing, especially when the data disagrees with him.  Let me note again that James Krieger made the explicit point that you have to look at all of the data and not focus on one study.  Yup.  And what all of the data except Brad’s study says is that 10-12 to 20 sets/week is the right number and Brad’s numbers are wrong.  Gotcha, James.  You played yourself, too.

Spoiler: My conclusion above is EXACTLY what Eric concluded as well in his MASS piece, that 10-20 sets/week was about optimal (and of course he did because it is what the MAJORITY of data supports). This was after he desperately tried to make Brad’s numbers and study not be total bullshit by dismissing the endless problems with it methodologically, by playing the “I do science” card and all other manners of silliness.   Which makes you wonder why he tried so hard to defend it with such pitiful arguments and reasoning.  He doesn’t even think the numbers are right or he’d drawn a different conclusion.  And yet he keeps trying to defend it with weaksauce defenses (leg extension volume load hahahahahaha.  I will never stop laughing at this).  I guess when seminar appearances are on the line you have to tow that line…

But this is kind of interesting because it does actually agree with Brad’s original meta-analysis (I am giving him the benefit of the doubt that it’s worth a shit to begin with and I question that with every passing day) which concluded only that 10+ sets gave the best growth response with no ability at the time to determine an upper cap.  So that passes the first reality check.  In all cases, up to 10 sets there is a clear improvement in growth response.  Above that, TO A POINT, there is a greater growth response but it shows a clear cap where higher volumes do NOT generate a greater response.  In most cases, it’s the same, in the case of Haun and triceps, it was worse.  As this is the only study showing a worse response at the highest volumes so no global conclusions can be drawn here in terms of more volume being detrimental.   It simply isn’t any better.

But despite Brad’s attempts to make huge volumes better, the broader body of work (5 of 6 studies) supports a cap of about 20 sets for upper body, if that, and possibly more for the legs (for which we need far more systematic research).  Not 45 sets/week more.  But possibly more than 20.  This goes along with endless anecdotal beliefs that legs need more training volume but more systematized studies need to be done to show where an optimum might fall and whether or not upper and lower body truly have different optimal volume levels in terms of their growth response.

Now I could cut this article here, going a really long way to reach the above conclusion.  But that would be the easy way out and I’m not done yet.  Because I now want to return to an issue I brought up in Part 1 and said I’d revisit.

The Set Count Issue Redux

I want to return to the issue of how sets should be counted that I mentioned in Part 1.  When the Heaselgrave study came out Brad responded with the following to try to make the study fit his conclusions.  Or he may have been talking about both that paper and the modified GVT paper as they are the only two that used isolation movements and he needed to dismiss the fact that they contradicts his results (nevermind that his own study used leg extensions).

He was jumping around a lot and it was tough to tell what he was trying to dismiss to make his own (incorrect) results happen.  Guru speak can be tough since the goalposts change with every post, sometimes including leg extension load volume data when nothing else will work….sorry can’t resist.

His assertion was that perhaps volume requirements are different for compound and isolation movements and that that changed how sets should be counted (which means that his numbers could still be right).   Or rather, that isolation movements should be counted differently (basically trying to dismiss Heaselgrave’s set count accuracy).

And honestly, this is just total guru bullshit in the sense that is is a change in argument when he needed it.  For years now, in every study he’s done and every meta-analysis Brad and his group have counted sets on a 1:1 basis.  They’ve always done training with compounds movements and measured peripheral muscles and treated the total set count for the compounds as applying to those muscles on a 1:1 basis.

It’s always been 1:1.

If someone does 1 set of bench press, that’s 1 set for very muscle involved by bench so one for shoulder and for triceps.  It has to be because even in his own paper he was measuring bicep and tricep size changes in response to compound work.  He wasn’t looking at chest and back thickness (again I question why pec isn’t done more since it is clearly technically possible and wonder if back can be measured).   If you’re going to look at bicep/tricep growth in response to only compound work (and note the odd little bench press study I described that looked at growth in pecs and triceps in response to bench only which means that pec CAN be measured) and count those sets in total, you’re calling it 1:1 for compound and isolation.  Brad and his group have treated it as such from the get go.  It’s also how he reported the ‘findings’ of his recent study too.  He didn’t say it was 30 and 45 sets but this has to be kept in the context of measuring triceps and biceps with only compound training movements.  He said 30 and 45 sets was best for growth with NO qualification whatsoever (well, he didn’t qualify anything until he got backed into a corner).

It’s been 1:1 from the get go for him.

Or it was UNTIL a study/studies came out that he wanted to dismiss. Suddenly, it’s no longer accurate to count it 1:1 which is just terribly convenient. Because you do not after the fact get to decide that the most recent study (or the GVT study) was different due to the isolation work and therefore do not contradict your results.   Even here the argument is totally worthless.  The Heaselgraves study did row, pulldown and curls.  Two compounds and one isolation so at most you count the third exercise differently.  Brad’s leg workout was 2 compounds and one isolation too so what’s the difference except that he doesn’t like data that contradicts him?  The Ostrowski study used isolation work too and Brad was happy to lie about the numbers there without considering set count.  He considered it 1:1 when it was convenient and then lied about the data to change the conclusion on top of it all and then decided it wasn’t 1:1 when it was no longer convenient to do so.

Pure guru shenanigans. The argument changes when it needs to.

I said back in Part 1 that I only used that convention for consistency to what they had been doing and that I didn’t agree with it at face value.  I’ve said the same thing for years.  Brad only said it when he needed it to defend his paper.  But even so, let’s go with this logic and see where it leads us.

Because NOW Brad seems to be arguing that isolation work counts differently towards the training response than compound work.  Presumably it’s worth more since it’s well direct.  That is, nobody is denying that bench works triceps and delts.  The question is to what degree in terms of tension and volume overload it works them and how it compares to direct work in that regard.

Let me add for honesty: in the discussion section of his most recent study, Brad does acknowledge in the limitations that the use of compounds and measurements of isolation might alter the set count conclusion and he used this as an argument for why his change of attitude wasn’t just a convenient excuse.  Which might fly except that this was NEVER brought up until he was backed into a corner and needed to bring it up to dismiss a study that contradicted his.  So he can say all he wants that he considered it but he knows that the majority don’t read the discussion (maybe why he thought he’d get away with his lie about the Ostrowski data).  But when every post he makes crowing about his results IGNORES it, he’s just bullshitting after the fact so far as I’m concerned.  An honest scientist mentions the limitations of their work UP FRONT when they present it whether in research OR PUBLICLY.  Like I have done in this article series for each paper, addressing the potential limitations (small subject number, DEXA only) as I went.  I’m not waiting to get backed into a corner to change my argument or magically find new data (cough cough leg extension load volume…..fucking seriously?)  He did just like James Krieger did.  You can ask me any question about this article series, and nothing I report will change from what I’ve already written unless I made an explicit mistake (which I will then fix).

But let’s go from the assumption (which I have felt from the get-go) that you need to count volume differently for compound and isolation work in terms of determining the growth response to training (note that Eric said this was also true in MASS even if he hemmed and hawed about the ratios involved).  I always have in practice and I’d say anyone with real-world training/coaching experience does as well.  We don’t consider a set of compound chest work to be 100% a triceps exercise (and nobody counts pec deck as a triceps movement although it does involve a little biceps although absolutely nobody counts that).  Most people aren’t even aware that triceps long head is involved in rowing due to it’s function as a shoulder extensor but nobody on the planet would count that towards triceps volume.

It might be conditionally true for trainees with very specific levers who can just bench and build big tris (and even perfectly built benchers do extra triceps work) but we don’t generally count it that way.  Well I don’t and neither does anybody else I know with actual training or coaching experience.  If you look at every workout routine I’ve ever written, total sets of chest (which always includes a compound movement which might be followed by a second compound or an isolation movement depending) are higher than for direct arm work because I’m counting some of the compound chest (or back) work towards arms in terms of daily and weekly totals.   Delts is always funky but I do the same thing kind of.  Shoulders are complicated since it’s three heads with different functions and one is pushing, one is pulling and one is either neither or both depending on you look at it (it’s really humeral abduction but let’s not get too entrenched in this).

How Do We Count Sets?

The question is then how should you count the sets of a compound exercise towards smaller muscles.  I don’t know but I’m going to start with an assumption of a 0.5:1 relationship.  That is, I will count one set of bench or row as one half of a set for triceps or biceps.  Is this right?  Doesn’t really matter, ignore the specifics and follow the logic.  You can math it out for a different value if you’d like.  Call it 1/3rd or 2/3rds.  Call it 3/4ths.   Whatever fits your personal bias. This is my assumption but it’s only that.  Make your own and do the math based on it.  It will only change the specifics but won’t change the general conclusions that come out of the exercise.

Let me note that the ratio must be lower than 1:1 since nobody would EVER count a compound movement as MORE than 1 set for the smaller muscles (ok, I know how people read my articles and someone will make a strawman about poorly done rows being more biceps and that’s fine. Let’s define this as the movement being done properly in a technical sense).  The question is simply how much lower.  Pick your ratio and break out the calculator.

Just don’t call it 1:1 until it’s convenient not to do so like Brad did.

Re-Analyzing the Volume and Hypertrophy Data

Because if Brad is now going to say that compound and isolation movements count differently in terms of sets then EVERY OTHER ONE OF BRAD’S STUDIES AND ALL THE REST have to be recalculated in terms of their effective set count including his original meta-analysis AND his most recent paper and all of the others I’ve examined.   His study used only compound movements (with leg extensions for quads) but looked only at single joint muscles and many studies seem to do this.  If he NOW thinks isolation exercises count differently than his effective bodypart volume changes.  And so do the set counts on every other study (the set counts on his meta-analysis will also change but I don’t know what body of literature it used and what proportion used compounds versus compounds and isolations).

So let me recalculate them based on my assumption that a compound set counts as 1/2 a set for smaller muscles and a  direct exercise counts as a full set (i.e. bench press is 0.5 sets for triceps, triceps extension is 1 set for triceps).  Again, this is my starting assumption and nothing more.  Use whatever ratio makes you happy so long as it’s less than 1:1.  And don’t play silly buggers at the extremes, we all know it’s not 0.9:1 to one or 0.1:1.  It’s probably not 1/4 to one or 4/5ths to 1 either but somewhere clustered around the middle depending on the movement.  Maybe 1/3, 1/2, 2/3, 3/4…again I don’t know for sure and nobody else does either. So I’m using 1/2 and literally splitting the middle.

So all I did was go back and count the sets.  If it was compound exercises, I cut the number of sets in half (10 sets per week becomes 5).  If isolation, I didn’t (5 sets per week = 5 sets per week).  If there was a mix I counted compounds as half and isolations as one and added them (10 sets compound chest = 5 sets + 5 sets isolation triceps = 10 total sets for triceps, down from 15 originally).  Only two used a mixture so I probably didn’t screw the math up too badly.  Even if I got a number slightly wrong it doesn’t change the overall conclusions.

Now let me be clear again, I am NOT saying that this is a perfect analysis or that a 0.5:1 estimate is right so spare me the strawman arguments that I’m trying to force a set of data.  I’m simply saying that if Brad is going to dismiss a study result he doesn’t like based on isolation vs. compound needing to be counted differently, that opens the door for this type of analysis.  Beating a (very) dead horse, you can redo it assuming compound is worth 2/3rd of set of 3/4.  The numbers will change slightly and that’s fine and they’ll be marginally higher than my 1/2 assumption. If you go with 1/3rd they will be marginally lower.  But for rational set counts, the differences aren’t even that much.  Focus on the principles, not the specifics, folks.

But they will ALL go DOWN from what Brad was claiming them to be originally.  EVERY SINGLE STUDY.  That means in his meta, in his own study and in his examination of previous studies the numbers will all decrease.  Even in Ostrowski which he lied about in his discussion.  All of the set counts decrease.  None of the studies I examined used only isolation movements so there is NO situation where the numbers don’t go down.  And since nobody would ever count a compound movement as MORE that one set, they can’t EVER go up.

Yes I am beating a dead horse but I know how people read my articles. At least one person will claim “Lyle said all compound movements are worth one half a set for the other muscles involved” which I am not saying in the least.  I am saying this is my working assumption for lack of a better one and that’s all it is.  Again, use your own number that is lower than 1:1.  Just follow the logic here.

Yes, we need data on how to compare the exercises to see what the best counting approach would be.  I am aware of one that compared pulldown to biceps curls for recovery and the biceps curls took longer to recover from than the pulldowns so clearly it’s NOT 1:1.  The pulldowns didn’t hit the biceps as much as direct arm work did.  Dadoi.  Until more data exists, we make assumptions.

It might even and probably will turn out that different movements should be counted differently.  An undergrip pulldown is more biceps involvement than overgrip and a parallel grip is halfway in between (with more brachiails).  A high bar squat is more quads than low bar and a close grip bench is close to an compound triceps exercise but a flared elbow bench is more pec specific and how you’d count those towards triceps would likely differ (I’d call close grip almost 1:1 for triceps but flared elbow as 0.5:1), etc.  Back gets super complicated as we’re dealing with the traps (with multiple sections), rhomboids, teres, lats (which have two segments with slightly different orientations) and back movements work them to varying degrees based on movement, grip, bar, etc.

Coaches make adjustments for this based on experience.  If I were using overgrip pulldowns with a trainee, I’d give them slightly more direct biceps work to compensate compared to if they were doing undergrip pulldowns which would have worked the biceps more. If they did V-bar rows, I’d make adjustments to biceps compared to doing an undergrip row.  I’d give a low bar squatter who sits back more direct quad work than one squatting high bar for example.  Everybody with any real-world experience does this in practice to some degree.  We do this based on 1/2 guesswork,  1/2 experience, 1/2 science, and 1/2 intuition (and sometimes1/2 luck).

Again, let’s not get too mired in the specifics here (as I do that very thing).  Follow the logic.

Rebuilding the Model: Part 1

And with my assumption of a 0.5:1 relationship, here are the re-mathed set counts for each of the 6 studies I’ve included. I’ve shown the original set counts in parentheses next to the re-mathed value and I probably messed at least one of these up because math is hard, my brain is tired, and I don’t bother to run it twice.  And this is total sets per week.  I’ve indicated in red which group did best (based on the analysis above) and might have even gotten it mostly right.  I am quite sure that anybody wishing to dismiss my conclusions based on a single typo will make me aware of that typo and I will change it because that’s the intellectually honest thing to do.

Study Muscle Low Moderate High
Ostrowski Lower 2 (3) 4 (6) 8 (12)
Upper 5 (7) 8 (14) 20 (28)
Amirlingtham/Hackett* Triceps 10-11 (16-18) 16 (26-28)
  Quads 7-8.5 (11-13)  8.5-9 (16-18)
Huan Increasing from 5-16 sets with a cap on growth at 10 sets for upper and possibly more (up to 16) for lower.
Schoenfeld Upper 3 (6) 9 (18) 15(30)
Lower 6 (9) 18 (27) 30 (45)
Haeselgrave 6 (9) 12 (18) 18.5 (27)

*I grouped the two studies together since they were an identical methodology and only differed by length
and I’m getting tired of writing and making tables as it’s a pain in the ass in WordPress.

And this brings the results into even starker view.  Ostrowski fits with all other data which shows a clear dose response relationship up to 10 sets for legs although the lack of data above 8 sets limits this finding and we can’t know if more would generate more growth. For upper, 20 sets (down from 28) wasn’t better than 8 down from 14.   Since 8 and 20 got the same growth, it seems unlikely that a middle value would get different results although it’s possible that 14 would but 20 was too much for some reason.  Without data, this is a guess and we need studies examining different intermediate values to know for sure.  Test 8, 14 and 20 next time.

For the GVT studies 10-11 sets per week as a mix of compounds and isolation was as good as 16 sets/week for upper body.  Inasmuch as the differences were miniscule, 8.5-9 sets per week for legs was better than 7.5-8.5 but at this point, we’re looking at a single set difference when it’s re-mathed and that alone would explain why the results were essentially identical (the stimulus was essentially identical).  You wouldn’t expect 1 set to matter but maybe if it were 7-8.5 vs. 15-16 it would. That group has already done the same study twice, now do it a third time with real differences in lower body volumes (give them a second leg day).

Schoenfeld’s data becomes a lot less idiotic now and at least starts to pass the reality check, in line with the other studies.  9 sets was as good as 15 in terms of triceps growth (because his stats did NOT show that the highest volume was more than insignificantly superior to the moderate).  Even if you believe that his highest volume was superior, it’s cut to a realistic 15 sets per week from an absolutely moronic 30 sets per week.  This starts to fit the reality check and is still well within the realm of 10-20 sets.  Like I said, big picture whether you accept my contention that moderate was as good as high or potential trend for highest to be superior, when you count the sets rationally, it stops mattering, at least for upper body where both moderate and high fall within 10-20 sets/week.

We still lack data on chest growth per se and it might require more volume or it might not so whether or not the original values matter is unknown (i.e. does the chest somehow need 30 sets of direct work…I doubt it). Until it’s measured, we don’t know.  No study can address that yet  and the bench press only study I referenced in Part 1 didn’t compare different volumes although it would be hard to see how the 9 sets that worked for 6 months suddenly needed to be tripled after one year.   But he doesn’t get to count chest volume and then measure triceps to draw conclusions about optimal sets for all muscle groups (which he essentially did) and then decide that you have to count volume differently for isolation exercises after the fact (which he actually did).

For lower body the 18 sets was as good as 30, again passing the reality check.   Here, if you take his higher volume claims as better that’s a pretty high set count (30 sets/week) although there might very well be a plateau value between 18 and 30 sets (we don’t know) which would be consistent with Haun (maybe) and anecdote.  Maybe.   If more than 20 sets IS optimal for legs (and this is still in the IF stage), a third group at 24 sets might have done better.  Testing 18 vs. 24 vs. 30 sets would be very informative but it has to be a lab that isn’t Brads.  The stats and strength gains still don’t support it and the fact that he lied about data should make his study inadmissible on fundamental grounds.

There’s still that pesky ECW issue to worry about above 20 sets per week which now ONLY the highest volume leg work in Brad’s study crosses (maybe that explains the almost significantly higher leg extension load volume.  Hahahahaha.  I’m never gonna stop laughing at that shit).  Then again, Haun was using pure compounds so that probably doesn’t make any sense as I think about it since I’m now comparing a compound only study to remathed sets.  So yeah, forget that bit, it’s wrong.  Based on initial volume, several groups in Schoenfeld cross 20 sets/week.  And that means ECW might be playing a role or artificially increasing the results.  I’d only note that with a spread of 18 to 30 sets, we don’t know if a middle value (i.e. 24 sets) would be superior until it’s directly tested.  Finally is Haeselgrave which found that 12 sets was better than 6 but no better than 18.5.

Rebuilding the Model: Part 2

Recreating the chart from above with the new numbers we get the following optimal volumes per week.

Study Optimal Volume
Ostrowski 8 sets lower body (no higher tested), 8 sets upper body
Amirthalingam/Hackett* 10-11 sets upper, 8.5-9 sets lower (the huge drop in set count is due to the single leg day)
Huan Increasing from 5-16 sets with a cap on growth at 10 sets for upper and up to 16 for lower.
Schoenfeld 9 sets for upper body, 18 sets for lower body (15 and 30 if you accept the highest volumes)
Heaselgrave 12 better than 6 but 18.5 no better than 12

So we get systematically lower numbers here, as expected.    And again, if you disagree with my 0.5:1 and use a different value, the numbers change slightly but they still all go down (i.e. if you use 3/4:1 Ostrowski’s leg data might be 10 sets instead of the original 12 or my 0.5:1 assumption 8) Basically, for any moderate set count, the differences in remathed sets just isn’t that significant.  I mean, consider a group that did 10 sets/week of compound.  If I assume 0.5:1 that goes to 5 sets.  Assume 1/3rd and it goes to 3.  Assume 2/3rds and it goes to 6.  Assume 3/4 and it goes to 7.5 or whatever and we are looking at a 4 set spread.  Use the lower ratio and it’s a little lower, use a higher ratio and it’s a little higher. And it all more or less stays in the same overall range we’re looking at here.

But it’s ALWAYS less than the original value which is my point here.

Looking at the new numbers, Ostrowski’s upper body optimal volume is 8 sets/week.  Lower body matches 8 sets/week with no higher values tested so we can’t know what would happen above that.   The two GVT studies are 10-11 sets for upper and 8.5-9 sets for lower with no higher volumes of lower body tested.  Haun finds a cap on upper body of 10 sets (down from 20) and possibly up to 16 for lower (down from 32).  Brad’s number stop being totally moronic when you don’t count them in an ass-backwards way with 9 sets for upper and 18 sets for lower body, generally matching the results of Haun.  Heaselgrave is at 12 sets for triceps but 18.5 was no better.  And all of this basically agrees with the original 10+ set meta-analysis even remathed (though it’s conclusions should probably change if it is remathed) except that we now have a much better idea of the upper caps on weekly set volume.  There’s a dearth of leg data at higher volumes and more study is needed here.

So my next to final comment: whether you look at the original unadjusted data or the semi-adjusted data for set count, you still see a general optimal range of 10-20 (original count)/8-16 (remathed count) sets per week per muscle group which is all close enough for government work.   Let’s just call it 8-20 sets/week and move on with our lives.  And again, this is consistent with Eric Helm’s own conclusions in MASS of 10-20 sets/week after his pitiful defense of Brad’s paper.

There is still the slight indication that *maybe* more for sets/week for legs would be better but it’s understudied and any conclusions would be tentative as hell.  But there is no way on god’s green earth to justify the 30 and 45 sets Schoenfeld et. al. is so desperate to prove as optimal.  His own data doesn’t support it, his stats don’t support it, the bullshit apologism by everyone involved doesn’t support it, a rational re-analysis of the set count doesn’t support and neither does the broad body of literature, his lie about the Ostrowski data notwithstanding, support it.  Nothing supports it except his burning desire to support it with guru games and because he’s believed all along in these types of high volumes.

Now We Refine the Model

Because like I said, this is how science works: you take all the available data and you make a model.  You don’t fixate on individual studies and it’s the overall body of literature that is relevant (again, I thank James Krieger for making my point for me on this).   And ignoring Radaelli, I have presented 5 studies showing that, in the aggregate, moderates volumes somewhere between 8-20 sets per week provide the maximal growth response and 1 that fails the reality check so hard it hurts, where data was lied about in the discussion (a fact that NOBODY has yet to address directly for me) and which should be dismissed on that fact alone.  Legs maybe need a bit higher but we need more data.

Now, it’s possible that more work will change that and I’ll change my model and opinion when and if they do.  But unless we do find out that Ultrasound doesn’t measure muscle growth or something and all of these studies go into the junk pile I won’t hold my breath.  They match one another despite varying methodologies and they match (for what it’s worth) with real-world training practices.  They pass the reality check is what I’m saying.  At best we’ll refine the above numbers with more targeted research.

That is, future research might start from the idea that 10 to 20 sets/week is optimal and determine what specific volume is optimal within that range.  Or more systematically compare lower body and upper body.  Perhaps look at 10,15,20 for upper body exercises and 15,20,25 for lower. But stop doing 9,18,30 where the variance is just too huge to know what happens in the middle ranges.  Is 12 the same as 18, is 24 the same as 30?  Stop focusing on sets per exercise.  If you want to test sets per week, set it up to do that in a rational way.

Feel free to contact me with help with the study design.  I can also probably figure out how to pre-register the study, describe the randomization in the methods (and randomize the subjects in a blinded fashion to do the Ultrasound) and efficiently write the discussion with accurate data representation for anybody who just doesn’t have time…..

But expectationally based on the broad body of literature, optimal results are likely to be found between 8-10 to 20 sets per muscle group per week.  Once again, Eric drew this same conclusion after his desperate efforts to make Brad’s paper not be shit.   And I just read something by Bret Contreras of all people saying to stop doing insane volumes and focus on intensity.  Good lord, when Bret is the rational one and Mike Isratel is the intellectually honest one in all of this, Mercury must be in retrograde.  But Bret was on Brad’s paper and he better be careful or Brad will kick him out of the paper publishing circle jerk or prevent him from getting seminar appearances for not towing the party line.

Ok, two more comments and I’m done.

The Generic Bulking Routine

With the above in mind, a rough volume of perhaps 10-20 (or 8-16 depending on the analysis) sets per week as an optimal growth number, I want to look at what I have presented for years as my Generic Bulking Routine.  This was an intermediate program I drew up absolute ages ago that has proven to work for intermediates for over a decade.  I report this only anecdotally and nothing more, I’m not James Krieger who thinks anecdote counts as ‘science’.  But if we’re going to pretend to integrate science and practice, then it is always nice when practice actually matches up with the science.

It was an Upper/Lower routine done 4 times per week with each day having the general structure shown below and was meant to be done as 2 weeks of a submaximal run up and then 6 weeks of trying to make progressive weight increase (progressive tension overload being the PRIMARY driver on growth with sufficient volume within being optimal) prior to backcycling the weights and starting over with the goal of ending up stronger over time.   It was mean to be an intermediate program used from about the 1-1.5 year mark of consistent training to maybe 3 year mark before my specialization routines were implemented.

Weights didn’t HAVE to be increased every week or workout, that was simply the goal (as Dante Trudell put it in his Doggcrapp system, you should be trying to beat the log book at each workout).  In my experience, so long as folks were eating and recovering well and started submaximally, they could do so over relatively short time periods like this (over a longer training cycle, I’d do different things).  Women perhaps less so than men for unrelated reasons but no matter, Volume 2 is coming eventually…

I’m going to provide specific exercises in the template but just think of them as either compound or isolation for the muscles involved since exercise selection is highly individually dependent. RI is rest interval and note that I use fairly long ones so ensure quality of training with real weights and ideally all sets are at the same heavy weight (oh yeah, in ‘Merkun a single apostrophe is minutes and a double is seconds and I am told this is the opposite of the rest of the world).  Big compound movements get 3 minutes, smaller muscles get 2 minutes and high rep work gets 90 seconds since it’s meant to be more of a fatigue stimulus to begin with.

But for big movements, a 90 second rest interval is bullshit and means that you’re probably squatting with 95 lbs on the bar by your fifth set ‘to failure’.  Better to do less sets and give yourself long enough to do quality work.  In that vein, after the submax run up, the goal RIR was maybe 2-3 for the initial set which would likely drop to 1 or even near failure by the last set.  The goal is progressive TENSION overload over time (meaning multiple training cycles).  When your workouts don’t use stupid volumes, you’re in the gym the same amount of time but can actually do quality work than when you’re trying to fit in 45 fucking sets and get done before tomorrow.

Upper SetsXReps(RI) Lower SetsXReps(RI)
Flat Bench 3-4X6-8(3′) Squat 3-4X6-8 (3′)
Row 3-4X6-8 (3′) RDL or Leg Curl 3-4X6-8 (3′)
Incline DB Bench 2-3X10-12 (2′) Leg Press 2-3X10-12 (2′)
Pulldown 2-3X10-12 (2′) Another leg curl 2-3X10-12 (2′)
Lateral Raise 3-4X6-8 (3′) Calf Raise 3-4X6-8 (3′)
Rear delt 3-4X6-8 (3′) Seated Calf Raise 2-3X10-12 (2′)
Direct Triceps 1-2X12-15 (90″) Abs Couple of heavy sets
Direct Biceps 1-2X12-15 (90″) Low Back Couple of heavy sets

The exercises are for example only and the other two workouts per week could be a repeat of the same movement or different within that general structure (exercises can be also be changed with each succeeding training cycle).  So start with incline bench and pulldown for the sets of 6-8 and do flat bench and row for the sets of 10-12 or whatever.

Now let’s add up the set count:

Compound Chest: 5-7 sets twice/week for 10-14 sets/week (counted as 5-7 sets for tris at 0.5:1)
Compound Back: 5-7 sets twice/week for 10-14 sets/week (counted as 5-7 sets for bis at 0.5:1)
Side delts: 6-8 sets per week.
Rear delts: 6-8 sets per week (gets hit somewhat by pulling but hard to math out)

Note: Effective delt volume is likely a bit higher than this but it’s a pain in the ass to estimate how much side delts do or do not get hit by compound pushing.  Or rear delts via compound pulling.  It might math out to 8-10 sets/week or less or maybe more.  Again, hard to say but most report just fine delt growth from the above (and no shoulder problems which is why it’s an upper/lower to begin with).

Bis/Tris: 1-2 direct sets added to compound work = 2-4 sets/week + 5-7 indirect sets/week = 7-11 sets/week.  Add a third or even fourth set if you like to get 8-12 sets/week or 10-14 sets/week of combined indirect and direct arm work.  I certainly agree that if you do heavy pushing and pulling you don’t need a lot of warm work.  I simply do NOT agree that the sets count 1:1.  But my workout designs usually have proportionally less direct arm work since I partially count the compound pushing/pulling and always have and always will.

Let me comment before moving forwards that while this might seem like a low per workout volume to some (and high to others), it matches the set count data based on my analysis above.  As well, I have contended for years that if you can’t get a proper stimulus to your muscles with that number of sets, volume is not the problem.  Rather, you are.  Whether it’s due to suboptimal intensity, focus, technique sucking, etc. you are the problem with your workout.  Doing more crap sets will never top doing a moderate amounts of GOOD sets.

Regardless, looking at it now, with 15 years of experience with it and the data analysis I just did, I might bump up the side delt volume a bit.   As noted above, the contribution from chest work is tough to really establish here and the delt has three heads with differing functions.  But no matter.  Let’s focus on generalities.  Which are that my general set count for this workout is and has always been right in the range of what the analysis of the majority of the training studies found to be optimal.

This template could be adjusted in various ways.  The second chest and back movement could be isolation which would reduce the indirect set count on arms, necessitating an increase in direct work.   So if someone did 4 sets of flat bench and 3 sets of incline flye that’s still 14 sets/week for chest but reduces indirect arm work to only 4 sets/week (8 sets of compound pushing divided by 2) so you bump direct arms to 3-4 sets per workout to get to 6-8 direct sets per week and 4 indirect sets for 10-14 sets per week.  I think that makes sense.  The point being that I am looking at total set counts per week (actually I was counting reps but it all evens out) and adjusting volumes for smaller bodyparts based on exercise selection.  If you use more isolation movements for chest or back that decreases the indirect set count for bis and tris so I’d add more direct work there.

The same holds for legs where quads are worked for 5-7 sets twice weekly or 10-14 sets, same for hams and calves.  I might bump this up slightly although high volumes of truly HEAVY leg work is pretty brutal, add a third movement like leg extension and another leg curl to for a couple of higher rep (12-15 rep) sets apiece.  Now it’s 7-9 sets twice a week or 14-18 sets.  Towards the higher end of volume but until we know for sure that it’s 20+, I’m not changing much here.  And, again, a workout with 20+ heavy sets of legs (including quads, hams and calves) is gruelling.

But overall upper body comes in at somewhere between 7-14 sets for upper body muscles and 10-14 for legs.  Again, intermediate program from like 1.5-3 years or so.

Those numbers look so very familiar.

The Wernbom Meta-Analysis

Let me finish by revisiting the original Wernbom analysis that looked at intensity, volume and frequency in terms of optimal growth.  It’s become pretty fashionable these days to dump on it for various reasons.  It’s fairly old, there is more data now and there was simply very little work done on intermediate much less advanced trainees at the time.

Irrespective of that, within moderate intensities (the typical ‘hypertrophy zone of perhaps 70-85% 1RM), it concluded that a volume of 40-70 repetitions twice/weekly was optimal for growth with triceps and quads being the muscles of interest.   I honestly think using reps per week is a better approach than sets since obviously 10 sets of 1 and 10 sets of 10 are not the same stimulus.  That said, since almost all of work on this topic stays in the 8-12 range or so, set counts are at least conditionally appropriate.  Within any rationally accepted repetition range, it just all sort of balances out.

If you add up the reps on my GBR you can see where my numbers come from. I use a combination of heavy 6-8’s for tension and 10-12 or 12-15 for more fatigue which is why I mix them but you end up with roughly that number of reps for every muscle group (you can count reps on compound chest/back/legs as half the reps for arms but it should all math out more or less correctly because that’s how I set it up).

No Wernbom wasn’t on well trained subjects but none of the above studies used elite guys either because a 1.1 bodyweight bench is not elite in men, it’s advanced noob.  Wernbom was basing on a limited data set in, at best, limited work on even intermediate trainees (again, just like the above studies) and still concluded 40-70 contractions twice a week gave optimal growth compared to lower and higher values.

So we double 40-70 and that’s 80-140 repetitions per week per muscle group.   Some quick maths.

At 10 reps per set 80-140 reps per week yields 8-14 sets per week.
At 8 reps per set 80-140 reps per week yields 10-16 sets per week.
A mix of 4X8 (32 reps) and 3×12-15 (36-45) for 68-77 reps per week is 14 sets/week.
A mix of say 5X5 (25 reps) and 3-4X1012 (30-36 reps) for 55-71 reps twice a week is 16-18 sets/week.

So for any rational workout design an optimal repetition count of 40-70 reps/workout done twice per week for 80-140 total reps per week put us somewhere in the realm of 8-18 sets/week for the optimal growth response.

Well whaddya know about that?

Training Volume and Muscle Growth: Part 2

So continuing from last time, when I looked at four studies (one of which I threw out based on what I consider absurd results in terms of making zero sense) on the topic of training volume and hypertrophy, I want to look at the remaining three studies (these are the ones that came out in the past few weeks) next to complete the set.  I’ll do the same basic analysis and this will all lead into the final part 3 where I’ll look at the results in overview to see if any general conclusions can be drawn regarding the questions I originally posed.

Effects of Graded Whey Supplementation during Extreme-Volume Resistance Training

The next paper is by Haun et. al. and was published in Frontiers in Nutrition in 2018.  It is notable for having been (at least partially) funded by Renassiance Periodization and having Mike Isratel as one of the authors.   I do NOT mention this to dismiss it out of hand on a  “Who funded it?” kind of way because I think that’s crap.  Just mentioning it since it was clearly an attempt to support/test Mike’s ideas about volume, MRV, etc.   Of some trivial interest, it literally came out like 3 days before Brad Schoenfeld et. als. paper.  Which is a shame because had it come out last year, it would have brought up a critically important issue that will have to be considered going forwards on this topic.  More below.

As you’ll see, it also had a semi-negative outcome in terms of NOT supporting what I suspect Mike was trying to prove to begin with.  Yet it was still published (and I am told that Mike has adjusted his workout templates volumes down in response).  That’s the mark of intellectual honesty since I’m 100% sure they wanted the opposite result of what they got.  Or Mike did anyhow.  Not only did they publish essentially a negative finding but Mike adjusted his recommendations (mind you, it would have been faster to have just listened to me in the first place but no matter).

The paper actually had a couple of different goals.  One was to examine the response of muscle growth to progressively increasing volumes.  Basically to see what happened as volume went up weekly.  This is where I will put my focus as hypertrophy has been my criterion endpoint from the outset.  But a second goal was see if increasing protein intake along with increasing volume had any additional benefit (the title makes it sound like this was the primary goal and it might be.  Whatever, there were two goals).   The idea being that getting optimal growth from increasing volumes require more protein or whatever.

Dietarily, the groups were either supplemented with maltodextrin, a single serving of whey or given graded doses of whey protein (from 25-150 grams/day).  But they all did the same training program.  Since the dietary manipulation ended up having zero effect on the results, I won’t mention it again.

In it 34 subjects were recruited with 3 dropouts resulting in 31 total subjects.  The participants were resistance-trained with at least 1 year of self-reported resistance training experience and a back squat 1RM of greater or equal to 1.5 times body weight.    The subjects performed the following workout with barbell back squat, barbell bench press, barbell SLDL and lat pulldown at every workout.  So one compound movement per muscle group(s).

Huan Squat Workout

The sets were done at 60% of maximum and you can see how training volumes increased weekly from 10, 15, 20, 24, 28 and finally 32 sets/week.  Yes, 32 sets per week of squats.   Even at 60% of max, well…yeesh.  That makes GVT look sane by comparison. That said…

It’s a little bit tough to tell from the methods how the workout was performed but it almost looks like one set of each movement was done before moving to the next, the next and the next and then going back to the first exercise.  As 2′ were given between each exercise unless the subject wanted to go sooner or needed a bit longer, this is like 10 minutes between sets of any individual exercise.  That is assuming I’m reading this right.

Exercises were completed one set at a time, in the following order during each training session: Days 1 and 3—barbell (BB) back squat, BB bench press, BB SLDL, and an underhand grip cable machine pulldown exercise designed to target the elbow flexors and latissimus dorsi muscles (Lat Pulldown); Day 2— BB back squat, BB overhead (OH) press, BB SLDL, and Lat Pulldown. A single set of one exercise was completed, followed by a set of each of the succeeding exercises before starting back at the first exercise of the session (e.g., compound sets or rounds). Participants were recommended to take 2min of rest between each exercise of the compound set. Additionally, participants were recommended to take 2 min of rest between each compound set. However, if participants felt prepared to execute exercises with appropriate technique under investigator supervision they were allowed to proceed to the next exercise without 2 min of rest. Additionally, if participants desired slightly longer than 2 min of rest, this was allowed with intention for the participant to execute the programmed training volume in <2 h each training session.

Subjects reported their Reps in Reserve (RIR) for each set which just means how many more reps they think they could have done. So an RIR of 3 on a set of 10 means they think they could have done 13 (and this method is fairly accurate for trained folks).   The average RIR started at roughly 3.7+-1 and this went up slightly to 4.3+-1.6 by week 6.  So on their sets of 10, they could have done 13-14 reps  Basically, it was all pretty submaximal as would be expected for sets of 10 at 60% with an almost 10 minute rest interval.  Ten reps is usually ~75% and with a 10 minute rest there simply isn’t any accumulated fatigue occurring.

Body composition changes were measured by DEXA (which is at best rough for estimating true muscular change) but muscle thickness via Ultrasound was also measured for the vastus lateralis (VL) and biceps.  Biopsies (where a chunk of muscle is literally cut out of it) of the vastus lateralis were also taken to determine the physiological cross sectional area of actual muscle fibers.   While way more invasive (with its own limitations), biopsy is arguably a far more direct method of assessing fiber size changes since you are literally looking at the muscle fiber area directly.

Of some interest, total body water (TBW) and extra cellular water (ECW) were measured via BIA (it’s only real use) as this is representative of inflammation and edema (basically, the body retains water when inflammation is present) and this will be important in a moment.   Measurements of mood (POMS, profile of mood state) and muscle tenderness were also measured, basically to check for overtraining and inflammation although I won’t focus on this.  This was a thoroughly done study to be sure.

Measurements were made before week 1, at week 3 and again at week 6.  And here is where it gets interesting since the results ended up being pretty different from Weeks 1-3 and Weeks 4-6 as the volumes got stupid.    First let me look at the lean body mass changes.  From week 1 to week 3 (10->20 sets/week), the subjects gained 1.35 kg/3ish pounds with that value dropping to 0.85 kg/2ish pounds from week 3 to week 6 (20->32 sets/week).  So already there is a reduction in training gains with more volume producing far less gains.  Still size is size and 2 lbs in 3 weeks is still good, right?  Hang on.

The study did something I haven’t seen before which was to correct the change in LBM  for extracellular water (ECW) which will show up as LBM.  Basically swelling and edema (not the same as sarcoplasmic hypertrophy) that can occur.  And when this correction was made, the LBM changes dropped from 1.3 kg/3 ish pounds to 1.18 kg/2.6 lbs from week 1 to week 3 which was about the same as before and from 0.8 kg (2 ish pounds) to a statistically insignificant 0.25 kg/0.55 lbs from week 4 to week 6.  So they gained about 0.9 lb/week for the first 3 weeks and this dropped to an insignificant 0.2 lbs/week from 3 to 6 when the volume got stupid.  Basically the increased LBM in the last three weeks of the study was just fluid accumulation (perhaps lending credence to the idea of pump ‘growth’ occurring, just not by the previously thought mechanism).

I’ve shown this data below.

Time Weeks 1-3 (10->20 sets/week) Week 4-6 (20->32 sets/week)
LBM Gains 1.35 kg (3 ish pounds) 0.85 kg (2 ish pounds)
ECW Corrected LBM Gains 1.2 kg (2.6 lbs) 0.25 kg (0.55 lbs)
LBM Gains/week
0.4 kg (0.9 lbs) 0.08 kg (0.2 lbs) BFD


In the abstract (which most in the industry don’t read past) the authors try to spin this as the higher volumes still being superior but in the discussion itself they state:

Thus, when considering uncorrected DXA LBM changes, one interpretation of these data is that participants did not experience a hypertrophy threshold to increasing volumes up to 32 sets per week. However, if accounting for ECW changes during RT does indeed better reflect changes in functional muscle mass, then it is apparent participants were approaching a maximal adaptable volume at ~20 sets per exercise per week.

Note their use of the word ‘if’ after however in the last sentence.  They are kind of hedging their bets here but it is NOT yet clear if you do or do not have to correct for ECW when measuring changes in muscle thickness via Ultrasound.  It’s hard not to see how it would given how Ultrasound works but that may reflect my own bias on the matter.   In their discussion they do state:

In this regard, Yamada et al. (24) suggest expansions of ECW may be representative of edema or inflammation and can mask true alterations in functional skeletal muscle mass. Further, these authors suggest the measurements of fluid compartmentalization (e.g., ICW, ECW), which are not measured by DXA, are needed if accurate representation of functional changes in LBM are to be inferred.

Suggesting that ECW can skew Ultrasound measurements and must be both measured and accounted for to get any idea about actual changes in skeletal muscle size or amount.  More importantly, this has to be directly EXAMINED before we go forwards with any more high volume studies.  I’ll harp on this but we must KNOW if the increase in ECW/edema impacts on measurements or not.

Regardless, a cap was seen where anything over 20 sets had no further advantage in total LBM gains.    But as I’ve said, LBM gains themselves aren’t necessarily indicative of actual muscular growth since it can represent a lot of things (perhaps the increased LBM even in weeks 1-3 was glycogen storage, since it was not measured, we can’t know).

So let’s look at the changes in thickness via Ultrasound.   As with Radaelli I’m not providing specific numbers because the raw data wasn’t presented, the graphics in the paper are tiny and my eyes hurt trying to read them.  So I can’t even begin to try to extrapolate it out (and if I did I’d be subject to my bias guessing at the numbers which I won’t do).   But here it is.

Huan Figure 5 DataThe verbiage in the results is even more obscure in terms of what happened.  The best I can do is quote from the legend on Figure 5 above.

Muscle thickness and VL fiber size differences between supplementation groups. Only a significant time effect was observed for biceps thickness (A) assessed via ultrasound with MID values being greater than PRE- and POST values. Only a significant time effect was observed for VL thickness (B) assessed via ultrasound with MID values being less than POST values. Panel (C) provides representative images of ultrasound scans from the same participants. Only significant time effects were observed for total fiber cross sectional area (fCSA) (D), type I fCSA (E), and type II fCSA (F) assessed via histology with MID values being less than PRE and POST values. Panel (G) provides representative 10x objective histology images from VL biopsies of the same participant. All data are presented as means ± standard deviation values, and values are indicated above each bar. Additionally, each data panel has delta values from PRE included as inset data. MALTO, maltodextrin group; WP, standardized whey protein group; GWP, graded whey protein group.

This is really hard to parse.  By Ultrasound, biceps thickness was bigger at week 3 than at the beginning or the end of the study, possibly suggesting that it increased in size and then decreased again.  For quads it’s even weirder.   By the Ultrasound the Week 6 size was greater than the Week 3 size but, statistically neither seems to have been different than the starting value.   So while there does appear to be growth from Week 3 to 6, there was no net size gain from Week 1 to 6.  As I’ll discuss below, the biopsy data showed shrinkage from Week 1 to 3 before regrowth to Week 6 and it’s possible that the Ultrasound simply didn’t pick the same pattern up statistically.   Perhaps the larger point is that there was no net growth from Week 1 to 6.  That’s a lot of squatting to achieve nothing.

Mind you, overall, the actual total change in muscle thickness via Ultrasound appear to have been absolutely tiny to begin with.   They state:

When summing biceps and VL thicknesses at each level of time, there were no significant differences between groups at each level of time. However, a significant main effect of time revealed that the summed values of thickness measurements were significantly higher at POST compared to PRE (p = 0.049). The summed value at POST was 7.16 ± 0.77 cm where the summed value was 6.98 ± 0.81 cm at PRE (data not shown).

What they actually did here was add up the total growth for the biceps and VL to make the number larger (and presumably reach statistical significance).   And looked at that way, it was a bit higher at 6 weeks than at the beginning of the study.  But look at that change, the summed total of both muscles was all of 0.18 cm (7.16 – 6.98) or 1.8 mm TOTAL (if this was split evenly and I don’t know if it was, that’s less than 1 mm growth per muscle).  And the only way they got this to be significant was by adding up two different muscles (a scientific approach called playing silly buggers).  It would be like a training study where the subjects improved their bench by 15 lbs and squat by 25 lbs and neither were meaningful but if you ADD THEM UP, suddenly the total 40 lb improvement from the training program is significant.  Feh.

Looking at the quad biopsy data, quad size actually went down from Week 1 to 3 (visible in D/E/F in the figure above) before returning to the starting size by week 6. This backs up the tentative conclusion on the Ultrasound above.  Basically at the lower volumes, quads shrank before increasing to their starting size at the end of the study.  If they hadn’t measured at week 3, they would have seen zero change from start to finish.

This is perhaps the more interesting finding, that there is a possibly discrepancy in growth requirements for upper and lower body.  For the upper body, biceps size went up from week 1 to 3 before going back down but ended up right where it started: a lot of work to achieve nothing.  Not only was more volume not better, it was worse.  In contrast, quads (by biopsy anyhow) shrank from Weeks 1 to 3 before growing from Week 4 to 6.  This suggests that the lower volumes early on were insufficient but doesn’t change the fact that, by biopsy, there was no overall muscle growth from Week 1 to 6.  The subjects did a LOT of squatting to make zero gains.

This does suggest, as I’ve noted already, that upper and lower body might have different optimal volume requirements. Perhaps if the quad volume had started at a higher level, there would not have been size loss from Weeks 1 to 3 or net size gain from Weeks 1 to 6.  Perhaps if upper body volume had been capped at 20 sets, there would have been no shrinkage or loss of size or even a further increase.  Perhaps perhaps perhaps.  Since they didn’t do it, we can’t know.

This is all still colored by the fact that any total growth of any sort was absolutely tiny.  When you have to add both values together to get it to be significant, that’s pretty damn telling.   This is what the Ostrowski study got in triceps alone with 14 sets/week.  Mind you, this study was a mere 6 weeks which is seriously short.  Perhaps we should consider that significant growth in that time period.  Perhaps we sholdn’t.  Perhaps over a longer study, perhaps with slower volume increases, different results would have been seen.  Perhaps, perhaps, perhaps.  This is what they did and this is the data they have.  In this vein they state

Finally, while a 6-week RT program seems rather abbreviated, we chose to implement this duration due to the concern a priori that the implemented volume would lead to injuries past 6 weeks of training.

Which is a consideration a lot of people seem to be missing in the current volume wars.  Even IF super high volumes generate better growth, what happens when they are followed over extended periods? I’ll tell you what happens: overuse and other injuries, overtraining, burnout, etc, none  of which are beneficial for long-term progress.  It’s just something nobody seems to be talking about outside of a few folks on my Facebook group.

Simply, even IF massive volumes generate better short-term growth (and the overall indication is that they do not), if they get you hurt that’s not a good thing.  Long-term progress is the goal here and that usually means a more moderate approach over time.  Put differently, it’s better do do a series of training cycles with 15 sets/week that gets some growth than doing one with 30 sets/week that keeps you out of the gym for 3 months with tendinitis.

Since it will likely be used to rebut the above, let me note that Brad Schoenfeld’s study, discussed next, used high volumes over 8 weeks and saw no dropouts due to injury.  Well, 8 vs. 6 weeks it not much of a difference and it’s when absurd levels of training are followed for months at a time (as most tend to do) that they get into trouble.   Try it for 3-4 months and get back to me is what I’m saying.   As well, the nature of the workout in Brad’s study had some real implications for training poundages that were used.   It’s one thing to do all the volume when the overall loading is low and another to do it when it’s heavy.  Only the latter wears stuff out.

Ignoring that for the time being, there is another issue that has to be addressed which is the ECW issue I mentioned above.  IF increased ECW is throwing off the Ultrasound, then it is only impacting on high volumes of training and it MUST be accounted for. But first it has to be determined if it impacts anything.  Do a pilot study, do something, but figure this out before another of these damn studies is done.  All you have to do is measure muscle thickness when there is no increase in ECW, then do something (like 32 sets of squats) to increase ECW and re-measure it when true growth could not have occurred.  BIA can clearly pick up ECW so this shouldn’t be terribly difficult to do methodologically.

Either muscle thickness does or does not change in response to it.  If it doesn’t, there’s no problem.  If it does, there is a HUGE problem where high volume studies are now measuring changes in ECW rather than actual growth. Or, at the very least, ECW changes are causing an artificial elevation of the Ultrasound BUT ONLY IN THE HIGH VOLUME GROUPS which would make it look like they are generating MORE growth than they actually are.  But ONLY in the high volume groups.

A second issue I want to discuss.  A current theme in the training world is that “volume is the primary driver on hypertrophy” and this study was clearly designed to test that.  Except that what it really did was disprove the idea entirely.  Keep in mind that the intensity of training was 60% of maximum with stupid long rest periods and the subjects reported 3-4 reps in reserve during the study.  So pretty submaximal for endless sets of 10 with little to no cumulative fatigue (even Poliquin’s original GVT was 10X10@60% on 1 minute and it got HARD by set 10 as fatigue set in).

And despite throwing all the volume at folks, growth was, factually, sucky, a mere 1.8 mm which they only achieved by adding up biceps and VL to begin with.  Basically, with insufficient TENSION overload, all the volume in the world doesn’t generate meaningful growth.  It does cause lots of water retention and maybe this does lend some credence to the whole idea of pump growth, it just happens to be increased ECW rather than sarcoplasmic growth.  So pump it up with light weights and you look bigger (maybe, for a little bit) and weigh more.  But it’s just fluid accumulation.  Or you could lift real weights for less volume and actually grow better.

And all of the above is true because TENSION is the primary driver on hypertrophy and this study shows it in spades (in his apologist article, Eric even states this very thing the same citing the same study I’ve been referring people to for almost 20 years).  Ostrowski got much more growth in the triceps (2 mm) with only 14 sets by using heavy loads (admittedly over a longer time period).   So did the two GVT studies (which at least had some heavy work, 70% for sets of 10).  A relatively small study by Mangine showed the same (when I mentioned this study to Brad he hand-waved it away for reasons I forget) where much lower volumes at a higher intensity generated more growth and strength than higher volumes at a lower intensity.

Tension/progressive tension overload trumps volume every time.

It’s really that simple: tension/progressive tension overload beats volume every time. Yes, volume plays a role but, in the absence of sufficient tension, it doesn’t mean shit.   That means that volume can’t be the primary driver because that’s not what primary means. If this is unclear consider the following:

  1. Insufficient intensity + high volumes = dick for growth (this study)
  2.  Sufficient intensity + low volumes = growth (all other studies)
  3. Sufficient intensity + higher volumes SOMETIMES = more growth (all other studies)

What’s the common factor in the two situations that generate growth? It’s tension, NOT volume.     That makes tension overload primary as without it, growth is effectively nil.  Volume is purely secondary to sufficient tension overload and there’s no escaping that fact.  Yes, more volume AT A SUFFICIENT INTENSITY may yield more growth. But ALL the volume at submaximal intensities accomplishes jack squat.

Since I imagine some will use it to rebut the above, I should address the low-load training studies (usually using 30% 1RM to FAILURE) to generate growth.  This is true but note my capitalized word.  By taking light loads to failure, the muscle fibers ARE exposed to high tension loads at the end of the set.  This is due to fatigue occurring and requiring the high threshold fibers to be recruited to keep the weight moving until FAILURE occurs.  And this is distinctly different than doing 10 non-fatiguing sets (remember RIR never went below 3-4) at 60% with 10 minutes rest.  The latter simply never exposes muscle fibers to a high-tension overload.  And clearly doesn’t work very well (if at all).  Even if it did, why bother with 20 light sets when, most likely, 8-10 heavy sets would be more effective?


As their own conclusion in the discussion (not the abstract) shows, the growth response seemed to hit a top-end cap of 20 sets with more sets generating insignificant gains in LBM but rather increased water storage due to edema/inflammation.  It suggest a difference in optimal volume levels for upper and lower body as well with the upper body seemingly responding better to lower volumes and the legs actually getting smaller with the same lower volumes (before returning to their initial size as volume went up).

But this has to be considered with the fact that the overall growth was miniscule, only being even remotely significant with the silly buggers method of adding biceps and VL growth together.  Admittedly, this was short study using, insufficient loads (also demonstrating clearly that volume is NOT the primary driver on growth because volume didn’t drive jack shit for growth here).

Perhaps the most interesting point is that water retention may have to be accounted for going forwards since it can clearly skew the results and make it look like more growth is occurring than actually is.  And since it looks like higher volume causes more water retention than lower, well….this has HUGE implications for a lot of volume studies.  Because if water retention is ONLY an issue beyond 20 sets/week, then it means that any study comparing volumes below and above this level have a confound that must be taken into account for the high volume groups (but NOT the lower).

That is, if you compare 12 sets to 24, the 24 set group MUST be adjusted for ECW but the 12 set group doesn’t have to be.   And IF that ECW is shown to impact on Ultrasound measurements, this means that any “apparent” benefit of the highest volumes might just be a measurement artifact of increased ECW.  Note my use of the word might.  We don’t know if ECW colors Ultrasound or not.  But it MUST be studied before another one of these studies is done.

Bringing me to Brad Schoenfeld’s study.

Resistance Training Volume Enhances Muscle Hypertrophy but not Strength in Trained Men

This is the paper by Brad Schoenfeld et. al. that has been a driver on all of the recent Internet drama.  It’s been discussed to death and I’ll try to keep this short (and probably fail).    It took 45 college aged men with at least 1 year of training experience (the single reported value was like 4 years +-3.9 or something).  Of those initial 45, 11 dropped out (not due to the workouts) leaving 34 total subjects.

They did the following workout flat barbell bench press, barbell military press, wide grip lat pulldown, seated cable row, back squat, leg press and one-legged leg extension and this was done three times weekly for either 1, 3 or 5 sets per exercise.  For the upper body, this meant that the volumes ranged from 6 to 18 to 30 sets and for lower it was 9, 27 or 45 sets per week.  Sets were 8-12RM (to concentric failure supposedly) on a 90 second rest interval.  Which is really already failing the reality check.

Let me ask:

  1. How many people have you ever seen squat to failure voluntarily?  By this I mean continue lifting until they fail mid-rep and either need a spotter to get them to the top or lower the bar to safety pins or dump it?
  2. How many could do 5 sets of squats to failure with 90 seconds rest with any decent poundage?

The answers will likely be

  1. Almost nobody.  I’ve done it, I’ve had the occasional trainee do it.   I’ve seen almost none do it on purpose.  If failure occurs on squats, it’s usually because the person is a macho dipshit and something goes desperately wrong mid-set (or it’s a powerlifter missing a heavy max).  But few, if anybody, do it on purpose.
  2. Literally nobody.  Well certainly no men.  Due to differences in fatigue, women would be far more likely to survive this.  But the study only used men so that doesn’t matter.  Who would never survive it.  A true set of 12RM and you have to lay down for a few minutes.  5 sets of 12RM on 90 seconds?  Maybe with 95 lbs on the bar and then the first set wouldn’t be to failure.  What you’d probably see is a decent first set and the poundages or reps dropping excruciatingly with every set.  This would be the definition of junk volume and anything less than 2-3′ rest between heavy sets of squats anywhere near limits would be a minimum.  I’d love to see what poundages were actually used in the high-set workouts because I bet they started ok and ended up as jack shit.

Which once again raises immediate questions about the study in the sense that the high-volume workouts were basically impossible to complete but no matter.

Body composition was, shockingly, not measured (even post-training weight wasn’t reported which is a bizarre oversight) especially given that the head researcher apparently likes to crow about his super amazing Inbody body comp device in his lab.  Sure, BIA is shit but if you have this amazing gizmo, why not use it?  Guess they couldn’t find the time with all the not pre-registering, doing the unblinded Ultrasounds, not describing the randomization in the paper they were doing along with figuring out how to misrepresent the Ostrowski data to change its actual conclusion to match theirs.

However, muscle thickness for triceps, biceps, vastus lateralis and rectus femoris was done by Ultrasound.   I’ve presented the data for each muscle group in the chart below.  All that is presented is the mm change for each muscle and the %age change this represents (I did this by dividing the average change by the average starting value in the study).   I’ve indicated sets as sets/week for upper and lower body.  I’ve indicated, using THEIR statistical methods which groups were different from which group.  An asterisk means it is different from a non-asterisk but equal to any other value with an asterisk.  NS means non-significant statistically.

6 sets/week 18 sets/week 30 sets/week
Triceps Change 0.6 mm 1.4 mm NS 2.6 mm NS
Triceps %age Change 1.3% NS 2.9% NS 5.5% NS
Biceps Change 0.7 mm 2.1 mm* 2.9 mm*
Biceps %age Change 1.6% 4.7%* 6.9%*
9 sets/week 27 sets/week 45 sets/week
Rectus Femoris Change 2.0 mm 3.0 mm 6.8 mm
Rectus Femoris %age Change 3.3% 5.1%* 12.5%*
Vastus Lateralis Change 2.9 4.6 7.2
Vastus Lateralis 5% 7.9%* 13.7%*

So despite a numerical change, triceps showed no significant differences between groups even if the absolute values were kind of different.  1.4 is over double 0.6 and 2.6 is nearly double that.  But it was non-significant.  But this is the beauty of statistics where an apparent real-world change can be statistically irrelevant but an irrelevant real-world change can be significant.

For biceps, the 18 and 30 set/week group were not statistically different from one another but both were higher than the 6-set group (and somehow smaller absolute changes in this case were significant).   This is likely due to it being statistically underpowered and I guess there is a small trend for 30 sets to be superior but that’s an enormous increase in volume and training time to get a relatively smaller increase (15 more sets/week for 0.8 mm).  Basically, the percentage change kind of hides the fact that going from 6 to 18 sets got 300% more growth and adding another 12 sets got only 40% more.  Even if the higher volumes are superior, it’s a terrible return on investment.

The same is true for the RF and VL changes.  Both the 27 and 45 set groups were better than the 9 set group but were NOT statistically different from one another.  But there is a visible trend present.  Clearly 12.5% is more than double 5.1% and 13.7% is a little less than double 7.9%.  Like in Ostrowski, the visible trend simply didn’t show up as a statistical difference.

Let me note that the above is true for all statistical methods applied, there was NO statistical difference between the moderate and high volume groups although both were better than the lowest volume groups.  This actually makes sense given the general improvement in growth up to 10+ sets.  But there is also the issue of the spread of sets.  From 6 to 18 sets up to 30 is big although 9 to 27 to 45 is enormous.  What happens in-between those values?  We don’t know.

Now when the P-value statistics failed to show a benefit, James Krieger invoked something called Bayesian statistics (a probability thing that I only vaguely understand and won’t attempt to discuss further).  Here the Bf values were like 1.2 for the 3 set and 2.4 for the 5 set group and this was claimed as double the probability of a real difference for the higher volume group to be superior by James Krieger over and over, usually to deflect my direct questions.   But in Bayesian statistics numbers that low are called a weak effect (as Eric honestly reported even if he still concluded it meant something).

As a friend with much more experience in statistics explained.

Know what another way to say “weak” effect is? “Not worth more than a bare mention” (that comes from the Jeffries (1961) reference that the Raftery (1995) paper [cite 20] refers to.

Cite 20 was in Brad’s paper and Jeffries is probably one of those ancient statistics papers.  Basically they reference a paper on Bayesian statistics that says that those Bf values barely worth a mention (until they get to like 100+ they just don’t mean anything).  Yet Brad and James et. al. used it to draw their conclusion and James continually repeated it in lieu of answering direct questions.  I believe his response to me was “in Lyle’s world a doubling of the chance of rain isn’t important.”  Not when the doubling is from 0.1% to 0.2% it’s not, James.  Is it in yours?  Because the simple fact is that double jack shit is still jack shit.

Basically, the paper was really struggling to make the highest volume group look better than the moderate volume group (the P-hacking I mentioned in Part 1) and I’ll link out to the extremely detailed statistical analysis my friend did on the topic for anybody interested in this. Basically none of the statistical metrics used support that the high volume group was better than the moderate volume group although I will still acknowledge that there was a trend, just as with Ostrowski (it would be disingenuous for me to acknowledge it for one paper but not this one).

I’ve brought up my other issues with this study endlessly and said at the outset that I wasn’t going to harp on methodological issues one way or another for any paper and it would be unfair of me to do it for this one only.  If Eric thinks that the argument “Well other papers are methodologically flawed so it’s ok for this one to be” is correct, I’ll concede the point and apply that rule.  This makes all studies fair game by his own argument (recall that I dismissed Raedelli on the totally nonsensical results, not the methods).

He opened the door, I get to walk through it.   Eric, you played yourself.

One oddity is that, Eric’s and Jame’s attempt to dredge up nonsense about leg extension volume load notwithstanding (data once again NOT reported in the paper and ONLY brought up when they were desperate to maintain the bullshit), the strength gains in all groups were identical over the length of the study.  This fails the reality check so hard it hurts.  It contradicts literally every past study on the topic showing that strength gains increase with increasing volume (at least up to a point).  Several suggestions were made from the length of the study to the repetition ranges that were used.  But it still fails the reality check.

However, in that muscle size and strength show a strong (but imperfect) relationship, the lack of strength gains differences implies something else: a lack of differences in muscular size gains.   Same size gains, same strength gains.  QED.  Note my use of the word IMPLIES.  But there is a big issue when this paper runs basically opposite to every other one.  One argument I’ve seen is that the short rest periods prevented the weights from being heavy enough.  Well, that’s true.

Brad himself has done a paper showing that longer rest intervals beat out shorter.  So why design the study like this to begin with?  Brad says it was for practical reasons, to keep the workout about an hour.  Which raises a practical issue about insane volumes: with a real rest interval, workouts with this much volume are impossibly long (I believe the 32 set workouts in Haun were about 2 hours).   But it still fails the reality check and contradicts all other literature on the topic.  Brad has since spun this to the NY Times as “You can make strength gains with 13′ workouts”.

Now, acknowledging that there is a trend (albeit one that could not even be forced into being statistically significant despite throwing multiple statistical methods at it), let’s go back to the Ostrowski data since that is what Brad compared his results to in his paper’s discussion in terms of muscle growth.  In doing so, Brad represented the leg data as 6.7% for the low volume group and 13% is for the high (3 and 12 sets respectively) stating that this is similar to his own data.   This is accurate but incomplete and you have to wonder why he left out the middle data point. Was it to save ink (on a PDF and yes I know that the article is also printed, spare me), was he running out of words?   The journal doesn’t have a word count limit so that’s not it.  It’s not even as if adding that information would have added more than like 3-4 words.  Allow me to demonstrate:

Brad wrote this:
“Ostrowski et al. (11) showed an increase of 6.8% in quadriceps MT for the lowest volume condition (3 sets per muscle/week) while growth in the highest volume condition (12 sets per muscle/week) was 13.1%”

36 words

I can rewrite this as:
Ostrowski et. al. (11) showed a dose-response for growth in the quadriceps of 6.8%, 5% and 13.1% for 3, 6 and 12 sets/week respectively

26 words.

Perhaps Brad should hire me to help him with the writing of his papers.  I saved him 12 words and accurately represented the data.

But let’s look at this side by side.  Below I’ve presented Ostrowski’s data for his 3,6 and 12 set groups and Brad’s for his 9, 27 and 45 set groups.  I can only compare percentages here since the Ostrowski data was presented in mm^2 and there’s no way to back calculate that to mm to compared it directly.

Schoenfeld RF %age Change 3.3% (9 sets/week) 5.1% (27 sets/week) 12.5% (45 sets/week)
Ostrowski RF %age Change
6.5% (3 sets/week) 5% (6 sets/week) 13.1% (12 sets/week)

Ok, so yes, it is true that both studies showed about 13% growth with the highest volumes.  Mind you, Ostrowski got DOUBLE the growth with only 3 sets that Brad needed 9 sets to get and even the 6 set/week group equalled his 27 set group.  And while the third groups got roughly the same RF growth, we have to ask why it took 4 times as much volume in Brad’s study to accomplish this.

Seriously, in what world do you need 45 sets/week to achieve the same growth as another study got in 12 set/week?  In discussing this discrepancy, Brad says this.

Interestingly, the group performing the lowest volume for the lower-body performed 9 sets in our study, which approaches the highest volume condition in Ostrowski et al. (11), yet much greater levels of volume were required to achieve similar hypertrophic responses in the quadriceps. The reason for these discrepancies remains unclear.

Basically a shoulder shrug and “dunno”.  Not even a speculation as to why.  Fantastic.

He actually tried to defend this on his website by writing the following:

“Some have asked why we did not discuss the dose-response implications between their study and ours. This was a matter of economy. Comparing and contrasting findings would have required fairly extensive discussion to properly cover nuances of the topic. Moreover, for thoroughness we then would have had to delve into the other dose-response paper by Radaelli et al, further increasing word count. Our discussion section was already quite lengthy, and we felt it was better to err on the side of brevity. However, it’s certainly a fair point and I will aim to address those studies now.”

This is seriously sad.  First off the journal doesn’t have a word limit so who cares for economy.  Second off, it’s good scientific practice to examine the discrepancies between your work and previous research to actually try to forward the field by determining what the explanation might be.  Third, with respect to Ostrowski, what nuance?  The study design was nearly identical, the weekly set count nearly identical, the training status nearly identical.  There’s no nuance and the results are the results and the FACT is that Brad needed 4 times the volume to get them and should try to explain why.  Raedelli is an issue because that paper is a mess but so what.  He was happy to address it in the STRENGTH data and examine the discrepancy writing this

However, for the bench press and lat- pulldown exercises, the 30 weekly set group experienced greater increases than the two other groups. Given that their subjects did not have any RT experience it might be that the greater strength gains in the 30 weekly set group are due to the greater opportunities to practice the exercise and thus an enhanced „learning‟ effect (22). Also, their intervention lasted 6 months while the present study had a duration of 8 weeks. It might be that higher training volumes become of greater importance for strength gains over longer time courses; future studies exploring this topic using longer duration interventions are needed to confirm this hypothesis.

But what, by the time he got to the GROWTH data, he didn’t have the energy or time to discuss the differences?   Please.  But I guess when he’s putting out 5 papers per month, Brad is probably too time stretched to actually do something properly.

Anyhow, since Brad is too time crunched to do it, I will speculate on the possible reasons for the discrepancy in the growth data.  I think a possibility, and this is impossible to know given that Ostrowski failed to report on their rest intervals, is that the absolutely sub-optimal rest interval in Schoenfeld’s study was the issue.  Basically, when you train on 90 seconds and can’t use heavy loads, maybe you need an absolute metric ton of junk volume to get the same growth as you could get doing 12 quality sets. This is MY speculation and nothing more but at least I provided one.  Perhaps Brad should get me to help him with the discussion in the future since he clearly can’t do it himself.  As noted above, I guess when you have to get your name on 45 papers per year (as of October 2018), you don’t have the time.

Now we come to the triceps data.     First let me reiterate that Brad deliberately misrepresented the data set here, reporting that Ostrowski found that 28 sets gave better results than 7 sets (4.8% vs. 2.3%) which was broadly similar to his results.  This technically true but inaccurate and misleading since Ostrowski found a plateau at 14 sets, said data going unreported (arguably why he left out the middle data for the thigh changes, to establish the pattern).  By leaving out this data point, Brad changed the results of Ostrowki from disagreeing with him to agreeing with him.  This is called lying.

Honestly, that single fact should disallow this finding on every level and this paper should have never passed peer review for that alone (too bad I wasn’t on the peer review since I caught the Ostrowki lie upon my first read and I bet Brad wishes he hadn’t sent me the pre-publication paper at all).

Regardless of that, let me look at the side by side data in terms of %age gains.

Schoenfeld Tri %age Change 1.3% (6 sets/wk) 2.9%  (18 sets/wk) 5.5% (30 sets/wk)
Ostrowski Tri %age Change
2.3% (7 sets/wk) 4.7% (14 sets/wk) 4.8% (28 set/wk)

So we see a similar pattern to legs.  First, Ostrowski got almost double the %age growth with 7 sets/week that Brad got with 6.  Moving up to 14 sets/week, Ostrowski still crushed Brad’s 18 set/week group.  And ignoring the misrepresentation of the data, it took Brad 30 sets/week to achieve the a little more growth than Ostrowski got in 14.  So over double the volume to achieve only a slight improvement.   So just like with the leg data we ask why it seemed to take twice as much volume to generate the identical growth in Brad’s study (bonus question: why will nobody directly address Brad’s lie about this data set?).

Now, since Brad misrepresented the data to make it look like it supported him, there was no discussion of the discrepancy.  It makes me wonder why more researchers don’t simply lie about data.  It would save so much time discussing reasons why it disagrees with you.  Just lie about the data and, boom, all research agrees with you. The amount of time it would save typing is enormous.

As with the leg data, I’d speculate again that the brutally suboptimal design of the workouts in Brad’s study are again at fault.  Even for upper body how many can do a lot of RM load sets on a short rest interval?  Not many.  But again maybe you need twice as much volume when those sets are lower quality junk volume.  Simply put, Ostrowski got the same growth as Brad’s with 1/2 as many sets.  Why do double for essentially the same results (doubling your volume for a small %age increase is asinine)?

There are other issues relating to this paper but I’ll only focus on the one that the Haun paper recently brought into light which is the issue of ECW and water retention which only appears to become an issue when more than 20 sets/week are done (note: Brad couldn’t have known about this paper since it came out a week before his so I can’t say he ignored it).  As a reminder, for the upper body workout, the set count was 6, 18 and 30.  Only the last one hits the volume level of 20+ sets where ECW might be relevant.  For legs it was 9, 27 and 45 so both the moderate and high volume group clear it.

As above, do we know that ECW impacts on Ultrasound?  No.  But we don’t know that it doesn’t (and there’s endless other work that edema is still present at the 48-72 hour time point Brad measured at anyhow even if James Krieger is desperate to dismiss it).  Haun clearly showed that it skewed the results hugely at high volumes and it needs to be studied and addressed.  But the issue only applies to 3 of the groups in this study (30 set upper body, 27 and 45 set lower body).   And if ECW turns out to skew the Ultrasound measurements, then even the trend towards better growth may very well disappear or be a measurement artifact.

Note I said may, not will and it may turn out not to have an effect (and I’ll accept that).
It has to be studied so we can know for sure.

Rather, based on the detailed analysis of the statistics, it’s clear that the high volume group did in fact NOT do better than the moderate volume group although there was an apparent visible trend based on absolute mm change and %age change.  Not by P value and not by Bayesian factors (which were too small to be relevant because double jack shit is still jack shit) and not by any nonsensical argument by the people involved or those defending it.    Going forwards, I’m going to treat this study as if the moderate volume group did best. I’ll look at the high volume group too but NOTHING about this study supports the conclusions being drawn.  NOTHING. You can agree or not.  As I said above, Brad’s lie alone should dismiss the paper out of hand but I’m keeping it in so that I won’t be accused of bias or simply making data I don’t like disappear (you know, like Brad did with Ostrowski).

Because even if you take the highest volume results at face value, you’ll see next week that it contradicts the other 5 papers I will have looked at.  And, as I discussed in Part 1, we base models on the overall data, not the one study (as James so helpfully pointed out despite the fact that only HIS group of folks were doing it).  If 1 study is the outlier on the other 5, we ignore the 1 until it’s replicated.  And in this case it must be replicated


Brad can continue to churn out studies showing that volume is all that matters but I suspect that looking at all of them in detail would turn up an equal amount of shenanigans (a project for another day).  As he’s now shown that he is willing to lie in a discussion to change the conclusion of a contradictory paper, nothing he puts out from here on out can be taken at face value: what’s to stop him from lying about data again?  If ONLY he can produce these results, then we have another issue (remember, not only is replication required but is better if someone else does it).  If another lab, perhaps one that thinks he’s wrong replicates it, then we can consider it. Let’s get Jeremy Loenneke on the job, hahahaha (in-joke, sorry).

You’ll also see in Part 3 that if we take into account another issue that has only recently been brought up (by Brad himself), that even if we take the highest set counts at face value as generating higher growth, it still stops mattering.  You’ll have to wait until next week for that.


Despite their attempts to make volume happen, Brad’s own statistics at best support that moderate volumes of 18 sets for upper body and 27 for lower body give better growth than lower volumes with statistically irrelevant support for the highest volumes of 30 and 45 sets being any better (there was a trend that did not reach significance by any of the methods used).  Even if you accept the highest volume data, it doesn’t change the fact that Brad needed 2X and 4X the volume to get the SAME growth as Ostrowski with him attempting to defend why he did NOT explain the discrepancy.

Since the leg volumes vary from a low of 9 sets to a medium 27 it’s impossible to know if a value between that level would have generated different results.   But it’s a huge spread of volume and an 18 set group would be very informative.  But they didn’t do it, the data is the data and we don’t know what might have happened (any speculation I could make it colored by my own bias so I won’t make one).

Also, the lack of differences in strength gains between groups goes against literally every other study on the topic where more volume, up to a point, leads to greater strength gains.  Literally every study.   Brad speculates on the reason but it still contradicts a lot of other data.  But the most parsimonious explanation, given the general relationship of muscle size and strength, would simply be that the gains in muscle didn’t differ.  Occam’s Razor folks, now available from Dollar Shave Company. Same muscular gains equal same strength gains.  QED.

This study just fails the reality check so hard it hurts.  The workouts can’t possibly have been completed to begin with using any decent poundages, the lack of strength gains scaling with supposed size gains contradict all previous data, etc.  It goes against every other study.  So either they are all wrong or it is wrong and well…..

Couple that with the deliberate misrepresentation of the data of Ostrowski and this study has some real issues.  It failed half of the criteria Greg Knuckols himself laid out in MASS methodologically even if Eric still said “Good study, broseph” (and Greg tried to somehow dismiss what I wrote last time).  But, I’m nice, I won’t dismiss it out of hand like I did with Radaelli because then people will say I’m biased against Brad and I’ll include it in the rest of my analysis.  Even IF it’s results are valid, and I clearly don’t think they are, it won’t end up mattering.

Dose-Response of Weekly Resistance Training Volume and Frequency on Muscular Adaptations in Trained Males

And on the heels of all of that is a brand new paper that came out a week after Brad’s by Heaselgrave et. al and published in the Int J Sports Physiology and Performance.    In it, 49 resistance trained males (1 year or more consistent training so not untrained) were put in either a low, moderate or high volume group that did 9, 18 or 27 sets for biceps for 6 weeks.

This was accomplished with a workout of biceps curls, bent over row and pulldown and the workout structure is a little bit goofy, unfortunately having both a frequency and volume component which adds an unnecessary second independent variable.

The low volume group trained once weekly performing 3 sets of each exercise for a total of 9 sets.  The moderate group performed that workout twice per week for a total of 18 sets.  The high volume group did one workout consisting of 5 sets of curls and row and 4 sets of pulldowns (14 sets) and a second workout of 4 sets of curls and row and 5 sets of pulldowns (13 sets) for a total of 27 sets.  I’ve replicated it to the best of my ability below with the + in the high group referring to the second workout.

Volume Low (9 sets/wk)
Moderate High
Pulldown 3 sets 1X/week 3 sets 2Xweek 4 sets 1X/week + 5 sets 1X/week
Bent over row 3 set 1X/week 3 sets 2Xweek 5 sets 1X/week + 4 sets 1X/week
Biceps curl 3 sets 1X/week 3 sets 2Xweek 5 sets 1X/week + 4 sets 1X/week

They did sets of 10-12 with a goal RIR of 2 reps (so close to failure), starting at 75% of 1RM and a 3 minute rest between sets.  The workouts were overseen with a lifting tempo of one second up and 3 seconds down.  Basically they lifted heavy weights, near limits and used a long enough rest interval to keep the loads heavy and as they state “…to maximize MPS [muscle protein synthesis] and strength gains.”

The subjects were allowed to work out outside of the study which is a HUGE methodological problem but they provided workout logs in order to show that they were not doing extra biceps work.  This unfortunately makes it easy to hand-wave these results away which is more or less what Brad tried to do online.  How do we KNOW that they didn’t do more arm work?

Well we don’t so we kind of have to trust them and their self-reporting.  And since Brad said he can do studies unblinded because you can trust him, I think it’s only fair to apply the same standard here unless he wants to call the study subjects liars.  I for one would hope Brad would not impugn someone’s integrity in such a fashion unless he is the only human in the world who can be trusted.  Shouldn’t we give the subjects of this study the benefit of the doubt?

Of course, I could claim on any study I didn’t like, that the subjects did extra training outside of the study (call it The Colorado Experiment effect).  In Brad’s study in Haun’s study, nobody can say for sure if the subjects did more training outside of the study itself.  Unless you lock them up in a metabolic ward, you can never know for sure.

Is this design ideal?  No.  It simply is what it is and I’m describing this limitation up front because that’s the honest thing to do and I think the results are still worth examining.  And if I’m keeping in Brad’s methodological shit show of a study (remember, failed about HALF of Knuckols’ methodolgy list) and Eric says it’s ok if one study is unsound because others are, I’m keeping this too.

They opened the door, I get to use it.

Looking at muscular thickness, the average changes in size were 0.1 cm (1mm) for the low volume group, 0.3 cm (3mm) for the moderate volume group and 0.2 cm (2mm) for the high volume group.

Heaselgrave Biceps Change

Now all three groups were significantly different from the initial value.  However, as with Ostrowski, there was no difference between groups.  That is, statistically the researcher concludes that all groups grew the same.  At the same time, as with Ostrowski there is a visible trend (admittedly with LARGE variance) from low (1 mm) to moderate (3 mm) back to high (2 mm).  Unfortunately the researchers did not provide the starting or ending levels, only the change so I can’t calculate the percentage changes here.   And I am hesitant to eyeball it as it will reflect my own bais.

At the same time, statistically insignificant or not, the absolute values are similar in absolute terms to the Ostrowski triceps data (1mm, 3mm, 2mm here vs. 1 mm, 2mm, 2mm in Ostrowski).  If Brad Schoenfeld at. al. get to use that magnitude of non-significant changes (albeit misrepresented) I feel as I have the same right.  They opened the door, I get to walk through it. An alternate conclusion, and this would be consistent with other work would be that all groups did in fact do the same and that 9 sets was just as effective as 18 or 27.    You can take your pick on which you think is correct.

What kind of stands out is the individual variance (and this is always an issue with these studies, hi James).  Clearly some subjects in the low volume group grew better than others in the moderate volume group (which has both the highest AND lowest responder).

The high volume group is more well clustered together certainly.  But if other papers are going to look at average response, I’m using this one’s average response for consistency.  And on average, moderate volumes beat out low and high volume was either equal to or even a bit less than moderate.



While the researchers reported no difference in growth from low to moderate to higher volumes there is a trend towards better growth from 9 to 18 sets with no further growth at 27 sets.  Again we see a cap/threshold at moderate volumes.

And with that final study out of the way, I’ll cut it here.  Next week in the third and final part, I’ll look at the studies in the aggregate to see if any general patterns show up along with re-addressing the set counting issue.  Don’t worry, this is almost over.


Why do Leg Extensions Hurt So Much?

Ok, this is going to be one of my stupid, pointless, non-applied articles that I just need to write to get something out of my head (so unless you’re really interested in minutial trivia go read something else).  It’s also a way to actually update the site as I finish up getting ready to launch the Women’s book (no foolin’ this time, the book is done and it’s just some busywork to launch in the third week of January).

Question 1: Why do leg extensions hurt so much for high reps?  I mean locally hurt, the quads are screaming and hurt more than other similar movements done for similar reps.

Question 2: What do blood flow restriction (KAAATSUUUUUU!!!), speed skating and leg extensions have in common?

Read more to find out.

Blood Flow Restriction (BFR)

Ok, for the 3 people who don’t know what BFR is, it’s a relatively new method of training where you basically use pressure to reduce blood flow to the muscle and then use relatively light loads for training.  And research has generally found that it provides similar hypertrophy gains to muscle as heavier training and does so with lighter loads with various mechanisms being involved.  Please note that the size gains are, at best, identical but not greater.  And you don’t get the strength gains you’d get from lifting real weights since you aren’t training the neural components.

Now, BFR is nice in that it does reduce joint strain which can be fantastic if you have a joint injury or deliberately need to do such.

But it has drawbacks.  One is set up since you’re having to go to the trouble to get everything tied off.  I’m not sure the average trainee can get the pressure right since it tends to be pretty specific.  Cutting off blood flow to muscles is not a good thing.  Necrosis anybody?  And while excruciatingly minor in the big scheme, there are two case studies of rhabdomyolysis occurring with BFR.  Mind you, that’s a weekly occurrence for Crossfit.


Hyperplasia vs. Hypertrophy in Skeletal Muscle

I received the following question in the mailbag and, for a fairly short question I’m going to give a fairly long answer since it gives me something to write about today.

Question: Does the number of fast/slow twitch muscle fiber types in your body actually change in response to strength or endurance stimulus? Or just the volume, and you’re stuck with what your genetics dictate?

The short answer is yes-ish.  Here’s the long answer.

Let me make one clarification here.  Well, two.  The first is that I am talking about skeletal muscle.  Cardiac muscle acts a little bit differently in how it grows with stress and we don’t lift weights for a bigger heart (perhaps if we did there would be more love in the world).

Also, I’m talking about training induced growth.  You can cause some goofy stuff to occur when you ablate a muscle (i.e. cut a muscle in a larger group and you see the other muscles grow like crazy) or with other distinctly non-physiological types of research methods.  Here we’re talking about moving iron (the original question asked about endurance training but there’s no reason to begin to suspect that hyperplasia occurs from that type of training in my mind).

Hyperplasia vs. Hypertrophy: Definitions

First, let’s define the terms hypertrophy and hyperplasia.  Hypertrophy means an increase in cell size.  Fat cell hypertrophy occurs when the fat cell increases in volume (by storing fatty acids as triglyeceride) and skeletal muscle hypertrophy occurs when skeletal muscle increases in volume.

Hyperplasia means an increase in cell number.  Fat cell hyperplasia (which does occur in adults, contrary to old belief) is an increase in fat cell number.  Skeletal muscle hyperplasia would be an increase in muscle cell, or in this case, fiber number.


It’s Time to Forget About Bulgarian Training

I’m actually not entirely sure how to introduce this piece, it’s just been something that’s been going through my head when I walk my dogs in the morning and I’m not even sure what stimulated it in the first place.  But as the title suggests, basically I think it’s time for the majority of the general training world to forget Bulgarian training.

Now, I’ve been in this field professionally for nearly 2 decades at this point and I have watched this endless fascination with what the Bulgarian OL’ers are supposedly doing come and go for the entire time.  And it was around far longer than that.  From about the time that the Bulgarians came on the scene (in roughly the 80’s) and started handing the Russians their asses in Olympic Lifting (at least in the lighter weight classes), all while using a training system that went more or less against the beliefs of the day, people have been fascinated with their training.

Since that time, various athletes, mostly Western Olympic Lifters (but every so often powerlifters) have attempted to apply the Bulgarian system to their training.  Without fail, it fails.  They get broken off, injured and unless they use it in fairly specific ways for fairly short periods of time, they get injured or worse.   They don’t have the buildup, the background, the drug support and it simply breaks them.

But to understand that, first let me look at the system in brief.

What is Bulgarian Training?

At the time that Ivan Abadjaev took over Bulgarian Olympic lifting, the common model was fairly stock standard periodization moving from transition to general prep to specific prep for competition.  You worked at lower intensities and higher volumes (typically more sets of relatively more repetitions and here I’m talking about 3-5’s depending on the lift) in the preparation phase only using lower repetitions and higher intensities, nearing maximum near competition.  Generally more assistance or partial movements were used during preparation with more specific competition work done nearer competition.