To generalize somewhat, I think one could say that in American practice, locomotives with four-wheeled pilot trucks drove on the second axle, whilst those with two-wheeled pilot trucks drove on the third axle. In each case that was likely the best trade-off between conflicting requirements, such as avoiding undue angularity of main rod movement on the one hand, and avoiding excessive mass on the other.
As already noted, there are exceptions. But some of these tend to reinforce rather than contradict generalized practice.
The SP and UP 4-10-2s and the UP 4-12-2s had their outside cylinders driving the third, not second axles. But in each case, the locomotives had extra-long guides and piston rods that put the crossheads in more-or-less the same position as would have been the case had they had two-wheel pilot trucks and conventional length guides. Essentially they were equipped with four-wheel pilot trucks less because they were perceived as necessary for the desired road speeds, than because they were needed to carry the extra weight of the three-cylinder front ends. Still, UP was convinced of the merits of the four-wheel pilot truck for freight locomotives, and to a first approximation, its “small” Challengers were a “bent” version of the 4-12-2. The small 4-6-6-4 shared the same grate area as the 4-12-2, which tended to run against the idea that there was a sharp division between locomotives with two-wheel trailing trucks and those with four-wheel trailing trucks, the latter being a necessary (but not sufficient) condition for application of the “Super Power” moniker.
And the Challenger had third axle drive for both engine units. For the trailing unit, that aligns fully with the generalization, as the cylinder placement relative to the drivers was akin to that of a locomotive with a two-wheel pilot truck. For the leading unit, third axle drive was achieved through the use of extended piston rod and guides located somewhat rearwards of the cylinders.
A wider look at articulateds generally and the Pennsy duplexii will, I think, tend to support the generalization.
Genuine exceptions included the low-drivered 4-8-2s with third axle drive, as already noted, including the relatively late Bangor & Aroostock example.
2-6-2s had either second or third axle drive, the former usually associated with drivers above 69 inches (and reputedly not very stable in a yaw sense) and the latter with 63 inch drivers.
4-4-2s had either front or rear axle drive; in the former case with extended spacing between trailing pilot truck axle and the leading driving axle.
2-6-0s usually had centre axle drive, but with longer driving wheelbases than other 6-coupled types with similar driving wheel diameters. E.g. compare the SP 2-6-0 with the Milwaukee 2-6-2.
Re the 69 and 70 inch drivered Berkshires, third axle drive is consistent with the generalization; for example the rod layout essentially aligns with that of the rear unit of the UP 4-8-8-4.
That Berkshires were a problem might be a wider issue than just the drive axle choice, at least for the 63 inch drivered set. Notwithstanding the “hype” associated with the Lima A-1 and its progeny, it does not escape notice that their fate on several roads that purchased that type is not quite consistent with what might be expected for a “wonder machine”. Thus for example the B&M opted for a big 4-8-2 rather than more 2-8-4s in the post-depression era. The IC also opted for the 4-8-2 type, and in its major post-depression rebuilding program, seemed to be a bit diffident about what to do with its fleet when the 4-6-4 conversion idea did not work out, whereas rebuilding its large 2-8-2 fleet in kind seemed to be a key activity. The Mopac rebuilt some of its 2-8-4s into 4-8-4s. And the Santa Fe seemed to be less sure as to where its 2-8-4 fleet fitted in later times, so it escaped any major modernization.
I suspect that the problem was that the 2-8-4 was more powerful than a 2-8-2 of similar adhesive weight, which meant that it would balance at higher speed with any given train weight, but not start a heavier train than the 2-8-2. But satisfactory running at those higher speeds required larger diameter driving wheels, hence the move up to 69 and 70 inches. However, then realizable speeds were probably beyond what some roads were prepared to accept for locomotives with two-wheeled pilot trucks, hence the preference in some cases for large 4-8-2s. And with the latter wheel arrangement, with its higher safe speeds, 73 and 74 inch drivers were feasible if desired.
In fact one might undertake a paper comparison of the relative utilities of a set of 8-coupled locomotives, all normalized to say 270 000 lb on drivers. A possible set of examples could be:
2-8-2 63 inch: A thoroughly modernized USRA heavy, e.g. Atlanta & West Point.
2-8-4 69/70 inch: The Van Sweringen design, e.g. Wheeling & Lake Erie.
4-8-2 73/74 inch: The B&M design
4-8-4 73/74 inch: The Alco WWII standard, say in Rock Island form.
4-8-4 80 inch: The UP FEF-2
Of that list, the UP FEF-2 was perhaps in a class of its own when it comes to very fast passenger haulage with 100 mile/h+ capability.
At the other end, the 2-8-2 was probably nicely balanced for freight haulage where moderate balancing speeds were acceptable; it would start the same weight of train as the others, and might have been an effective helper for the 73/74 inch 4-8-2 an 4-8-4 on grades where it was desire to double the low speed tractive effort but not the power.
For fast freight in lowish grade territory, say not much above 1%, the 73/74 inch 4-8-4 would be about right. (For steeper grades, a 4-6-6-4 might have been considered, particularly if one did not want to sacrifice too much speed capability on level sections within heavy grade divisions.)
That leaves the 4-8-2 and 2-8-4. An obvious role for both would have been on divisions where bridge loading constraints, etc, would not allow the use of a 4-8-4 because of its total weight. But which one? With the same adhesive weight, they would both have started about the same trailing loads, but with lower balancing speeds than the 4-8-4, and perhaps with the 2-8-4 having a slight speed edge over the 4-8-2 with the heavier trains. But the 4-8-2 would have matched the 4-8-4 for top operating speed, albeit with a lighter load. On the other hand, I suspect that some roads at least would have had an aversion to running locomotives with 2-wheel pilot trucks at much above 50 mile/h, even if the capability to go materially faster than this without overstressing the machinery was there, as it probably was for the Van Sweringen design.
I haven’t taken a more holistic look at the relative numbers of 2-8-4s and 4-8-2s built since 1924, when the A-1 arrived, nor at their dispositions and inferred standing in their respective owner’s eyes, but I suspect that if one did that, then the 4-8-2 might come out as the dark horse that won by virtue of greater general utility. But as type, and particularly in its later (say post-1925) form, it has not been subject to the same coherent treatment in the literature as either the 2-8-4 group in toto or its Van Sweringen subgroup. Heresy perhaps...
Cheers,