How to review what is described in the preface as technical research literature, not pop science?
First, it is well written, for what it is (a PhD thesis modified for book publication ) with the possible exception of the treatment of the statistics used - and already we are diving in to technical matters: Any research literature is meant to describe the experimental method employed in sufficient detail that a reader could replicate it. I couldn't do this with regard to the statistical analysis used, but I hope that someone versed in the type of statistical models used would be able to. Which leads on to another difficulty: I'm a trained scientist but I'm a physicist, not a molecular biologist or psychologist, so in some respects I'm on very familiar ground and in others I'm swimming in deep waters. Still, the preface suggests that the general reader should be able to get the gist and I think I did. Be warned, however - people without a scientific background are probably going to find this both very dull and very hard work.
So what's it all about? Testosterone. More precisely, whether foetal testosterone as measured by its concentration in amniotic fluid causes sex differences in social behaviour and language development of infants and toddlers (0 - 4 yo). The motivation for this experiment involves a group of hypotheses, one of which was notable for not being mentioned even once (perhaps the most important error I noted). This tacit assumption is that widely documented sex differences in human psychology are caused by structural differences between female and male brains. It would appear that the relevant scientific community are receptive to this hypothesis but it really is just that; an unproven idea.
This hypothesis has in turn produced several more in a long chain. A key one is the idea that female brains are less "lateralised" than male brains, which is to say female brains show less dominance of one side of the brain over the other with regard to certain types of task and this in turn explains the observed psychological differences. Two more hypotheses are mutually contradictory ideas about how structural differences between male and female brains might differ structurally in order to generate the observed lesser "lateralisation" in women. The key one here is the Calossal Hypothisis: The corpus callosum is a major linkage between the two brain hemispheres. If this linkage is bigger in women, then there would be greater communication between hemispheres and therefore less lateralisation, goes this hypothesis. Additionally, it is suggested that foetal testosterone levels are important in determining the size of the corpus calossum. It is expected that the correlation is inverse i.e. the more foetal testosterone the smaller the corpus calossum and the greater the lateralisation. Hence the desire to study amniotic testosterone.
This is where the fun starts; the evidence regarding the Calossal Hypothesis (CH) is contradictory. Direct anatomical studies support it. An MRI study finds the opposite. Which brings up the topic of sample sizes - they are tiny! Fairly general and straightforward statistical results show that if you want to talk about a general population that is large but not necessarily exactly known, you need a sample size in excess of 1600 in order to feel very confident about one's results. The largest important study mentioned in this book has a sample size of ~100. The smallest 40. The opposing study is the one mentioned above with a sample size over 100. So the fact is that the reason for conducting the studies reported here - the CH - has no clear case in its favour.
So what did they find out? With sample sizes varying in the range 50 - 100, they determined that there were significant links between amniotic testosterone concentration and infant gaze time for 12 month olds, both intra-sex and inter-sex, i.e. applying for girls alone, boys alone or both sexes taken together. Gaze time is a measure of how much time the baby spent looking at a parent during a "free play" experiment. The girls spent a lot longer doing it than the boys. At 24 months the babies were given a vocabulary test, as reported by a parent. This time there was an over-all link between amniotic testosterone level and vocabulary but no link was observable for the sexes individually. At 48 months a group of tests that are not explained in detail (though fully referenced) were performed. These related to communication skills and "restricted interests" - the latter never being precisely defined, though it presumably is in the literature describing the tests fully. The only link to amniotic testosterone found was for boys and restricted interests. It should be noted that the relationships discovered by these studies were not all linear but I'm not going to go into the details.
The authors give an honest assessment of the weaknesses of their study, which whilst numerous, are extremely difficult to avoid. I'm going to discuss one - the most worrying one and the one that afflicts the entire research area, not just the studies presented here: sample size.
Given what I said above about the validity of statistics of the general population of humans, why are people doing studies with sample sizes one or even two orders of magnitude smaller than the desirable minimum? It's not because they're laughably incompetent. It's because they have to. The discussion of sampling given in this book illustrates why. There were 500 amniocentesis samples in the freezers of the local hospitals. After applying various practical and ethical filters this number was reduced by half. An example of such a filter is eliminating any sample where the foetus or subsequent baby suffered any serious chromosomal abnormality (e.g. Down's Syndrome) or died for any reason at all. On ethical grounds the parents were asked once only to participate (by their GPs) so as not to harass the parents. The positive response number appears to have been ~80 roughly equally divided by sex. So from 500 potentially to < 100.
I conclude from reading this that the evidence in favour of foetal testosterone exposure levels influencing sex differences in human psychology is weak, but not so weak as to be laughed out of court; there's reason enough for further investigtion of the hypothesis. It should also be noted that even if the CH is wrong, the foetal testosterone hypothesis may not be.
Finally, my motivation for reading this was to try to understand a hypothesis regarding the cause of autism. This hypothesis is that autism is caused by having an "extreme male" brain, which is to say that autism is just the extreme of the structural differences in brains that cause sex differences in human psychology. The final study is the most telling in this regard; "restricted interests" are one of the diagnostic criteria for Asperger's Syndrome. It should be noted that the authors theorise that the appropriate foetal testosterone exposure level must be accompanied by other genetic factors in order to cause autism. The evidence base for this is even weaker than for the general case of overall sex differences.
If you got through this review without skimming give yourself a round of applause from me!