The "Full" Network
The "full" network isn't.
Mann et al 2008 describe a "full" network consisting of 1209 proxies:
We made use of a multiple proxy (‘‘multiproxy’’) database consisting of a diverse (1,209) set of annually (1,158) and decadally (51) resolved proxy series … Reconstructions were performed based on both the ‘‘full’’ proxy data network and on a ‘‘screened’’ network (Table S1)
The locations of the "full" network are illustrated in their Figure 1, captioned "spatial distribution of full (A) and screened (B)" network. Their Figure S7 compares "long-term CPS NH land (a) and EIV NH land plus ocean (full global proxy network) (b) with and without using tree-ring data". Their Figure S8 compares "long-term CPS NH land (a) and EIV NH land plus ocean (b) reconstructions (full global proxy network)". Their Figure S11 shows "plots of the NH CPS Land reconstruction (A) and EIV Land plus ocean (full global proxy network)".
Table S1 shows breakdown counts by dendro and non-dendro, by annual-resolved and total, by NH, SH and Global. The top portion of Table S1 is excerpted here:
The problem is that their actual calculations appear almost certainly not to have used the network with the count statistics shown in Table S1, but a truncated version of that network, with their "full" network using only 663 (560 - NH) of the 1209 sites (1026- NH). Jean S has run the gridproxy.m program using the FULL option and so we have been able to "prove" this for this part of the program (CPS) through an actual run of the program. (I'm not sure that the EIV portion of the program can be run with present code.)
First here is a table showing counts that compare to the top right counts (NH all) for Table S1. The column headed Matlab shows the counts obtained from the gridproxy.m program, while the column headed "Reported" shows the corresponding column from Table S1. I'll explain the calculation of the other 3 columns below (which I calculated from first principles. For now, you can see that my NH total in the third column matches the counts from gridproxy.m exactly in nearly every case and, in no case, differs by more than 1.)
|
NH Dendro |
NH Other |
NH Total |
Matlab |
Reported |
|
421 |
139 |
560 |
560 |
1036 |
|
297 |
129 |
426 |
427 |
700 |
|
162 |
124 |
286 |
287 |
408 |
|
90 |
120 |
210 |
211 |
277 |
|
62 |
39 |
101 |
102 |
151 |
|
28 |
34 |
62 |
62 |
102 |
|
19 |
30 |
49 |
49 |
77 |
|
11 |
30 |
41 |
41 |
63 |
|
6 |
24 |
30 |
30 |
46 |
|
4 |
21 |
25 |
25 |
40 |
|
4 |
20 |
24 |
24 |
37 |
|
4 |
20 |
24 |
24 |
34 |
|
4 |
19 |
23 |
23 |
33 |
|
4 |
18 |
22 |
22 |
28 |
|
4 |
18 |
22 |
22 |
28 |
|
3 |
18 |
21 |
21 |
27 |
|
2 |
17 |
19 |
19 |
24 |
An Unreported Screening Procedure on the Full Network
The difference appears to result from an unreported screening procedure on the "full" network.
Mann's SI does describe a screening procedure on his "screened" network:
Screening Procedure. To pass screening, a series was required to exhibit a statistically significant (P 0.10) correlation with either one of the two closest instrumental surface temperature grid points over the calibration interval (although see discussion below about the influence of temporal autocorrelation, which reduces the effective criterion to roughly P 0.13 in the case of correlations at the annual time scales)
The implementation of a procedure along these lines can be discerned in gridproxy.m in the case (confusingly) entitled "WHOLE", but which means screened. At one stage of the program, correlation hurdles are defined for various proxy classes (using the proxy class codes to set the hurdle.) "Annual" and "decadally" resolved proxies have different benchmarks. Documentary and speleothems (5000,6000 series) have somewhat different benchmarks (corals - 7000 check for negative correlation). The relevant lines are:
case 'WHOLE'
ia=4; %% annually resolved
ib=5; %% decadally resolved
corra=0.106; %% 9000,8000,7500,4000,3000,2000
corra2=-0.106; %% 7000
corra3=0.136; %% 6000, 5000
corrb=0.337; %% same as above +1
corrb2=-0.337;
corrb3=0.425;
Later in the program, they test the observed correlation for each proxy against the relevant benchmark (their code is very inelegant; they speak Matlab with a heavy Fortran accent). For the "screened" network, corra is 0.106.
for i=1:m1-1 % This is for searching annually-resolved proxies
if (z(3,i)==9000 | z(3,i)==8000 | z(3,i)==7500 | z(3,i)==4000 | z(3,i)==3000 | z(3,i)==2000) &…
x(kk,i+1)>-99999 & x(kkk,i+1)>-99999 &…
z(1,i)>=ilon1 & z(1,i)< =ilon2 & z(2,i)>=ilat1 & z(2,i)< =ilat2 & z(ia,i)>=corra
n=n+1;
…
end
For corals and speleothems, they do it a little differently, testing the absolute value of the correlation:
… z(1,i)>=ilon1 & z(1,i)< =ilon2 & z(2,i)>=ilat1 & z(2,i)< =ilat2 & abs(z(ia,i))>=corra3
Now watch what happens in the "FULL" case. Here's how they set their hurdles. Can you see what happens?
case 'FULL_'
ia=4;
ib=5;
corra=0;
corra2=0;
corra3=0;
corrb=0;
corrb2=0;
corrb3=0;
If a series has a negative correlation to temperature (as about half of the tree ring series do), the test shown above will lead to the exclusion of this series. So this test either intentionally or unintentionally eliminates all the series with negative correlations to gridcell temperature - thus the reduction in NH count from 1036 to 560. Jean S got this count from running gridproxy.m and I got this number by independently comparing reported correlations to a benchmark of 0. So while it's not impossible that I've misinterpreted something here, until we see some alternate explanation, I think that this interpretation should be regarded as valid.
Now this creates problems - both with the statistical significance of what was reported and with the accuracy of the methodological description in the PNAS article, both familiar themes in this corpus.
We've discussed on many occasions that you can "get" a HS merely from picking upward-trending series from networks of red noise (David Stockwell had a good note on this phenomenon on his blog a couple of years ago. My first experiments of this type were on the cherry picks in the original Jacoby network.) Since gridcell temperature series from the mid-19th century to late 20th century by and large have an upward trend, this test is equivalent to a simple pick operation. Whatever the merits of effectively screening the data set for upward trending data, this affects statistical benchmarks, as the null is applying the same sort of operation to red noise networks.
Secondly, whatever one may think of the Mannian PC algorithm (and defenders are reduced to a pretty hard core right now), few people defend the non-reporting of their modification of the PC algorithm and their failure to assess the statistical significance and properties of these changes. (Quite aside from whether they can "get" a HS using some other method.) We're into something pretty similar here. Again there seems to be an unreported screening operation on the "full" network. The value of archiving code is once again demonstrated as the unreported operation can only be discerned through parsing code. Because Mann et al commendably archived code concurrent with publication, it hasn't taken years to identify the issue. (I note that the precise identification of the problem with Mannian PCs also was identified when a little bit of code was inadvertently left in a directory identified for the public after MM03.) So code really does help pin down problems.












The U.S. Financial Crisis:
Sea Ice - End of Game Analysis:
The "Full" Network:
BBC "Climate Wars":
It's Saturday Night Live:
Pielke Jr on Spinning Science:
Code and realcode:
Risk Management Solutions Ltd and the 37 Professors: