Climate Audit

by Steve McIntyre

 

The "Full" Network

The "full" network isn't.

Mann et al 2008 describe a "full" network consisting of 1209 proxies:

We made use of a multiple proxy (‘‘multiproxy’’) database consisting of a diverse (1,209) set of annually (1,158) and decadally (51) resolved proxy series … Reconstructions were performed based on both the ‘‘full’’ proxy data network and on a ‘‘screened’’ network (Table S1)

The locations of the "full" network are illustrated in their Figure 1, captioned "spatial distribution of full (A) and screened (B)" network. Their Figure S7 compares "long-term CPS NH land (a) and EIV NH land plus ocean (full global proxy network) (b) with and without using tree-ring data". Their Figure S8 compares "long-term CPS NH land (a) and EIV NH land plus ocean (b) reconstructions (full global proxy network)". Their Figure S11 shows "plots of the NH CPS Land reconstruction (A) and EIV Land plus ocean (full global proxy network)".

Table S1 shows breakdown counts by dendro and non-dendro, by annual-resolved and total, by NH, SH and Global. The top portion of Table S1 is excerpted here:

The problem is that their actual calculations appear almost certainly not to have used the network with the count statistics shown in Table S1, but a truncated version of that network, with their "full" network using only 663 (560 - NH) of the 1209 sites (1026- NH). Jean S has run the gridproxy.m program using the FULL option and so we have been able to "prove" this for this part of the program (CPS) through an actual run of the program. (I'm not sure that the EIV portion of the program can be run with present code.)

First here is a table showing counts that compare to the top right counts (NH all) for Table S1. The column headed Matlab shows the counts obtained from the gridproxy.m program, while the column headed "Reported" shows the corresponding column from Table S1. I'll explain the calculation of the other 3 columns below (which I calculated from first principles. For now, you can see that my NH total in the third column matches the counts from gridproxy.m exactly in nearly every case and, in no case, differs by more than 1.)

NH Dendro

NH Other

NH Total

Matlab

Reported

421

139

560

560

1036

297

129

426

427

700

162

124

286

287

408

90

120

210

211

277

62

39

101

102

151

28

34

62

62

102

19

30

49

49

77

11

30

41

41

63

6

24

30

30

46

4

21

25

25

40

4

20

24

24

37

4

20

24

24

34

4

19

23

23

33

4

18

22

22

28

4

18

22

22

28

3

18

21

21

27

2

17

19

19

24

An Unreported Screening Procedure on the Full Network
The difference appears to result from an unreported screening procedure on the "full" network.

Mann's SI does describe a screening procedure on his "screened" network:

Screening Procedure. To pass screening, a series was required to exhibit a statistically significant (P 0.10) correlation with either one of the two closest instrumental surface temperature grid points over the calibration interval (although see discussion below about the influence of temporal autocorrelation, which reduces the effective criterion to roughly P 0.13 in the case of correlations at the annual time scales)

The implementation of a procedure along these lines can be discerned in gridproxy.m in the case (confusingly) entitled "WHOLE", but which means screened. At one stage of the program, correlation hurdles are defined for various proxy classes (using the proxy class codes to set the hurdle.) "Annual" and "decadally" resolved proxies have different benchmarks. Documentary and speleothems (5000,6000 series) have somewhat different benchmarks (corals - 7000 check for negative correlation). The relevant lines are:

case 'WHOLE'
ia=4; %% annually resolved
ib=5; %% decadally resolved
corra=0.106; %% 9000,8000,7500,4000,3000,2000
corra2=-0.106; %% 7000
corra3=0.136; %% 6000, 5000
corrb=0.337; %% same as above +1
corrb2=-0.337;
corrb3=0.425;

Later in the program, they test the observed correlation for each proxy against the relevant benchmark (their code is very inelegant; they speak Matlab with a heavy Fortran accent). For the "screened" network, corra is 0.106.

for i=1:m1-1 % This is for searching annually-resolved proxies
if (z(3,i)==9000 | z(3,i)==8000 | z(3,i)==7500 | z(3,i)==4000 | z(3,i)==3000 | z(3,i)==2000) &…
x(kk,i+1)>-99999 & x(kkk,i+1)>-99999 &…
z(1,i)>=ilon1 & z(1,i)< =ilon2 & z(2,i)>=ilat1 & z(2,i)< =ilat2 & z(ia,i)>=corra
n=n+1;

end

For corals and speleothems, they do it a little differently, testing the absolute value of the correlation:

… z(1,i)>=ilon1 & z(1,i)< =ilon2 & z(2,i)>=ilat1 & z(2,i)< =ilat2 & abs(z(ia,i))>=corra3

Now watch what happens in the "FULL" case. Here's how they set their hurdles. Can you see what happens?

case 'FULL_'
ia=4;
ib=5;
corra=0;
corra2=0;
corra3=0;
corrb=0;
corrb2=0;
corrb3=0;

If a series has a negative correlation to temperature (as about half of the tree ring series do), the test shown above will lead to the exclusion of this series. So this test either intentionally or unintentionally eliminates all the series with negative correlations to gridcell temperature - thus the reduction in NH count from 1036 to 560. Jean S got this count from running gridproxy.m and I got this number by independently comparing reported correlations to a benchmark of 0. So while it's not impossible that I've misinterpreted something here, until we see some alternate explanation, I think that this interpretation should be regarded as valid.

Now this creates problems - both with the statistical significance of what was reported and with the accuracy of the methodological description in the PNAS article, both familiar themes in this corpus.

We've discussed on many occasions that you can "get" a HS merely from picking upward-trending series from networks of red noise (David Stockwell had a good note on this phenomenon on his blog a couple of years ago. My first experiments of this type were on the cherry picks in the original Jacoby network.) Since gridcell temperature series from the mid-19th century to late 20th century by and large have an upward trend, this test is equivalent to a simple pick operation. Whatever the merits of effectively screening the data set for upward trending data, this affects statistical benchmarks, as the null is applying the same sort of operation to red noise networks.

Secondly, whatever one may think of the Mannian PC algorithm (and defenders are reduced to a pretty hard core right now), few people defend the non-reporting of their modification of the PC algorithm and their failure to assess the statistical significance and properties of these changes. (Quite aside from whether they can "get" a HS using some other method.) We're into something pretty similar here. Again there seems to be an unreported screening operation on the "full" network. The value of archiving code is once again demonstrated as the unreported operation can only be discerned through parsing code. Because Mann et al commendably archived code concurrent with publication, it hasn't taken years to identify the issue. (I note that the precise identification of the problem with Mannian PCs also was identified when a little bit of code was inadvertently left in a directory identified for the public after MM03.) So code really does help pin down problems.

Code and realcode

Mann et al (PNAS 2008) states:

All data used in this study and Matlab source codes for implementing the algorithms described, as well as additional supporting information are available at www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07 .

They did make a much better effort than others in the field, but unfortunately the reality falls short of the PNAS representation. They have already coopered up some defects in the original documentation in response to observations made at Climate Audit e.g. by adding "original" data to the archive in early September (notably failing to acknowledge us, presumably following the example set by industry leader, James Hansen).

One of the sections of code that was added post-publication was the directory www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeprepdata/infillproxydata/ , which contained code for the infilling of proxy data. There's something interesting about the code in this directory.

First take a look at the programs in the directory www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/coderecon/ supposedly containing code to do the EIV reconstruction. This contains 14 programs:

center.m gcvfctn.m gcvridge.m iridge.m mridge.m nancov.m nanmean.m nanstd.m nansum.m newregem.m peigs.m pttls.m regrechigh.m regreclow.m

Now return to the directory www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codeprepdata/infillproxydata/ which supposedly containing code to do proxy infilling. It contains 24 programs, of which 14 program names are identical to the names in the recon code directory. In a quick spot check, it appears that it is not just the names of the programs that are identical, but the programs themselves are identical, even up to the file names of output.

The recon code directory lacks a readme and lacks any programs comparable to the above 3 programs which actually carry the water in the infillproxydata operations. My partial error here. As observed by a reader below, the recon code directory does contain a readme which states the following:

This directory contains all the codes used for reconstuction. Users need to run regrechigh.m for the reconstruction of high-frequency band, and run regreclow.m for the reconstruction of low-frequency band. The other codes are subroutines that have brief description of the function of the codes.

The infillproxydata code directory includes a Readme, which states that the following 3 programs need to be run in order to execute the operations:

1) trimdata1227nolu.m 2)masterinfillproxy.m 3)backsiginfilled1227nolu.m

Given that the programs regrechigh.m and regreclow.m in the recon code directory are identical to the corresponding programs in the infilledproxydata directory, it seems odd that the same program can be used in both places in an identical way and that covering programs are needed in the one case, but not the other. Thoughts on this welcomed.

Proxy Counts: Today's Crossword Puzzle
Table S1 contains a table showing the counts of proxies starting before specified date. There are 12 such tables. I've excerpted the tables at the bottom of the first page of the SI showing the counts of proxies (GLB), the 3 columns on the left show "annually" resolved proxies, the 3 columns on the right "all" proxies. The classification into "annual" is not provided anywhere and appears to have some traps for the unwary.

I tried to replicate the counts shown in the 3 columns to the left (code shown in a comment below). My counting method replicated the top line counts of the 3 right-hand columns.

The "dendro" count (!032) equals the total of code -9000 and code -7500 proxies. In passing, this means that there is another Mannian trap for the unwary: some dendro series are classified as non-dendro - the ones with code 3000. These are two of Cook's Tasmanian dendro series, the Jacoby-D'Arrigo Mongolian series and the Taymyr series and are all long series (through the MWP). One feels that this detracts somewhat from whatever point was being made in the "non-dendro" reconstructions. Luterbacher (instrumental in their later portion) are included among the "other" proxies, but are really sui generis.

However, as shown below, I was unable to replicate the counts for any earlier period. I place this before you as today's crossword puzzle. Given that observed counts exceed Mann counts in most periods, the discrepancy is a real braintwister.

1032

177

1209

926

154

1080

553

145

698

317

139

456

197

58

255

117

52

169

90

47

137

73

45

118

58

37

95

43

34

77

31

31

62

27

30

57

24

29

53

19

28

47

17

25

42

11

25

36

10

24

34

10

24

34

9

23

32

More on the MXD Data Set

On Sep 9, 2008, I sent an FOI request to the University of East Anglia, requesting a copy of the MXD data set as provided to Mann et al. Today (Oct 2, 2008), I was notified that they would provide this data and, sure enough the data is now posted (Oct 2, 2008) at www.cru.uea.ac.uk/~timo/datapages/mxdtrw.htm under the heading Rutherford et al 2008.

There are some puzzles.

The website reports the use of 341 sites, in Rutherford et al 2005, while the text of Rutherford et al reports the use of 387 sites. So one or the other is incorrect. An earlier article (Briffa et al, Holocene, 2002a) used 387 sites, listed here. I presume that the website is correct and the article is wrong and that a small corrigendum on this matter should be issued. All 341 sites said to have been used in Rutherford et al 2005 are included in the list of 387. Why were 46 sites removed from the network? Surely some sort of explanation should be provided.

The website (as of Oct 2, 2008) states:

The values after 1960 are a combination of information from high-frequency MXD variations and low-frequency instrumental temperature variations. We recommend, therefore, that the post-1960 values be deleted or ignored in any analysis that might be biased by the inclusion of this observed temperature information, such as the calibration of these data to form a climate reconstruction, or comparision of these data with instrumental climate observations for the purpose of assessing the ability of these data to represent temperature variability.

This is also a very puzzling comment as Rutherford et al 2005 nowhere mentions the blending of instrumental temperature variations back into proxy data after 1960. And if this was done, it is rather troubling. The explanation in Rutherford et al 2005 (Briffa, Osborn being coauthors with the Mann crowd) said:

Because the age-banding method requires large numbers of samples throughout the time period being studied, it has been applied only at a regional scale for the MXD network used here, rather than at the level of the 387 original site chronologies.

OSB therefore worked first with the traditionally standardized data at the individual chronology scale and gridded them to provide values in 115 5° by 5° grid boxes (26 available back to A.D. 1400) in the extratropical NH (Fig. 1b). They then developed temperature reconstructions by the local calibration of the MXD grid-box data against the corresponding instrumental grid-box temperatures.

The “missing” low-frequency temperature variability was then identified as the difference between the 30-yr smoothed regional reconstructions of Briffa et al. (2001) and the corresponding 30-yr smoothed regional averages of the gridded reconstructions. OSB add this missing low-frequency variability to each grid box in a region. After roughly 1960, the trends in the MXD data deviate from those of the collocated instrumental grid-box SAT data for reasons that are not yet understood (Briffa et al. 1998b, 2003; Vaganov et al. 1999). To circumvent this complication, we use only the pre-1960 instrumental record for calibration/cross validation of this dataset in the CFR experiments.

As I read the above, it describes a sort of coercion of the individual gridded series at "low frequency" to the corresponding "low frequency" shape of the regional ABD data (which is available at WDCP by the way). However it's hard to be sure right now.

Here is a plot of average of the Briffa-Osborn gridded data, with dotted red line showing the part deleted in Mann et al 2008 (where Osborn and Briffa are coauthors).

Note that Briffa and Osborn also archived today various data used in IPCC AR4 graphics - previously unavailable in these versions.

It's Saturday Night Live

When you're trading in puts and calls (or derivatives), it's important to know the sign of the relationship between the value of the derivative and the market. Short positions will go up in value as the market goes down. And, unfortunately, you don't get to decide afterwards whether you wanted to be short or long. Proxies in climate can, in a sense, either be "short" or "long", in the sense that the values of some proxies (e.g. coral dO18) are said to go down with higher temperatures, while the values of other proxies (e.g. ice core dO18) are said to go up with higher temperatures.

One feels that it is not asking too much of paleoclimatologists to know the expected sign of a proxy derivative. Traders would like to decide on the sign of a proxy derivative after the fact, by taking a correlation to market performance, but this luxury is denied to them, as it should be denied to climate scientists.

In Mann et al 2008, there is a truly remarkable example of opportunistic after-the-fact sign selection, which, in addition, beautifully illustrates the concept of spurious regression, a concept that seems to baffle signal mining paleoclimatologists. For this example, we turn to the highly HS-shaped Finnish sediment series of Tiljander et al 2003.

Tiljander et al cored varved sediments from Lake Korttajarvi, Finland, going back through most of the Holocene. In Tiljander et al 2003, they distinguished the amount of mineral and organic matter in each varve. The basis for using mineral and organic matter as climate proxies is set out as follows:

The amounts of inorganic and organic matter, form the basis of the climate interpretations. Periods rich in organic matter indicate favourable climate conditions, when less snow accumulates in winter by diminished precipitation and/or increased thawing, causing weaker spring flow and formation of a thin mineral layer. In addition, a long growing season thickens the organic matter. More severe climate conditions occur with higher winter precipitation, a longer cold period and rapid melting at spring, shown as thicker mineral matter within a varve.

The caption to their Figure 5 reports the following link between X-ray density and their climate mechanism:

High X-ray density corresponds to high amount of mineral matter (light grey value tints in X-ray film) and low X-ray density corresponds to dark grey values caused by a higher proportion of organic matter.

Putting the two paragraphs together: warmer climate favors more organic material and thus a low X-ray density. In order to show warm values at the top of a graph, you need to invert the plot (i.e. you have to pay attention to the sign of your climate derivative.)

In the figure below, on the left, I show an excerpt from their Figure 5 which they show vertically (only the X-ray density is shown here - consult the original paper for the other plots.) The left portion of their Figure 5 shows an organic-rich period in the MWP, about which they say:

An organic rich period from AD 980 to 1250 in the Lake Korttajarvi record is chronologically comparable with the well-known ‘Medieval Warm Period’ (e.g. Lamb 1965; Grove & Switsur 1994; Broecker 2001). The sediment structure changes, less mineral material accumulates on the lake bottom than at any other time in the 3000 years sequence analysed and the sediment is quite organic rich (LOI ~20%). Thus, the winter snow cover must have been negligible, if it existed at all, and spring floods must have been of considerably lower magnitude than during the instrumental period (since AD 1881). According to the scenarios presented by Solantie & Drebs (2001), a 2°C increase in winter temperature would decrease the amount of snow in southern Finland significantly. Under such conditions, winter snow accumulation and intense spring floods would be rare events….

The Lake Korttajarvi record also indicates a climatically more severe period in the 17th century. Two periods, AD 1580–1630 and AD 1650–1710, are marked by an increase in both sedimentation (varve thickness) and mineral matter accumulation (relative Xray density). Also, magnetic susceptibility values are high between AD 1650 and 1710, indicating increasing mineral matter input into the lake.

They cite literature, including Hulden 2001, showing mild conditions in Finland in the MWP.

On the right, I've plotted the corresponding data so that "warm" grey values are on the top. I've also highlighted the (MWP) period identified as having elevated values of organic matter. If you squint, you can satisfy yourself that the left-hand and right-hand panels are showing the same data.

   

Fig. 1. Left from Tiljander et al 2003 Figure 5; right - plot of X-ray density (inverted).

Plotted according to the climatic interpretation offered by Tjilander et al, the modern warm period shows as colder than the Little Ice Age, something which makes no sense if this data is to be used as a climate proxy. Tiljander et al provide a plausible interpretation of the "divergence" of the proxy from its climatic interpretation as a result of agricultural and construction disturbance to sediment patterns, actually tying several especially thick varves to ditch and bridge construction:

This recent increase in thickness is due to the clay-rich varves caused by intensive cultivation in the late 20th century. …

There are two exceptionally thick clay-silt layers caused by man. The thick layer of AD 1930 resulted from peat ditching and forest clearance (information from a local farmer in 1999) and the thick layer of AD 1967 originated due to the rebuilding of the bridge in the vicinity of the lake’s southern corner (information from the Finnish Road Administration).

Now let's see what Mann et al did with this data. All of the 20th century values of varve thickness, X-ray density etc go up like crazy with the agricultural and construction activities as shown below for 2 of the 4 series (the other two are similar). Instead of using the climatic interpretation of the data described by Tiljander et al, Mann correlates the increases in varve thickness and changes in density and color, originating from local construction and farming, to world climate.

   

Figure 2. Two of 4 versions used in Mann et al 2008

By flipping the data opposite to the interpretation of Tiljander et al, Mann shows the Little Ice Age in Finland as being warmer than the MWP, 100% opposite to the interpretation of the authors and the paleoclimate evidence. The flipping is done because the increase in varve thickness due to construction and agricultural activities is interpreted by Mann et al as a "nonlocal statistical relationship" or "teleconnection" to world climate. Mann:

the EIV approach, which makes use of nonlocal statistical relationships, allowing temperature changes over distant regions to be effectively represented through their covariance with climatic changes recorded by the network

A more convincing example of spurious regression in "peer reviewed" literature will be hard to find. After reading through this, I keep expecting someone to say:

Live from New York, it's Saturday Night.

Reference:
TILJANDER, MIA, M. SAARNISTO, AEK OJALA, and T. SAARINEN. 2003. A 3000-year palaeoenvironmental record from annually laminated sediment of Lake Korttajarvi, central Finland. Boreas 32, no. 4: 566-577.

Mann et al 2008 and GAAP Accounting

Let's say that you were a partner of the firm North, Hegerl and Cicerone and charged with issuing an opinion on the financial statements of Team Capital Management Inc.(TCM) And let's say that you were doing so in heady pre-crash days when markets were going up and mark-to-market accounting was something that the companies wanted to do.

The footnotes to the TCM statements said that mark-to-market accounting had not been used for non-arms-length investments in securities issued by Briffa MXD Inc. which had shown a lack of "sensitivity" to rising world stock market prices, while mark-to-market accounting had been used for a high-flying Finnish penny stock (Kortajarvi Lake Gold). And while TCM statements did not mention their holdings in controversial Bristlecone Estates, a project that was much in the news, it turned out that mark-to-market accounting had been used on this speculation as well.

In the MD&A (Management Discussion and Analysis), the company said that they were profitable even without their investments in U.S. mortgages. Elsewhere in the footnotes, they said that TCM was profitable even without mark-to-market accounting in the Finnish penny gold stock. However, the combined effect wasn't discussed.

Can the partners in North, Hegerl and Cicerone certify that these statements meet GAAP? Of course they can't.

If you're using mark-to-market accounting on the penny gold stock that went up, then you have to use mark-to-market accounting on the shares in Briffa MXD Inc that went down. While disclosure is important, you can't disclose your away around this sort of inconsistent non-GAAP accounting. You can't use mark-to-market accounting on stocks that went up and not do so for stocks that went down.

In case this parable seems too harsh, here are the exact statements from Mann et al 2008. Obviously the precise issue of "mark-to-market" accounting doesn't arise: I use this as a parable for inconsistent accounting for data that goes up as opposed to data that goes down,

Accounting for MXD
Mann et al say of their handling of the Briffa MXD series (which "diverge" from rising world markets):

Because of the evidence for loss of temperature sensitivity after ~1960, MXD data were eliminated for the post-1960 interval.

In 1998, Briffa postulated an unknown anthropogenic cause for the post-1960 downturn, but nothing has turned up 10 years later. An open alternative is that there is a non-linear upside-down U-shaped temperature response, an alternative referred to even in IPCC AR4 as follows:

Others, however, argue for a breakdown in the assumed linear tree growth response to continued warming, invoking a possible threshold exceedance beyond which moisture stress now limits further growth (D’Arrigo et al., 2004). If true, this would imply a similar limit on the potential to reconstruct possible warm periods in earlier times at such sites. At this time there is no consensus on these issues (for further references see NRC, 2006)

An opinion that you'd think that the auditing firm of North, Hegerl and Cicerone would be aware of.

In respect to the accounting of Mann et al, they truncated this data merely because of a potential non-climate anthropogenic impact on the data, an impact which is postulated, but not proven.


Accounting for Finnish Speculative Stocks

The Mann et al 2008 footnotes say of the speculative Finnish sediments:

These records include the four Tijander et al. (12) series used (see Fig. S9) for which the original authors note that human effects over the past few centuries unrelated to climate might impact records (the original paper states ‘‘Natural variability in the sediment record was disrupted by increased human impact in the catchment area at A.D. 1720.’’ and later, ‘‘In the case of Lake Korttajarvi it is a demanding task to calibrate the physical varve data we have collected against meteorological data, because human impacts have distorted the natural signal to varying extents’’)…

We therefore … compaired [sic] the reconstructions both with and without the above seven potentially problematic series, as shown in Fig. S8.

Unlike their handling of Briffa MXD data which was truncated merely because of a potential non-climate anthropogenic impact (without reporting on the impact of this truncation), Mann et al included the Finnish sediment (which went up), purporting to justify this by arguing that they were still profitable without the data. A policy which, regardless of its individual merit or lack of merit, is inconsistent with the handling of the Briffa MXD data which went down.

It's actually worse here, because the original authors do not merely say that human effects unrelated to climate "might" impact the record. They categorically say that they were so caused. No "might" about it. For example:

This recent increase in thickness is due to the clay-rich varves caused by intensive cultivation in the late 20th century. …

There are two exceptionally thick clay-silt layers caused by man. The thick layer of AD 1930 resulted from peat ditching and forest clearance (information from a local farmer in 1999) and the thick layer of AD 1967 originated due to the rebuilding of the bridge in the vicinity of the lake’s southern corner (information from the Finnish Road Administration).

There is another shoe to drop in the handling of the Finnish sediments, which I will discuss in my next post. It appears to me that Mann et al actually have a short position on the Finnish speculative stocks, which they have inadvertently accounted for as a long position. But more on this in another post.

Accounting for Subprime Mortgages on Bristlecone Estates
A consulting report by North and associates, also sponsored by Cicerone, said that subprime mortgages on Bristlecone Estates, in particular, "strip bark" derivatives, should be "avoided". However, Mann et al 2008 nowhere mention that strip bark derivatives remain on their books.

GAAP Accounting
It's one thing to object individually to (1) the truncation of post-1960 Briffa MXD data due to a potential anthropogenic disturbance (which may actually be a non-linear response); (2) the use of the 20th century Graybill bristlecone chronologies despite (a) potential non-climate anthropogenic fertilization or (b) recommendations by the NAS panel not to use strip bark derivatives; (3) the use of 20th century Finnish sediment data despite an established non-climate anthropogenic disturbance.

However, the combination of the three is much worse than any one of these individually. The inconsistency of (1) and (3) is particularly opportunistic.

I make no imputation as to intent on the part of the authors, as intent is irrelevant (and, as is blog policy, I ask readers not to speculate on intent either). All that matters is what is presented. Auditors confronted with opportunistic accounting policies of this type cannot certify that the statements comply with GAAP. If North, Hegerl and Cicerone were chartered accountants and certified that such inconsistent and opportunistic practices complied with GAAP, they would be treated very severely by any supervisor of accounting practices.

The MBH98 Corrigendum

The SI for MBH98 listed 34 tree ring series that were not actually used. This was acknowledged in their 2004 Corrigendum which provided the following implausible excuse:

These series, all of which come from the International Tree Ring Data Bank (ITRDB), met all the tests used for screening of the ITRDB data used in ref. 1 (see ref. 5), except one—namely, that in 1997, either it could not be ascertained by the authors how these series had been standardized by the original contributors, or it was known that the series had been aggressively standardized, removing multidecadal to century-scale fluctuations.

A number of the unused series were generated by Schweingruber. Other Schweingruber series arising from the same collection and publication were used in MBH98, making the application of the Corrigendum excuse to the Schweingruber series highly implausible to say the least. Other unused series were generated by prominent dendrochronologists in published articles, whose methodologies were either published or easily available to MBH. Whatever the actual reason for the omission of the 34 series, the above reason seemed highly implausible.

On a small whim, I cross-checked the Mann et al 2008 against the Corrigendum exclusions.

25 of the 34 series said to be unsuitable in the Corrigendum were used directly in Mann et al 2008; 4 excluded Schweingruber MXD series are used in the MXD grid. The only series not used in Mann et al 2008 are 4 Schweingruber RW series and one oddball (chil016, a Roig series - other Roig series are used.)

For 23 of the 25 series, the versions used in Mann et al 2008 are identical to the ITRDB version, previously said to be unsuitable in the Corrigendum. For the other 2 (arge065, cana106), the version differences were not material.

Wouldn't it be nice…

if Team methodological descriptions were correct?

Mann says:

All ITRDB tree-ring proxy series were required to pass a series of minimum standards to be included in the network: (i) series must cover at least the interval 1750 to 1970, (ii) correlation between individual cores for a given site must be 0.50 for this period, (iii) there must be at least eight samples during the screened period 1800–1960 and for every year used.

I checked two series familiar to CA readers - Gaspe (cana036) and Sheep Mountain (ca534). Aside from any other issues with these series, the early portion of the Gaspe chronology has only a couple of samples as does the post-1983 portion of the Sheep Mountain chronology. I checked these series in both the "original" and "infilled" versions. The "original" version transcribed ITRDB crn series, including portions with fewer than 8 samples. The "infilled" version was identical to the "original" version where the original version had values. Thus, contrary to the statement in the SI, the "infilled" versions used data with only one sample.

Does it "matter"? Well, why say it if it's untrue?

When Good Proxies Go Bad

Many of the good folks who write the papers and keep the databases seem not to use their naked eyeballs. By that I mean, they seriously think that you can invent some new procedure, and then apply it across the board to transform a group of a thousand datasets without looking at each and every dataset to see what effect it might have. Or alternatively, they simply grab a great whacking heap of proxies and toss them into their latest statistical meat grinder. They take a hard look at what comes out of the grinder … but they seem not to have applied the naked eyeball to the proxies going into the grinder.

So, I thought I might remedy that shortcoming for the proxies used in Mann 2008. As a first cut at understanding the proxies, I use a three-pronged approach — I examine a violinplot, a "q-q" plot, and a conventional plot (amplitude vs. time) of each proxy. To start with, let's see what some standard distributions look like. The charts below are in horizontal groups of three (violinplot, qq plot, and amp/time) with the name over the middle (qq) chart of each group of three.

A "violinplot" (left of each group of three, yellow) can be thought of as two smoothed histograms back-to-back. In addition, the small boxplot in the center of the violinplot shows interquartile range as a small black rectangle. The median of the dataset is shown by the red dot. Below each violinplot are the best estimates of the AR and MA variables as fit by the R function "arima(x,c(1,0,1))". Note that in some cases either the MA coefficient, or both the AR and MA coefficients, are not significant and are not shown.

The "QQ Plot" (center of three, black circles) compares the actual distribution with the theoretical distribution. If the distribution is normal, the circles will run from corner to corner of the box. The green "QQLine" is drawn through the first and third quartile points. The red dotted line shows the Normal QQLine, and goes corner to corner. It is drawn underneath the QQLine, which sometimes hides it if the distribution is near-normal.

The standard plot (right of three, blue) shows the actual data versus time. The heteroscedasticity (Breusch-Godfrey test) is shown above the standard plot, with a rejection value of p less than .05.

Below are some possibile distributions that we might find. A number of them look related to "power law" type distributions (exponential, pareto, lognorm, gamma, fdist) although there are subtle differences. These all look like the letter "J", and seem to differ in the upper tails, the tightness of the bend, and the placement of the QQLine.

The sinusoidal, Sin + Noise, and Uniform distributions all look like the letter "S". Comparison with the QQLine shows that there is too much data in the tails. On the other hand, the "normal squared" distribution is a signed square of a normal distribution, and has too little data in the tails.

Finally, note that the ARMA distributions are not at first blush visually distinguishable from normal distributions. However, as you can see by the blue plots, they are more lifelike.

Sample Distributions

With that in hand, let's take a look at some of the Mann proxies. What I did was simply go through the proxies and look for anomalies, proxies that seemed strange in some way. I didn't look at numbers, that comes later.

More »

Visiting St Peter's, Rome

While we were in Italy this summer, my wife and I did the usual tourist things in Rome, visiting the Roman Colliseum and Forum and the Vatican. I noticed something at St Peter's which is reported first here at Climate Audit. More »

Manchester United

It's Sunday and I was just watching the sports reports while surfing financial crises. A trivia question only for people who do not know the answer and PROMISE to simply guess. The answer is easy if you know it and equally easy to find if you research it, so please don't spoil the guesses.

What corporate logo is on the front of the jerseys of Cristiano Ronaldo and the other Manchester United players? (Yes, we get European soccer on many Toronto sports news reports.)

Mann Sediments and Noise Simulation

Both in climate blog world and the financial world, there has been much talk recently about the interaction of models and data distributions. Linear regression models assume normal distributions. What happens to models when the data distributions don't meet the assumptions. Sometimes it doesn't matter much, sometimes it does. But it seems like an important thing to study. More »

Mann's PC1 in Esper and Frank 2008

On previous occasions, we've noticed some strange appearances of the Mann hockey stick under different disguises. In Inconvenient Truth, a splice of Mann's hockey stick and CRU instrumental data is described as "Dr Thompson's thermometer". Today, I noticed another peculiar incident, where Esper and Frank (Clim Chg 2008), who one would think would know better, identify Mann's PC1 as originating from Lloyd and Graumlich 1997.

Esper and Frank 2008, entitled The IPCC on a heterogeneous Medieval Warm Period, is fairly critical of IPCC conclusions. They discuss the proxies in IPCC AR4 Box 6.4 Figure 1 (interested readers may also look at my Erice presentation where this figure is discussed.) They list the proxies as follows:

Proxies shown in the AR4 include an ice core record from W Greenland (Fisher et al. 1996), a multi-proxy record from E Asia (Yang et al. 2002), and six treering records representing: SW Canada (Luckman and Wilson 2005), W USA (Lloyd and Graumlich 1997), N Sweden (Grudd et al. 2002), NW Russia (Hantemirov and Shiyatov 2002), N Russia (Naurzbaev et al. 2002), and Mongolia (D’Arrigo et al. 2001)…. Importantly, all tree-ring records shown in AR4 were detrended using a method known as ‘Regional Curve Standardization’ (RCS; Esper et al. 2003).

Now Esper himself produced a chronology from Graumlich foxtail measurements, which is actually quite similar to nearby Sheep Mountain bristlecone measurements and which I'll discuss briefly below. First I simply want to show that IPCC used Mann's PC1 and not the Graumlich series.

IPCC AR4 Second Draft
First here is the figure from the IPCC second draft. The Mann PC1 is not labelled as such, but is the orange series ("WUSA") with one of the most distinctive closing uptrends. The Graumlich series (light purple), also labeled "WUSA", also has a closing uptrend, but does turn down a little at the end. The data is noted as coming from the Osborn and Briffa 2006 collation (which I've used in my plots below).


SOD Legend: Box 6.4, Figure 1. (a) The heterogeneous nature of climate during the MWP is illustrated by the wide spread of values exhibited by the individual records that have been used to reconstruct NH-mean temperature. Individual, or small regional averages of, proxy records used in various studies (see Osborn and Briffa, 2006), (collated from those used by Mann and Jones (2003), Esper et al. (2002) and Luckman and Wilson (2005) but excluding shorter series or those with an ambiguous relationship to local temperature).

To highlight the differences between the two series, I've replotted them below, Mann PC1 in red, Graumlich in light grey.

In Review Comments on the Second Draft, I objected to the inclusion of both Mann's PC1 and foxtails. Here's the comment on the PC1 (see the source for the next comment criticizing the foxtails):

6-1143 B 29:14 29:14 One of the most prominent series on the right hand side of Box 6.4 Figure 1 is Mann's PC1, which uses his biased PC methodology. It is so weighted that the series is virtually indistinguishable from the Sheep Mountain bristlecone series discussed in Lamarche, Fritts, Graybill and Rose (1984). These authors compared growth to gridcell temperature and concluded that the bristlecone growth pulse could not be accounted for by temperature, hypothesizing CO2 fertilization. Graybill and Idso (1993) also stated this. One of the MBH coauthors Hughes in Biondi et al 1999 said that bristlecones were not a reliable temperature proxy in the 20th century. IPCC Second Assessment Report expressed cautions about the effect of CO2 fertilization on tree ring proxies, which were not over-ruled in IPCc Third Assessment Report. At a minimum, the relationship is "ambiguous". In addition, I tested the correlation of this series with HadCRU2 gridcell temperature and obtained a correlation of 0.0. Osborn and Briffa say that they themselves did not verify the temperature relationship for this data. Why not? At any rate, in this example, the authors have not excluded an important series with a well-known "ambiguous" relation to temperature. [Stephen McIntyre (Reviewer’s comment ID #: 309-39)]

This comment was rejected as follows (but the rejection acknowledges the use of the Mann PC1):

Rejected – the purpose of this Figure is to illustrate in a simple fashion, the variability of numerous records that have been used in published reconstructions of large-scale temperature changes. The text is not intended to give a very detailed account of the specific limitations in data or interpretation for each. Furthermore though there is an ambiguity in the time-dependent strength of the response of Bristlecone Pine trees to temperature variability, there is other evidence that these trees do display a temperature response . Right or wrong, Mann and colleagues do apply an adjustment to the western trees PC1 in their (1999) analysis to account for possible CO2 fertilization. Other authors ( Graumlich et al ., 1991) assert that the recent rise in some high elevation conifers in the western U.S. could be explained as a temperature response (she can not confirm the LaMarche et al findings). The issue is clearly complex , as will be noted in a new papragraph on tree-ring problems that will be added to the text

IPCC AR4 Final Draft
In the final draft, the number of illustrated proxies was pared down, with the Graumlich foxtails being dropped. Here's the graphic with the caption. Comparing captions, one notes two interesting changes. First the attribution of versions to Osborn and Briffa 2006 is deleted. This is unfortunate as the versions illustrated here can at least be traced in Osborn and Briffa 2006, but cannot necessarily be spotted in the citations provided here. Second, notice the change in criterion - the earlier version used an "ambiguous" relationship to temperature, this one uses "no" relationship.

Looking at the form of the one remaining WUSA series, it clearly has the shape of the Mann PC1 going up at the end, rather than the foxtail series, which is similar but with a slight downtick at the end. So we can safely conclude that the series illustrated in the IPCC document is definitely the Mann PC1 (Sheep Mountain) and not the Graumlich series.


Box 6.4, Figure 1. The heterogeneous nature of climate during the ‘Medieval Warm Period’ is illustrated by the wide spread of values exhibited by the individual records that have been used to reconstruct NH mean temperature. These consist of individual, or small regional averages of, proxy records collated from those used by Mann and Jones (2003), Esper et al. (2002) and Luckman and Wilson (2005), but exclude shorter series or those with no evidence of sensitivity to local temperature. These records have not been calibrated here, but each has been smoothed with a 20-year filter and scaled to have zero mean and unit standard deviation over the period 1001 to 1980.

Although the foxtail series is attributed to Llloyd and Graumlich 1997, neither this chronology nor any similar chronology appears in that article, which I've placed online here. Nor does Lloyd and Graumlich even say that their foxtail series records medieval temperatures; quite the opposite. They conclude that their foxtails were limited in growth by medieval drought.

The period from 950 to 550 BP illustrates the extent to which water balance can reverse treeline response to temperature. Whereas climate appears to have been warm when treeline forests expanded, warmth does not necessarily lead in subalpine forest expansion.

Graumlich reported a number of other treeline sites, which did not have HS shapes. Indeed, she opposed Graybill's CO2 theory largely on the basis that she could not replicate his results.

I discussed strip bark at these foxtail sites here, obtaining confirmation from Andrea Lloyd on a concordance of names. She also provided an interesting confirmation from Andrea Lloyd about Pete Holzmann and my strip bark theory. I identified several trees whose growth patterns looked a lot like Almagre strip bark and asked her to check if these trees were strip bark in her notebooks. Bingo. They were.

There is one other notable mis-identification of provenance in Esper and Frank - one which is endemic. Briffa's Yamal version is attributed to Hantemirov and Shiyatov 2002, which actually has a non-HS chronology from that site. Juckes made the same incorrect identification.

CA Tip Jar

Amount:

Website(Optional):

CA Forum

Pages

Categories

Articles

Blogroll

Favorite posts

Links

Weblogs and resources

Archives

Search

 

Recent Posts

Recent Comments

The U.S. Financial Crisis:

Sea Ice - End of Game Analysis:

The "Full" Network:

BBC "Climate Wars":

It's Saturday Night Live:

Pielke Jr on Spinning Science:

Code and realcode:

Risk Management Solutions Ltd and the 37 Professors:

Adsense

Meta

Bad Behavior has blocked 8539 access attempts in the last 7 days.