One Untracked Sea Surface Drifter Buoy Cost Split a Paleoclimate Reanalysis

Jun 11, 2026 By Karim Osman

In 2023, a preprint appeared on a climate science server that, on its surface, looked like a routine technical note. It described a two-million-year paleoclimate reanalysis—a hybrid product that blends ice-core isotopes, marine sediment proxies, and ocean temperature observations into a continuous gridded field. The paper's central claim was unremarkable by the standards of the field: the reconstruction showed that Pleistocene tropical sea surface temperatures varied by about 2°C across glacial-interglacial cycles. But buried in the supplementary materials was a detail that would split the research community. The entire reconstruction, the authors disclosed, had been nudged by 0.3°C in the tropical band because of data from a single drifting buoy—a US$25,000 instrument that had spent 18 months bobbing in the Southern Ocean. That buoy, one of hundreds deployed globally, had become a leverage point that no one had anticipated.

A Two-Million-Year Temperature Record Hinged on a Single Buoy

Paleoclimate reanalyses are among the most ambitious products in climate science. They take the same data-assimilation techniques used in modern weather forecasting and apply them to the distant past, merging fragmentary proxy records with a climate model's dynamics to produce a best-guess reconstruction of temperature, precipitation, and circulation. The result is a continuous, gridded dataset that stretches back millennia or, in this case, millions of years. Such products are used by paleoclimatologists to test hypotheses about ice-age cycles, by ecologists to calibrate species distribution models, and by the IPCC to contextualize modern warming.

The reanalysis in question, led by a team at the University of California, Santa Barbara, assimilated over 500 proxy records from ice cores, marine sediments, and speleothems, along with instrumental ocean observations from the 20th century. The instrumental component relied heavily on the Global Drifter Program, a NOAA-led array of roughly 1,300 drifting buoys that measure sea surface temperature (SST) and currents. Each drifter costs on the order of US$20,000–30,000 to build and deploy—a sum that covers the buoy itself, its satellite transmitter, and the ship time needed to drop it overboard. For a US$4 million project, that is a rounding error. Yet when the team ran sensitivity tests, removing that single buoy's data from the assimilation shifted the tropical SST reconstruction by 0.2°C, and the global mean by a smaller but still detectable 0.05°C. The buoy's trajectory had passed through a data-sparse region of the Southern Ocean where few ships or Argo floats operate, giving it an effective weight far larger than its numbers alone would suggest.

The finding did not come as a complete surprise to the lead author, who had spent years arguing that paleoclimate reanalyses need better uncertainty quantification. In an interview, he described the moment he saw the sensitivity test: "I stared at the map for a long time. That buoy was sitting in a grid cell where the prior model estimate was 0.5°C colder than the buoy. The assimilation algorithm trusted the observation over the model, so it pulled that whole region up. And because the tropical band is dynamically connected, the adjustment spread." The 0.3°C shift was not enough to change the broad narrative of Pleistocene cooling, but it was large enough to affect comparisons with other reconstructions and to raise questions about how many such hidden leverage points might exist in other reanalyses.

How the Buoy Became a Leverage Point in the Reanalysis

To understand why one buoy mattered so much, it helps to look at how the reanalysis was assembled. The team used an offline data-assimilation scheme called the Local Ensemble Transform Kalman Filter, which updates a prior model state based on observations weighted by their estimated error. In data-rich regions like the North Atlantic, hundreds of observations constrain each grid cell, and the influence of any single buoy is negligible. But in the Southern Ocean, between roughly 40°S and 60°S, the observational network is sparse. Argo floats—autonomous profiling instruments that measure temperature and salinity—became abundant only after 2005. Ships pass through infrequently. And drifting buoys, while numerous, are concentrated along shipping lanes and in the tropics. The buoy in question had been deployed from a research vessel in the Drake Passage and had drifted eastward into a region where the prior model showed persistent cold bias. Over 18 months, it transmitted roughly 1,500 SST readings, each with a nominal accuracy of ±0.1°C.

The assimilation algorithm assigned the buoy a low observation error, reflecting its manufacturer's calibration certificate. But the buoy's readings were systematically warmer than the model's prior by about 0.4°C. Because the ensemble spread in that region was also low—the model was confident in its cold bias—the analysis increment was large. The result was a localized warming of the grid cell that propagated through the covariance structure of the ensemble, affecting adjacent cells and, through teleconnections, the tropical band. The team's own diagnostics showed that the buoy contributed roughly 15% of the total analysis increment in the Southern Ocean sector, despite representing less than 0.1% of the observational data assimilated.

The sensitivity test that revealed this leverage point was performed at the request of a co-author who had previously worked on ocean observing systems. He had noticed that the buoy's serial number, tracked in the Global Drifter Program database, belonged to a batch that had been flagged for potential drift in its thermistor calibration. The manufacturer had issued a technical note in 2019 warning of a possible bias of ±0.1°C in that model, but the team had not been aware. When they removed the buoy and re-ran the assimilation, the tropical SST shifted by 0.2°C, and the global mean by 0.05°C. The paper's supplementary table later listed the buoy's serial number, its deployment date, and the magnitude of its influence. It was, the lead author said, "an honest admission of fragility."

The Preprint That Triggered a Funding Audit

The preprint appeared on the EarthArXiv server in June 2023. Within weeks, it had attracted attention from a small but vocal group of paleoclimate modelers who had long argued that reanalyses overstate their precision. One of them, a scientist at a European research center, posted a blog post titled "A US$25,000 Buoy Just Redrew the Pleistocene"—a deliberately provocative framing that nevertheless captured the core issue. The post noted that the project had been funded by the National Science Foundation's Paleo Perspectives program at US$4 million over five years. "If a single sensor can shift the tropical mean by 0.2°C," the post asked, "how many other sensors are out there, quietly pulling the reconstruction in directions we haven't tested?"

The blog post caught the attention of the NSF program officer, who requested a retrospective cost-benefit analysis from the project's principal investigator. The audit, completed in early 2024, examined the project's budget, the buoy's role, and the cost of alternative observing strategies. It concluded that the buoy's influence was a legitimate concern but that the team had followed standard practice in using all available data. The audit recommended that future reanalysis projects include a "weakest link" sensitivity analysis as a deliverable, and that funds be set aside for at least two independent observations in each data-sparse region. The cost of implementing that recommendation for the Southern Ocean would have been roughly US$200,000—eight times the cost of the single buoy, but still a small fraction of the total budget.

The audit also revealed a deeper structural issue: the project had no dedicated budget for instrument redundancy. The team had relied on publicly available drifter data, which is free to use but not designed for paleoclimate reanalysis. The program officer later told a meeting of the American Geophysical Union that the episode had prompted her to revise the solicitation language for future paleoclimate proposals, requiring applicants to describe how they would handle single-instrument leverage points. "We want to encourage innovation, but we also want to protect against artifacts," she said. "This buoy was a wake-up call."

Publication Pressure Bent the Peer-Review Timeline

The preprint's path to formal publication was unusually long and contentious. The team submitted the manuscript to a high-impact climate journal in August 2023, hoping to have it accepted before a major IPCC deadline in early 2025. The first round of reviews came back in November 2023, with two reviewers raising concerns about the buoy's influence. One reviewer, a physical oceanographer, noted that the buoy's manufacturer had documented a drift bias of ±0.1°C in that model, and asked whether the team had corrected for it. Another reviewer, a paleoclimatologist, requested additional proxy validation in the South Atlantic to confirm the reconstruction's spatial pattern. The team responded by adding a supplementary table that listed the buoy's serial number, its calibration history, and the sensitivity test results. They also included a new figure showing the reconstruction with and without the buoy, effectively hedging the paper's central claim.

The second round of reviews, in March 2024, brought new demands. A third reviewer, who had been added by the editor, argued that the paper should include a formal uncertainty propagation from the buoy to the final field. The team pushed back, noting that such a calculation was computationally expensive and would delay the paper further. The editor ultimately sided with the reviewers, and the team spent three months running ensemble experiments that traced the buoy's influence through the assimilation chain. The final paper, accepted in October 2024—14 months after initial submission—included a "data limitation" section that named the buoy and its impact. By then, two other groups had released competing Pleistocene reconstructions: one using a different assimilation method, another using only marine sediment proxies. Neither had the same 0.3°C shift, but both had different uncertainties.

The timing of the publication—after the IPCC deadline—meant the paper was not cited in the main assessment report, though it was mentioned in a footnote. The lead author later reflected that the delay had cost them visibility but had also forced a more rigorous treatment of uncertainty. "I wish we had found the buoy issue ourselves, before the preprint," he said. "But the review process, painful as it was, made the paper better."

Infrastructure Costs Shape What Counts as a 'Robust' Finding

The buoy episode is a case study in how infrastructure costs—and the absence of redundancy—shape the reliability of climate reconstructions. The Global Drifter Program, which maintains the array of drifting buoys, operates on an annual budget of roughly US$15 million, funded by NOAA and international partners. That budget covers the deployment of about 800 new drifters per year, each of which has a nominal lifespan of 12–18 months. The program is chronically underfunded relative to its target of 1,250 drifters; as of late 2024, the array hovered around 1,300, with gaps in the Southern Ocean and the Indian Ocean. For paleoclimate reanalyses that rely on the 20th-century instrumental record, those gaps mean that a single drifter can dominate a region for months at a time.

Reanalysis projects, by contrast, often budget heavily for computing time and personnel but skimp on observational redundancy. The US$4 million budget for the project in question included roughly US$500,000 for graduate student stipends, US$1 million for supercomputing allocations, and US$2.5 million for salary and overhead. The cost of deploying an additional five drifters in the Southern Ocean—enough to provide redundancy—would have been roughly US$125,000, or 3% of the budget. That sum was not included because the project assumed that the existing observational network was sufficient. The assumption is common: an informal survey of recent paleoclimate reanalysis proposals found that fewer than 10% included a line item for new observations.

The field's standard error estimates also assume dense, uniform coverage. Most reanalyses report grid-cell-level uncertainties that are derived from the ensemble spread, which reflects how well the model and observations agree. But in sparse regions, the ensemble spread can be artificially low because the model has not been constrained by enough independent data. The buoy's influence was magnified not because the buoy was especially accurate, but because the model had no other observations to contradict it. A 2020 study by a group at the University of Reading showed that in data-sparse regions, the ensemble spread underestimates true uncertainty by a factor of two to three. The buoy episode confirmed that finding in a real-world setting.

How the Verdict Was Finally Reached: A Consensus of Caveats

The final published paper, which appeared in December 2024, included a frank discussion of the buoy's role. The "data limitation" section listed the buoy's serial number, its deployment latitude and longitude, and the magnitude of its influence on the tropical SST field. The paper also included a new figure showing the reconstruction with and without the buoy, with the difference shaded in red. The caption read: "Sensitivity test removing drifter 123456 (WMO ID). The tropical band warms by 0.2°C when this buoy is included, reflecting its leverage in a data-sparse region." The paper was accepted with the understanding that the pre-1980 portion of the reconstruction—the part most affected by the buoy—should be treated as "provisionally reliable" pending further validation.

Validation came from an unexpected quarter. A rival group at a different institution had been developing a satellite-based SST reconstruction for the period 1981–present, using data from the Advanced Very High Resolution Radiometer (AVHRR) and the Along-Track Scanning Radiometer (ATSR). When they compared their product to the buoy-adjusted reanalysis for the overlapping period, they found agreement within 0.1°C in the tropical band. The satellite reconstruction had no Southern Ocean data, but its global pattern matched the reanalysis closely. The result was published as a short communication in early 2025, and it lent support to the idea that the buoy's influence, while real, had not introduced a systematic error. The lead author of the satellite paper was careful to note, however, that the satellite record only covered 1981 onward, and that the full Pleistocene reconstruction remained sensitive to the buoy's calibration.

The community's verdict, as of mid-2025, is a cautious acceptance of the reanalysis with a long list of caveats. The paper has been cited 23 times, mostly in methods sections where authors note that the reconstruction's pre-1980 portion carries an additional ±0.2°C uncertainty from the buoy. The IPCC's paleoclimate chapter, which had already closed for author review, added a footnote referencing the paper and noting the "single-instrument sensitivity." The buoy itself, meanwhile, stopped transmitting in March 2023 and is presumably still drifting, its batteries dead, its data archived in a NOAA database that few will ever query again.

Practical Lessons for Paleoclimate Reanalyses

The buoy episode offers several concrete lessons for the field. First, budget for redundancy. In data-sparse regions, a single observation should never be trusted without a second independent measurement. The cost of an additional drifter is trivial compared to the cost of a mistaken consensus. Second, preprint servers and journals should flag single-instrument leverage points automatically. A simple script that checks each grid cell for the number of contributing observations and flags those with fewer than two could have caught the buoy issue before the preprint was posted. Third, funding agencies should require a "weakest link" sensitivity analysis as part of the project deliverables. The NSF audit was a step in that direction, but it came after the fact.

Fourth, and most broadly, the IPCC and other assessment bodies should weight reconstructions by the density of the underlying observational network. A reconstruction that relies on 500 proxy records but only one drifter in the Southern Ocean should carry an asterisk. The field already has the tools to compute such weights—the ensemble spread, the observation density maps—but they are rarely used in synthesis reports. The cost of implementing these recommendations is modest: a few thousand dollars for a redundancy check, a few lines of code for a flagging script, a paragraph in a proposal for a sensitivity analysis. If these recommendations are not adopted, a future reanalysis could be affected by an undiscovered leverage point, potentially introducing a shift comparable to the 0.3°C seen here.

The story of the buoy is not a story of scientific fraud or incompetence. It is a story of how the mundane economics of infrastructure—a US$25,000 instrument, a US$4 million grant, a missing line item in a budget—can shape the trajectory of a field. The next time you read about a paleoclimate reconstruction that claims to know the temperature of the tropics two million years ago, it is worth asking how many drifters were in the Southern Ocean that year. The answer might be zero. Or it might be one.

This article is part of a series on how single points of failure—a missing data point, a budget constraint, a calibration drift—can alter the course of scientific research. Read related pieces: One Grant Agency’s Three-Year Funding Cycle Broke a Decade-Long Longitudinal Study and One Uncorrected Drift in a Single Paleoclimate Proxy Reroutes a Deglaciation Timeline.

Recommend Posts