Exoplanet scienceEnglish editionPreprintPreliminary result

Towards Instrument-Agnostic Exoplanet Candidate Prioritization

We have developed a novel machine learning approach for predicting the likelihood of exoplanet candidate confirmation equally capable of performance on both TESS and Kepler data.

Original source cited and editorially framed by Cosmos Week. arXiv Earth & Planetary

Editorial signatureCosmos Week Editorial Desk

Published05 Jun 2026 18: 26 UTC

Updated2026-06-05

Coverage typePreprint

Evidence levelPreliminary result

Read time4 min read

Key points

Focus: We have developed a novel machine learning approach for predicting the likelihood of exoplanet candidate confirmation equally capable of performance
Editorial reading: provisional result, not yet formally peer reviewed.

Full story

We have developed a novel machine learning approach for predicting the likelihood of exoplanet candidate confirmation equally capable of performance on both TESS and Kepler data. The new analysis still awaits peer review, but it already lays out the central claim clearly.

The significance lies in exoplanet science has moved beyond the era of simple discovery into a period of comparative characterization. With more than five thousand confirmed planets known, the scientifically productive questions now concern atmospheric composition, internal structure, orbital history and the statistical properties of populations rather than the existence of individual worlds. A new detection or spectral measurement is most valuable when it adds a well-constrained data point to those comparative frameworks, not when it stands alone as an anecdote. From the NASA exoplanet archival post-processed Kepler and TESS databases, we chose six parameters that we assessed to be predictive to the planet transit signature: planet. We have developed a novel machine learning (ML) approach for predicting the likelihood of exoplanet candidate confirmation equally capable of performance on both TESS and Kepler.

We used these parameters to evaluate eleven different ML models on all possible train/test combinations of TESS and Kepler data, using the confirmed planet and false positive. We found that, due to substantially different distributions of our chosen parameters in Kepler and TESS databases, models trained with data from one instrument have difficulty.

However, models trained jointly with both TESS and Kepler data can perform well on both. We combined our best models into a statistically robust ensemble to evaluate the planet candidates in both Kepler and TESS, and we provide a list of the top candidates predicted.

Confirmed planets and false positives that have been resolved since the completion of our analysis demonstrate the effectiveness of our model and suggest that our top candidates. With the upcoming launch of the Nancy Grace Roman Space Telescope (Roman) and the expected order-of-magnitude increase in planet candidates, we suggest that our method can be.

The broader interest lies in making the target less anecdotal and more comparable with the rest of the known planetary population. Population-level questions, such as the frequency of atmospheres around small rocky planets or the prevalence of water-rich worlds in the habitable zone, require well-characterized individual data points before statistical patterns become meaningful. Each new planet with a measured radius, mass and, ideally, atmospheric constraint is a brick in that larger structure, and the accumulation of bricks eventually allows theorists to test formation models against real distributions rather than projections.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. ArXiv is committed to these values and only works with partners that adhere to them.

Because this is still a preprint, the result should be read with genuine interest and proportionate caution. Peer review is not a guarantee of correctness, but it is a process that forces authors to respond to technical criticism from specialists who have no stake in a particular outcome. Preprints that survive that process, often with substantive revisions, emerge with a stronger evidential base than the version that first appeared. Until that stage is complete, the responsible reading keeps uncertainty explicitly visible rather than treating the claims as established findings.

The next step is to improve independent constraints on the mass, radius, atmospheric composition and orbital dynamics of the target. Transmission spectroscopy with JWST, radial velocity campaigns with high-resolution ground-based spectrographs and phase-curve measurements from space photometry represent the observational toolkit that can move characterization from plausible to robust. That convergence of techniques is the standard the community now expects before a planetary atmosphere result is treated as confirmed. Until peer review and independent follow-up address those open questions, skepticism is not a failure of appreciation for the work; it is part of how science decides what to keep.

Source

Original source: arXiv Earth & Planetary

Dynamic version keeps live navigation and the current homepage context.

Source and framing

This box tells the reader what kind of source originated the story and how strongly the result should be interpreted.

Coverage typePreprint

Evidence levelPreliminary result

Original sourcearXiv Earth & Planetary

Editorial context

Preprint

Preliminary result

Preprint not yet peer reviewed.

Read original source

Story tools

Open source

Editorial standards

How Cosmos Week labels sources, evidence levels and provisional claims.

Read standards page