Very very timely paper that captures the current zeitgeist in EO and AI. If nothing else, it serves as a fantastic introduction to one of the technologies that I think(/hope) will help the most bring imagery to the masses in the coming years.

Metadata

Highlights

Earth embedding vectors emb are produced by a family of embedding functions E that map continuous location inputs (i.e., longitude, latitude with optionally elevation, and time) into a d-dimensional vector space:

Figure 2: Earth embeddings provide different functions: (1) They compress high-dimensional data into a lower-dimensional vector format. (2) They fuse together different geospatial data modalities, from different types of images to text and tabular data. (3) They can interpolate to unseen spatiotemporal locations, where raw data is missing. (4) They are interoperable with other AI foundation models, such as LLMs, through aligned embedding spaces.

as explicit models, extracting embeddings from raw data (e.g. satellite imagery) associated with a location (emb ∼ Eexplicit(datalocation))

implicit models, returning embeddings from only location inputs (emb ∼ Eimplicit(location)).

Earth embeddings map places and times that share similar properties closer together in embedding space.

GeoFMs are large-scale modeling and learning frameworks, whereas Earth embeddings constitute the interoperable, location-indexed data outputs that can be stored, shared, or queried independently of the model that created them.

We posit that Earth embeddings will emerge as the dominant format of geospatial data in the AI age

ways in which users can employ Earth embeddings for prediction, conditioning, simulation, and search

Call to action: Advancing analyses and applications with Earth embeddings.

• Evaluating and benchmarking Earth embeddings

• Explainable and interpretable Earth embeddings

• Learning planetary processes with Earth embeddings

Earth Embedding Models: Explicit Feature Extraction versus Implicit Neural Representation

Challenges and opportunities for improving Earth embeddings.

• Model capacity

• Spatio-temporal heterogeneity

• Data curation and scaling

• Learning objective

The research agenda we outline is fundamentally interdisciplinary: Earth embeddings will rely on feedback from domain scientists, e.g. in ecological, geological, oceanographic, and atmospheric sciences, that incorporate Earth embeddings into their analyses and from data practitioners apply- ing Earth embeddings in their workflows and products.