Loader

Finding Our Google Maps Moment in Translational Genomics

author image

Sandor Szalma, Scientific Advisor   |   7mins

The growth of large, well-defined biobanks like UK Biobank, FinnGen, and the All of Us Research Program has changed what is possible in the field of genetics research.

Researchers now have access to huge genetic and phenotypic datasets that cover entire populations and can show the genetic structure of disease in ways that have never been possible before. The promise is clear: enable discoveries, make drug development better, improve patient stratification, and eventually make precision medicine a part of everyday healthcare.

However, even though there is a lot of data, the path from biobank to biological insight is still very fragmented. In Translational Genomics, researchers today are in a strange situation -- they have more data than ever, but the tools they have to find, combine, and use this data often don't work well.

Sandor Szalma

(Sandor Szalma, Scientific Advisor at Zifo)

Metaphorically speaking, bridging large-scale genomic data to biological insights is like driving through a dense city -- each path may lead to a different destination, whether it’s a Polygenic Risk Score, a Biomarker, a Drug Target, or a Pharmacogenomic Guide. The challenge lies in choosing the optimal route and interpreting the signs along the way.

(Sandor Szalma, Scientific Advisor at Zifo)

Our current tools are like paper road maps -- helpful, but not very flexible or connected. This is where the “Google Maps” metaphor makes sense. Translational genomics needs a system that is dynamic, interactive, and layered, just like the map app lets us zoom in and out, see traffic and adapt the route, find alternative routes, add stops, and share our journeys. A system that not only shows where associations are but also guides researchers from population-scale signals to interpretable biological insights.

From Atlases to GPS: Why Current Tools Fall Short

GWAS and related analyses have been the workhorse of human genetics, producing summary statistics that capture associations between genetic variants and traits. You can now query, visualise, and share these results thanks to public resources like the GWAS Catalog and open-source tools like PheWeb, Open Targets, and GWAS Atlas.

Manhattan plots and QQ plots, for example, are now well-known places on the map of discovery. Of course, the most sophisticated amongst these is Open Targets which integrates genetic associations data with a vast array of other biological evidence types (e.g., functional genomics, expression data, pathways, animal models, drug data) and a suite of tools such as fine-mapping, colocalization, and locus-to-gene (L2G) prioritization using machine learning to support in silico target triaging. The software is available for download and local implementation, but it needs dedicated effort, and customizations can interfere with the evolution of the package by the Open Targets Consortium roadmap.

Hence, these tools aren't enough. At best, they are static snapshots that are often optimised for one dataset and are not often made with scalability, collaboration, or integrative analysis in mind. A researcher who wants to go from a GWAS hit to a possible drug target may have to use a lot of different platforms, download datasets, run command-line scripts, and connect the dots between genetic signals, biological pathways, and phenotypes by hand. The trip takes a long time, and the path is often unclear.

Biobanks: The New Cities of Genomics

Think about the biobank landscape itself to get an idea of how big the problem is. These massive datasets are like huge new cities: they have a lot of information, but it's easy to get lost in them if you don't have the right tools. The UK Biobank has genomic and longitudinal phenotypic data on 500,000 people. FinnGen combines genetic information with health records that go back decades. All of Us wants to sign up more than a million people from different backgrounds in the U.S. They make up an unprecedented map of human health and disease.

Privacy restrictions often make it hard to get to individual-level data, so summary statistics are usually the best way to do large-scale analysis. Summary statistics are powerful, but they need advanced downstream analysis -- fine-mapping, colocalization, variant annotation, and Mendelian Randomization (MR) -- to turn associations into biological meaning. These steps become roadblocks when workflows aren't integrated.

What a Google Maps for Genomics Would Look Like

So, what would it mean to have a Translational Genomics version of Google Maps? A few ideas stand out:

Zoom In, Zoom Out -- This feature lets you customise your visualisation beyond basic Manhattan/QQ plots. You can compare multiple traits, explore different regions, and add functional annotations on top of each other.

Interactive and Intuitive -- Tools that let you explore, test hypotheses, and see things in real time, not just in static plots.

User-friendly upload and management of any GWAS summary statistics (private or public) including user management.

Integrated downstream analysis workflows -- The platform integrates with pipelines for algorithms like MR, fine-mapping, and variant annotation , so you don't need to know a lot about the command line or download data for each step.

Ready for collaboration -- Shared workspaces where research teams can mark up and talk about their findings in the same way that maps today let people share routes and get live updates.

Scalable and Extensible -- A platform that can add custom analysis and visualisation tools and interactive performance does not degrade significantly by increasing the size of the variants and phenotypes/studies.

Such a system would not only help researchers navigate today’s complexity but also accelerate the journey from biobank-scale data to meaningful biological discovery -- guiding scientists to see not just where the roads are, but which ones are most promising for translational insights and therapeutic discovery.

The Road Ahead: Why This Is Important Right Now

It is very important to build this next-generation infrastructure as soon as possible. Drug discovery pipelines are relying more on human genetics to derisk targets to improve the probability of technical success.

Robust interpretation of genomic data across populations is essential for patient stratification and risk prediction. And for precision medicine to work, we need to find a way to connect large-scale research with personalised care.

The promise of biobanks could be put on hold if navigation tools don't get better. Data will keep piling up, but insights will stay hidden behind technical problems, broken workflows, and separate systems. The opportunity is clear: we can unlock the full potential of biobank-scale research by thinking of new ways to manage, analyse, and visualize genomic data.

It's time for Google Maps Moment

Translational genomics is at a turning point. We have the data, the computing power, and the ambition. What we need now is a platform that is integrated, intuitive, and collaborative -- one that can take us from biobank-scale data to discovery-ready insights. A “Google Maps” for genomics could change the way we understand the complexity of human health in the same way that GPS changed how we get around cities. The goal is clear: precision medicine for everyone. It's time to make the map that will take us there.

(Sandor Szalma's career spans over two decades of senior leadership in drug discovery and computational biology at top pharmaceutical companies, also including a long tenure as an Adjunct Professor at Rutgers University. His experience showcases a deep expertise in leveraging machine learning, data science, and human genetics to drive innovation across the scientific value chain.)