It has been called one of the most important scientific milestones in history: The Human Genome Project.

It was launched in 1990 by an international consortium of scientists at a cost of 3 billion US dollars.

Their goal was to determine the sequence of the 3.2 billion base pairs (or letters) that make up human DNA: all of their hereditary information and the instructions for building and maintaining the function of their cells, tissues and organs.

In 2000, with great fanfare, it was announced that the first draft the human genome is complete.

“Today’s announcement represents more than a historic triumph of science and reason…With this profound new knowledge, humanity will gain tremendous new healing power,” said then-President of the United States, Bill Clinton.

We recommend

How the Black Death from 700 years ago is still affecting your health

American cancer patient who developed ‘uncontrollable Irish accent’

Genetic consequences of having children with a cousin: what is real and what is taboo

What is a genetic conflict that is naturally released by pregnancy and why it can cause complications for the fetus and the mother

The project promised many things: it would reveal the function of genes, especially those involved in disease, which would enable personalized medicine with treatments based on our genetic makeup.

The genome also promised to reveal information about our evolutionary origins so that we would know exactly where we came from and what made us different from other primates.

But the genome presented in 2000 it was not complete. Not only was it a rough first draft, but it also had vast areas where the DNA sequence did not even appear.

The project continued. And in 2003, it was announced again, this time with less fanfare, that the human genome was complete.

But… about 8% of their information was still missing.

These gaps included the most difficult fragments to sequence, in which the DNA letters were repeated over and over. With the technology available at the time, it was impossible to read them.

And so the human genome – which was officially complete – remained undeciphered for 20 years.

Until 2021, when a scientific consortium called Telomere-to-Telomere (T2T) announced that it had managed to read the entire genome.

But was it so?

Yes, but… even though they’ve reached previously inaccessible places – specifically the 8% that couldn’t be read before – the reality is that parts of the human genome still remain beyond the reach of geneticists.

Photo: GETTY IMAGES

In other words, with the advancement of technology, it was possible to read the entire human genome, without gaps and with minimal errors.

But that reference human genome is “joint” for which DNA extracted from several individuals was used.

that is, it is not the genome of an actual person who ever lived.

Difficulties

Why is deciphering the human genome such a difficult task?

“The main limitation has been that the technologies that allow us to decipher the DNA sequence do so through short fragments that are read by a machine and then need to be reassembled, as if they were pieces of a complex puzzle,” he explained. for BBC Mundo Dr. Manuel Corpas, Professor of Genomics in the School of Life Sciences at the University of Westminster, London.

“If you encounter an area in a puzzle where the color and shape of the pieces do not change (repeat), putting them in the correct order unambiguously and without a frame of reference is complicated.”

In fact, sequencing a genome is like cutting a book into text fragments and trying to reconstruct that book by putting all the fragments back together.

Text fragments containing the words i repeated or common phrases are much more difficult to construct make pieces of text that are unique and different.

With the human genome, it is necessary to put together millions of parts that describe the diversity of an individual.

Large fragments of these pieces are full of repetitions and these are the most difficult regions to read in the human genome.

Until 2021, when new sequencing techniques were able to capture these repeats.

“It’s as if we have a cartographic map from the 18th century that describes the geography of the world. First, the shapes of coastlines on nearby continents were found, and the blank spaces were filled in as our ability to resolve ambiguous areas was refined,” says Dr. Corpas.

Photo: GETTY IMAGES

A key discovery made by T2T in 2021, which became official in 2022 with several studies published in the journal Science, was the ability to accurately read much longer stretches of DNA after discovering a way to map its most mysterious and forgotten repetitive regions.

The T2T Consortium was founded in 2018 by Adam Phillippy of the National Human Genome Research Institute in Maryland, USA, and Karen Miga, a geneticist at the University of California, Santa Cruz.

T2T was not a multi-billion dollar project, but its achievement – the ability to read the entire human genome -, it was considered a turning point.

In order to sequence it, the scientists used a kind of “shortcut”.

Normal human cells are diploid, meaning they have two copies of each type of chromosome. Both father and mother give each pair one chromosome.

But the cells the T2T team used for their sequencing contained only one set chromosomes inherited from the father. This made it easier to reconstruct the precise sequence, but also meant that the T2T genome could not reveal how DNA varies within a person.

That is, despite the enormous progress of T2T, the genome they sequenced is one version of the genome that it does not represent the man who lived. It is not “that” human genome.

But this sequenced genome will now lay the groundwork for new genomic research.

With the ability to read the entire human genome, scientists now hope to be able to sequence the genomes of people from different populations around the world to build a true picture of the genetic diversity of our species.

In other words, the real achievement will be to be able to read several genomes that allow us to see how their regions vary within a person, from one person to another, from one population to another, or from one species to another.

“There are many variants or differences in every organism, about 5 million in every human. The vast majority of variants do not produce any effect, but a small percentage do,” explains Dr. Manuel Corpas.

“Understanding the effects these variants cause and how they determine the function of the organism is one of the main frontiers of knowledge about the genome, but not the only one.”

“Clarify what tendency to frequent and rare diseases Therefore, it is one of the main goals to be achieved”.

“Another important goal is to understand how many variants that determine the occurrence of cancer develop within the organism to produce tumors,” adds the expert.

The human pangenome

A new effort in this regard is that of scientists from the so-called Human Pangenome Reference Consortium.

Together with T2T, the PanGenome Consortium hopes to sequence the genomes of about 450 people from around the world to better understand how DNA varies within and from person to person.

One of the main goals of this knowledge will be to identify variants that contribute to disease risk and have personalized medicine in the future.

“The possibility of developing cancer therapies that are personalized for the patient is a very active field, as is pharmacogenomics, that is, how genetics affects our optimal dose or even an adverse reaction to drugs,” explains genomics professor Manuel Corpas. .

Also, the expert explains, efforts are being made to modify our genetic code with techniques such as CRISPR, which aims to “edit” genes in order to eliminate and correct errors that cause diseases.

But this, points out Manuel Corpas, “It’s just the tip of the iceberg”: The medicine of the future will be based on genomics and the way genetic information is inherited and changed from one generation to the next.

Scientific efforts are now focused on understanding how DNA varies within a person and from person to person. Photo: GETTY IMAGES

Achievements

Much of what was promised in 1990 when the Human Genome Project was launched has already been realized.

Today we know much more about functions of many genes and its role in diseases ranging from breast cancer to schizophrenia.

However, in practice, genomic medicine he couldn’t get far since most diseases have been found to be influenced by hundreds of genes.

There are very few inherited diseases that are caused by a single faulty gene, and the use of genetic testing to screen people at risk for rare diseases is generally only used for people thought to be at the highest risk.

But genetics managed to change ours understanding human evolution.

For example, we now know that our ancestors mixed with other hominids like Neanderthals.

The question that arises is whether we will finally succeed in completing the human genome with new initiatives, such as the Human Pangenome project.

The answer is no.

The reason is that there is no unique human genome. Everyone’s DNA is different and those differences matter.

“We all have a unique genome that determines our response to pathogens, disease, drugs, etc.,” explains Dr. Corpas.

“The time will come when the reference genome will whether each person’s genome is unique for each individual to detect and predict disease before symptoms appear.

“In the meantime, there’s a lot we can already do with the common variation we find in populations of individuals in different ratios.”

“These variations help us understand why Asians are less tolerant to alcohol or lactose and why Europeans are more susceptible to skin cancer,” says the genomics professor.

Therefore, we will only really understand the genome when we have data on how it varies from person to person and from population to population.

In short, as long as there are people, there will be new genomes, and we will never fully read the human genome.