Model predicts gene responses to cold across diverse plant species

James Schnable and Rebecca Roston
Craig Chandler | University Communication
James Schnable (left), Rebecca Roston and their colleagues have developed a machine-learning model that, given just the DNA sequence of a single plant species, can predict how another's genes will turn off and on in the face of freezing temperatures.
March 4, 2021

Lincoln, Neb. —When Xiaoxi Meng and Zhikai Liang first proposed the idea a couple of years ago, James Schnable was skeptical. To say the least.

“‘Well, you can try, but I don’t think it’s going to work,’” the associate professor of agronomy and horticulture recalled saying to Meng and Liang, then postdoctoral researchers in Schnable’s lab at the University of Nebraska–Lincoln.

He was wrong and, in hindsight, never happier to be. Yet at the time, Schnable had fair reason to raise an eyebrow. The duo’s idea — that the DNA sequences of cold-sensitive crops that surrender to a hard frost could help predict how wilder, hardier plants tolerate freezing conditions — seemed audacious. To say the least. Still, it was a low-risk, high-reward proposition. Because if Meng and Liang could get it to work, it might just fast-track efforts to make cold-sensitive crops a little or even a lot more like their cold-resistant counterparts.

Some of the world’s most important crops were domesticated in tropical regions — corn in southern Mexico, sorghum in eastern Africa — that put no selective pressure on them to evolve defenses against cold or freezing. When those crops are grown in harsher climates, their sensitivity to cold limits how early they can be planted and how late they can be harvested. Shorter growing seasons equal less time for photosynthesis, resulting in smaller yields and less food for a global population expected to approach 10 billion people by 2050.

Plant species that already grow in colder climates, meanwhile, evolved tricks to endure the cold. They can reconfigure their cellular membranes to maintain liquidity at lower temperatures, preventing the membranes from freezing and fracturing. They can add dashes of sugars to the liquids in and around those membranes, lowering their freezing point in much the same way that salt does a sidewalk’s. They can even produce proteins that smother minuscule ice crystals before those crystals grow into cell-busting masses.

All of those defenses originate at the genetic level, though not just in the sequences of DNA itself. When plants begin to freeze, they can respond by essentially turning certain genes off or on — preventing or allowing their genetic instruction manuals to be transcribed and carried out. Knowing which genes cold-tolerant plants turn off and on in the face of freezing temperatures, then, can help researchers grasp the very foundations of their fortifications and, ultimately, engineer similar defenses into cold-sensitive crops.

But Schnable also knew, as Meng and Liang did, that even an identical gene often responds differently to cold across plant species, even closely related ones. Which means, frustratingly, that understanding how a gene responds to cold in one species tends to tell plant scientists almost nothing conclusive about the gene’s behavior in another. That unpredictability, in turn, has hindered efforts to learn the rules dictating what will deactivate or activate genes.

“We’re still really, really bad at understanding why genes turn off and on,” Schnable said.

Lacking a rulebook, the researchers turned to machine learning, a form of artificial intelligence that can essentially write its own. They specifically developed a supervised classification model — the sort that can, when presented with enough labeled images of, say, cats and not-cats, eventually learn to distinguish the former from the latter. The team initially presented its own model with an enormous pile of sequenced genes from corn, along with the average activity levels of those genes when the plant was subjected to freezing temperatures. The model was also fed “every feature we could think of” for each corn gene, Schnable said, including its length, its stability and any differences between it and other versions of it found in other corn plants.

Later, the researchers tested their model by concealing from it just one piece of information in a subset of those genes: whether they responded to the onset of freezing temperatures, or whether they didn’t. By analyzing the features of genes it had been told were either responsive or non-responsive, the model discerned which combinations of those features were relevant to each — and then successfully slotted the majority of the remaining, mystery-box genes into their correct categories.

It was a promising start, no doubt. But the real test remained: Could the model take the training it had received in one species and apply it to another?

The answer was a definitive yes. After being trained with DNA data from just one of six species — corn, sorghum, pearl millet, proso millet, foxtail millet or switchgrass — the model was generally able to predict which genes in any of the other five would respond to freezing. To Schnable’s surprise, the model held up even when it was trained on a cold-sensitive species — corn, sorghum, pearl or proso millet — but tasked with predicting gene responses in the cold-tolerant foxtail millet or switchgrass.

“The models we trained worked almost as well across species as if you actually had data in one species and used the internal data to make the predictions in that same species,” he said, a hint of wonder lingering in his voice months later. “I really would not have predicted that.

“The idea that we can just feed all of this information into a computer, and it can figure out at least some rules to make predictions that work, is still kind of amazing to me.”

Those predictions could prove especially useful when considering the alternative. For roughly a decade, plant biologists have actually been able to measure the number of RNA molecules — the ones responsible for transcribing and transporting DNA instructions — produced by every gene in a living plant. But comparing how that gene expression responds to cold in living specimens, and across multiple species, is a painstaking undertaking, Schnable said. That’s particularly true with wild plants, whose seeds can be difficult to even acquire. Those seeds may not germinate when expected, if at all, and can take years to grow. Even if they do, every resulting plant has to be cultivated in an identical, controlled environment and studied at the same developmental stage.

All of that poses a massive challenge to growing enough wild specimens, from enough wild species, to replicate and statistically evaluate their genes’ responses to cold.

“If we really want to get at what genes are important — that actually play a role in how the plant adapts to cold — we need to be looking at more than two species,” Schnable said. “We want to look at a group of species that are tolerant of cold and a group that are sensitive, and look at the patterns: ‘This same gene always responds in one and always doesn’t respond in the other.’

“That starts to become a really big and expensive experiment. It’d be really nice if we could just make predictions from the DNA sequences of those species instead of, say, taking 20 species and trying to get all of them at the same stage, put them all through the exact same stress treatments, and measure the amount of RNA produced for each gene in each species.”

Fortunately for the model, researchers have already sequenced the genomes of more than 300 plant species. An ongoing international effort could push that number as high as 10,000 over the next few years.

Though the model has already wildly exceeded his modest expectations, Schnable said the next step will nevertheless involve “convincing both ourselves and other people” that it works as well as it has so far. In every test case to date, the researchers have asked the model to tell them what they already knew. The ultimate test, he said, will come when both the humans and the machine are starting from scratch.

“The next big experiment I think we need to do is to make predictions on a species where we don’t have any data at all,” he said. “To convince people that it really works in cases where even we don’t know the answers.”

The team reported its findings in the journal Proceedings of the National Academy of Sciences. Meng, Liang and Schnable authored the study with Nebraska’s Rebecca Roston, Yang Zhang, Samira Mahboub and undergraduate student Daniel Ngu, along with Xiuru Dai, a visiting scholar from Shandong Agricultural University.

The researchers received funding from the U.S. Department of Agriculture’s National Institute of Food and Agriculture and the U.S. Department of Energy’s Office of Science.

by Scott Schrage | University Communication