An UdeM computer-science professor wins a big U.S. prize

Irina Rish

Irina Rish

Credit: Amélie Philibert | Université de Montréal

In 5 seconds

As part of a U.S. Department of Energy program and as the sole Canadian-based researcher, Irina Rish gets to spend 990,000 node-hours on the IBM Summit supercomputer at Oak Ridge National Laboratory.

The U.S. Department of Energy's Office of Science has released the list of the 56 winners of its 2023 INCITE program – and renowned Université de Montréal computer-science professor Irina Rish is among them.

The sole Canadian-based scientist on the list, Rish was awarded 990,000 “node-hours” to work over the next 12 months on the powerful IBM Summit supercomputer at Oak Ridge National Laboratory in Tennessee.

A node-hour is the usage of one node (or computing unit) on a supercomputer for one hour.

Having access to the Summit will allow Rish to develop her project ‘Scalable Foundational Models for Transferable Generalist AI,’ which aims to train large-scale multimodal deep neural network models.

“With this support,” she said, “we are not only getting a supercomputing resource, we are also taking a step towards our long-term goal of democratising artificial intelligence and demonstrating that universities and open source communities can compete in this field.”

Rish is a professor in UdeM’s Department of Computer Science and Operations Research, is a member of Mila, and holds a Canada Excellence Research Chair in Autonomous Artificial Intelligence and a Canada-CIFAR Chair in AI.

20 years at IBM

Her research interests include AI, machine learning and neural data analysis. Employed at UdeM since 2019, she taught at Columbia University and for 20 years was a team member of the IBM T. J. Watson Research Center, where she worked on several projects in neuroscience and AI.

Established in 2003 by U.S. Under Secretary for Science Raymond Lee Orbach, the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program aims to assist research communities in advancing science and engineering.

The selection process for the winners is highly competitive and takes place over a four-month period. Experts and peer review committees evaluate the projects from various angles to select the teams that will have access to one of the program's four supercomputers for one year.

Of  97 applicants this year, half were awarded supercomputing time, including Rish and her two co-investigators: Stella Biderman, a Georgia Institute of Technology master’s student and chief scientist and organizer of EleutherAI, a grassroots AI research collective; and AI researcher Jenia Jitsev, a senior researcher leading the Scalable Learning and Multi-Purpose AI Lab at Germany’s Juelich Supercomputing Center and who is the scientific lead and founder of LAION, a non-profit organization focused on open-source AI.

Additional collaborators who contributed to the INCITE proposal include Quentin Anthony, Guillermo Cecchi, Mehdi Cherti, Guillaume Dumas, Eric Hallahan, Yonggang Hu, Sergey Panitkin, Christoph Schuhmann, Rio Yokota.

In their work in Tennessee, besides training large-scale neural network models, Rish and her team will investigate scaling laws and emergent behavior, and apply their generalization and knowledge-transfer capabilities in a variety of practical applications.

‘Alignment with human values’

The team’s goal is to contribute to advancing AI from narrow to broad interests while ensuring AI safety and “alignment with human values, and contribute towards advances in other fields such as healthcare, biomedical sciences, and others,” said Rish.

That includes developing generic, powerful, large-scale models that are pretrained in a self-supervised way on a broad variety of datasets. Such models could serve as a foundation of transferable knowledge and have much wider applications than current AI allows.

Building on recent successes in this area, the team plans to train large-scale neural network models called Transformers, which recently demonstrated impressive performance in language modeling and image processing.

Rish and her colleagues will evaluate their scaling with bigger model and pretraining datasets, and plan to extend these models to handle a much wider range of modalities beyond text and images, as well as various machine-learning tasks.

The goal will be to expand them towards adaptive, continually learning systems and develop them for predictive modeling in several applications such as healthcare and brain imaging, said Rish.

“And all this, we’ll make publicly available, so that everyone can benefit. Science is for everyone, and this project is an example of that.”