Science doesn’t discriminate, but probably technology does, at least in terms of accessibility. New research has found that the unequal distribution of compute power in academia is promoting inequality in the era of deep learning. The study conducted jointly by AI researchers from Virginia Tech and Western University found that this de-democratisation of AI has pushed people to leave academia and opt for high-paying industry jobs.
The study found that the amount of compute power at elite universities, ranked among top 50 as per QS World University Rankings, is much more than at mid-to-low tier institutions. For the research, authors analysed over 170,000 papers presented across 60 prestigious computer science conferences such as ACL, ICML, and NeurIPS in categories like computer vision, data mining, NLP, and machine learning.
Recording their study in a paper titled ‘The De-democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research’, the authors noted, “This compute divide between large firms and non-elite universities increases concerns around bias and fairness within AI technology, and presents an obstacle towards “democratising” AI. These results suggest that a lack of access to specialised equipment such as compute can de-democratise knowledge production.”
De-Democratisation of AI
The study found that mid-range universities (ranking >300) published six fewer papers at AI research conferences, compared to other elite universities since the deep learning rise. Fortune 500 companies, Big Tech leaders and elite universities saw dramatically different trends.
The ‘modern era’ of AI research which started from 2012 has been characterised largely by increased compute-usage, much more than ever seen before. In fact, this compute usage has been every 3.4 months, as opposed to the earlier pace of this advancement every two years. This increase is attributed to specialised hardware and advancement in processing units such as GPU for training models.
The advancement in hardware and software capabilities has significantly outpaced its availability and accessibility. The access to specialised hardware seems to be concentrated to certain research groups.
Drawing on previous research, the authors of the study formulated a simple model of knowledge production to understand how assets such as datasets and compute play a role. They assumed that scientific knowledge production depends on knowledge, skills, materials, equipment and effort through the following relationship:
KP= f (knowledge, skills, materials, equipment, and effort)
For the entire study, the data was collected from various sources such as csrankings.org, Scopus, QS World University Rankings, the US News & World Report’s Rankings, Fortune magazine and Elsevier Scopus.
The second aspect of the research also showed that the presence of corporates and big firms in AI-related research has also gone up in the last few years. This increase of firms in the research space has been attributed to two points:
- Firms have increased ‘firm-only’ publications, meaning publications which do not have any independent participation from third parties (universities or research institutes).
- There has been an active promotion of collaboration of firms with universities. For such collaborations, firms were found to favour top-ranked universities more than the mid- or low-ranked universities, combined.
Drawing on their research, authors also say that their study demonstrates that the ‘compute divide’ has been along with a series of social fault lines. Elite universities typically house wealthy students and the diversity in such universities is poor. Not just universities, Big Tech firms are also found to be less diverse, particularly among engineers, product designer, and AI researchers. This inevitably leads to personal biases creeping up in research and product; further, since AI has developed to become more mainstream and general-purpose technology that impacts businesses, private and public lives of people, hence this demographic imbalance has widespread consequences.
Wrapping Up
There have been many researchers in the past, highlighting the divide in terms of access and availability in the AI ecosystem. This particular study has also referenced quite a few of them. Since the study is mostly based on its survey of the US universities, authors in their concluding note, offer concerned authorities the suggestion that public datasets must be shared indiscriminately. It also suggests the government to fund research and outreach to attract people from nontraditional backgrounds.
Read the full paper here.