Tuesday, October 4, 2022
HomeTech & GamesAI could be used to generate more biographies of women on Wikipedia

AI could be used to generate more biographies of women on Wikipedia

An algorithm capable of writing biographies of women who deserve to be included in Wikipedia has been developed.

The idea is to be able to submit, in an automatic fashion, a multitude of profiles so that they can then be validated and completed by the editors of the famous online encyclopedia.

At the present, just 20 per cent of the biographies on Wikipedia in English concern women. And the low proportion of women is even more obvious in certain fields such as science and extremely glaring when it comes to women of Asian or African origin. However, things could change. Angela Fan, a Facebook AI Research Paris, worked on a PhD project as a computer science student at the Université de Lorraine, CNRS, in France that involved developing a solution that could help remedy this imbalance, through the use of artificial intelligence.

This “intelligent” system was trained to search the internet and “automatically” write the first draft of a future article to be published on Wikipedia, in the style of the online encyclopedia. The goal is to quickly submit thousands of new, reliable and interesting biographical articles about prominent people who are not yet on the site to Wikipedia editors.

For example, this model was used to generate a short biography of Libbie Hyman, a pioneer in the field of invertebrate zoology. This text was designed on the basis of information collected in a reference article, supplemented by various information found on the web. The AI only retained information that was considered relevant, from reliable sources. It’s envisioned that this type of text, sometimes very short, could serve as a starting point for the creation of a new Wikipedia entry, while contributors to the encyclopedia could then flesh out this biography.

In this case, the ability of the AI to search for the right information is based on large-scale pretraining, machine learning to identify useful information, such as date and place of birth, schools attended, professional background, etc. Then, a generation module creates the text from this information, with a citation module building the bibliography by “looking back” at the sources used to create a solid basis for an entry.

The goal of this initiative is to improve the “equitable availability” of content offered on Wikipedia in the near future. More work could be done in the future on other underrepresented groups.


Leave a reply

Please enter your comment!
Please enter your name here

- Advertisment -

Most Popular

Recent Comments