31 May 2022
Meta is developing an algorithm to address the gender gap on Wikipedia.
In fact, according to the Wikimedia Foundation, only 20% of the biographies on the world’s most famous online encyclopedia are about female figures.
That’s why a researcher at the company founded by Mark Zuckerberg has launched an Artificial Intelligence project that allows Wikipedia editors to build and update profiles of women who, for their work, deserve to be catalogued. And the applications of this system do not stop there…
How does this system work? What is the real impact it can have on the world?
Today, we at Social Thingum will explain how this Artificial Intelligence system works and how it is helping the world discover female figures who are currently largely unknown. If you want to receive updates on the latest news in Innovation and Artificial Intelligence, follow our page on LinkedIn.
And now, let’s begin!
Overcoming the Gender Gap Online
Angela Fan, now a researcher at Meta, developed an innovative method during her PhD studies at the Université de Lorraine to break down the gender gap in online biographical writing through artificial intelligence.
The researcher was inspired by an episode from her childhood. In third grade, a teacher asked her to write an essay about a historical figure of her interest: the only requirement was that the information about that person be found in one of the books available in the library.
Angela would have liked to delve into the story of Eleanor Roosevelt, but unfortunately, during her research she found no text discussing her story. She was therefore forced to write about President Franklin Delano Roosevelt, husband of the great human rights activist and First Lady.
For Fan, if the same task were assigned today, despite the enormous amount of available sources, the same problem would occur. According to the researcher, female personalities are not absent from online sources, but they are too poorly represented.
An example is that of Donna Strickland, Nobel Prize winner in Physics, who was completely absent from Wikipedia even in the first days following the award.
In particular, only 20% of the biographies on Wikipedia concern female figures, as noted by the Wikimedia Foundation, despite Wikipedia being among the top 10 most visited websites in the world.
These are the reasons that led Angela Fan to decide to undertake the project.
The Project
In Angela Fan’s project, artificial intelligence is used to retrieve, from various web pages, information concerning the female figures in question. Then the system creates a detailed biography from which Wikipedia editors can quickly create new encyclopedia entries, thereby increasing female representation online.
The model employs a RAG (Retrieval-Augmented Generation) architecture, whose purpose is to identify the most salient information available online on a given subject. This information is then quickly inserted into the bios created by the system. The data retrieved by the RAG architecture is passed to a generator, called seq2seq, which compiles the biography.
With this architecture, the system retrieves in just a few seconds information such as the place and date of birth, schools attended, and positions held by the figure in question. The model is also capable of automatically appending to the generated text all the information sources used in the retrieval process.
Unfortunately, the quality of the output depends significantly on the amount of information available online.
In fact, many figures belonging to marginalized social groups are currently underrepresented, not only on Wikipedia, but on the entire web. This is an obstacle that Meta’s researchers want to overcome by continuing to work, so as to give these social groups the proper representation.
Irene Azzimondi
