this researcher has created an AI that produces biographies of female personalities

The online encyclopedia Wikipedia is a wonderful humanistic project, but even it is not free from gender bias. Thus, women represent only 20% of biographies written in English. And they make up only 15% of Wikipedia editors.
To counter this under-representation, Angela Fan, a researcher at Meta’s FAIR Paris center, created an artificial intelligence capable of compiling information from the Web about a person and turning it into a Wikipedia-style biography article.

It is a process that is far from obvious. Certainly, there are already models, like GPT-3, which are able to generate a bluffing way a text on a particular subject. But there, the level of requirement is higher. It is not a question of inventing sentences one after the other, but of composing a complete article with sourced information.

“Text-generating AIs such as GPT-3 struggle to plan and structure long text. Beyond a paragraph, it will lose coherence”explains Angela Fan.

To overcome this problem, the researcher created a particularly structured system, which functions as a framework. He will subdivide the biography into different parts (introduction, youth, career, etc.) and for each of them he will carry out three processing operations based on machine learning: searching and sorting relevant information on the Web , text generation, and reference integration.

The system was trained on a set of over 677,000 biographies from Wikipedia. To be able to better measure her performance, Angela Fan put together a specific evaluation set, made up of 1,527 biographies of women. Then she used her system to create a series of biographies. Unfortunately, the result is still far from perfect.
Among the information contained in the articles generated, 32% can also be found in the referring biographies and 17% can be linked to Web sources. The rest is invention, impossible to verify.

Another result: the more information there is about a person on the Web, the better the quality of the biography generated. And this is also one of the fundamental problems.

“On the web, there is less information about female personalities than male ones. What makes generating biographies more difficult”notes Angela Fan.

It’s a bit like the snake biting its own tail.

Either way, this research is a good start and points the way for future research. This system could be used to counterbalance other inequalities, for example of a linguistic, cultural or political nature.

Xiang Qin
