FrBMedQA: the first French biomedical question answering  dataset

Zakaria Kaddari; Toumi Bouchentouf

doi:10.11591/ijai.v11.i4.pp1588-1595

FrBMedQA: the first French biomedical question answering dataset

Zakaria Kaddari, Toumi Bouchentouf

Abstract

FrBMedQA is the first French biomedical question answering dataset, containing 41k+ passage-question instances. It was automatically constructed in a cloze-style manner, from biomedical French Wikipedia articles. To test the validity and difficulty of the dataset, we experimented with four statistical baseline models, a biomedical bidirectional encoder representation from transformers (BERT)-based model, and two French BERT-based language model. We also did human evaluation on a subset of the test set. All the three tested models were not able to surpass the best performing baseline model. Human performance at 61.11% is leading the leaderboard with more than 8% from the best performing model. We made available the dataset and the code to reproduce our results.

Keywords

Biomedical; Dataset; FrBMedQA; Information retrieval; Question answering;

Full Text:

PDF

DOI: http://doi.org/10.11591/ijai.v11.i4.pp1588-1595

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats

Username
Password
Remember me