Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model

Research output: Contribution to journal › Journal article › Research › peer-review

Standard

Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model. / Zhu, Hongyin; Zeng, Yi; Wang, Dongsheng; Huangfu, Cunqing.

In: Frontiers in Human Neuroscience, Vol. 14, 128, 21.04.2020.

Research output: Contribution to journal › Journal article › Research › peer-review

Harvard

Zhu, H, Zeng, Y, Wang, D & Huangfu, C 2020, 'Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model', Frontiers in Human Neuroscience, vol. 14, 128. https://doi.org/10.3389/fnhum.2020.00128

APA

Zhu, H., Zeng, Y., Wang, D., & Huangfu, C. (2020). Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model. Frontiers in Human Neuroscience, 14, [128]. https://doi.org/10.3389/fnhum.2020.00128

Vancouver

Zhu H, Zeng Y, Wang D, Huangfu C. Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model. Frontiers in Human Neuroscience. 2020 Apr 21;14. 128. https://doi.org/10.3389/fnhum.2020.00128

Author

Zhu, Hongyin ; Zeng, Yi ; Wang, Dongsheng ; Huangfu, Cunqing. / Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model. In: Frontiers in Human Neuroscience. 2020 ; Vol. 14.

Bibtex

@article{84822c88659b4263b9d0c5e36a63472b,

title = "Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model",

abstract = "Large-scale neuroscience literature call for effective methods to mine the knowledge from species perspective to link the brain and neuroscience communities, neurorobotics, computing devices, and AI research communities. Structured knowledge can motivate researchers to better understand the functionality and structure of the brain and link the related resources and components. However, the abstracts of massive scientific works do not explicitly mention the species. Therefore, in addition to dictionary-based methods, we need to mine species using cognitive computing models that are more like the human reading process, and these methods can take advantage of the rich information in the literature. We also enable the model to automatically distinguish whether the mentioned species is the main research subject. Distinguishing the two situations can generate value at different levels of knowledge management. We propose SpecExplorer project which is used to explore the knowledge associations of different species for brain and neuroscience. This project frees humans from the tedious task of classifying neuroscience literature by species. Species classification task belongs to the multi-label classification which is more complex than the single-label classification due to the correlation between labels. To resolve this problem, we present the sequence-to-sequence classification framework to adaptively assign multiple species to the literature. To model the structure information of documents, we propose the hierarchical attentive decoding (HAD) to extract span of interest (SOI) for predicting each species. We create three datasets from PubMed and PMC corpora. We present two versions of annotation criteria (mention-based annotation and semantic-based annotation) for species research. Experiments demonstrate that our approach achieves improvements in the final results. Finally, we perform species-based analysis of brain diseases, brain cognitive functions, and proteins related to the hippocampus and provide potential research directions for certain species.",

keywords = "brain science, cognitive computing, corpus annotation, linked brain data, multi-label classification, neuroscience, PubMed",

author = "Hongyin Zhu and Yi Zeng and Dongsheng Wang and Cunqing Huangfu",

year = "2020",

month = apr,

day = "21",

doi = "10.3389/fnhum.2020.00128",

language = "English",

volume = "14",

journal = "Frontiers in Human Neuroscience",

issn = "1662-5161",

publisher = "Frontiers Research Foundation",

}

RIS

TY - JOUR

T1 - Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model

AU - Zhu, Hongyin

AU - Zeng, Yi

AU - Wang, Dongsheng

AU - Huangfu, Cunqing

PY - 2020/4/21

Y1 - 2020/4/21

N2 - Large-scale neuroscience literature call for effective methods to mine the knowledge from species perspective to link the brain and neuroscience communities, neurorobotics, computing devices, and AI research communities. Structured knowledge can motivate researchers to better understand the functionality and structure of the brain and link the related resources and components. However, the abstracts of massive scientific works do not explicitly mention the species. Therefore, in addition to dictionary-based methods, we need to mine species using cognitive computing models that are more like the human reading process, and these methods can take advantage of the rich information in the literature. We also enable the model to automatically distinguish whether the mentioned species is the main research subject. Distinguishing the two situations can generate value at different levels of knowledge management. We propose SpecExplorer project which is used to explore the knowledge associations of different species for brain and neuroscience. This project frees humans from the tedious task of classifying neuroscience literature by species. Species classification task belongs to the multi-label classification which is more complex than the single-label classification due to the correlation between labels. To resolve this problem, we present the sequence-to-sequence classification framework to adaptively assign multiple species to the literature. To model the structure information of documents, we propose the hierarchical attentive decoding (HAD) to extract span of interest (SOI) for predicting each species. We create three datasets from PubMed and PMC corpora. We present two versions of annotation criteria (mention-based annotation and semantic-based annotation) for species research. Experiments demonstrate that our approach achieves improvements in the final results. Finally, we perform species-based analysis of brain diseases, brain cognitive functions, and proteins related to the hippocampus and provide potential research directions for certain species.

AB - Large-scale neuroscience literature call for effective methods to mine the knowledge from species perspective to link the brain and neuroscience communities, neurorobotics, computing devices, and AI research communities. Structured knowledge can motivate researchers to better understand the functionality and structure of the brain and link the related resources and components. However, the abstracts of massive scientific works do not explicitly mention the species. Therefore, in addition to dictionary-based methods, we need to mine species using cognitive computing models that are more like the human reading process, and these methods can take advantage of the rich information in the literature. We also enable the model to automatically distinguish whether the mentioned species is the main research subject. Distinguishing the two situations can generate value at different levels of knowledge management. We propose SpecExplorer project which is used to explore the knowledge associations of different species for brain and neuroscience. This project frees humans from the tedious task of classifying neuroscience literature by species. Species classification task belongs to the multi-label classification which is more complex than the single-label classification due to the correlation between labels. To resolve this problem, we present the sequence-to-sequence classification framework to adaptively assign multiple species to the literature. To model the structure information of documents, we propose the hierarchical attentive decoding (HAD) to extract span of interest (SOI) for predicting each species. We create three datasets from PubMed and PMC corpora. We present two versions of annotation criteria (mention-based annotation and semantic-based annotation) for species research. Experiments demonstrate that our approach achieves improvements in the final results. Finally, we perform species-based analysis of brain diseases, brain cognitive functions, and proteins related to the hippocampus and provide potential research directions for certain species.

KW - brain science

KW - cognitive computing

KW - corpus annotation

KW - linked brain data

KW - multi-label classification

KW - neuroscience

KW - PubMed

UR - http://www.scopus.com/inward/record.url?scp=85084364415&partnerID=8YFLogxK

U2 - 10.3389/fnhum.2020.00128

DO - 10.3389/fnhum.2020.00128

M3 - Journal article

C2 - 32372933

AN - SCOPUS:85084364415

VL - 14

JO - Frontiers in Human Neuroscience

JF - Frontiers in Human Neuroscience

SN - 1662-5161

M1 - 128

ER -

ID: 243526271

Department of Computer Science