8.22 MSR Talk & 组会笔记
Published:
Microsoft Research Talk
Machine Reading for Precision Medicine
Annotation Bottleneck
Document-Level N-array Relation Extraction
Self-supervised Learning
Knowledge Base
Distant supervision: Relation from database/knowledge graph as noisy labels.
Knowledge -> Probabilistic Logic -> Distant Supervised Label -> Deep Learning
Neural Architecture
Graph LSTM
Graph -> Exploit rich linguistic structures
Multi-scale Representation Learning
for Document-Level relation extraction
Input Text -> Mention-level representations -> Entity-level representation -> final predictions
“Document-level n-ary relation extraction with Multi-scale representation learning” - NAACL-19
Application
Machine reading for assisted curation. curator
- molecular tumor board
- EMR: 60%-80% in unstructured text
Some discussion
BERT is too general. Prior knowledge is very useful and may be added during training BERT.
CoAI Lab Disscussion
Pre-training Language Model
Auto-regressive language model behave as well as non-autoregressive auto-encoder in pre-training methods.