Without negation function turn on, the precision is ~0.4 and the recall is ~0.7.
By turnning on the negation detection, the precision is ~0.7.
Speed ranking (fast to slow): string-based (default) > MetaMap Lite > MetaMap = NCBO annotator > Ensemble
Recall rate (hige to low): Ensemble > MetaMap > MetaMap Lite = NCBO annotator > string-based (default)
In general, if the input is long (more than 1 WORD page), we suggest using default method or MetaMap Lite. If users want a better performance, please use the ensemble method
Without negation function turn on, the precision is ~0.4 and the recall is ~0.7.
By turnning on the negation detection, the precision is ~0.7.
Negation detection is available by using Wendy Chapman's NegEx, which is a rule and keyword based method
If two annotations overlapped in the text, only the longest one will be used in display. But both will be stored in the JSON output.
If two annotations repeated, a random one will be picked. But both will be stored in the JSON output.
Ensemble method union the results generated from other parsinge engines.
String-based method leveraging the Aho–Corasick algorithm for speedy concept extraction. The full input will be treated as one single string and search against all HPO terms and their synonyms under ‘phenotypic abnormality’ (HP:0000118).
by unchecking 'allow partial search', post-processing rules will be added to remove partial match, such as 'tic' in 'genetics'
It splits the input into sentences, and then feed each sentence to a locally configured MetaMap server via the Java API.
MetaMap first identifies candidate clinical terms through lexical and syntactic analysis and maps them to standard UMLS concepts.
The UMLS concepts are then mapped to HPO concepts following the mapping at here.
MetaMap is a fast version of MetaMap. It provide a near real-time named-entity recognizer which is not a rigorous as MetaMap but is much faster
Currenty, MetaMapLite does not support dynamic variant generation. Named Entities are found using longest match.
It employs the online NCBO Annotator API for HPO concept recognition. Different options for NCBO Annotator are exposed to users via the Doc2Hpo interface to customize the parsing.
When the network is not good, it will cause certain delay if the input size is big