semantic_html.models¶
Attributes¶
Classes¶
Base class for all graph items with standardized fields. |
|
Base class for all graph items with standardized fields. |
|
Base class for all graph items with standardized fields. |
|
Base class for all graph items with standardized fields. |
|
Base class for all graph items with standardized fields. |
|
Base class for all graph items with standardized fields. |
|
Base class for all graph items with standardized fields. |
Functions¶
|
|
|
|
|
Convert WADM annotations into CoNLL format. |
Module Contents¶
- semantic_html.models.DEFAULT_CONTEXT¶
- semantic_html.models.WADM_CONTEXT = 'https://www.w3.org/ns/anno.jsonld'¶
- class semantic_html.models.BaseGraphItem(type_, text=None, metadata=None, selector=None, **kwargs)¶
Base class for all graph items with standardized fields.
- data¶
- selector¶
- wadm_metadata¶
- to_dict()¶
Return the graph item as a dictionary.
- to_wadm()¶
Return a WADM-conformant dictionary representation.
- class semantic_html.models.NoteItem(text, **kwargs)¶
Bases:
BaseGraphItemBase class for all graph items with standardized fields.
- class semantic_html.models.StructureItem(text, level, **kwargs)¶
Bases:
BaseGraphItemBase class for all graph items with standardized fields.
- class semantic_html.models.LocatorItem(text, **kwargs)¶
Bases:
BaseGraphItemBase class for all graph items with standardized fields.
- class semantic_html.models.DocItem(text, **kwargs)¶
Bases:
BaseGraphItemBase class for all graph items with standardized fields.
- class semantic_html.models.AnnotationItem(text, **kwargs)¶
Bases:
BaseGraphItemBase class for all graph items with standardized fields.
- class semantic_html.models.QuotationItem(text, **kwargs)¶
Bases:
BaseGraphItemBase class for all graph items with standardized fields.
- semantic_html.models.generate_wadm_annotation(item)¶
- semantic_html.models.build_tei_from_items(base_items: list[BaseGraphItem])¶
- semantic_html.models.wadm_to_conll(wadm, config: dict = None, jsonld: dict = None)¶
Convert WADM annotations into CoNLL format. - wadm: dict with ‘text’ + ‘annotations’, OR list of annotations - jsonld: optional ground-truth JSON-LD, used to resolve source->text - config: options (max_span_tokens, whitelist, blacklist, type_whitelist) Returns: CoNLL string