semantic_html.models ==================== .. py:module:: semantic_html.models Attributes ---------- .. autoapisummary:: semantic_html.models.DEFAULT_CONTEXT semantic_html.models.WADM_CONTEXT Classes ------- .. autoapisummary:: semantic_html.models.BaseGraphItem semantic_html.models.NoteItem semantic_html.models.StructureItem semantic_html.models.LocatorItem semantic_html.models.DocItem semantic_html.models.AnnotationItem semantic_html.models.QuotationItem Functions --------- .. autoapisummary:: semantic_html.models.generate_wadm_annotation semantic_html.models.build_tei_from_items semantic_html.models.wadm_to_conll Module Contents --------------- .. py:data:: DEFAULT_CONTEXT .. py:data:: WADM_CONTEXT :value: 'https://www.w3.org/ns/anno.jsonld' .. py:class:: BaseGraphItem(type_, text=None, metadata=None, selector=None, **kwargs) Base class for all graph items with standardized fields. .. py:attribute:: data .. py:attribute:: selector .. py:attribute:: wadm_metadata .. py:method:: to_dict() Return the graph item as a dictionary. .. py:method:: to_wadm() Return a WADM-conformant dictionary representation. .. py:class:: NoteItem(text, **kwargs) Bases: :py:obj:`BaseGraphItem` Base class for all graph items with standardized fields. .. py:class:: StructureItem(text, level, **kwargs) Bases: :py:obj:`BaseGraphItem` Base class for all graph items with standardized fields. .. py:class:: LocatorItem(text, **kwargs) Bases: :py:obj:`BaseGraphItem` Base class for all graph items with standardized fields. .. py:class:: DocItem(text, **kwargs) Bases: :py:obj:`BaseGraphItem` Base class for all graph items with standardized fields. .. py:class:: AnnotationItem(text, **kwargs) Bases: :py:obj:`BaseGraphItem` Base class for all graph items with standardized fields. .. py:class:: QuotationItem(text, **kwargs) Bases: :py:obj:`BaseGraphItem` Base class for all graph items with standardized fields. .. py:function:: generate_wadm_annotation(item) .. py:function:: build_tei_from_items(base_items: list[BaseGraphItem]) .. py:function:: wadm_to_conll(wadm, config: dict = None, jsonld: dict = None) Convert WADM annotations into CoNLL format. - wadm: dict with 'text' + 'annotations', OR list of annotations - jsonld: optional ground-truth JSON-LD, used to resolve source->text - config: options (max_span_tokens, whitelist, blacklist, type_whitelist) Returns: CoNLL string