semantic_html.models
====================
.. py:module:: semantic_html.models
Attributes
----------
.. autoapisummary::
semantic_html.models.DEFAULT_CONTEXT
semantic_html.models.WADM_CONTEXT
Classes
-------
.. autoapisummary::
semantic_html.models.BaseGraphItem
semantic_html.models.NoteItem
semantic_html.models.StructureItem
semantic_html.models.LocatorItem
semantic_html.models.DocItem
semantic_html.models.AnnotationItem
semantic_html.models.QuotationItem
Functions
---------
.. autoapisummary::
semantic_html.models.generate_wadm_annotation
semantic_html.models.build_tei_from_items
semantic_html.models.wadm_to_conll
Module Contents
---------------
.. py:data:: DEFAULT_CONTEXT
.. py:data:: WADM_CONTEXT
:value: 'https://www.w3.org/ns/anno.jsonld'
.. py:class:: BaseGraphItem(type_, text=None, metadata=None, selector=None, **kwargs)
Base class for all graph items with standardized fields.
.. py:attribute:: data
.. py:attribute:: selector
.. py:attribute:: wadm_metadata
.. py:method:: to_dict()
Return the graph item as a dictionary.
.. py:method:: to_wadm()
Return a WADM-conformant dictionary representation.
.. py:class:: NoteItem(text, **kwargs)
Bases: :py:obj:`BaseGraphItem`
Base class for all graph items with standardized fields.
.. py:class:: StructureItem(text, level, **kwargs)
Bases: :py:obj:`BaseGraphItem`
Base class for all graph items with standardized fields.
.. py:class:: LocatorItem(text, **kwargs)
Bases: :py:obj:`BaseGraphItem`
Base class for all graph items with standardized fields.
.. py:class:: DocItem(text, **kwargs)
Bases: :py:obj:`BaseGraphItem`
Base class for all graph items with standardized fields.
.. py:class:: AnnotationItem(text, **kwargs)
Bases: :py:obj:`BaseGraphItem`
Base class for all graph items with standardized fields.
.. py:class:: QuotationItem(text, **kwargs)
Bases: :py:obj:`BaseGraphItem`
Base class for all graph items with standardized fields.
.. py:function:: generate_wadm_annotation(item)
.. py:function:: build_tei_from_items(base_items: list[BaseGraphItem])
.. py:function:: wadm_to_conll(wadm, config: dict = None, jsonld: dict = None)
Convert WADM annotations into CoNLL format.
- wadm: dict with 'text' + 'annotations', OR list of annotations
- jsonld: optional ground-truth JSON-LD, used to resolve source->text
- config: options (max_span_tokens, whitelist, blacklist, type_whitelist)
Returns: CoNLL string