semantic_html.parser¶
Functions¶
|
Parses a HTML note HTML string into a JSON-LD dictionary (optionally also annotated HTML). |
Module Contents¶
- semantic_html.parser.parse_note(html: str, mapping: dict, note_uri: str = None, metadata: dict = None, rdfa: bool = False, wadm: bool = False, remove_empty_tags: bool = True) dict¶
Parses a HTML note HTML string into a JSON-LD dictionary (optionally also annotated HTML).
- Parameters:
html (str) – The HTML content of the HTML note.
mapping (dict) – A dictionary mapping classes, tags, styles, and types.
note_uri (str, optional) – If provided, used as the Note’s @id. Can also be a key in mapping dict.
metadata (dict, optional) – A dictionary with additional keys to append for each item (e.g. provenance information) Can also be set as dict ‘metadata’ in mapping.
rdfa (bool, optional) – If True, also return RDFa-annotated HTML.
wadm (bool, optional) – If True, also return Web Annotation Data Model conformant JSON-LD.
remove_empty_tags (bool, optional) – If True, empty tags will be removed from HTML before parsing.
- Returns:
dict with keys for JSON-LD, WADM, and RDFa.
- Return type:
dict