An application that uses the xmlproc API has to import the xmlproc module (non-validating parsing) or the xmlval module (validating parsing). A parser object is created by instantiating an object of the XMLProcessor class (non-validating) or XMLValidator (validating). Both classes have the same interface.
If you want to receive information about the document being parsed you must implement an object conforming to the Application interface, and tell the parser about it with the set_application method.
If you want to receive error events and react to them you must implement an object conforming to the ErrorHandler interface, and tell the parser to use your error handler with the set_error_handler method.
It is also possible to control the way the parser interprets system identifiers, by implementing an object conforming to the InputSourceFactory interface and giving it to the parser with the set_inputsource_factory method.
See the DTD API documentation and the catalog file documentation.
These are the classes of interest to xmlproc application writers:
This is the interface implemented by the two XML parser objects and is used to control parsing.
def __init__(self):
def get_dtd(self):
def set_application(self,app):
def set_error_handler(self,err):
def set_inputsource_factory(self,isf):
def set_pubid_resolver(self,pubres):
def parse_resource(self,sysID,bufsize=16384):
def reset(self):
def feed(self,new_data):
def close(self):
def get_current_sysid(self):
def get_offset(self):
def get_line(self):
def get_column(self):
def set_error_language(self,language):
This is the interface of the objects that data events from the parsed document.
def set_locator(self,locator):
def doc_start(self):
def doc_end(self):
def handle_comment(self,data):
def handle_start_tag(self,name,attrs):
def handle_end_tag(self,name):
def handle_data(self,data,start,end):
def handle_ignorable_data(self,data,start,end):
def handle_pi(self,target,data):
def handle_doctype(self,root,pubID,sysID):
def set_entity_info(self,xmlver,enc,sddecl):
This interface is used to receive information about errors encountered during the parsing of the document.
def __init__(self,locator):
def set_locator(self,loc):
def get_locator(self):
def warning(self,msg):
def error(self,msg):
def fatal(self,msg):
This interface is used by the parser to resolve any public identifiers used in the document to their corresponding system identifiers. The default implementation always returns the given system identifier, but the interface has been included mainly to allow support for catalog files.
def resolve_pe_pubid(self,pubid,sysid):
def resolve_doctype_pubid(self,pubid,sysid):
def resolve_entity_pubid(self,pubid,sysid):
This interface is used to allow users to control the way in which the parser interprets system identifiers. This is especially useful for embedding the parser in a larger document system, which may want to use system identifiers to refer to other documents inside the document system and not just to be ordinary URLs. It is also useful to allow the application to interpret system identifiers that are URIs, but not URLs, such as URNs.
The default implementation interprets system identifiers as URLs.
def create_input_source(self,sysid):