Interface IHTMLHandler
-
- All Superinterfaces:
IDiscoveredLinkHandler,IMetaTagHandler
- All Known Implementing Classes:
FindContentHandler,FindHTMLFormHandler,FindHTMLHrefHandler,WebcrawlerConnector.ProcessActivityHTMLHandler
public interface IHTMLHandler extends IDiscoveredLinkHandler, IMetaTagHandler
This interface describes the functionality needed by an HTML processor in order to handle an HTML document.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description voidfinishUp()Done with the document.voidnoteAHREF(java.lang.String rawURL)Note discovered hrefvoidnoteBASEHREF(java.lang.String rawURL)Note base hrefvoidnoteFormEnd()Note the end of a formvoidnoteFormInput(java.util.Map inputAttributes)Note an input tagvoidnoteFormStart(java.util.Map formAttributes)Note the start of a formvoidnoteFRAMESRC(java.lang.String rawURL)Note discovered FRAME SRCvoidnoteIMGSRC(java.lang.String rawURL)Note discovered IMG SRCvoidnoteLINKHREF(java.lang.String rawURL)Note discovered hrefvoidnoteTextCharacter(char textCharacter)Note a character of text.-
Methods inherited from interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
noteDiscoveredBase, noteDiscoveredLink
-
Methods inherited from interface org.apache.manifoldcf.crawler.connectors.webcrawler.IMetaTagHandler
noteMetaTag
-
-
-
-
Method Detail
-
noteFormStart
void noteFormStart(java.util.Map formAttributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote the start of a form- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFormInput
void noteFormInput(java.util.Map inputAttributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote an input tag- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFormEnd
void noteFormEnd() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote the end of a form- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteAHREF
void noteAHREF(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote discovered href- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteLINKHREF
void noteLINKHREF(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote discovered href- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteBASEHREF
void noteBASEHREF(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote base href- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteIMGSRC
void noteIMGSRC(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote discovered IMG SRC- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFRAMESRC
void noteFRAMESRC(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote discovered FRAME SRC- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteTextCharacter
void noteTextCharacter(char textCharacter) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote a character of text. Structured this way to keep overhead low for handlers that don't use text.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
finishUp
void finishUp() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionDone with the document.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-