Interface IHTMLHandler
-
- All Superinterfaces:
IDiscoveredLinkHandler
,IMetaTagHandler
- All Known Implementing Classes:
FindContentHandler
,FindHTMLFormHandler
,FindHTMLHrefHandler
,WebcrawlerConnector.ProcessActivityHTMLHandler
public interface IHTMLHandler extends IDiscoveredLinkHandler, IMetaTagHandler
This interface describes the functionality needed by an HTML processor in order to handle an HTML document.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description void
finishUp()
Done with the document.void
noteAHREF(java.lang.String rawURL)
Note discovered hrefvoid
noteBASEHREF(java.lang.String rawURL)
Note base hrefvoid
noteFormEnd()
Note the end of a formvoid
noteFormInput(java.util.Map inputAttributes)
Note an input tagvoid
noteFormStart(java.util.Map formAttributes)
Note the start of a formvoid
noteFRAMESRC(java.lang.String rawURL)
Note discovered FRAME SRCvoid
noteIMGSRC(java.lang.String rawURL)
Note discovered IMG SRCvoid
noteLINKHREF(java.lang.String rawURL)
Note discovered hrefvoid
noteTextCharacter(char textCharacter)
Note a character of text.-
Methods inherited from interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
noteDiscoveredBase, noteDiscoveredLink
-
Methods inherited from interface org.apache.manifoldcf.crawler.connectors.webcrawler.IMetaTagHandler
noteMetaTag
-
-
-
-
Method Detail
-
noteFormStart
void noteFormStart(java.util.Map formAttributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Note the start of a form- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFormInput
void noteFormInput(java.util.Map inputAttributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Note an input tag- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFormEnd
void noteFormEnd() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Note the end of a form- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteAHREF
void noteAHREF(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Note discovered href- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteLINKHREF
void noteLINKHREF(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Note discovered href- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteBASEHREF
void noteBASEHREF(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Note base href- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteIMGSRC
void noteIMGSRC(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Note discovered IMG SRC- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFRAMESRC
void noteFRAMESRC(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Note discovered FRAME SRC- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteTextCharacter
void noteTextCharacter(char textCharacter) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Note a character of text. Structured this way to keep overhead low for handlers that don't use text.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
finishUp
void finishUp() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Done with the document.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-