Class BaseOutputConnector
- java.lang.Object
-
- org.apache.manifoldcf.core.connector.BaseConnector
-
- org.apache.manifoldcf.agents.output.BaseOutputConnector
-
- All Implemented Interfaces:
IOutputConnector
,IPipelineConnector
,IConnector
public abstract class BaseOutputConnector extends BaseConnector implements IOutputConnector
This base class describes an instance of a connection between an output pipeline and the Connector Framework. Each instance of this interface is used in only one thread at a time. Connection Pooling on these kinds of objects is performed by the factory which instantiates repository connectors from symbolic names and config parameters, and is pooled by these parameters. That is, a pooled connector handle is used only if all the connection parameters for the handle match. Implementers of this interface should provide a default constructor which has this signature:xxx();
Connectors are either configured or not. If configured, they will persist in a pool, and be reused multiple times. Certain methods of a connector may be called before the connector is configured. This includes basically all methods that permit inspection of the connector's capabilities.
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
_rcsid
-
Fields inherited from class org.apache.manifoldcf.core.connector.BaseConnector
currentContext, params
-
Fields inherited from interface org.apache.manifoldcf.agents.interfaces.IPipelineConnector
DOCUMENTSTATUS_ACCEPTED, DOCUMENTSTATUS_REJECTED
-
-
Constructor Summary
Constructors Constructor Description BaseOutputConnector()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
addOrReplaceDocumentWithException(java.lang.String documentURI, VersionContext pipelineDescription, RepositoryDocument document, java.lang.String authorityNameString, IOutputAddActivity activities)
Add (or replace) a document in the output data store using the connector.boolean
checkDateIndexable(VersionContext pipelineDescription, java.util.Date date, IOutputCheckActivity checkActivity)
Detect if a document date is acceptable or not.boolean
checkDocumentIndexable(VersionContext pipelineDescription, java.io.File localFile, IOutputCheckActivity checkActivity)
Pre-determine whether a document (passed here as a File object) is acceptable or not.boolean
checkLengthIndexable(VersionContext pipelineDescription, long length, IOutputCheckActivity checkActivity)
Pre-determine whether a document's length is acceptable.boolean
checkMimeTypeIndexable(VersionContext pipelineDescription, java.lang.String mimeType, IOutputCheckActivity checkActivity)
Detect if a mime type is acceptable or not.boolean
checkURLIndexable(VersionContext pipelineDescription, java.lang.String url, IOutputCheckActivity checkActivity)
Pre-determine whether a document's URL is acceptable.java.lang.String[]
getActivitiesList()
Return the list of activities that this connector supports (i.e.java.lang.String
getFormCheckJavascriptMethodName(int connectionSequenceNumber)
Obtain the name of the form check javascript method to call.java.lang.String
getFormPresaveCheckJavascriptMethodName(int connectionSequenceNumber)
Obtain the name of the form presave check javascript method to call.VersionContext
getPipelineDescription(Specification spec)
Get a pipeline version string, given a pipeline specification object.void
noteAllRecordsRemoved()
Notify the connector that all records associated with this connection have been removed.void
noteJobComplete(IOutputNotifyActivity activities)
Notify the connector of a completed job.void
outputSpecificationBody(IHTTPOutput out, java.util.Locale locale, Specification os, int connectionSequenceNumber, int actualSequenceNumber, java.lang.String tabName)
Output the specification body section.void
outputSpecificationHeader(IHTTPOutput out, java.util.Locale locale, Specification os, int connectionSequenceNumber, java.util.List<java.lang.String> tabsArray)
Output the specification header section.java.lang.String
processSpecificationPost(IPostParameters variableContext, java.util.Locale locale, Specification os, int connectionSequenceNumber)
Process a specification post.void
removeDocument(java.lang.String documentURI, java.lang.String outputDescription, IOutputRemoveActivity activities)
Remove a document using the connector.boolean
requestInfo(Configuration output, java.lang.String command)
Request arbitrary connector information.void
viewSpecification(IHTTPOutput out, java.util.Locale locale, Specification os, int connectionSequenceNumber)
View specification.-
Methods inherited from class org.apache.manifoldcf.core.connector.BaseConnector
check, clearThreadContext, connect, deinstall, disconnect, getConfiguration, install, isConnected, outputConfigurationBody, outputConfigurationBody, outputConfigurationHeader, outputConfigurationHeader, outputConfigurationHeader, pack, packFixedList, packList, packList, poll, processConfigurationPost, processConfigurationPost, setThreadContext, unpack, unpackFixedList, unpackList, viewConfiguration, viewConfiguration
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.manifoldcf.core.interfaces.IConnector
check, clearThreadContext, connect, deinstall, disconnect, getConfiguration, install, isConnected, outputConfigurationBody, outputConfigurationHeader, poll, processConfigurationPost, setThreadContext, viewConfiguration
-
-
-
-
Field Detail
-
_rcsid
public static final java.lang.String _rcsid
- See Also:
- Constant Field Values
-
-
Method Detail
-
getActivitiesList
public java.lang.String[] getActivitiesList()
Return the list of activities that this connector supports (i.e. writes into the log).- Specified by:
getActivitiesList
in interfaceIOutputConnector
- Returns:
- the list.
-
requestInfo
public boolean requestInfo(Configuration output, java.lang.String command) throws ManifoldCFException
Request arbitrary connector information. This method is called directly from the API in order to allow API users to perform any one of several connector-specific queries.- Specified by:
requestInfo
in interfaceIOutputConnector
- Parameters:
output
- is the response object, to be filled in by this method.command
- is the command, which is taken directly from the API request.- Returns:
- true if the resource is found, false if not. In either case, output may be filled in.
- Throws:
ManifoldCFException
-
noteJobComplete
public void noteJobComplete(IOutputNotifyActivity activities) throws ManifoldCFException, ServiceInterruption
Notify the connector of a completed job. This is meant to allow the connector to flush any internal data structures it has been keeping around, or to tell the output repository that this is a good time to synchronize things. It is called whenever a job is either completed or aborted.- Specified by:
noteJobComplete
in interfaceIOutputConnector
- Parameters:
activities
- is the handle to an object that the implementer of an output connector may use to perform operations, such as logging processing activity.- Throws:
ManifoldCFException
ServiceInterruption
-
checkDateIndexable
public boolean checkDateIndexable(VersionContext pipelineDescription, java.util.Date date, IOutputCheckActivity checkActivity) throws ManifoldCFException, ServiceInterruption
Detect if a document date is acceptable or not. This method is used to determine whether it makes sense to fetch a document in the first place.- Specified by:
checkDateIndexable
in interfaceIPipelineConnector
- Parameters:
pipelineDescription
- is the document's pipeline version string, for this connection.date
- is the date of the document.checkActivity
- is an object including the activities that can be performed by this method.- Returns:
- true if the document with that date can be accepted by this connector.
- Throws:
ManifoldCFException
ServiceInterruption
-
checkMimeTypeIndexable
public boolean checkMimeTypeIndexable(VersionContext pipelineDescription, java.lang.String mimeType, IOutputCheckActivity checkActivity) throws ManifoldCFException, ServiceInterruption
Detect if a mime type is acceptable or not. This method is used to determine whether it makes sense to fetch a document in the first place.- Specified by:
checkMimeTypeIndexable
in interfaceIPipelineConnector
- Parameters:
pipelineDescription
- is the document's pipeline version string, for this connection.mimeType
- is the mime type of the document.checkActivity
- is an object including the activities that can be performed by this method.- Returns:
- true if the mime type can be accepted by this connector.
- Throws:
ManifoldCFException
ServiceInterruption
-
checkDocumentIndexable
public boolean checkDocumentIndexable(VersionContext pipelineDescription, java.io.File localFile, IOutputCheckActivity checkActivity) throws ManifoldCFException, ServiceInterruption
Pre-determine whether a document (passed here as a File object) is acceptable or not. This method is used to determine whether a document needs to be actually transferred. This hook is provided mainly to support search engines that only handle a small set of accepted file types.- Specified by:
checkDocumentIndexable
in interfaceIPipelineConnector
- Parameters:
pipelineDescription
- is the document's pipeline version string, for this connection.localFile
- is the local file to check.checkActivity
- is an object including the activities that can be done by this method.- Returns:
- true if the file is acceptable, false if not.
- Throws:
ManifoldCFException
ServiceInterruption
-
checkLengthIndexable
public boolean checkLengthIndexable(VersionContext pipelineDescription, long length, IOutputCheckActivity checkActivity) throws ManifoldCFException, ServiceInterruption
Pre-determine whether a document's length is acceptable. This method is used to determine whether to fetch a document in the first place.- Specified by:
checkLengthIndexable
in interfaceIPipelineConnector
- Parameters:
pipelineDescription
- is the document's pipeline version string, for this connection.length
- is the length of the document.checkActivity
- is an object including the activities that can be done by this method.- Returns:
- true if the file is acceptable, false if not.
- Throws:
ManifoldCFException
ServiceInterruption
-
checkURLIndexable
public boolean checkURLIndexable(VersionContext pipelineDescription, java.lang.String url, IOutputCheckActivity checkActivity) throws ManifoldCFException, ServiceInterruption
Pre-determine whether a document's URL is acceptable. This method is used to help filter out documents that cannot be indexed in advance.- Specified by:
checkURLIndexable
in interfaceIPipelineConnector
- Parameters:
pipelineDescription
- is the document's pipeline version string, for this connection.url
- is the URL of the document.checkActivity
- is an object including the activities that can be done by this method.- Returns:
- true if the file is acceptable, false if not.
- Throws:
ManifoldCFException
ServiceInterruption
-
getPipelineDescription
public VersionContext getPipelineDescription(Specification spec) throws ManifoldCFException, ServiceInterruption
Get a pipeline version string, given a pipeline specification object. The version string is used to uniquely describe the pertinent details of the specification and the configuration, to allow the Connector Framework to determine whether a document will need to be processed again. Note that the contents of any document cannot be considered by this method; only configuration and specification information can be considered. This method presumes that the underlying connector object has been configured.- Specified by:
getPipelineDescription
in interfaceIPipelineConnector
- Parameters:
spec
- is the current pipeline specification object for this connection for the job that is doing the crawling.- Returns:
- a string, of unlimited length, which uniquely describes configuration and specification in such a way that if two such strings are equal, nothing that affects how or whether the document is indexed will be different.
- Throws:
ManifoldCFException
ServiceInterruption
-
addOrReplaceDocumentWithException
public int addOrReplaceDocumentWithException(java.lang.String documentURI, VersionContext pipelineDescription, RepositoryDocument document, java.lang.String authorityNameString, IOutputAddActivity activities) throws ManifoldCFException, ServiceInterruption, java.io.IOException
Add (or replace) a document in the output data store using the connector. This method presumes that the connector object has been configured, and it is thus able to communicate with the output data store should that be necessary.- Specified by:
addOrReplaceDocumentWithException
in interfaceIPipelineConnector
- Parameters:
documentURI
- is the URI of the document. The URI is presumed to be the unique identifier which the output data store will use to process and serve the document. This URI is constructed by the repository connector which fetches the document, and is thus universal across all output connectors.pipelineDescription
- includes the description string that was constructed for this document by the getOutputDescription() method.document
- is the document data to be processed (handed to the output data store).authorityNameString
- is the name of the authority responsible for authorizing any access tokens passed in with the repository document. May be null.activities
- is the handle to an object that the implementer of a pipeline connector may use to perform operations, such as logging processing activity, or sending a modified document to the next stage in the pipeline.- Returns:
- the document status (accepted or permanently rejected).
- Throws:
java.io.IOException
- only if there's a stream error reading the document data.ManifoldCFException
ServiceInterruption
-
removeDocument
public void removeDocument(java.lang.String documentURI, java.lang.String outputDescription, IOutputRemoveActivity activities) throws ManifoldCFException, ServiceInterruption
Remove a document using the connector. Note that the last outputDescription is included, since it may be necessary for the connector to use such information to know how to properly remove the document.- Specified by:
removeDocument
in interfaceIOutputConnector
- Parameters:
documentURI
- is the URI of the document. The URI is presumed to be the unique identifier which the output data store will use to process and serve the document. This URI is constructed by the repository connector which fetches the document, and is thus universal across all output connectors.outputDescription
- is the last description string that was constructed for this document by the getOutputDescription() method above.activities
- is the handle to an object that the implementer of an output connector may use to perform operations, such as logging processing activity.- Throws:
ManifoldCFException
ServiceInterruption
-
noteAllRecordsRemoved
public void noteAllRecordsRemoved() throws ManifoldCFException
Notify the connector that all records associated with this connection have been removed. This method allows the connector to remove any internal data storage that is associated with records sent to the index on behalf of a connection. It should not attempt to communicate with the output index.- Specified by:
noteAllRecordsRemoved
in interfaceIOutputConnector
- Throws:
ManifoldCFException
-
getFormCheckJavascriptMethodName
public java.lang.String getFormCheckJavascriptMethodName(int connectionSequenceNumber)
Obtain the name of the form check javascript method to call.- Specified by:
getFormCheckJavascriptMethodName
in interfaceIPipelineConnector
- Parameters:
connectionSequenceNumber
- is the unique number of this connection within the job.- Returns:
- the name of the form check javascript method.
-
getFormPresaveCheckJavascriptMethodName
public java.lang.String getFormPresaveCheckJavascriptMethodName(int connectionSequenceNumber)
Obtain the name of the form presave check javascript method to call.- Specified by:
getFormPresaveCheckJavascriptMethodName
in interfaceIPipelineConnector
- Parameters:
connectionSequenceNumber
- is the unique number of this connection within the job.- Returns:
- the name of the form presave check javascript method.
-
outputSpecificationHeader
public void outputSpecificationHeader(IHTTPOutput out, java.util.Locale locale, Specification os, int connectionSequenceNumber, java.util.List<java.lang.String> tabsArray) throws ManifoldCFException, java.io.IOException
Output the specification header section. This method is called in the head section of a job page which has selected an output connection of the current type. Its purpose is to add the required tabs to the list, and to output any javascript methods that might be needed by the job editing HTML.- Specified by:
outputSpecificationHeader
in interfaceIPipelineConnector
- Parameters:
out
- is the output to which any HTML should be sent.locale
- is the preferred local of the output.os
- is the current output specification for this job.connectionSequenceNumber
- is the unique number of this connection within the job.tabsArray
- is an array of tab names. Add to this array any tab names that are specific to the connector.- Throws:
ManifoldCFException
java.io.IOException
-
outputSpecificationBody
public void outputSpecificationBody(IHTTPOutput out, java.util.Locale locale, Specification os, int connectionSequenceNumber, int actualSequenceNumber, java.lang.String tabName) throws ManifoldCFException, java.io.IOException
Output the specification body section. This method is called in the body section of a job page which has selected an output connection of the current type. Its purpose is to present the required form elements for editing. The coder can presume that the HTML that is output from this configuration will be within appropriate <html>, <body>, and <form> tags. The name of the form is "editjob".- Specified by:
outputSpecificationBody
in interfaceIPipelineConnector
- Parameters:
out
- is the output to which any HTML should be sent.locale
- is the preferred local of the output.os
- is the current output specification for this job.connectionSequenceNumber
- is the unique number of this connection within the job.actualSequenceNumber
- is the connection within the job that has currently been selected.tabName
- is the current tab name.- Throws:
ManifoldCFException
java.io.IOException
-
processSpecificationPost
public java.lang.String processSpecificationPost(IPostParameters variableContext, java.util.Locale locale, Specification os, int connectionSequenceNumber) throws ManifoldCFException
Process a specification post. This method is called at the start of job's edit or view page, whenever there is a possibility that form data for a connection has been posted. Its purpose is to gather form information and modify the output specification accordingly. The name of the posted form is "editjob".- Specified by:
processSpecificationPost
in interfaceIPipelineConnector
- Parameters:
variableContext
- contains the post data, including binary file-upload information.locale
- is the preferred local of the output.os
- is the current output specification for this job.connectionSequenceNumber
- is the unique number of this connection within the job.- Returns:
- null if all is well, or a string error message if there is an error that should prevent saving of the job (and cause a redirection to an error page).
- Throws:
ManifoldCFException
-
viewSpecification
public void viewSpecification(IHTTPOutput out, java.util.Locale locale, Specification os, int connectionSequenceNumber) throws ManifoldCFException, java.io.IOException
View specification. This method is called in the body section of a job's view page. Its purpose is to present the output specification information to the user. The coder can presume that the HTML that is output from this configuration will be within appropriate <html> and <body> tags.- Specified by:
viewSpecification
in interfaceIPipelineConnector
- Parameters:
out
- is the output to which any HTML should be sent.locale
- is the preferred local of the output.connectionSequenceNumber
- is the unique number of this connection within the job.os
- is the current output specification for this job.- Throws:
ManifoldCFException
java.io.IOException
-
-