Class IncrementalIngester.MonitoredAddActivityWrapper
- java.lang.Object
-
- org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.MonitoredAddActivityWrapper
-
- All Implemented Interfaces:
IOutputAddActivity,IOutputCheckActivity,IOutputHistoryActivity,IOutputQualifyActivity
- Enclosing class:
- IncrementalIngester
protected static class IncrementalIngester.MonitoredAddActivityWrapper extends java.lang.Object implements IOutputAddActivity
This class passes everything through, and monitors what happens so that the framework can compensate for any transformation connector coding errors.
-
-
Field Summary
Fields Modifier and Type Field Description protected IOutputAddActivityactivitiesprotected booleandocumentProcessed-
Fields inherited from interface org.apache.manifoldcf.agents.interfaces.IOutputAddActivity
_rcsid
-
Fields inherited from interface org.apache.manifoldcf.agents.interfaces.IOutputHistoryActivity
CREATED_DIRECTORY, EXCEPTION, EXCLUDED_CONTENT, EXCLUDED_DATE, EXCLUDED_LENGTH, EXCLUDED_MIMETYPE, EXCLUDED_URL, HTTP_ERROR, IOEXCEPTION, UNKNOWN_SECURITY
-
-
Constructor Summary
Constructors Constructor Description MonitoredAddActivityWrapper(IOutputAddActivity activities)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description booleancheckDateIndexable(java.util.Date date)Detect if a date is acceptable downstream or not.booleancheckDocumentIndexable(java.io.File localFile)Pre-determine whether a document (passed here as a File object) is acceptable downstream.booleancheckLengthIndexable(long length)Pre-determine whether a document's length is acceptable downstream.booleancheckMimeTypeIndexable(java.lang.String mimeType)Detect if a mime type is acceptable downstream or not.booleancheckURLIndexable(java.lang.String url)Pre-determine whether a document's URL is acceptable downstream.voidnoDocument()Send NO document via the pipeline to the next output connection.java.lang.StringqualifyAccessToken(java.lang.String authorityNameString, java.lang.String accessToken)Qualify an access token appropriately, to match access tokens as returned by mod_aa.voidrecordActivity(java.lang.Long startTime, java.lang.String activityType, java.lang.Long dataSize, java.lang.String entityURI, java.lang.String resultCode, java.lang.String resultDescription)Record time-stamped information about the activity of the output connector.intsendDocument(java.lang.String documentURI, RepositoryDocument document)Send a document via the pipeline to the next output connection.booleanwasDocumentActedUpon()
-
-
-
Field Detail
-
activities
protected final IOutputAddActivity activities
-
documentProcessed
protected boolean documentProcessed
-
-
Constructor Detail
-
MonitoredAddActivityWrapper
public MonitoredAddActivityWrapper(IOutputAddActivity activities)
-
-
Method Detail
-
wasDocumentActedUpon
public boolean wasDocumentActedUpon()
-
sendDocument
public int sendDocument(java.lang.String documentURI, RepositoryDocument document) throws ManifoldCFException, ServiceInterruption, java.io.IOExceptionSend a document via the pipeline to the next output connection.- Specified by:
sendDocumentin interfaceIOutputAddActivity- Parameters:
documentURI- is the document's URI.document- is the document data to be processed (handed to the output data store).- Returns:
- the document status (accepted or permanently rejected); return codes are listed in IPipelineConnector.
- Throws:
java.io.IOException- only if there's an IO error reading the data from the document.ManifoldCFExceptionServiceInterruption
-
noDocument
public void noDocument() throws ManifoldCFException, ServiceInterruptionSend NO document via the pipeline to the next output connection. This is equivalent to sending an empty document placeholder.- Specified by:
noDocumentin interfaceIOutputAddActivity- Throws:
ManifoldCFExceptionServiceInterruption
-
qualifyAccessToken
public java.lang.String qualifyAccessToken(java.lang.String authorityNameString, java.lang.String accessToken) throws ManifoldCFExceptionQualify an access token appropriately, to match access tokens as returned by mod_aa. This method includes the authority name with the access token, if any, so that each authority may establish its own token space.- Specified by:
qualifyAccessTokenin interfaceIOutputQualifyActivity- Parameters:
authorityNameString- is the name of the authority to use to qualify the access token.accessToken- is the raw, repository access token.- Returns:
- the properly qualified access token.
- Throws:
ManifoldCFException
-
recordActivity
public void recordActivity(java.lang.Long startTime, java.lang.String activityType, java.lang.Long dataSize, java.lang.String entityURI, java.lang.String resultCode, java.lang.String resultDescription) throws ManifoldCFExceptionRecord time-stamped information about the activity of the output connector.- Specified by:
recordActivityin interfaceIOutputHistoryActivity- Parameters:
startTime- is either null or the time since the start of epoch in milliseconds (Jan 1, 1970). Every activity has an associated time; the startTime field records when the activity began. A null value indicates that the start time and the finishing time are the same.activityType- is a string which is fully interpretable only in the context of the connector involved, which is used to categorize what kind of activity is being recorded. For example, a web connector might record a "fetch document" activity. Cannot be null.dataSize- is the number of bytes of data involved in the activity, or null if not applicable.entityURI- is a (possibly long) string which identifies the object involved in the history record. The interpretation of this field will differ from connector to connector. May be null.resultCode- contains a terse description of the result of the activity. The description is limited in size to 255 characters, and can be interpreted only in the context of the current connector. May be null.resultDescription- is a (possibly long) human-readable string which adds detail, if required, to the result described in the resultCode field. This field is not meant to be queried on. May be null.- Throws:
ManifoldCFException
-
checkDateIndexable
public boolean checkDateIndexable(java.util.Date date) throws ManifoldCFException, ServiceInterruptionDetect if a date is acceptable downstream or not. This method is used to determine whether it makes sense to fetch a document in the first place.- Specified by:
checkDateIndexablein interfaceIOutputCheckActivity- Parameters:
date- is the date of the document.- Returns:
- true if the document described by the date can be accepted by the downstream connection.
- Throws:
ManifoldCFExceptionServiceInterruption
-
checkMimeTypeIndexable
public boolean checkMimeTypeIndexable(java.lang.String mimeType) throws ManifoldCFException, ServiceInterruptionDetect if a mime type is acceptable downstream or not. This method is used to determine whether it makes sense to fetch a document in the first place.- Specified by:
checkMimeTypeIndexablein interfaceIOutputCheckActivity- Parameters:
mimeType- is the mime type of the document.- Returns:
- true if the mime type can be accepted by the downstream connection.
- Throws:
ManifoldCFExceptionServiceInterruption
-
checkDocumentIndexable
public boolean checkDocumentIndexable(java.io.File localFile) throws ManifoldCFException, ServiceInterruptionPre-determine whether a document (passed here as a File object) is acceptable downstream. This method is used to determine whether a document needs to be actually transferred. This hook is provided mainly to support search engines that only handle a small set of accepted file types.- Specified by:
checkDocumentIndexablein interfaceIOutputCheckActivity- Parameters:
localFile- is the local file to check.- Returns:
- true if the file is acceptable by the downstream connection.
- Throws:
ManifoldCFExceptionServiceInterruption
-
checkLengthIndexable
public boolean checkLengthIndexable(long length) throws ManifoldCFException, ServiceInterruptionPre-determine whether a document's length is acceptable downstream. This method is used to determine whether to fetch a document in the first place.- Specified by:
checkLengthIndexablein interfaceIOutputCheckActivity- Parameters:
length- is the length of the document.- Returns:
- true if the file is acceptable by the downstream connection.
- Throws:
ManifoldCFExceptionServiceInterruption
-
checkURLIndexable
public boolean checkURLIndexable(java.lang.String url) throws ManifoldCFException, ServiceInterruptionPre-determine whether a document's URL is acceptable downstream. This method is used to help filter out documents that cannot be indexed in advance.- Specified by:
checkURLIndexablein interfaceIOutputCheckActivity- Parameters:
url- is the URL of the document.- Returns:
- true if the file is acceptable by the downstream connection.
- Throws:
ManifoldCFExceptionServiceInterruption
-
-