Interface ISeedingActivity
-
- All Superinterfaces:
IAbortActivity
,IHistoryActivity
,INamingActivity
- All Known Implementing Classes:
SeedingActivity
public interface ISeedingActivity extends IHistoryActivity, INamingActivity, IAbortActivity
This interface abstracts from the activities that a seeding operation can do. See IProcessActivity for a description of the framework's prerequisite event model. This interface too has support for that model.
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
_rcsid
-
Fields inherited from interface org.apache.manifoldcf.crawler.interfaces.IHistoryActivity
BAD_URL, EXCLUDED_CONTENT, EXCLUDED_DATE, EXCLUDED_LENGTH, EXCLUDED_MIMETYPE, EXCLUDED_URL, NULL_URL
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description void
addSeedDocument(java.lang.String documentIdentifier)
Record a "seed" document identifier.void
addSeedDocument(java.lang.String documentIdentifier, java.lang.String[] prereqEventNames)
Record a "seed" document identifier.void
addUnqueuedSeedDocument(java.lang.String documentIdentifier)
This method receives document identifiers that should be considered part of the seeds, but do not need to be queued for processing at this time.-
Methods inherited from interface org.apache.manifoldcf.crawler.interfaces.IAbortActivity
checkJobStillActive
-
Methods inherited from interface org.apache.manifoldcf.crawler.interfaces.IHistoryActivity
recordActivity
-
Methods inherited from interface org.apache.manifoldcf.crawler.interfaces.INamingActivity
createConnectionSpecificString, createGlobalString, createJobSpecificString
-
-
-
-
Field Detail
-
_rcsid
static final java.lang.String _rcsid
- See Also:
- Constant Field Values
-
-
Method Detail
-
addSeedDocument
void addSeedDocument(java.lang.String documentIdentifier, java.lang.String[] prereqEventNames) throws ManifoldCFException
Record a "seed" document identifier. Seeds passed to this method will be loaded into the job's queue at the beginning of the job's execution, and for continuous crawling jobs, periodically throughout the crawl. All documents passed to this method are placed on the "pending documents" list, and are marked as being seed documents. All pending documents will be processed to determine if they have changed or have been deleted. It is not a big problem if the connector chooses to put more documents onto the pending list than are strictly necessary; it is merely a question of overall work required. Note that it is always ok to send MORE documents rather than less to this method.- Parameters:
documentIdentifier
- is the identifier of the document to add to the "pending" queue.prereqEventNames
- is the list of prerequisite events required for this document, or null if none.- Throws:
ManifoldCFException
-
addSeedDocument
void addSeedDocument(java.lang.String documentIdentifier) throws ManifoldCFException
Record a "seed" document identifier. Seeds passed to this method will be loaded into the job's queue at the beginning of the job's execution, and for continuous crawling jobs, periodically throughout the crawl. All documents passed to this method are placed on the "pending documents" list, and are marked as being seed documents. All pending documents will be processed to determine if they have changed or have been deleted. It is not a big problem if the connector chooses to put more documents onto the pending list than are strictly necessary; it is merely a question of overall work required. Note that it is always ok to send MORE documents rather than less to this method.- Parameters:
documentIdentifier
- is the identifier of the document to add to the "pending" queue.- Throws:
ManifoldCFException
-
addUnqueuedSeedDocument
void addUnqueuedSeedDocument(java.lang.String documentIdentifier) throws ManifoldCFException
This method receives document identifiers that should be considered part of the seeds, but do not need to be queued for processing at this time. (This method is used to keep the hopcount tables up to date.) It is allowed to receive more identifiers than it strictly needs to, specifically identifiers that may have also been sent to the getDocumentIdentifiers() method above. However, the connector must constrain the identifiers it sends by the document specification. This method is only required to be called at all if the connector supports hopcount determination (which it should signal by having more than zero legal relationship types returned by the getRelationshipTypes() method).- Parameters:
documentIdentifier
- is the identifier of the document to consider as a seed, but not to put in the "pending" queue.- Throws:
ManifoldCFException
-
-