Class JobManager

  • All Implemented Interfaces:
    IJobManager

    public class JobManager
    extends java.lang.Object
    implements IJobManager
    This is the main job manager. It provides methods that support both job definition, and the threads that execute the jobs.
    • Method Detail

      • writeEnumeratedValues

        protected static void writeEnumeratedValues​(java.io.OutputStream os,
                                                    EnumeratedValues ev)
                                             throws java.io.IOException
        Throws:
        java.io.IOException
      • readEnumeratedValues

        protected EnumeratedValues readEnumeratedValues​(java.io.InputStream is)
                                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • noteConnectorDeregistration

        public void noteConnectorDeregistration​(java.lang.String[] connectionNames)
                                         throws ManifoldCFException
        Note the deregistration of a connector used by the specified connections. This method will be called when the connector is deregistered. Jobs that use these connections must therefore enter appropriate states.
        Specified by:
        noteConnectorDeregistration in interface IJobManager
        Parameters:
        connectionNames - is the set of connection names.
        Throws:
        ManifoldCFException
      • noteConnectionDeregistration

        protected void noteConnectionDeregistration​(java.util.List<java.lang.String> list)
                                             throws ManifoldCFException
        Note deregistration for a batch of connection names.
        Throws:
        ManifoldCFException
      • noteConnectorRegistration

        public void noteConnectorRegistration​(java.lang.String[] connectionNames)
                                       throws ManifoldCFException
        Note the registration of a connector used by the specified connections. This method will be called when a connector is registered, on which the specified connections depend.
        Specified by:
        noteConnectorRegistration in interface IJobManager
        Parameters:
        connectionNames - is the set of connection names.
        Throws:
        ManifoldCFException
      • noteConnectionRegistration

        protected void noteConnectionRegistration​(java.util.List<java.lang.String> list)
                                           throws ManifoldCFException
        Note registration for a batch of connection names.
        Throws:
        ManifoldCFException
      • noteNotificationConnectorDeregistration

        public void noteNotificationConnectorDeregistration​(java.lang.String[] connectionNames)
                                                     throws ManifoldCFException
        Note the deregistration of a notification connector used by the specified connections. This method will be called when the connector is deregistered. Jobs that use these connections must therefore enter appropriate states.
        Specified by:
        noteNotificationConnectorDeregistration in interface IJobManager
        Parameters:
        connectionNames - is the set of connection names.
        Throws:
        ManifoldCFException
      • noteNotificationConnectionDeregistration

        protected void noteNotificationConnectionDeregistration​(java.util.List<java.lang.String> list)
                                                         throws ManifoldCFException
        Note deregistration for a batch of notification connection names.
        Throws:
        ManifoldCFException
      • noteNotificationConnectorRegistration

        public void noteNotificationConnectorRegistration​(java.lang.String[] connectionNames)
                                                   throws ManifoldCFException
        Note the registration of a notification connector used by the specified connections. This method will be called when a connector is registered, on which the specified connections depend.
        Specified by:
        noteNotificationConnectorRegistration in interface IJobManager
        Parameters:
        connectionNames - is the set of connection names.
        Throws:
        ManifoldCFException
      • noteNotificationConnectionRegistration

        protected void noteNotificationConnectionRegistration​(java.util.List<java.lang.String> list)
                                                       throws ManifoldCFException
        Note registration for a batch of connection names.
        Throws:
        ManifoldCFException
      • noteOutputConnectorDeregistration

        public void noteOutputConnectorDeregistration​(java.lang.String[] connectionNames)
                                               throws ManifoldCFException
        Note the deregistration of an output connector used by the specified connections. This method will be called when the connector is deregistered. Jobs that use these connections must therefore enter appropriate states.
        Specified by:
        noteOutputConnectorDeregistration in interface IJobManager
        Parameters:
        connectionNames - is the set of connection names.
        Throws:
        ManifoldCFException
      • noteOutputConnectionDeregistration

        protected void noteOutputConnectionDeregistration​(java.util.List<java.lang.String> list)
                                                   throws ManifoldCFException
        Note deregistration for a batch of output connection names.
        Throws:
        ManifoldCFException
      • noteOutputConnectorRegistration

        public void noteOutputConnectorRegistration​(java.lang.String[] connectionNames)
                                             throws ManifoldCFException
        Note the registration of an output connector used by the specified connections. This method will be called when a connector is registered, on which the specified connections depend.
        Specified by:
        noteOutputConnectorRegistration in interface IJobManager
        Parameters:
        connectionNames - is the set of connection names.
        Throws:
        ManifoldCFException
      • noteOutputConnectionRegistration

        protected void noteOutputConnectionRegistration​(java.util.List<java.lang.String> list)
                                                 throws ManifoldCFException
        Note registration for a batch of output connection names.
        Throws:
        ManifoldCFException
      • noteTransformationConnectorDeregistration

        public void noteTransformationConnectorDeregistration​(java.lang.String[] connectionNames)
                                                       throws ManifoldCFException
        Note the deregistration of a transformation connector used by the specified connections. This method will be called when the connector is deregistered. Jobs that use these connections must therefore enter appropriate states.
        Specified by:
        noteTransformationConnectorDeregistration in interface IJobManager
        Parameters:
        connectionNames - is the set of connection names.
        Throws:
        ManifoldCFException
      • noteTransformationConnectionDeregistration

        protected void noteTransformationConnectionDeregistration​(java.util.List<java.lang.String> list)
                                                           throws ManifoldCFException
        Note deregistration for a batch of transformation connection names.
        Throws:
        ManifoldCFException
      • noteTransformationConnectorRegistration

        public void noteTransformationConnectorRegistration​(java.lang.String[] connectionNames)
                                                     throws ManifoldCFException
        Note the registration of a transformation connector used by the specified connections. This method will be called when a connector is registered, on which the specified connections depend.
        Specified by:
        noteTransformationConnectorRegistration in interface IJobManager
        Parameters:
        connectionNames - is the set of connection names.
        Throws:
        ManifoldCFException
      • noteTransformationConnectionRegistration

        protected void noteTransformationConnectionRegistration​(java.util.List<java.lang.String> list)
                                                         throws ManifoldCFException
        Note registration for a batch of transformation connection names.
        Throws:
        ManifoldCFException
      • noteConnectionChange

        public void noteConnectionChange​(java.lang.String connectionName)
                                  throws ManifoldCFException
        Note a change in connection configuration. This method will be called whenever a connection's configuration is modified, or when an external repository change is signalled.
        Specified by:
        noteConnectionChange in interface IJobManager
        Throws:
        ManifoldCFException
      • noteNotificationConnectionChange

        public void noteNotificationConnectionChange​(java.lang.String connectionName)
                                              throws ManifoldCFException
        Note a change in notification connection configuration. This method will be called whenever a notification connection's configuration is modified, or when an external repository change is signalled.
        Specified by:
        noteNotificationConnectionChange in interface IJobManager
        Throws:
        ManifoldCFException
      • noteOutputConnectionChange

        public void noteOutputConnectionChange​(java.lang.String connectionName)
                                        throws ManifoldCFException
        Note a change in output connection configuration. This method will be called whenever a connection's configuration is modified, or when an external target config change is signalled.
        Specified by:
        noteOutputConnectionChange in interface IJobManager
        Throws:
        ManifoldCFException
      • getHopLockName

        protected java.lang.String getHopLockName​(java.lang.Long jobID)
        Get the hoplock for a given job ID
      • deleteJob

        public void deleteJob​(java.lang.Long id)
                       throws ManifoldCFException
        Delete a job.
        Specified by:
        deleteJob in interface IJobManager
        Parameters:
        id - is the job's identifier. This method will purge all the records belonging to the job from the database, as well as remove all documents indexed by the job from the index.
        Throws:
        ManifoldCFException
      • checkIfReference

        public boolean checkIfReference​(java.lang.String connectionName)
                                 throws ManifoldCFException
        See if there's a reference to a connection name.
        Specified by:
        checkIfReference in interface IJobManager
        Parameters:
        connectionName - is the name of the connection.
        Returns:
        true if there is a reference, false otherwise.
        Throws:
        ManifoldCFException
      • checkIfNotificationReference

        public boolean checkIfNotificationReference​(java.lang.String connectionName)
                                             throws ManifoldCFException
        See if there's a reference to a notification connection name.
        Specified by:
        checkIfNotificationReference in interface IJobManager
        Parameters:
        connectionName - is the name of the connection.
        Returns:
        true if there is a reference, false otherwise.
        Throws:
        ManifoldCFException
      • checkIfOutputReference

        public boolean checkIfOutputReference​(java.lang.String connectionName)
                                       throws ManifoldCFException
        See if there's a reference to an output connection name.
        Specified by:
        checkIfOutputReference in interface IJobManager
        Parameters:
        connectionName - is the name of the connection.
        Returns:
        true if there is a reference, false otherwise.
        Throws:
        ManifoldCFException
      • checkIfTransformationReference

        public boolean checkIfTransformationReference​(java.lang.String connectionName)
                                               throws ManifoldCFException
        See if there's a reference to a transformation connection name.
        Specified by:
        checkIfTransformationReference in interface IJobManager
        Parameters:
        connectionName - is the name of the connection.
        Returns:
        true if there is a reference, false otherwise.
        Throws:
        ManifoldCFException
      • cleanupProcessData

        public void cleanupProcessData​(java.lang.String processID)
                                throws ManifoldCFException
        Reset the job queue for an individual process ID. If a node was shut down in the middle of doing something, sufficient information should be around in the database to allow the node's activities to be cleaned up.
        Specified by:
        cleanupProcessData in interface IJobManager
        Parameters:
        processID - is the process ID of the node we want to clean up after.
        Throws:
        ManifoldCFException
      • cleanupProcessData

        public void cleanupProcessData()
                                throws ManifoldCFException
        Reset the job queue for all process IDs. If a node was shut down in the middle of doing something, sufficient information should be around in the database to allow the node's activities to be cleaned up.
        Specified by:
        cleanupProcessData in interface IJobManager
        Throws:
        ManifoldCFException
      • prepareForClusterStart

        public void prepareForClusterStart()
                                    throws ManifoldCFException
        Prepare to start the entire cluster. If there are no other nodes alive, then at the time the first node comes up, we need to reset the job queue for ALL processes that had been running before. This method must be called in addition to cleanupProcessData().
        Specified by:
        prepareForClusterStart in interface IJobManager
        Throws:
        ManifoldCFException
      • getNextCleanableDocuments

        public DocumentSetAndFlags getNextCleanableDocuments​(java.lang.String processID,
                                                             int maxCount,
                                                             long currentTime)
                                                      throws ManifoldCFException
        Get list of cleanable document descriptions. This list will take into account multiple jobs that may own the same document. All documents for which a description is returned will be transitioned to the "beingcleaned" state. Documents which are not in transition and are eligible, but are owned by other jobs, will have their jobqueue entries deleted by this method.
        Specified by:
        getNextCleanableDocuments in interface IJobManager
        Parameters:
        processID - is the current process ID.
        maxCount - is the maximum number of documents to return.
        currentTime - is the current time; some fetches do not occur until a specific time.
        Returns:
        the document descriptions for these documents.
        Throws:
        ManifoldCFException
      • makeCompositeID

        protected static java.lang.String makeCompositeID​(java.lang.String docIDHash,
                                                          java.lang.String connectionName)
        Create a composite document hash key. This consists of the document id hash plus the connection name.
      • getNextDeletableDocuments

        public DocumentDescription[] getNextDeletableDocuments​(java.lang.String processID,
                                                               int maxCount,
                                                               long currentTime)
                                                        throws ManifoldCFException
        Get list of deletable document descriptions. This list will take into account multiple jobs that may own the same document. All documents for which a description is returned will be transitioned to the "beingdeleted" state. Documents which are not in transition and are eligible, but are owned by other jobs, will have their jobqueue entries deleted by this method.
        Specified by:
        getNextDeletableDocuments in interface IJobManager
        Parameters:
        processID - is the current process ID.
        maxCount - is the maximum number of documents to return.
        currentTime - is the current time; some fetches do not occur until a specific time.
        Returns:
        the document descriptions for these documents.
        Throws:
        ManifoldCFException
      • getUnindexableDocumentIdentifiers

        protected java.lang.String[] getUnindexableDocumentIdentifiers​(DocumentDescription[] documentIdentifiers,
                                                                       java.lang.String connectionName)
                                                                throws ManifoldCFException
        Get a list of document identifiers that should actually be deleted from the index, from a list that might contain identifiers that are shared with other jobs, which are targeted to the same output connection. The input list is guaranteed to be smaller in size than maxInClauseCount for the database.
        Parameters:
        documentIdentifiers - is the set of document identifiers to consider.
        connectionName - is the connection name for ALL the document identifiers.
        Returns:
        the set of documents which should be removed from the index, where there are no potential conflicts.
        Throws:
        ManifoldCFException
      • clearAllDocumentPriorities

        public void clearAllDocumentPriorities()
                                        throws ManifoldCFException
        Clear all document priorities, in preparation for reprioritization of all previously-prioritized documents. This method is called to start the dynamic reprioritization cycle, which follows this method with explicit prioritization of all documents, piece-meal, using getNextNotYetProcessedReprioritizationDocuments(), and writeDocumentPriorities().
        Specified by:
        clearAllDocumentPriorities in interface IJobManager
        Throws:
        ManifoldCFException
      • getNextNotYetProcessedReprioritizationDocuments

        public DocumentDescription[] getNextNotYetProcessedReprioritizationDocuments​(java.lang.String processID,
                                                                                     int n)
                                                                              throws ManifoldCFException
        Get a list of not-yet-processed documents to reprioritize. Documents in all jobs will be returned by this method. Up to n document descriptions will be returned.
        Specified by:
        getNextNotYetProcessedReprioritizationDocuments in interface IJobManager
        Parameters:
        processID - is the process that requests the reprioritization documents.
        n - is the maximum number of document descriptions desired.
        Returns:
        the document descriptions.
        Throws:
        ManifoldCFException
      • writeDocumentPriorities

        public void writeDocumentPriorities​(DocumentDescription[] documentDescriptions,
                                            IPriorityCalculator[] priorities)
                                     throws ManifoldCFException
        Save a set of document priorities. In the case where a document was eligible to have its priority set, but it no longer is eligible, then the provided priority will not be written.
        Specified by:
        writeDocumentPriorities in interface IJobManager
        Parameters:
        documentDescriptions - are the document descriptions.
        priorities - are the desired priorities.
        Throws:
        ManifoldCFException
      • getExpiredDocuments

        public DocumentSetAndFlags getExpiredDocuments​(java.lang.String processID,
                                                       int n,
                                                       long currentTime)
                                                throws ManifoldCFException
        Get up to the next n documents to be expired. This method marks the documents whose descriptions have been returned as "being processed", or active. The same marking is used as is used for documents that have been queued for worker threads. The model is thus identical.
        Specified by:
        getExpiredDocuments in interface IJobManager
        Parameters:
        processID - is the current process ID.
        n - is the maximum number of records desired.
        currentTime - is the current time.
        Returns:
        the array of document descriptions to expire.
        Throws:
        ManifoldCFException
      • getNextDocuments

        public DocumentDescription[] getNextDocuments​(java.lang.String processID,
                                                      int n,
                                                      long currentTime,
                                                      long interval,
                                                      BlockingDocuments blockingDocuments,
                                                      PerformanceStatistics statistics,
                                                      DepthStatistics scanRecord)
                                               throws ManifoldCFException
        /** Get up to the next n document(s) to be fetched and processed. This fetch returns records that contain the document identifier, plus all instructions pertaining to the document's handling (e.g. whether it should be refetched if the version has not changed). This method also marks the documents whose descriptions have be returned as "being processed".
        Specified by:
        getNextDocuments in interface IJobManager
        Parameters:
        processID - is the current process ID.
        n - is the maximum number of records desired.
        currentTime - is the current time; some fetches do not occur until a specific time.
        interval - is the number of milliseconds that this set of documents should represent (for throttling).
        blockingDocuments - is the place to record documents that were encountered, are eligible for reprioritization, but could not be queued due to throttling considerations.
        statistics - are the current performance statistics per connection, which are used to balance the queue stuffing so that individual connections are not overwhelmed.
        scanRecord - retains the bins from all documents encountered from the query, even those that were skipped due to being overcommitted.
        Returns:
        the array of document descriptions to fetch and process.
        Throws:
        ManifoldCFException
      • checkJobActive

        public boolean checkJobActive​(java.lang.Long jobID)
                               throws ManifoldCFException
        Verify that a specific job is indeed still active. This is used to permit abort or pause to be relatively speedy. The query done within MUST be cached in order to not cause undue performance degradation.
        Specified by:
        checkJobActive in interface IJobManager
        Parameters:
        jobID - is the job identifier.
        Returns:
        true if the job is in one of the "active" states.
        Throws:
        ManifoldCFException
      • markDocumentCompletedMultiple

        public void markDocumentCompletedMultiple​(DocumentDescription[] documentDescriptions)
                                           throws ManifoldCFException
        Note completion of document processing by a job thread of a document. This method causes the state of the document to be marked as "completed".
        Specified by:
        markDocumentCompletedMultiple in interface IJobManager
        Parameters:
        documentDescriptions - are the description objects for the documents that were processed.
        Throws:
        ManifoldCFException
      • markDocumentCompleted

        public void markDocumentCompleted​(DocumentDescription documentDescription)
                                   throws ManifoldCFException
        Note completion of document processing by a job thread of a document. This method causes the state of the document to be marked as "completed".
        Specified by:
        markDocumentCompleted in interface IJobManager
        Parameters:
        documentDescription - is the description object for the document that was processed.
        Throws:
        ManifoldCFException
      • markDocumentDeletedMultiple

        public DocumentDescription[] markDocumentDeletedMultiple​(java.lang.Long jobID,
                                                                 java.lang.String[] legalLinkTypes,
                                                                 DocumentDescription[] documentDescriptions,
                                                                 int hopcountMethod)
                                                          throws ManifoldCFException
        Delete from queue as a result of processing of an active document. The document is expected to be in one of the active states: ACTIVE, ACTIVESEEDING, ACTIVENEEDSRESCAN, ACTIVESEEDINGNEEDSRESCAN. The RESCAN variants are interpreted as meaning that the document should not be deleted, but should instead be popped back on the queue for a repeat processing attempt.
        Specified by:
        markDocumentDeletedMultiple in interface IJobManager
        Parameters:
        documentDescriptions - are the set of description objects for the documents that were processed.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • markDocumentDeleted

        public DocumentDescription[] markDocumentDeleted​(java.lang.Long jobID,
                                                         java.lang.String[] legalLinkTypes,
                                                         DocumentDescription documentDescription,
                                                         int hopcountMethod)
                                                  throws ManifoldCFException
        Delete from queue as a result of processing of an active document. The document is expected to be in one of the active states: ACTIVE, ACTIVESEEDING, ACTIVENEEDSRESCAN, ACTIVESEEDINGNEEDSRESCAN. The RESCAN variants are interpreted as meaning that the document should not be deleted, but should instead be popped back on the queue for a repeat processing attempt.
        Specified by:
        markDocumentDeleted in interface IJobManager
        Parameters:
        documentDescription - is the description object for the document that was processed.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • markDocumentHopcountRemovalMultiple

        public DocumentDescription[] markDocumentHopcountRemovalMultiple​(java.lang.Long jobID,
                                                                         java.lang.String[] legalLinkTypes,
                                                                         DocumentDescription[] documentDescriptions,
                                                                         int hopcountMethod)
                                                                  throws ManifoldCFException
        Mark hopcount removal from queue as a result of processing of an active document. The document is expected to be in one of the active states: ACTIVE, ACTIVESEEDING, ACTIVENEEDSRESCAN, ACTIVESEEDINGNEEDSRESCAN. The RESCAN variants are interpreted as meaning that the document should not be marked as removed, but should instead be popped back on the queue for a repeat processing attempt.
        Specified by:
        markDocumentHopcountRemovalMultiple in interface IJobManager
        Parameters:
        documentDescriptions - are the set of description objects for the documents that were processed.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • markDocumentHopcountRemoval

        public DocumentDescription[] markDocumentHopcountRemoval​(java.lang.Long jobID,
                                                                 java.lang.String[] legalLinkTypes,
                                                                 DocumentDescription documentDescription,
                                                                 int hopcountMethod)
                                                          throws ManifoldCFException
        Mark hopcount removal from queue as a result of processing of an active document. The document is expected to be in one of the active states: ACTIVE, ACTIVESEEDING, ACTIVENEEDSRESCAN, ACTIVESEEDINGNEEDSRESCAN. The RESCAN variants are interpreted as meaning that the document should not be marked as removed, but should instead be popped back on the queue for a repeat processing attempt.
        Specified by:
        markDocumentHopcountRemoval in interface IJobManager
        Parameters:
        documentDescription - is the description object for the document that was processed.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • markDocumentExpiredMultiple

        public DocumentDescription[] markDocumentExpiredMultiple​(java.lang.Long jobID,
                                                                 java.lang.String[] legalLinkTypes,
                                                                 DocumentDescription[] documentDescriptions,
                                                                 int hopcountMethod)
                                                          throws ManifoldCFException
        Delete from queue as a result of expiration of an active document. The document is expected to be in one of the active states: ACTIVE, ACTIVESEEDING, ACTIVENEEDSRESCAN, ACTIVESEEDINGNEEDSRESCAN. Since the document expired, no special activity takes place as a result of the document being in a RESCAN state.
        Specified by:
        markDocumentExpiredMultiple in interface IJobManager
        Parameters:
        documentDescriptions - are the set of description objects for the documents that were processed.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • markDocumentExpired

        public DocumentDescription[] markDocumentExpired​(java.lang.Long jobID,
                                                         java.lang.String[] legalLinkTypes,
                                                         DocumentDescription documentDescription,
                                                         int hopcountMethod)
                                                  throws ManifoldCFException
        Delete from queue as a result of expiration of an active document. The document is expected to be in one of the active states: ACTIVE, ACTIVESEEDING, ACTIVENEEDSRESCAN, ACTIVESEEDINGNEEDSRESCAN. Since the document expired, no special activity takes place as a result of the document being in a RESCAN state.
        Specified by:
        markDocumentExpired in interface IJobManager
        Parameters:
        documentDescription - is the description object for the document that was processed.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • markDocumentCleanedUpMultiple

        public DocumentDescription[] markDocumentCleanedUpMultiple​(java.lang.Long jobID,
                                                                   java.lang.String[] legalLinkTypes,
                                                                   DocumentDescription[] documentDescriptions,
                                                                   int hopcountMethod)
                                                            throws ManifoldCFException
        Delete from queue as a result of cleaning up an unreachable document. The document is expected to be in the PURGATORY state. There is never any need to reprocess the document.
        Specified by:
        markDocumentCleanedUpMultiple in interface IJobManager
        Parameters:
        documentDescriptions - are the set of description objects for the documents that were processed.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • markDocumentCleanedUp

        public DocumentDescription[] markDocumentCleanedUp​(java.lang.Long jobID,
                                                           java.lang.String[] legalLinkTypes,
                                                           DocumentDescription documentDescription,
                                                           int hopcountMethod)
                                                    throws ManifoldCFException
        Delete from queue as a result of cleaning up an unreachable document. The document is expected to be in the PURGATORY state. There is never any need to reprocess the document.
        Specified by:
        markDocumentCleanedUp in interface IJobManager
        Parameters:
        documentDescription - is the description object for the document that was processed.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • doDeleteMultiple

        protected DocumentDescription[] doDeleteMultiple​(java.lang.Long jobID,
                                                         java.lang.String[] legalLinkTypes,
                                                         DocumentDescription[] documentDescriptions,
                                                         int hopcountMethod)
                                                  throws ManifoldCFException
        Delete documents with no repercussions. We don't have to worry about the current state of each document, since the document is definitely going away.
        Parameters:
        documentDescriptions - are the set of description objects for the documents that were processed.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • calculateAffectedDeleteCarrydownChildren

        protected DocumentDescription[] calculateAffectedDeleteCarrydownChildren​(java.lang.Long jobID,
                                                                                 java.lang.String[] docIDHashes)
                                                                          throws ManifoldCFException
        Helper method: Find the document descriptions that will be affected due to carrydown row deletions.
        Throws:
        ManifoldCFException
      • maxClauseProcessDeleteHashSet

        protected int maxClauseProcessDeleteHashSet()
        Get maximum count.
      • processDeleteHashSet

        protected void processDeleteHashSet​(java.lang.Long jobID,
                                            java.util.HashMap resultHash,
                                            java.util.ArrayList list)
                                     throws ManifoldCFException
        Helper method: look up rows affected by a deleteRecords operation.
        Throws:
        ManifoldCFException
      • requeueDocumentMultiple

        public void requeueDocumentMultiple​(DocumentDescription[] documentDescriptions,
                                            java.lang.Long[] executeTimes,
                                            int[] actions)
                                     throws ManifoldCFException
        Requeue a document for further processing in the future. This method is called after a document is processed, when the job is a "continuous" one. It is essentially equivalent to noting that the document processing is complete, except the document remains on the queue.
        Specified by:
        requeueDocumentMultiple in interface IJobManager
        Parameters:
        documentDescriptions - is the set of description objects for the document that was processed.
        executeTimes - are the times that the documents should be rescanned. Null indicates "never".
        actions - are what should be done when the time arrives. Choices are ACTION_RESCAN or ACTION_REMOVE.
        Throws:
        ManifoldCFException
      • requeueDocument

        public void requeueDocument​(DocumentDescription documentDescription,
                                    java.lang.Long executeTime,
                                    int action)
                             throws ManifoldCFException
        Requeue a document for further processing in the future. This method is called after a document is processed, when the job is a "continuous" one. It is essentially equivalent to noting that the document processing is complete, except the document remains on the queue.
        Specified by:
        requeueDocument in interface IJobManager
        Parameters:
        documentDescription - is the description object for the document that was processed.
        executeTime - is the time that the document should be rescanned. Null indicates "never".
        action - is what should be done when the time arrives. Choices include ACTION_RESCAN or ACTION_REMOVE.
        Throws:
        ManifoldCFException
      • resetDocumentMultiple

        public void resetDocumentMultiple​(DocumentDescription[] documentDescriptions,
                                          long executeTime,
                                          int action,
                                          long failTime,
                                          int failCount)
                                   throws ManifoldCFException
        Reset a set of documents for further processing in the future. This method is called after some unknown number of the documents were processed, but then a service interruption occurred. Note well: The logic here basically presumes that we cannot know whether the documents were indeed processed or not. If we knew for a fact that none of the documents had been handled, it would be possible to look at the document's current status and decide what the new status ought to be, based on a true rollback scenario. Such cases, however, are rare enough so that special logic is probably not worth it.
        Specified by:
        resetDocumentMultiple in interface IJobManager
        Parameters:
        documentDescriptions - is the set of description objects for the document that was processed.
        executeTime - is the time that the documents should be rescanned.
        failTime - is the time beyond which a service interruption will be considered a hard failure.
        failCount - is the number of retries beyond which a service interruption will be considered a hard failure.
        Throws:
        ManifoldCFException
      • resetCleaningDocumentMultiple

        public void resetCleaningDocumentMultiple​(DocumentDescription[] documentDescriptions,
                                                  long checkTime)
                                           throws ManifoldCFException
        Reset a set of cleaning documents for further processing in the future. This method is called after some unknown number of the documents were cleaned, but then an ingestion service interruption occurred. Note well: The logic here basically presumes that we cannot know whether the documents were indeed cleaned or not. If we knew for a fact that none of the documents had been handled, it would be possible to look at the document's current status and decide what the new status ought to be, based on a true rollback scenario. Such cases, however, are rare enough so that special logic is probably not worth it.
        Specified by:
        resetCleaningDocumentMultiple in interface IJobManager
        Parameters:
        documentDescriptions - is the set of description objects for the document that was cleaned.
        checkTime - is the minimum time for the next cleaning attempt.
        Throws:
        ManifoldCFException
      • resetCleaningDocument

        public void resetCleaningDocument​(DocumentDescription documentDescription,
                                          long checkTime)
                                   throws ManifoldCFException
        Reset a cleaning document back to its former state. This gets done when a deleting thread sees a service interruption, etc., from the ingestion system.
        Specified by:
        resetCleaningDocument in interface IJobManager
        Parameters:
        documentDescription - is the description of the document that was cleaned.
        checkTime - is the minimum time for the next cleaning attempt.
        Throws:
        ManifoldCFException
      • resetDeletingDocumentMultiple

        public void resetDeletingDocumentMultiple​(DocumentDescription[] documentDescriptions,
                                                  long checkTime)
                                           throws ManifoldCFException
        Reset a set of deleting documents for further processing in the future. This method is called after some unknown number of the documents were deleted, but then an ingestion service interruption occurred. Note well: The logic here basically presumes that we cannot know whether the documents were indeed processed or not. If we knew for a fact that none of the documents had been handled, it would be possible to look at the document's current status and decide what the new status ought to be, based on a true rollback scenario. Such cases, however, are rare enough so that special logic is probably not worth it.
        Specified by:
        resetDeletingDocumentMultiple in interface IJobManager
        Parameters:
        documentDescriptions - is the set of description objects for the document that was processed.
        checkTime - is the minimum time for the next cleaning attempt.
        Throws:
        ManifoldCFException
      • resetDeletingDocument

        public void resetDeletingDocument​(DocumentDescription documentDescription,
                                          long checkTime)
                                   throws ManifoldCFException
        Reset a deleting document back to its former state. This gets done when a deleting thread sees a service interruption, etc., from the ingestion system.
        Specified by:
        resetDeletingDocument in interface IJobManager
        Parameters:
        documentDescription - is the description object for the document that was cleaned.
        checkTime - is the minimum time for the next cleaning attempt.
        Throws:
        ManifoldCFException
      • resetDocument

        public void resetDocument​(DocumentDescription documentDescription,
                                  long executeTime,
                                  int action,
                                  long failTime,
                                  int failCount)
                           throws ManifoldCFException
        Reset an active document back to its former state. This gets done when there's a service interruption and the document cannot be processed yet. Note well: This method formerly presumed that a perfect rollback was possible, and that there was zero chance of any processing activity occuring before it got called. That assumption appears incorrect, however, so I've opted to now presume that processing has perhaps occurred. Perfect rollback is thus no longer possible.
        Specified by:
        resetDocument in interface IJobManager
        Parameters:
        documentDescription - is the description object for the document that was processed.
        executeTime - is the time that the document should be rescanned.
        failTime - is the time that the document should be considered to have failed, if it has not been successfully read until then.
        failCount - is the number of permitted failures before a hard error is signalled.
        Throws:
        ManifoldCFException
      • eliminateDuplicates

        protected static java.lang.String[] eliminateDuplicates​(java.lang.String[] docIDHashes)
        Eliminate duplicates, and sort
      • buildReorderMap

        protected static java.util.HashMap buildReorderMap​(java.lang.String[] originalIDHashes,
                                                           java.lang.String[] reorderedIDHashes)
        Build a reorder map, describing how to convert an original index into a reordered index.
      • retryStartup

        public void retryStartup​(JobStartRecord jsr,
                                 long failTime,
                                 int failCount)
                          throws ManifoldCFException
        Retry startup.
        Specified by:
        retryStartup in interface IJobManager
        Parameters:
        jsr - is the current job notification record.
        failTime - is the new fail time (-1L if none).
        failCount - is the new fail retry count (-1 if none).
        Throws:
        ManifoldCFException
      • addDocumentsInitial

        public void addDocumentsInitial​(java.lang.String processID,
                                        java.lang.Long jobID,
                                        java.lang.String[] legalLinkTypes,
                                        java.lang.String[] docIDHashes,
                                        java.lang.String[] docIDs,
                                        boolean overrideSchedule,
                                        int hopcountMethod,
                                        IPriorityCalculator[] documentPriorities,
                                        java.lang.String[][] prereqEventNames)
                                 throws ManifoldCFException
        Add an initial set of documents to the queue. This method is called during job startup, when the queue is being loaded. A set of document references is passed to this method, which updates the status of the document in the specified job's queue, according to specific state rules.
        Specified by:
        addDocumentsInitial in interface IJobManager
        Parameters:
        processID - is the current process ID.
        jobID - is the job identifier.
        legalLinkTypes - is the set of legal link types that this connector generates.
        docIDs - are the local document identifiers.
        overrideSchedule - is true if any existing document schedule should be overridden.
        hopcountMethod - is either accurate, nodelete, or neverdelete.
        documentPriorities - are the document priorities corresponding to the document identifiers.
        prereqEventNames - are the events that must be completed before each document can be processed.
        docIDHashes - are the hashes of the local document identifiers (primary key).
        Throws:
        ManifoldCFException
      • addRemainingDocumentsInitial

        public void addRemainingDocumentsInitial​(java.lang.String processID,
                                                 java.lang.Long jobID,
                                                 java.lang.String[] legalLinkTypes,
                                                 java.lang.String[] docIDHashes,
                                                 int hopcountMethod)
                                          throws ManifoldCFException
        Add an initial set of remaining documents to the queue. This method is called during job startup, when the queue is being loaded, to list documents that were NOT included by calling addDocumentsInitial(). Documents listed here are simply designed to enable the framework to get rid of old, invalid seeds. They are not queued for processing.
        Specified by:
        addRemainingDocumentsInitial in interface IJobManager
        Parameters:
        processID - is the current process ID.
        jobID - is the job identifier.
        legalLinkTypes - is the set of legal link types that this connector generates.
        docIDHashes - are the local document identifier hashes.
        hopcountMethod - is either accurate, nodelete, or neverdelete.
        Throws:
        ManifoldCFException
      • doneDocumentsInitial

        public void doneDocumentsInitial​(java.lang.Long jobID,
                                         java.lang.String[] legalLinkTypes,
                                         boolean isPartial,
                                         int hopcountMethod)
                                  throws ManifoldCFException
        Signal that a seeding pass has been done. Call this method at the end of a seeding pass. It is used to perform the bookkeeping necessary to maintain the hopcount table.
        Specified by:
        doneDocumentsInitial in interface IJobManager
        Parameters:
        jobID - is the job identifier.
        legalLinkTypes - is the set of legal link types that this connector generates.
        isPartial - is set if the seeds provided are only a partial list. Some connectors cannot supply a full list of seeds on every seeding iteration; this acknowledges that limitation.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Throws:
        ManifoldCFException
      • findHopCounts

        public boolean[] findHopCounts​(java.lang.Long jobID,
                                       java.lang.String[] legalLinkTypes,
                                       java.lang.String[] docIDHashes,
                                       java.lang.String linkType,
                                       int limit,
                                       int hopcountMethod)
                                throws ManifoldCFException
        Get the specified hop counts, with the limit as described.
        Specified by:
        findHopCounts in interface IJobManager
        Parameters:
        jobID - is the job identifier.
        legalLinkTypes - is the set of legal link types that this connector generates.
        docIDHashes - are the hashes for the set of documents to find the hopcount for.
        linkType - is the kind of link to find the hopcount for.
        limit - is the limit, beyond which a negative distance may be returned.
        hopcountMethod - is the method for managing hopcounts that is in effect.
        Returns:
        a vector of booleans corresponding to the documents requested. A true value is returned if the document is within the specified limit, false otherwise.
        Throws:
        ManifoldCFException
      • getAllSeeds

        public java.lang.String[] getAllSeeds​(java.lang.Long jobID)
                                       throws ManifoldCFException
        Get all the current seeds. Returns the seed document identifiers for a job.
        Specified by:
        getAllSeeds in interface IJobManager
        Parameters:
        jobID - is the job identifier.
        Returns:
        the document identifiers that are currently considered to be seeds.
        Throws:
        ManifoldCFException
      • addDocuments

        public void addDocuments​(java.lang.String processID,
                                 java.lang.Long jobID,
                                 java.lang.String[] legalLinkTypes,
                                 java.lang.String[] docIDHashes,
                                 java.lang.String[] docIDs,
                                 java.lang.String parentIdentifierHash,
                                 java.lang.String relationshipType,
                                 int hopcountMethod,
                                 java.lang.String[][] dataNames,
                                 java.lang.Object[][][] dataValues,
                                 IPriorityCalculator[] documentPriorities,
                                 java.lang.String[][] prereqEventNames)
                          throws ManifoldCFException
        Add documents to the queue in bulk. This method is called during document processing, when a set of document references are discovered. The document references are passed to this method, which updates the status of the document(s) in the specified job's queue, according to specific state rules.
        Specified by:
        addDocuments in interface IJobManager
        Parameters:
        processID - is the process ID.
        jobID - is the job identifier.
        legalLinkTypes - is the set of legal link types that this connector generates.
        docIDHashes - are the local document identifier hashes.
        parentIdentifierHash - is the optional parent identifier hash of this document. Pass null if none. MUST be present in the case of carrydown information.
        relationshipType - is the optional link type between this document and its parent. Pass null if there is no relationship with a parent.
        hopcountMethod - is the desired method for managing hopcounts.
        dataNames - are the names of the data to carry down to the child from this parent.
        dataValues - are the values to carry down to the child from this parent, corresponding to dataNames above. If CharacterInput objects are passed in here, it is the caller's responsibility to clean these up.
        documentPriorities - are the desired document priorities for the documents.
        prereqEventNames - are the events that must be completed before a document can be queued.
        docIDs - are the local document identifiers.
        Throws:
        ManifoldCFException
      • addDocument

        public void addDocument​(java.lang.String processID,
                                java.lang.Long jobID,
                                java.lang.String[] legalLinkTypes,
                                java.lang.String docIDHash,
                                java.lang.String docID,
                                java.lang.String parentIdentifierHash,
                                java.lang.String relationshipType,
                                int hopcountMethod,
                                java.lang.String[] dataNames,
                                java.lang.Object[][] dataValues,
                                IPriorityCalculator priority,
                                java.lang.String[] prereqEventNames)
                         throws ManifoldCFException
        Add a document to the queue. This method is called during document processing, when a document reference is discovered. The document reference is passed to this method, which updates the status of the document in the specified job's queue, according to specific state rules.
        Specified by:
        addDocument in interface IJobManager
        Parameters:
        processID - is the process ID.
        jobID - is the job identifier.
        legalLinkTypes - is the set of legal link types that this connector generates.
        docIDHash - is the local document identifier hash value.
        parentIdentifierHash - is the optional parent identifier hash of this document. Pass null if none. MUST be present in the case of carrydown information.
        relationshipType - is the optional link type between this document and its parent. Pass null if there is no relationship with a parent.
        hopcountMethod - is the desired method for managing hopcounts.
        dataNames - are the names of the data to carry down to the child from this parent.
        dataValues - are the values to carry down to the child from this parent, corresponding to dataNames above.
        priority - is the desired document priority for the document.
        prereqEventNames - are the events that must be completed before the document can be processed.
        Throws:
        ManifoldCFException
      • revertDocuments

        public void revertDocuments​(java.lang.Long jobID,
                                    java.lang.String[] legalLinkTypes,
                                    java.lang.String[] parentIdentifierHashes)
                             throws ManifoldCFException
        Undo the addition of child documents to the queue, for a set of documents. This method is called at the end of document processing, to back out any incomplete additions to the queue, and restore the status quo ante prior to the incomplete additions. Call this method instead of finishDocuments() if the addition of documents was not completed.
        Specified by:
        revertDocuments in interface IJobManager
        Parameters:
        jobID - is the job identifier.
        legalLinkTypes - is the set of legal link types that this connector generates.
        parentIdentifierHashes - are the hashes of the document identifiers for whom child link extraction just took place.
        Throws:
        ManifoldCFException
      • finishDocuments

        public DocumentDescription[] finishDocuments​(java.lang.Long jobID,
                                                     java.lang.String[] legalLinkTypes,
                                                     java.lang.String[] parentIdentifierHashes,
                                                     int hopcountMethod)
                                              throws ManifoldCFException
        Complete adding child documents to the queue, for a set of documents. This method is called at the end of document processing, to help the hopcount tracking engine do its bookkeeping.
        Specified by:
        finishDocuments in interface IJobManager
        Parameters:
        jobID - is the job identifier.
        legalLinkTypes - is the set of legal link types that this connector generates.
        parentIdentifierHashes - are the document identifier hashes for whom child link extraction just took place.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Returns:
        the set of documents for which carrydown data was changed by this operation. These documents are likely to be requeued as a result of the change.
        Throws:
        ManifoldCFException
      • calculateAffectedRestoreCarrydownChildren

        protected DocumentDescription[] calculateAffectedRestoreCarrydownChildren​(java.lang.Long jobID,
                                                                                  java.lang.String[] parentIDHashes)
                                                                           throws ManifoldCFException
        Helper method: Calculate the unique set of affected carrydown children resulting from a "restoreRecords" operation.
        Throws:
        ManifoldCFException
      • processParentHashSet

        protected void processParentHashSet​(java.lang.Long jobID,
                                            java.util.HashMap resultHash,
                                            java.util.ArrayList list)
                                     throws ManifoldCFException
        Helper method: look up rows affected by a restoreRecords operation.
        Throws:
        ManifoldCFException
      • beginEventSequence

        public boolean beginEventSequence​(java.lang.String processID,
                                          java.lang.String eventName)
                                   throws ManifoldCFException
        Begin an event sequence.
        Specified by:
        beginEventSequence in interface IJobManager
        Parameters:
        processID - is the current process ID.
        eventName - is the name of the event.
        Returns:
        true if the event could be created, or false if it's already there.
        Throws:
        ManifoldCFException
      • carrydownChangeDocumentMultiple

        public void carrydownChangeDocumentMultiple​(DocumentDescription[] documentDescriptions,
                                                    IPriorityCalculator[] docPriorities)
                                             throws ManifoldCFException
        Requeue a document set because of carrydown changes. This method is called when carrydown data is modified for a set of documents. The documents must be requeued for immediate reprocessing, even to the extent that if one is *already* being processed, it will need to be done over again.
        Specified by:
        carrydownChangeDocumentMultiple in interface IJobManager
        Parameters:
        documentDescriptions - is the set of description objects for the documents that have had their parent carrydown information changed.
        docPriorities - are the document priorities to assign to the documents, if needed.
        Throws:
        ManifoldCFException
      • carrydownChangeDocument

        public void carrydownChangeDocument​(DocumentDescription documentDescription,
                                            IPriorityCalculator docPriority)
                                     throws ManifoldCFException
        Requeue a document because of carrydown changes. This method is called when carrydown data is modified for a document. The document must be requeued for immediate reprocessing, even to the extent that if it is *already* being processed, it will need to be done over again.
        Specified by:
        carrydownChangeDocument in interface IJobManager
        Parameters:
        documentDescription - is the description object for the document that has had its parent carrydown information changed.
        docPriority - is the document priority to assign to the document, if needed.
        Throws:
        ManifoldCFException
      • getRandomAmount

        protected long getRandomAmount()
        Sleep a random amount of time after a transaction abort.
      • retrieveParentData

        public java.lang.String[] retrieveParentData​(java.lang.Long jobID,
                                                     java.lang.String docIDHash,
                                                     java.lang.String dataName)
                                              throws ManifoldCFException
        Retrieve specific parent data for a given document.
        Specified by:
        retrieveParentData in interface IJobManager
        Parameters:
        jobID - is the job identifier.
        docIDHash - is the document identifier hash value.
        dataName - is the kind of data to retrieve.
        Returns:
        the unique data values.
        Throws:
        ManifoldCFException
      • retrieveParentDataAsFiles

        public CharacterInput[] retrieveParentDataAsFiles​(java.lang.Long jobID,
                                                          java.lang.String docIDHash,
                                                          java.lang.String dataName)
                                                   throws ManifoldCFException
        Retrieve specific parent data for a given document.
        Specified by:
        retrieveParentDataAsFiles in interface IJobManager
        Parameters:
        jobID - is the job identifier.
        docIDHash - is the document identifier hash value.
        dataName - is the kind of data to retrieve.
        Returns:
        the unique data values.
        Throws:
        ManifoldCFException
      • startJobs

        public void startJobs​(long currentTime,
                              java.util.List<java.lang.Long> unwaitList)
                       throws ManifoldCFException
        Start all jobs in need of starting. This method marks all the appropriate jobs as "in progress", which is all that should be needed to start them. It's also the case that the start event should be logged in the event log. In order to make it possible for the caller to do this logging, a set of job ID's will be returned containing the jobs that were started.
        Specified by:
        startJobs in interface IJobManager
        Parameters:
        currentTime - is the current time in milliseconds since epoch.
        unwaitList - is filled in with the set of job ID objects that were resumed.
        Throws:
        ManifoldCFException
      • waitJobs

        public void waitJobs​(long currentTime,
                             java.util.List<java.lang.Long> waitList)
                      throws ManifoldCFException
        Put active or paused jobs in wait state, if they've exceeded their window.
        Specified by:
        waitJobs in interface IJobManager
        Parameters:
        currentTime - is the current time in milliseconds since epoch.
        waitList - is filled in with the set of job ID's that were put into a wait state.
        Throws:
        ManifoldCFException
      • resetJobSchedule

        public void resetJobSchedule​(java.lang.Long jobID)
                              throws ManifoldCFException
        Reset job schedule. This re-evaluates whether the job should be started now. This method would typically be called after a job's scheduling window has been changed.
        Specified by:
        resetJobSchedule in interface IJobManager
        Parameters:
        jobID - is the job identifier.
        Throws:
        ManifoldCFException
      • checkTimeMatch

        protected static java.lang.Long checkTimeMatch​(long startTime,
                                                       long currentTimestamp,
                                                       EnumeratedValues daysOfWeek,
                                                       EnumeratedValues daysOfMonth,
                                                       EnumeratedValues months,
                                                       EnumeratedValues years,
                                                       EnumeratedValues hours,
                                                       EnumeratedValues minutes,
                                                       java.lang.String timezone,
                                                       java.lang.Long duration)
        Check if the specified job parameters have a 'hit' within the specified interval.
        Parameters:
        startTime - is the start time.
        currentTimestamp - is the end time.
        daysOfWeek - is the enumerated days of the week, or null.
        daysOfMonth - is the enumerated days of the month, or null.
        months - is the enumerated months, or null.
        years - is the enumerated years, or null.
        hours - is the enumerated hours, or null.
        minutes - is the enumerated minutes, or null.
        Returns:
        null if there is NO hit within the interval; otherwise the actual time of the hit in milliseconds from epoch is returned.
      • manualStart

        public void manualStart​(java.lang.Long jobID)
                         throws ManifoldCFException
        Manually start a job. The specified job will be run REGARDLESS of the timed windows, and will not cease until complete. If the job is already running, this operation will assure that the job does not pause when its window ends. The job can be manually paused, or manually aborted.
        Specified by:
        manualStart in interface IJobManager
        Parameters:
        jobID - is the ID of the job to start.
        Throws:
        ManifoldCFException
      • manualStart

        public void manualStart​(java.lang.Long jobID,
                                boolean requestMinimum)
                         throws ManifoldCFException
        Manually start a job. The specified job will be run REGARDLESS of the timed windows, and will not cease until complete. If the job is already running, this operation will assure that the job does not pause when its window ends. The job can be manually paused, or manually aborted.
        Specified by:
        manualStart in interface IJobManager
        Parameters:
        jobID - is the ID of the job to start.
        requestMinimum - is true if a minimal job run is requested.
        Throws:
        ManifoldCFException
      • noteJobStarted

        public void noteJobStarted​(java.lang.Long jobID,
                                   long startTime,
                                   java.lang.String seedingVersion)
                            throws ManifoldCFException
        Note job started.
        Specified by:
        noteJobStarted in interface IJobManager
        Parameters:
        jobID - is the job id.
        startTime - is the job start time.
        seedingVersion - is the seeding version to record with the job start.
        Throws:
        ManifoldCFException
      • noteJobSeeded

        public void noteJobSeeded​(java.lang.Long jobID,
                                  java.lang.String seedingVersion)
                           throws ManifoldCFException
        Note job seeded.
        Specified by:
        noteJobSeeded in interface IJobManager
        Parameters:
        jobID - is the job id.
        seedingVersion - is the job seeding version string to record.
        Throws:
        ManifoldCFException
      • prepareJobScan

        public void prepareJobScan​(java.lang.Long jobID,
                                   java.lang.String[] legalLinkTypes,
                                   int hopcountMethod,
                                   int connectorModel,
                                   boolean continuousJob,
                                   boolean fromBeginningOfTime,
                                   boolean requestMinimum)
                            throws ManifoldCFException
        Prepare a job to be run. This method is called regardless of the details of the job; what differs is only the flags that are passed in. The code inside will determine the appropriate procedures. (This method replaces prepareFullScan() and prepareIncrementalScan(). )
        Specified by:
        prepareJobScan in interface IJobManager
        Parameters:
        jobID - is the job id.
        legalLinkTypes - are the link types allowed for the job.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        connectorModel - is the model used by the connector for the job.
        continuousJob - is true if the job is a continuous one.
        fromBeginningOfTime - is true if the job is running starting from time 0.
        requestMinimum - is true if the minimal amount of work is requested for the job run.
        Throws:
        ManifoldCFException
      • queueAllExisting

        protected void queueAllExisting​(java.lang.Long jobID,
                                        java.lang.String[] legalLinkTypes)
                                 throws ManifoldCFException
        Queue all existing.
        Parameters:
        jobID - is the job id.
        legalLinkTypes - are the link types allowed for the job.
        Throws:
        ManifoldCFException
      • prepareFullScan

        protected void prepareFullScan​(java.lang.Long jobID,
                                       java.lang.String[] legalLinkTypes,
                                       int hopcountMethod)
                                throws ManifoldCFException
        Prepare for a full scan.
        Parameters:
        jobID - is the job id.
        legalLinkTypes - are the link types allowed for the job.
        hopcountMethod - describes how to handle deletions for hopcount purposes.
        Throws:
        ManifoldCFException
      • manualAbort

        public void manualAbort​(java.lang.Long jobID)
                         throws ManifoldCFException
        Manually abort a running job. The job will be permanently stopped, and will not run again until automatically started based on schedule, or manually started.
        Specified by:
        manualAbort in interface IJobManager
        Parameters:
        jobID - is the job to abort.
        Throws:
        ManifoldCFException
      • manualAbortRestart

        public void manualAbortRestart​(java.lang.Long jobID,
                                       boolean requestMinimum)
                                throws ManifoldCFException
        Manually restart a running job. The job will be stopped and restarted. Any schedule affinity will be lost, until the job finishes on its own.
        Specified by:
        manualAbortRestart in interface IJobManager
        Parameters:
        jobID - is the job to abort.
        requestMinimum - is true if a minimal job run is requested.
        Throws:
        ManifoldCFException
      • manualAbortRestart

        public void manualAbortRestart​(java.lang.Long jobID)
                                throws ManifoldCFException
        Manually restart a running job. The job will be stopped and restarted. Any schedule affinity will be lost, until the job finishes on its own.
        Specified by:
        manualAbortRestart in interface IJobManager
        Parameters:
        jobID - is the job to abort.
        Throws:
        ManifoldCFException
      • errorAbort

        public boolean errorAbort​(java.lang.Long jobID,
                                  java.lang.String errorText)
                           throws ManifoldCFException
        Abort a running job due to a fatal error condition.
        Specified by:
        errorAbort in interface IJobManager
        Parameters:
        jobID - is the job to abort.
        errorText - is the error text.
        Returns:
        true if this is the first logged abort request for this job.
        Throws:
        ManifoldCFException
      • getJobsReadyForSeeding

        public JobSeedingRecord[] getJobsReadyForSeeding​(java.lang.String processID,
                                                         long currentTime)
                                                  throws ManifoldCFException
        Get the list of jobs that are ready for seeding.
        Specified by:
        getJobsReadyForSeeding in interface IJobManager
        Parameters:
        processID - is the current process ID.
        currentTime - is the current time in milliseconds since epoch.
        Returns:
        jobs that are active and are running in adaptive mode. These will be seeded based on what the connector says should be added to the queue.
        Throws:
        ManifoldCFException
      • getJobsReadyForStartup

        public JobStartRecord[] getJobsReadyForStartup​(java.lang.String processID)
                                                throws ManifoldCFException
        Get the list of jobs that are ready for startup.
        Specified by:
        getJobsReadyForStartup in interface IJobManager
        Parameters:
        processID - is the current process ID.
        Returns:
        jobs that were in the "readyforstartup" state. These will be marked as being in the "starting up" state.
        Throws:
        ManifoldCFException
      • getJobsReadyForInactivity

        public JobNotifyRecord[] getJobsReadyForInactivity​(java.lang.String processID)
                                                    throws ManifoldCFException
        Find the list of jobs that need to have their connectors notified of job completion.
        Specified by:
        getJobsReadyForInactivity in interface IJobManager
        Parameters:
        processID - is the process ID.
        Returns:
        the ID's of jobs that need their output connectors notified in order to become inactive.
        Throws:
        ManifoldCFException
      • getJobsReadyForDelete

        public JobNotifyRecord[] getJobsReadyForDelete​(java.lang.String processID)
                                                throws ManifoldCFException
        Find the list of jobs that need to have their connectors notified of job deletion.
        Specified by:
        getJobsReadyForDelete in interface IJobManager
        Parameters:
        processID - is the process ID.
        Returns:
        the ID's of jobs that need their output connectors notified in order to be removed.
        Throws:
        ManifoldCFException
      • finishJobResumes

        public void finishJobResumes​(long timestamp,
                                     java.util.List<IJobDescription> modifiedJobs)
                              throws ManifoldCFException
        Complete the sequence that resumes jobs, either from a pause or from a scheduling window wait. The logic will restore the job to an active state (many possibilities depending on connector status), and will record the jobs that have been so modified.
        Specified by:
        finishJobResumes in interface IJobManager
        Parameters:
        timestamp - is the current time in milliseconds since epoch.
        modifiedJobs - is filled in with the set of IJobDescription objects that were resumed.
        Throws:
        ManifoldCFException
      • finishJobStops

        public void finishJobStops​(long timestamp,
                                   java.util.List<IJobDescription> modifiedJobs,
                                   java.util.List<java.lang.Integer> stopNotificationTypes)
                            throws ManifoldCFException
        Complete the sequence that stops jobs, either for abort, pause, or because of a scheduling window. The logic will move the job to its next state (INACTIVE, PAUSED, ACTIVEWAIT), and will record the jobs that have been so modified.
        Specified by:
        finishJobStops in interface IJobManager
        Parameters:
        timestamp - is the current time in milliseconds since epoch.
        modifiedJobs - is filled in with the set of IJobDescription objects that were stopped.
        stopNotificationTypes - is filled in with the type of stop notification.
        Throws:
        ManifoldCFException
      • mapToNotificationType

        protected static java.lang.Integer mapToNotificationType​(int jobStatus,
                                                                 boolean noErrorText)
      • resetJobs

        public void resetJobs​(long currentTime,
                              java.util.List<IJobDescription> resetJobs)
                       throws ManifoldCFException
        Reset eligible jobs either back to the "inactive" state, or make them active again. The latter will occur if the cleanup phase of the job generated more pending documents. This method is used to pick up all jobs in the shutting down state whose purgatory or being-cleaned records have been all processed.
        Specified by:
        resetJobs in interface IJobManager
        Parameters:
        currentTime - is the current time in milliseconds since epoch.
        resetJobs - is filled in with the set of IJobDescription objects that were reset.
        Throws:
        ManifoldCFException
      • getStatus

        public JobStatus getStatus​(java.lang.Long jobID,
                                   boolean includeCounts)
                            throws ManifoldCFException
        Get the status of a job.
        Specified by:
        getStatus in interface IJobManager
        Parameters:
        jobID - is the job ID.
        includeCounts - is true if document counts should be included.
        Returns:
        the status object for the specified job.
        Throws:
        ManifoldCFException
      • getAllStatus

        public JobStatus[] getAllStatus​(boolean includeCounts)
                                 throws ManifoldCFException
        Get a list of all jobs, and their status information.
        Specified by:
        getAllStatus in interface IJobManager
        Parameters:
        includeCounts - is true if document counts should be included.
        Returns:
        an ordered array of job status objects.
        Throws:
        ManifoldCFException
      • getRunningJobs

        public JobStatus[] getRunningJobs​(boolean includeCounts)
                                   throws ManifoldCFException
        Get a list of running jobs. This is for status reporting.
        Specified by:
        getRunningJobs in interface IJobManager
        Parameters:
        includeCounts - is true if document counts should be included.
        Returns:
        an array of the job status objects.
        Throws:
        ManifoldCFException
      • getFinishedJobs

        public JobStatus[] getFinishedJobs​(boolean includeCounts)
                                    throws ManifoldCFException
        Get a list of completed jobs, and their statistics.
        Specified by:
        getFinishedJobs in interface IJobManager
        Parameters:
        includeCounts - is true if document counts should be included.
        Returns:
        an array of the job status objects.
        Throws:
        ManifoldCFException
      • getStatus

        public JobStatus getStatus​(java.lang.Long jobID,
                                   boolean includeCounts,
                                   int maxCount)
                            throws ManifoldCFException
        Get the status of a job.
        Specified by:
        getStatus in interface IJobManager
        Parameters:
        includeCounts - is true if document counts should be included.
        jobID - is the job ID.
        maxCount - is the maximum number of documents we want to count for each status.
        Returns:
        the status object for the specified job.
        Throws:
        ManifoldCFException
      • getAllStatus

        public JobStatus[] getAllStatus​(boolean includeCounts,
                                        int maxCount)
                                 throws ManifoldCFException
        Get a list of all jobs, and their status information.
        Specified by:
        getAllStatus in interface IJobManager
        Parameters:
        includeCounts - is true if document counts should be included.
        maxCount - is the maximum number of documents we want to count for each status.
        Returns:
        an ordered array of job status objects.
        Throws:
        ManifoldCFException
      • getRunningJobs

        public JobStatus[] getRunningJobs​(boolean includeCounts,
                                          int maxCount)
                                   throws ManifoldCFException
        Get a list of running jobs. This is for status reporting.
        Specified by:
        getRunningJobs in interface IJobManager
        Parameters:
        includeCounts - is true if document counts should be included.
        maxCount - is the maximum number of documents we want to count for each status.
        Returns:
        an array of the job status objects.
        Throws:
        ManifoldCFException
      • getFinishedJobs

        public JobStatus[] getFinishedJobs​(boolean includeCounts,
                                           int maxCount)
                                    throws ManifoldCFException
        Get a list of completed jobs, and their statistics.
        Specified by:
        getFinishedJobs in interface IJobManager
        Parameters:
        includeCounts - is true if document counts should be included.
        maxCount - is the maximum number of documents we want to count for each status.
        Returns:
        an array of the job status objects.
        Throws:
        ManifoldCFException
      • makeJobStatus

        protected JobStatus[] makeJobStatus​(java.lang.String whereClause,
                                            java.util.ArrayList whereParams,
                                            boolean includeCounts,
                                            int maxCount)
                                     throws ManifoldCFException
        Make a job status array from a query result.
        Parameters:
        whereClause - is the where clause for the jobs we are interested in.
        Returns:
        the status array.
        Throws:
        ManifoldCFException
      • buildCountsUsingIndividualQueries

        protected void buildCountsUsingIndividualQueries​(java.lang.String whereClause,
                                                         java.util.ArrayList whereParams,
                                                         int maxCount,
                                                         java.util.Map<java.lang.Long,​java.lang.Long> set2Hash,
                                                         java.util.Map<java.lang.Long,​java.lang.Long> set3Hash,
                                                         java.util.Map<java.lang.Long,​java.lang.Long> set4Hash,
                                                         java.util.Map<java.lang.Long,​java.lang.Boolean> set2Exact,
                                                         java.util.Map<java.lang.Long,​java.lang.Boolean> set3Exact,
                                                         java.util.Map<java.lang.Long,​java.lang.Boolean> set4Exact)
                                                  throws ManifoldCFException
        Throws:
        ManifoldCFException
      • buildCountsUsingGroupBy

        protected void buildCountsUsingGroupBy​(java.lang.String whereClause,
                                               java.util.ArrayList whereParams,
                                               java.util.Map<java.lang.Long,​java.lang.Long> set2Hash,
                                               java.util.Map<java.lang.Long,​java.lang.Long> set3Hash,
                                               java.util.Map<java.lang.Long,​java.lang.Long> set4Hash,
                                               java.util.Map<java.lang.Long,​java.lang.Boolean> set2Exact,
                                               java.util.Map<java.lang.Long,​java.lang.Boolean> set3Exact,
                                               java.util.Map<java.lang.Long,​java.lang.Boolean> set4Exact)
                                        throws ManifoldCFException
        Throws:
        ManifoldCFException
      • addWhereClause

        protected void addWhereClause​(java.lang.StringBuilder sb,
                                      java.util.ArrayList list,
                                      java.lang.String whereClause,
                                      java.util.ArrayList whereParams,
                                      boolean wherePresent)
      • genDocumentStatus

        public IResultSet genDocumentStatus​(java.lang.String connectionName,
                                            StatusFilterCriteria filterCriteria,
                                            SortOrder sortOrder,
                                            int startRow,
                                            int rowCount)
                                     throws ManifoldCFException
        Run a 'document status' report.
        Specified by:
        genDocumentStatus in interface IJobManager
        Parameters:
        connectionName - is the name of the connection.
        filterCriteria - are the criteria used to limit the records considered for the report.
        sortOrder - is the specified sort order of the final report.
        startRow - is the first row to include.
        rowCount - is the number of rows to include.
        Returns:
        the results, with the following columns: identifier, job, state, status, scheduled, action, retrycount, retrylimit. The "scheduled" column and the "retrylimit" column are long values representing a time; all other values will be user-friendly strings.
        Throws:
        ManifoldCFException
      • genQueueStatus

        public IResultSet genQueueStatus​(java.lang.String connectionName,
                                         StatusFilterCriteria filterCriteria,
                                         SortOrder sortOrder,
                                         BucketDescription idBucketDescription,
                                         int startRow,
                                         int rowCount)
                                  throws ManifoldCFException
        Run a 'queue status' report.
        Specified by:
        genQueueStatus in interface IJobManager
        Parameters:
        connectionName - is the name of the connection.
        filterCriteria - are the criteria used to limit the records considered for the report.
        sortOrder - is the specified sort order of the final report.
        idBucketDescription - is the bucket description for generating the identifier class.
        startRow - is the first row to include.
        rowCount - is the number of rows to include.
        Returns:
        the results, with the following columns: idbucket, inactive, processing, expiring, deleting, processready, expireready, processwaiting, expirewaiting
        Throws:
        ManifoldCFException
      • addBucketExtract

        protected void addBucketExtract​(java.lang.StringBuilder sb,
                                        java.util.ArrayList list,
                                        java.lang.String columnPrefix,
                                        java.lang.String columnName,
                                        BucketDescription bucketDesc)
        Turn a bucket description into a return column. This is complicated by the fact that the extraction code is inherently case sensitive. So if case insensitive is desired, that means we whack the whole thing to lower case before doing the match.
      • addCriteria

        protected boolean addCriteria​(java.lang.StringBuilder sb,
                                      java.util.ArrayList list,
                                      java.lang.String fieldPrefix,
                                      java.lang.String connectionName,
                                      StatusFilterCriteria criteria,
                                      boolean whereEmitted)
                               throws ManifoldCFException
        Add criteria clauses to query.
        Throws:
        ManifoldCFException
      • emitClauseStart

        protected boolean emitClauseStart​(java.lang.StringBuilder sb,
                                          boolean whereEmitted)
        Emit a WHERE or an AND, depending...
      • addOrdering

        protected void addOrdering​(java.lang.StringBuilder sb,
                                   java.lang.String[] completeFieldList,
                                   SortOrder sort)
        Add ordering.
      • addLimits

        protected void addLimits​(java.lang.StringBuilder sb,
                                 int startRow,
                                 int maxRowCount)
        Add limit and offset.