Class HopCount.DocumentHash
- java.lang.Object
-
- org.apache.manifoldcf.crawler.jobs.HopCount.DocumentHash
-
- Enclosing class:
- HopCount
protected class HopCount.DocumentHash extends java.lang.Object
The Document Hash structure contains the document nodes we are interested in, including those we need answers for to proceed. The main interface involves specifying a set of questions and receiving the answers. This structure permits multiple requests to be made to each object, and in-memory caching is used to reduce the amount of database activity as much as possible. It is also presumed that these requests take place inside of the appropriate transactions, since both read and write database activity may well occur.
-
-
Field Summary
Fields Modifier and Type Field Description protected HopCount.NodeQueue
childFetchQueue
This is the queue for nodes that need to be initialized, who need child fetching.protected HopCount.NodeQueue
evaluationQueue
This is the queue for evaluating nodes.protected int
hopcountMethod
The hopcount methodprotected java.lang.Long
jobID
The job identifierprotected java.lang.String[]
legalLinkTypes
These are the legal link types for the jobprotected java.util.Map
questionLookupMap
This is the map of known questions to DocumentNode objects.
-
Constructor Summary
Constructors Constructor Description DocumentHash(java.lang.Long jobID, java.lang.String[] legalLinkTypes, int hopcountMethod)
Constructor
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int[]
askQuestions(HopCount.Question[] questions)
Throw in some questions, and prepare for the answers.protected void
evaluateNode(HopCount.DocumentNode node)
Evaluate a node from the evaluation queue.protected void
findChildren(java.util.Map referenceMap, java.lang.Long jobID, java.util.ArrayList list)
Get the children of a bunch of nodes.protected void
getNodeChildren(HopCount.DocumentNode[] nodes)
Fetch a the children of a bunch of nodes, and initialize all of the nodes appropriately.protected void
makeNodeComplete(HopCount.DocumentNode node)
Make a node be complete.protected int
maxClauseFindChildren(java.lang.Long jobID)
Get the max clauses.protected void
notifyParents(HopCount.DocumentNode node)
Notify parents of a node's change of state.protected void
queueParents(HopCount.DocumentNode node)
Queue the parents on the evaluation queue.protected HopCount.DocumentNode[]
queueQuestions(HopCount.Question[] questions)
Queue up a set of questions.protected void
removeChildLinks(HopCount.DocumentNode dn)
Remove remaining links to children.
-
-
-
Field Detail
-
jobID
protected java.lang.Long jobID
The job identifier
-
questionLookupMap
protected java.util.Map questionLookupMap
This is the map of known questions to DocumentNode objects.
-
childFetchQueue
protected HopCount.NodeQueue childFetchQueue
This is the queue for nodes that need to be initialized, who need child fetching.
-
evaluationQueue
protected HopCount.NodeQueue evaluationQueue
This is the queue for evaluating nodes. For all of these nodes, the processing has begun: all child nodes have been queued, and at least a partial answer is present. Evaluating one of these nodes involves potentially updating the node's answer, and when that is done, all listed parents will be requeued on this queue.
-
legalLinkTypes
protected java.lang.String[] legalLinkTypes
These are the legal link types for the job
-
hopcountMethod
protected int hopcountMethod
The hopcount method
-
-
Method Detail
-
askQuestions
public int[] askQuestions(HopCount.Question[] questions) throws ManifoldCFException
Throw in some questions, and prepare for the answers.- Throws:
ManifoldCFException
-
evaluateNode
protected void evaluateNode(HopCount.DocumentNode node) throws ManifoldCFException
Evaluate a node from the evaluation queue.- Throws:
ManifoldCFException
-
getNodeChildren
protected void getNodeChildren(HopCount.DocumentNode[] nodes) throws ManifoldCFException
Fetch a the children of a bunch of nodes, and initialize all of the nodes appropriately.- Throws:
ManifoldCFException
-
maxClauseFindChildren
protected int maxClauseFindChildren(java.lang.Long jobID)
Get the max clauses.
-
findChildren
protected void findChildren(java.util.Map referenceMap, java.lang.Long jobID, java.util.ArrayList list) throws ManifoldCFException
Get the children of a bunch of nodes.- Throws:
ManifoldCFException
-
queueParents
protected void queueParents(HopCount.DocumentNode node)
Queue the parents on the evaluation queue.
-
makeNodeComplete
protected void makeNodeComplete(HopCount.DocumentNode node) throws ManifoldCFException
Make a node be complete. This involves writing the node's data to the database, if appropriate.- Throws:
ManifoldCFException
-
queueQuestions
protected HopCount.DocumentNode[] queueQuestions(HopCount.Question[] questions) throws ManifoldCFException
Queue up a set of questions. If the question is completed, nothing is done and the node is returned. If the question is queued already, the node may be modified if the question is more specific than what was already there. In any case, if the answer isn't ready, null is returned.- Parameters:
questions
- are the set of questions.- Throws:
ManifoldCFException
-
notifyParents
protected void notifyParents(HopCount.DocumentNode node)
Notify parents of a node's change of state.
-
removeChildLinks
protected void removeChildLinks(HopCount.DocumentNode dn)
Remove remaining links to children.
-
-