Class SharedDriveConnector
- java.lang.Object
-
- org.apache.manifoldcf.core.connector.BaseConnector
-
- org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
-
- org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector
-
- All Implemented Interfaces:
org.apache.manifoldcf.core.interfaces.IConnector
,org.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
public class SharedDriveConnector extends org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
This is the "repository connector" for a smb/cifs shared drive file system. It's a relative of the share crawler, and should have comparable basic functionality.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected class
SharedDriveConnector.ProcessDocumentsFilter
This is the filter class that actually receives the files in batches.
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
_rcsid
static java.lang.String
ACTIVITY_ACCESS
static java.lang.String
ATTRIBUTE_FILESPEC
static java.lang.String
ATTRIBUTE_INDEXABLE
static java.lang.String
ATTRIBUTE_MATCH
static java.lang.String
ATTRIBUTE_PATH
static java.lang.String
ATTRIBUTE_REPLACE
static java.lang.String
ATTRIBUTE_TOKEN
static java.lang.String
ATTRIBUTE_TYPE
static java.lang.String
ATTRIBUTE_VALUE
static java.lang.String
NODE_ACCESS
static java.lang.String
NODE_EXCLUDE
static java.lang.String
NODE_FILEMAP
static java.lang.String
NODE_INCLUDE
static java.lang.String
NODE_MAXLENGTH
static java.lang.String
NODE_PARENTFOLDERACCESS
static java.lang.String
NODE_PARENTFOLDERSECURITY
static java.lang.String
NODE_PATHMAP
static java.lang.String
NODE_PATHNAMEATTRIBUTE
static java.lang.String
NODE_SECURITY
static java.lang.String
NODE_SHAREACCESS
static java.lang.String
NODE_SHARESECURITY
static java.lang.String
NODE_STARTPOINT
static java.lang.String
NODE_URIMAP
static java.lang.String
PROPERTY_JCIFS_USE_NTLM_V1
static java.lang.String
VALUE_DIRECTORY
static java.lang.String
VALUE_FILE
-
Fields inherited from class org.apache.manifoldcf.core.connector.BaseConnector
currentContext, params
-
Fields inherited from interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
GLOBAL_DENY_TOKEN, JOBMODE_CONTINUOUS, JOBMODE_ONCEONLY, MODEL_ADD, MODEL_ADD_CHANGE, MODEL_ADD_CHANGE_DELETE, MODEL_ALL, MODEL_CHAINED_ADD, MODEL_CHAINED_ADD_CHANGE, MODEL_CHAINED_ADD_CHANGE_DELETE, MODEL_PARTIAL
-
-
Constructor Summary
Constructors Constructor Description SharedDriveConnector()
Constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected static void
addSecuritySet(java.lang.StringBuilder description, boolean enabled, java.lang.String[] allowTokens, java.lang.String[] denyTokens)
java.lang.String
addSeedDocuments(org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities, org.apache.manifoldcf.core.interfaces.Specification spec, java.lang.String lastSeedVersion, long seedTime, int jobMode)
Queue "seed" documents.java.lang.String
check()
Check status of connection.protected boolean
checkInclude(boolean isDirectory, java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification)
Check if a file or directory should be included, given a document specification.protected static boolean
checkIncludeFile(long fileLength, java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification, org.apache.manifoldcf.crawler.interfaces.IFingerprintActivity activities)
Check if a file's stats are OK for inclusion.protected boolean
checkIngest(java.io.File localFile, java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification, org.apache.manifoldcf.crawler.interfaces.IFingerprintActivity activities)
Check if a file should be ingested, given a document specification and a local copy of the file.protected static boolean
checkMatch(java.lang.String sourceMatch, int sourceIndex, java.lang.String match)
Check a match between two strings with wildcards.protected boolean
checkNeedFileData(java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification)
Check to see whether we need the contents of the file for anything.void
connect(org.apache.manifoldcf.core.interfaces.ConfigParams configParameters)
Connect.protected void
convertACEs(java.util.List<java.lang.String> allowList, java.util.List<java.lang.String> denyList, jcifs.ACE[] aces)
protected static java.lang.String
convertToURI(java.lang.String documentIdentifier, MatchMap fileMap, MatchMap uriMap)
Convert a document identifier to a URI.void
disconnect()
Close the connection.protected static boolean
equivalentIOExceptions(java.io.IOException e1, java.io.IOException e2)
Check if two IOExceptions are equivalentprotected static boolean
equivalentSmbExceptions(jcifs.smb.SmbException e1, jcifs.smb.SmbException e2)
Check if two SmbExceptions are equivalentprotected static boolean
fileExists(jcifs.smb.SmbFile file)
Check for file/directory existenceprotected static boolean
fileIsDirectory(jcifs.smb.SmbFile file)
Check if file is a directoryprotected static long
fileLastModified(jcifs.smb.SmbFile file)
Get last modified date for fileprotected static long
fileLength(jcifs.smb.SmbFile file)
Get file lengthprotected static jcifs.smb.SmbFile[]
fileListFiles(jcifs.smb.SmbFile file, jcifs.smb.SmbFileFilter filter)
List filesjava.lang.String[]
getActivitiesList()
Return the list of activities that this connector supports (i.e.java.lang.String[]
getBinNames(java.lang.String documentIdentifier)
Get the bin name string for a document identifier.java.lang.String[]
getChildFolderNames(java.lang.String folder)
given a smb uri, return all children directoriesprotected static java.lang.String
getFileCanonicalPath(jcifs.smb.SmbFile file)
Get canonical pathprotected static java.io.InputStream
getFileInputStream(jcifs.smb.SmbFile file)
Get input stream for fileprotected static jcifs.ACE[]
getFileSecurity(jcifs.smb.SmbFile file, boolean useSIDs)
Get file securityprotected boolean
getFileSecuritySet(java.util.List<java.lang.String> allowList, java.util.List<java.lang.String> denyList, jcifs.smb.SmbFile file, java.lang.String[] forced)
protected static jcifs.ACE[]
getFileShareSecurity(jcifs.smb.SmbFile file, boolean useSIDs)
Get share securityprotected boolean
getFileShareSecuritySet(java.util.List<java.lang.String> allowList, java.util.List<java.lang.String> denyList, jcifs.smb.SmbFile file, java.lang.String[] forced)
protected static int
getFileType(jcifs.smb.SmbFile file)
Get file typeprotected static java.lang.String[]
getForcedAcls(org.apache.manifoldcf.core.interfaces.Specification spec)
Grab forced acl out of document specification.protected static java.lang.String[]
getForcedParentFolderAcls(org.apache.manifoldcf.core.interfaces.Specification spec)
Grab forced parent folder acls out of document specification.protected static java.lang.String[]
getForcedShareAcls(org.apache.manifoldcf.core.interfaces.Specification spec)
Grab forced share acls out of document specification.protected void
getSession()
Establish a "session".jcifs.smb.SmbFile[]
getShareNames(java.lang.String serverURI)
given a server uri, return all sharesprotected static void
handleIOException(java.lang.String documentIdentifier, java.io.IOException e)
protected static java.lang.String
mapExtensionToMimeType(java.lang.String fileName)
Map an extension to a mime typeprotected java.lang.String
mapToIdentifier(java.lang.String path)
Map a "path" specification to a full identifier.protected static int
matchSubPath(java.lang.String subPath, java.lang.String fullPath)
Match a sub-path.void
outputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.lang.String tabName)
Output the configuration body section.void
outputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.util.List<java.lang.String> tabsArray)
Output the configuration header section.void
outputSpecificationBody(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber, int actualSequenceNumber, java.lang.String tabName)
Output the specification body section.void
outputSpecificationHeader(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber, java.util.List<java.lang.String> tabsArray)
Output the specification header section.protected static boolean
processCheck(boolean caseSensitive, java.lang.String sourceMatch, int sourceIndex, java.lang.String match, int matchIndex)
Recursive worker method for checkMatch.java.lang.String
processConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
Process a configuration post.void
processDocuments(java.lang.String[] documentIdentifiers, org.apache.manifoldcf.crawler.interfaces.IExistingVersions statuses, org.apache.manifoldcf.core.interfaces.Specification spec, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, int jobMode, boolean usesDefaultAuthority)
Process a set of documents.protected static void
processSMBException(jcifs.smb.SmbException se, java.lang.String documentIdentifier, java.lang.String activity, java.lang.String operation)
java.lang.String
processSpecificationPost(org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber)
Process a specification post.boolean
requestInfo(org.apache.manifoldcf.core.interfaces.Configuration output, java.lang.String command)
Request arbitrary connector information.protected static void
setDocumentSecurity(org.apache.manifoldcf.agents.interfaces.RepositoryDocument rd, java.lang.String[] shareAllow, java.lang.String[] shareDeny, java.lang.String[] parentAllow, java.lang.String[] parentDeny, java.lang.String[] allow, java.lang.String[] deny)
protected static void
setPathMetadata(org.apache.manifoldcf.agents.interfaces.RepositoryDocument rd, java.lang.String pathAttributeName, java.lang.String pathAttributeValue)
void
setThreadContext(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext)
Set thread context.java.lang.String
validateFolderName(java.lang.String folder)
Given a folder path, determine if the folder is in fact legal and accessible (and is a folder).void
viewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
View configuration.void
viewSpecification(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber)
View specification.protected boolean
wouldFileBeIncluded(java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification, boolean pretendIndexable)
Pretend that a file is either indexable or not, and return whether or not it would be ingested.-
Methods inherited from class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
getConnectorModel, getFormCheckJavascriptMethodName, getFormPresaveCheckJavascriptMethodName, getMaxDocumentRequest, getRelationshipTypes
-
Methods inherited from class org.apache.manifoldcf.core.connector.BaseConnector
clearThreadContext, deinstall, getConfiguration, install, isConnected, outputConfigurationBody, outputConfigurationHeader, outputConfigurationHeader, pack, packFixedList, packList, packList, poll, processConfigurationPost, unpack, unpackFixedList, unpackList, viewConfiguration
-
-
-
-
Field Detail
-
_rcsid
public static final java.lang.String _rcsid
- See Also:
- Constant Field Values
-
ACTIVITY_ACCESS
public static final java.lang.String ACTIVITY_ACCESS
- See Also:
- Constant Field Values
-
NODE_STARTPOINT
public static final java.lang.String NODE_STARTPOINT
- See Also:
- Constant Field Values
-
NODE_INCLUDE
public static final java.lang.String NODE_INCLUDE
- See Also:
- Constant Field Values
-
NODE_EXCLUDE
public static final java.lang.String NODE_EXCLUDE
- See Also:
- Constant Field Values
-
NODE_PATHNAMEATTRIBUTE
public static final java.lang.String NODE_PATHNAMEATTRIBUTE
- See Also:
- Constant Field Values
-
NODE_PATHMAP
public static final java.lang.String NODE_PATHMAP
- See Also:
- Constant Field Values
-
NODE_FILEMAP
public static final java.lang.String NODE_FILEMAP
- See Also:
- Constant Field Values
-
NODE_URIMAP
public static final java.lang.String NODE_URIMAP
- See Also:
- Constant Field Values
-
NODE_SHAREACCESS
public static final java.lang.String NODE_SHAREACCESS
- See Also:
- Constant Field Values
-
NODE_SHARESECURITY
public static final java.lang.String NODE_SHARESECURITY
- See Also:
- Constant Field Values
-
NODE_PARENTFOLDERACCESS
public static final java.lang.String NODE_PARENTFOLDERACCESS
- See Also:
- Constant Field Values
-
NODE_PARENTFOLDERSECURITY
public static final java.lang.String NODE_PARENTFOLDERSECURITY
- See Also:
- Constant Field Values
-
NODE_MAXLENGTH
public static final java.lang.String NODE_MAXLENGTH
- See Also:
- Constant Field Values
-
NODE_ACCESS
public static final java.lang.String NODE_ACCESS
- See Also:
- Constant Field Values
-
NODE_SECURITY
public static final java.lang.String NODE_SECURITY
- See Also:
- Constant Field Values
-
ATTRIBUTE_PATH
public static final java.lang.String ATTRIBUTE_PATH
- See Also:
- Constant Field Values
-
ATTRIBUTE_TYPE
public static final java.lang.String ATTRIBUTE_TYPE
- See Also:
- Constant Field Values
-
ATTRIBUTE_INDEXABLE
public static final java.lang.String ATTRIBUTE_INDEXABLE
- See Also:
- Constant Field Values
-
ATTRIBUTE_FILESPEC
public static final java.lang.String ATTRIBUTE_FILESPEC
- See Also:
- Constant Field Values
-
ATTRIBUTE_VALUE
public static final java.lang.String ATTRIBUTE_VALUE
- See Also:
- Constant Field Values
-
ATTRIBUTE_TOKEN
public static final java.lang.String ATTRIBUTE_TOKEN
- See Also:
- Constant Field Values
-
ATTRIBUTE_MATCH
public static final java.lang.String ATTRIBUTE_MATCH
- See Also:
- Constant Field Values
-
ATTRIBUTE_REPLACE
public static final java.lang.String ATTRIBUTE_REPLACE
- See Also:
- Constant Field Values
-
VALUE_DIRECTORY
public static final java.lang.String VALUE_DIRECTORY
- See Also:
- Constant Field Values
-
VALUE_FILE
public static final java.lang.String VALUE_FILE
- See Also:
- Constant Field Values
-
PROPERTY_JCIFS_USE_NTLM_V1
public static final java.lang.String PROPERTY_JCIFS_USE_NTLM_V1
- See Also:
- Constant Field Values
-
-
Method Detail
-
setThreadContext
public void setThreadContext(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Set thread context. Use the opportunity to set the system properties we'll need.- Specified by:
setThreadContext
in interfaceorg.apache.manifoldcf.core.interfaces.IConnector
- Overrides:
setThreadContext
in classorg.apache.manifoldcf.core.connector.BaseConnector
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
getSession
protected void getSession() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Establish a "session". In the case of the jcifs connector, this just builds the appropriate smbconnectionPath string, and does the necessary checks.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
getActivitiesList
public java.lang.String[] getActivitiesList()
Return the list of activities that this connector supports (i.e. writes into the log).- Specified by:
getActivitiesList
in interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
- Overrides:
getActivitiesList
in classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
- Returns:
- the list.
-
disconnect
public void disconnect() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Close the connection. Call this before discarding the repository connector.- Specified by:
disconnect
in interfaceorg.apache.manifoldcf.core.interfaces.IConnector
- Overrides:
disconnect
in classorg.apache.manifoldcf.core.connector.BaseConnector
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
connect
public void connect(org.apache.manifoldcf.core.interfaces.ConfigParams configParameters)
Connect.- Specified by:
connect
in interfaceorg.apache.manifoldcf.core.interfaces.IConnector
- Overrides:
connect
in classorg.apache.manifoldcf.core.connector.BaseConnector
- Parameters:
configParameters
- is the set of configuration parameters, which in this case describe the root directory.
-
getBinNames
public java.lang.String[] getBinNames(java.lang.String documentIdentifier)
Get the bin name string for a document identifier. The bin name describes the queue to which the document will be assigned for throttling purposes. Throttling controls the rate at which items in a given queue are fetched; it does not say anything about the overall fetch rate, which may operate on multiple queues or bins. For example, if you implement a web crawler, a good choice of bin name would be the server name, since that is likely to correspond to a real resource that will need real throttle protection.- Specified by:
getBinNames
in interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
- Overrides:
getBinNames
in classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
- Parameters:
documentIdentifier
- is the document identifier.- Returns:
- the bin name.
-
convertToURI
protected static java.lang.String convertToURI(java.lang.String documentIdentifier, MatchMap fileMap, MatchMap uriMap) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Convert a document identifier to a URI. The URI is the URI that will be the unique key from the search index, and will be presented to the user as part of the search results.- Parameters:
documentIdentifier
- is the document identifier.- Returns:
- the document uri.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
requestInfo
public boolean requestInfo(org.apache.manifoldcf.core.interfaces.Configuration output, java.lang.String command) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Request arbitrary connector information. This method is called directly from the API in order to allow API users to perform any one of several connector-specific queries.- Specified by:
requestInfo
in interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
- Overrides:
requestInfo
in classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
- Parameters:
output
- is the response object, to be filled in by this method.command
- is the command, which is taken directly from the API request.- Returns:
- true if the resource is found, false if not. In either case, output may be filled in.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
addSeedDocuments
public java.lang.String addSeedDocuments(org.apache.manifoldcf.crawler.interfaces.ISeedingActivity activities, org.apache.manifoldcf.core.interfaces.Specification spec, java.lang.String lastSeedVersion, long seedTime, int jobMode) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption
Queue "seed" documents. Seed documents are the starting places for crawling activity. Documents are seeded when this method calls appropriate methods in the passed in ISeedingActivity object. This method can choose to find repository changes that happen only during the specified time interval. The seeds recorded by this method will be viewed by the framework based on what the getConnectorModel() method returns. It is not a big problem if the connector chooses to create more seeds than are strictly necessary; it is merely a question of overall work required. The end time and seeding version string passed to this method may be interpreted for greatest efficiency. For continuous crawling jobs, this method will be called once, when the job starts, and at various periodic intervals as the job executes. When a job's specification is changed, the framework automatically resets the seeding version string to null. The seeding version string may also be set to null on each job run, depending on the connector model returned by getConnectorModel(). Note that it is always ok to send MORE documents rather than less to this method. The connector will be connected before this method can be called.- Specified by:
addSeedDocuments
in interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
- Overrides:
addSeedDocuments
in classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
- Parameters:
activities
- is the interface this method should use to perform whatever framework actions are desired.spec
- is a document specification (that comes from the job).seedTime
- is the end of the time range of documents to consider, exclusive.lastSeedVersion
- is the last seeding version string for this job, or null if the job has no previous seeding version string.jobMode
- is an integer describing how the job is being run, whether continuous or once-only.- Returns:
- an updated seeding version string, to be stored with the job.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
processDocuments
public void processDocuments(java.lang.String[] documentIdentifiers, org.apache.manifoldcf.crawler.interfaces.IExistingVersions statuses, org.apache.manifoldcf.core.interfaces.Specification spec, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, int jobMode, boolean usesDefaultAuthority) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption
Process a set of documents. This is the method that should cause each document to be fetched, processed, and the results either added to the queue of documents for the current job, and/or entered into the incremental ingestion manager. The document specification allows this class to filter what is done based on the job. The connector will be connected before this method can be called.- Specified by:
processDocuments
in interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
- Overrides:
processDocuments
in classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
- Parameters:
documentIdentifiers
- is the set of document identifiers to process.statuses
- are the currently-stored document versions for each document in the set of document identifiers passed in above.activities
- is the interface this method should use to queue up new document references and ingest documents.jobMode
- is an integer describing how the job is being run, whether continuous or once-only.usesDefaultAuthority
- will be true only if the authority in use for these documents is the default one.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
handleIOException
protected static void handleIOException(java.lang.String documentIdentifier, java.io.IOException e) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
mapExtensionToMimeType
protected static java.lang.String mapExtensionToMimeType(java.lang.String fileName)
Map an extension to a mime type
-
addSecuritySet
protected static void addSecuritySet(java.lang.StringBuilder description, boolean enabled, java.lang.String[] allowTokens, java.lang.String[] denyTokens)
-
getFileSecuritySet
protected boolean getFileSecuritySet(java.util.List<java.lang.String> allowList, java.util.List<java.lang.String> denyList, jcifs.smb.SmbFile file, java.lang.String[] forced) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOException
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
-
getFileShareSecuritySet
protected boolean getFileShareSecuritySet(java.util.List<java.lang.String> allowList, java.util.List<java.lang.String> denyList, jcifs.smb.SmbFile file, java.lang.String[] forced) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOException
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
-
convertACEs
protected void convertACEs(java.util.List<java.lang.String> allowList, java.util.List<java.lang.String> denyList, jcifs.ACE[] aces)
-
processSMBException
protected static void processSMBException(jcifs.smb.SmbException se, java.lang.String documentIdentifier, java.lang.String activity, java.lang.String operation) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
setDocumentSecurity
protected static void setDocumentSecurity(org.apache.manifoldcf.agents.interfaces.RepositoryDocument rd, java.lang.String[] shareAllow, java.lang.String[] shareDeny, java.lang.String[] parentAllow, java.lang.String[] parentDeny, java.lang.String[] allow, java.lang.String[] deny)
-
setPathMetadata
protected static void setPathMetadata(org.apache.manifoldcf.agents.interfaces.RepositoryDocument rd, java.lang.String pathAttributeName, java.lang.String pathAttributeValue) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
check
public java.lang.String check() throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Check status of connection.- Specified by:
check
in interfaceorg.apache.manifoldcf.core.interfaces.IConnector
- Overrides:
check
in classorg.apache.manifoldcf.core.connector.BaseConnector
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
checkIncludeFile
protected static boolean checkIncludeFile(long fileLength, java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification, org.apache.manifoldcf.crawler.interfaces.IFingerprintActivity activities) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption
Check if a file's stats are OK for inclusion.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
checkInclude
protected boolean checkInclude(boolean isDirectory, java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Check if a file or directory should be included, given a document specification.- Parameters:
isDirectory
- is true if the file is a directory.fileName
- is the canonical file name.documentSpecification
- is the specification.- Returns:
- true if it should be included.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
wouldFileBeIncluded
protected boolean wouldFileBeIncluded(java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification, boolean pretendIndexable) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Pretend that a file is either indexable or not, and return whether or not it would be ingested. This is only ever called for files.- Parameters:
fileName
- is the canonical file name.documentSpecification
- is the specification.pretendIndexable
- should be set to true if the document's contents would be fingerprinted as "indexable", or false otherwise.- Returns:
- true if the file would be ingested given the parameters.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
checkNeedFileData
protected boolean checkNeedFileData(java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Check to see whether we need the contents of the file for anything. We do this by assuming that the file is indexable, and assuming that it's not, and seeing if the same thing would happen.- Parameters:
fileName
- is the name of the file.documentSpecification
- is the document specification.- Returns:
- true if the file needs to be fingerprinted.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
checkIngest
protected boolean checkIngest(java.io.File localFile, java.lang.String fileName, org.apache.manifoldcf.core.interfaces.Specification documentSpecification, org.apache.manifoldcf.crawler.interfaces.IFingerprintActivity activities) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, org.apache.manifoldcf.agents.interfaces.ServiceInterruption
Check if a file should be ingested, given a document specification and a local copy of the file. It is presumed that only files that passed checkInclude() and were also flagged as needing file data by checkNeedFileData() will be checked by this method.- Parameters:
localFile
- is the file.fileName
- is the JCIFS file name.documentSpecification
- is the specification.activities
- are the activities available to determine indexability.- Returns:
- true if the file should be ingested.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
-
matchSubPath
protected static int matchSubPath(java.lang.String subPath, java.lang.String fullPath)
Match a sub-path. The sub-path must match the complete starting part of the full path, in a path sense. The returned value should point into the file name beyond the end of the matched path, or be -1 if there is no match.- Parameters:
subPath
- is the sub path.fullPath
- is the full path.- Returns:
- the index of the start of the remaining part of the full path, or -1.
-
checkMatch
protected static boolean checkMatch(java.lang.String sourceMatch, int sourceIndex, java.lang.String match)
Check a match between two strings with wildcards.- Parameters:
sourceMatch
- is the expanded string (no wildcards)sourceIndex
- is the starting point in the expanded string.match
- is the wildcard-based string.- Returns:
- true if there is a match.
-
processCheck
protected static boolean processCheck(boolean caseSensitive, java.lang.String sourceMatch, int sourceIndex, java.lang.String match, int matchIndex)
Recursive worker method for checkMatch. Returns 'true' if there is a path that consumes both strings in their entirety in a matched way.- Parameters:
caseSensitive
- is true if file names are case sensitive.sourceMatch
- is the source string (w/o wildcards)sourceIndex
- is the current point in the source string.match
- is the match string (w/wildcards)matchIndex
- is the current point in the match string.- Returns:
- true if there is a match.
-
getForcedAcls
protected static java.lang.String[] getForcedAcls(org.apache.manifoldcf.core.interfaces.Specification spec)
Grab forced acl out of document specification.- Parameters:
spec
- is the document specification.- Returns:
- the acls.
-
getForcedShareAcls
protected static java.lang.String[] getForcedShareAcls(org.apache.manifoldcf.core.interfaces.Specification spec)
Grab forced share acls out of document specification.- Parameters:
spec
- is the document specification.- Returns:
- the acls.
-
getForcedParentFolderAcls
protected static java.lang.String[] getForcedParentFolderAcls(org.apache.manifoldcf.core.interfaces.Specification spec)
Grab forced parent folder acls out of document specification.- Parameters:
spec
- is the document specification.- Returns:
- the acls.
-
mapToIdentifier
protected java.lang.String mapToIdentifier(java.lang.String path) throws java.net.MalformedURLException, java.net.UnknownHostException
Map a "path" specification to a full identifier.- Throws:
java.net.MalformedURLException
java.net.UnknownHostException
-
getFileCanonicalPath
protected static java.lang.String getFileCanonicalPath(jcifs.smb.SmbFile file)
Get canonical path
-
fileExists
protected static boolean fileExists(jcifs.smb.SmbFile file) throws jcifs.smb.SmbException
Check for file/directory existence- Throws:
jcifs.smb.SmbException
-
fileIsDirectory
protected static boolean fileIsDirectory(jcifs.smb.SmbFile file) throws jcifs.smb.SmbException
Check if file is a directory- Throws:
jcifs.smb.SmbException
-
fileLastModified
protected static long fileLastModified(jcifs.smb.SmbFile file) throws jcifs.smb.SmbException
Get last modified date for file- Throws:
jcifs.smb.SmbException
-
fileLength
protected static long fileLength(jcifs.smb.SmbFile file) throws jcifs.smb.SmbException
Get file length- Throws:
jcifs.smb.SmbException
-
fileListFiles
protected static jcifs.smb.SmbFile[] fileListFiles(jcifs.smb.SmbFile file, jcifs.smb.SmbFileFilter filter) throws jcifs.smb.SmbException
List files- Throws:
jcifs.smb.SmbException
-
getFileInputStream
protected static java.io.InputStream getFileInputStream(jcifs.smb.SmbFile file) throws java.io.IOException
Get input stream for file- Throws:
java.io.IOException
-
getFileSecurity
protected static jcifs.ACE[] getFileSecurity(jcifs.smb.SmbFile file, boolean useSIDs) throws java.io.IOException
Get file security- Throws:
java.io.IOException
-
getFileShareSecurity
protected static jcifs.ACE[] getFileShareSecurity(jcifs.smb.SmbFile file, boolean useSIDs) throws java.io.IOException
Get share security- Throws:
java.io.IOException
-
getFileType
protected static int getFileType(jcifs.smb.SmbFile file) throws jcifs.smb.SmbException
Get file type- Throws:
jcifs.smb.SmbException
-
equivalentSmbExceptions
protected static boolean equivalentSmbExceptions(jcifs.smb.SmbException e1, jcifs.smb.SmbException e2)
Check if two SmbExceptions are equivalent
-
equivalentIOExceptions
protected static boolean equivalentIOExceptions(java.io.IOException e1, java.io.IOException e2)
Check if two IOExceptions are equivalent
-
outputConfigurationHeader
public void outputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.util.List<java.lang.String> tabsArray) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOException
Output the configuration header section. This method is called in the head section of the connector's configuration page. Its purpose is to add the required tabs to the list, and to output any javascript methods that might be needed by the configuration editing HTML.- Specified by:
outputConfigurationHeader
in interfaceorg.apache.manifoldcf.core.interfaces.IConnector
- Overrides:
outputConfigurationHeader
in classorg.apache.manifoldcf.core.connector.BaseConnector
- Parameters:
threadContext
- is the local thread context.out
- is the output to which any HTML should be sent.parameters
- are the configuration parameters, as they currently exist, for this connection being configured.tabsArray
- is an array of tab names. Add to this array any tab names that are specific to the connector.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
-
outputConfigurationBody
public void outputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters, java.lang.String tabName) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOException
Output the configuration body section. This method is called in the body section of the connector's configuration page. Its purpose is to present the required form elements for editing. The coder can presume that the HTML that is output from this configuration will be within appropriate <html>, <body>, and <form> tags. The name of the form is "editconnection".- Specified by:
outputConfigurationBody
in interfaceorg.apache.manifoldcf.core.interfaces.IConnector
- Overrides:
outputConfigurationBody
in classorg.apache.manifoldcf.core.connector.BaseConnector
- Parameters:
threadContext
- is the local thread context.out
- is the output to which any HTML should be sent.parameters
- are the configuration parameters, as they currently exist, for this connection being configured.tabName
- is the current tab name.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
-
processConfigurationPost
public java.lang.String processConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Process a configuration post. This method is called at the start of the connector's configuration page, whenever there is a possibility that form data for a connection has been posted. Its purpose is to gather form information and modify the configuration parameters accordingly. The name of the posted form is "editconnection".- Specified by:
processConfigurationPost
in interfaceorg.apache.manifoldcf.core.interfaces.IConnector
- Overrides:
processConfigurationPost
in classorg.apache.manifoldcf.core.connector.BaseConnector
- Parameters:
threadContext
- is the local thread context.variableContext
- is the set of variables available from the post, including binary file post information.parameters
- are the configuration parameters, as they currently exist, for this connection being configured.- Returns:
- null if all is well, or a string error message if there is an error that should prevent saving of the connection (and cause a redirection to an error page).
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
viewConfiguration
public void viewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext, org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.ConfigParams parameters) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOException
View configuration. This method is called in the body section of the connector's view configuration page. Its purpose is to present the connection information to the user. The coder can presume that the HTML that is output from this configuration will be within appropriate <html> and <body>tags.- Specified by:
viewConfiguration
in interfaceorg.apache.manifoldcf.core.interfaces.IConnector
- Overrides:
viewConfiguration
in classorg.apache.manifoldcf.core.connector.BaseConnector
- Parameters:
threadContext
- is the local thread context.out
- is the output to which any HTML should be sent.parameters
- are the configuration parameters, as they currently exist, for this connection being configured.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
-
outputSpecificationHeader
public void outputSpecificationHeader(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber, java.util.List<java.lang.String> tabsArray) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOException
Output the specification header section. This method is called in the head section of a job page which has selected a repository connection of the current type. Its purpose is to add the required tabs to the list, and to output any javascript methods that might be needed by the job editing HTML. The connector will be connected before this method can be called.- Specified by:
outputSpecificationHeader
in interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
- Overrides:
outputSpecificationHeader
in classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
- Parameters:
out
- is the output to which any HTML should be sent.locale
- is the locale the output is preferred to be in.ds
- is the current document specification for this job.connectionSequenceNumber
- is the unique number of this connection within the job.tabsArray
- is an array of tab names. Add to this array any tab names that are specific to the connector.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
-
outputSpecificationBody
public void outputSpecificationBody(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber, int actualSequenceNumber, java.lang.String tabName) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOException
Output the specification body section. This method is called in the body section of a job page which has selected a repository connection of the current type. Its purpose is to present the required form elements for editing. The coder can presume that the HTML that is output from this configuration will be within appropriate <html>, <body>, and <form> tags. The name of the form is always "editjob". The connector will be connected before this method can be called.- Specified by:
outputSpecificationBody
in interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
- Overrides:
outputSpecificationBody
in classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
- Parameters:
out
- is the output to which any HTML should be sent.locale
- is the locale the output is preferred to be in.ds
- is the current document specification for this job.connectionSequenceNumber
- is the unique number of this connection within the job.actualSequenceNumber
- is the connection within the job that has currently been selected.tabName
- is the current tab name. (actualSequenceNumber, tabName) form a unique tuple within the job.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
-
processSpecificationPost
public java.lang.String processSpecificationPost(org.apache.manifoldcf.core.interfaces.IPostParameters variableContext, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Process a specification post. This method is called at the start of job's edit or view page, whenever there is a possibility that form data for a connection has been posted. Its purpose is to gather form information and modify the document specification accordingly. The name of the posted form is always "editjob". The connector will be connected before this method can be called.- Specified by:
processSpecificationPost
in interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
- Overrides:
processSpecificationPost
in classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
- Parameters:
variableContext
- contains the post data, including binary file-upload information.locale
- is the locale the output is preferred to be in.ds
- is the current document specification for this job.connectionSequenceNumber
- is the unique number of this connection within the job.- Returns:
- null if all is well, or a string error message if there is an error that should prevent saving of the job (and cause a redirection to an error page).
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
viewSpecification
public void viewSpecification(org.apache.manifoldcf.core.interfaces.IHTTPOutput out, java.util.Locale locale, org.apache.manifoldcf.core.interfaces.Specification ds, int connectionSequenceNumber) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException, java.io.IOException
View specification. This method is called in the body section of a job's view page. Its purpose is to present the document specification information to the user. The coder can presume that the HTML that is output from this configuration will be within appropriate <html> and <body>tags. The connector will be connected before this method can be called.- Specified by:
viewSpecification
in interfaceorg.apache.manifoldcf.crawler.interfaces.IRepositoryConnector
- Overrides:
viewSpecification
in classorg.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
- Parameters:
out
- is the output to which any HTML should be sent.locale
- is the locale the output is preferred to be in.ds
- is the current document specification for this job.connectionSequenceNumber
- is the unique number of this connection within the job.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
-
getShareNames
public jcifs.smb.SmbFile[] getShareNames(java.lang.String serverURI) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
given a server uri, return all shares- Parameters:
serverURI
- -- Returns:
- an array of SmbFile
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
validateFolderName
public java.lang.String validateFolderName(java.lang.String folder) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
Given a folder path, determine if the folder is in fact legal and accessible (and is a folder).- Parameters:
folder
- is the relative folder from the network root- Returns:
- the canonical folder name if valid, or null if not.
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
getChildFolderNames
public java.lang.String[] getChildFolderNames(java.lang.String folder) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
given a smb uri, return all children directories- Parameters:
folder
- is the relative folder from the network root- Returns:
- array of child folder names
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-