Class BOMEncodingDetector
- java.lang.Object
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.ByteReceiver
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.SingleByteReceiver
-
- org.apache.manifoldcf.connectorcommon.fuzzyml.BOMEncodingDetector
-
- All Implemented Interfaces:
EncodingDetector
public class BOMEncodingDetector extends SingleByteReceiver implements EncodingDetector
This class represents the parse state of the BOM (byte order mark) parser. The byte order mark parser looks for a byte order mark at the start of a byte sequence, and based on whether it finds it or not, and what it finds, selects a preliminary character encoding. Once a preliminary character encoding is determined, an EncodingAccepter is notified, and further bytes are sent to a provided ByteReceiver.
-
-
Field Summary
Fields Modifier and Type Field Description protected static int
BOM_NOTHINGYET
protected static int
BOM_SEEN_0000
protected static int
BOM_SEEN_0000FE
protected static int
BOM_SEEN_EF
protected static int
BOM_SEEN_EFBB
protected static int
BOM_SEEN_FE
protected static int
BOM_SEEN_FF
protected static int
BOM_SEEN_FFFE
protected static int
BOM_SEEN_FFFE00
protected static int
BOM_SEEN_ZERO
protected int
currentState
protected java.lang.String
encoding
protected ByteReceiver
overflowByteReceiver
protected ByteBuffer
replayBuffer
-
Fields inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.SingleByteReceiver
byteBuffer
-
-
Constructor Summary
Constructors Constructor Description BOMEncodingDetector(ByteReceiver overflowByteReceiver)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
dealWithByte(byte b)
Receive a byte.protected boolean
dealWithRemainder(byte[] buffer, int offset, int len, java.io.InputStream inputStream)
Deal with the remainder of the input.protected boolean
establishEncoding(java.lang.String encoding)
Establish the provided encoding, and send the rest to the child, if any.java.lang.String
getEncoding()
Retrieve final encoding determination.protected void
mark()
Set a "mark".protected boolean
playFromCurrentPoint()
Send stream from current point onward with the current encoding.protected boolean
replay()
Establish NO encoding, and replay from the current saved point to the child, if any.void
setEncoding(java.lang.String encoding)
Set initial encoding.-
Methods inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.SingleByteReceiver
dealWithBytes
-
Methods inherited from class org.apache.manifoldcf.connectorcommon.fuzzyml.ByteReceiver
finishUp
-
-
-
-
Field Detail
-
encoding
protected java.lang.String encoding
-
overflowByteReceiver
protected final ByteReceiver overflowByteReceiver
-
replayBuffer
protected ByteBuffer replayBuffer
-
BOM_NOTHINGYET
protected static final int BOM_NOTHINGYET
- See Also:
- Constant Field Values
-
BOM_SEEN_EF
protected static final int BOM_SEEN_EF
- See Also:
- Constant Field Values
-
BOM_SEEN_FF
protected static final int BOM_SEEN_FF
- See Also:
- Constant Field Values
-
BOM_SEEN_FE
protected static final int BOM_SEEN_FE
- See Also:
- Constant Field Values
-
BOM_SEEN_ZERO
protected static final int BOM_SEEN_ZERO
- See Also:
- Constant Field Values
-
BOM_SEEN_EFBB
protected static final int BOM_SEEN_EFBB
- See Also:
- Constant Field Values
-
BOM_SEEN_FFFE
protected static final int BOM_SEEN_FFFE
- See Also:
- Constant Field Values
-
BOM_SEEN_0000
protected static final int BOM_SEEN_0000
- See Also:
- Constant Field Values
-
BOM_SEEN_FFFE00
protected static final int BOM_SEEN_FFFE00
- See Also:
- Constant Field Values
-
BOM_SEEN_0000FE
protected static final int BOM_SEEN_0000FE
- See Also:
- Constant Field Values
-
currentState
protected int currentState
-
-
Constructor Detail
-
BOMEncodingDetector
public BOMEncodingDetector(ByteReceiver overflowByteReceiver)
Constructor.- Parameters:
overflowByteReceiver
- Pass in the receiver of all overflow bytes. If no receiver is passed in, the detector will stop as soon as the BOM is either seen, or not seen.
-
-
Method Detail
-
setEncoding
public void setEncoding(java.lang.String encoding)
Set initial encoding.- Specified by:
setEncoding
in interfaceEncodingDetector
-
getEncoding
public java.lang.String getEncoding()
Retrieve final encoding determination.- Specified by:
getEncoding
in interfaceEncodingDetector
-
dealWithByte
public boolean dealWithByte(byte b) throws ManifoldCFException
Receive a byte.- Specified by:
dealWithByte
in classSingleByteReceiver
- Returns:
- true to stop further processing.
- Throws:
ManifoldCFException
-
establishEncoding
protected boolean establishEncoding(java.lang.String encoding) throws ManifoldCFException
Establish the provided encoding, and send the rest to the child, if any.- Throws:
ManifoldCFException
-
mark
protected void mark()
Set a "mark".
-
replay
protected boolean replay() throws ManifoldCFException
Establish NO encoding, and replay from the current saved point to the child, if any.- Throws:
ManifoldCFException
-
playFromCurrentPoint
protected boolean playFromCurrentPoint() throws ManifoldCFException
Send stream from current point onward with the current encoding.- Throws:
ManifoldCFException
-
dealWithRemainder
protected boolean dealWithRemainder(byte[] buffer, int offset, int len, java.io.InputStream inputStream) throws java.io.IOException, ManifoldCFException
Deal with the remainder of the input. This is called only when dealWithByte() returns true.- Overrides:
dealWithRemainder
in classSingleByteReceiver
- Parameters:
buffer
- is the buffer of characters that should come first.offset
- is the offset within the buffer of the first character.len
- is the number of characters in the buffer.inputStream
- is the stream that should come after the characters in the buffer.- Returns:
- true to abort, false if the end of the stream has been reached.
- Throws:
java.io.IOException
ManifoldCFException
-
-