com.armatiek.infofuze.source.extractor
Class CSVFileExtractor

java.lang.Object
  extended by com.armatiek.infofuze.source.extractor.FileExtractor
      extended by com.armatiek.infofuze.source.extractor.FileFileExtractor
          extended by com.armatiek.infofuze.source.extractor.CSVFileExtractor
All Implemented Interfaces:
IOFileFilter

public class CSVFileExtractor
extends FileFileExtractor

Class that represents the definition of a filesystem based source within infofuze-config.xml. The class implements IOFileFilter and thus can be used to determine if a specific FileIf is accepted by this extractor definition.

Author:
Maarten Kroon

Field Summary
 
Fields inherited from class com.armatiek.infofuze.source.extractor.FileExtractor
logger
 
Constructor Summary
CSVFileExtractor(org.w3c.dom.Element configElem)
           
 
Method Summary
 char getCommentStart()
          Returns the configured CSV comment start character.
 java.lang.String getDefaultCharsetName()
          Returns the name of the character encoding of the CSV file when it can not be determined automatically or "UTF-8" when it is not configured.
 char getDelimiter()
          Returns the configured CSV delimiter character.
 char getEncapsulator()
          Returns the configured CSV encapsulator character.
 char getEscape()
          Returns the configured CSV escape character.
 boolean getHasHeading()
          Returns whether the first line of the CSV contains column headings or false when it is not configured.
 boolean getIgnoreEmptyLines()
          Returns whether or not to ignore empty lines or true when it is not configured.
 boolean getIgnoreLeadingWhitespace()
          Returns whether or not to ignore leading whitespace or true when it is not configured.
 boolean getIgnoreTrailingWhitespace()
          Returns whether or not to ignore trailing whitespace or true when it is not configured.
 boolean getInterpretUnicodeEscapes()
          Returns whether or not to interpret unicode escapes or false when it is not configured.
 
Methods inherited from class com.armatiek.infofuze.source.extractor.FileFileExtractor
getCacheStream, getIncludeBinary
 
Methods inherited from class com.armatiek.infofuze.source.extractor.FileExtractor
accept
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CSVFileExtractor

public CSVFileExtractor(org.w3c.dom.Element configElem)
                 throws java.lang.Exception
Throws:
java.lang.Exception
Method Detail

getDelimiter

public char getDelimiter()
Returns the configured CSV delimiter character. If the delimiter is not configured is returns a comma character.


getEncapsulator

public char getEncapsulator()
Returns the configured CSV encapsulator character. If the encapsulator is not configured is returns a quote character.


getCommentStart

public char getCommentStart()
Returns the configured CSV comment start character. If the comment start character is not configured is returns CSVStrategy.COMMENTS_DISABLED.


getEscape

public char getEscape()
Returns the configured CSV escape character. If the escape character is not configured is returns CSVStrategy.ESCAPE_DISABLED.


getIgnoreLeadingWhitespace

public boolean getIgnoreLeadingWhitespace()
Returns whether or not to ignore leading whitespace or true when it is not configured.


getIgnoreTrailingWhitespace

public boolean getIgnoreTrailingWhitespace()
Returns whether or not to ignore trailing whitespace or true when it is not configured.


getInterpretUnicodeEscapes

public boolean getInterpretUnicodeEscapes()
Returns whether or not to interpret unicode escapes or false when it is not configured.


getIgnoreEmptyLines

public boolean getIgnoreEmptyLines()
Returns whether or not to ignore empty lines or true when it is not configured.


getHasHeading

public boolean getHasHeading()
Returns whether the first line of the CSV contains column headings or false when it is not configured.


getDefaultCharsetName

public java.lang.String getDefaultCharsetName()
Returns the name of the character encoding of the CSV file when it can not be determined automatically or "UTF-8" when it is not configured.