org.apache.solr.handler.dataimport
Class TikaEntityProcessor

java.lang.Object
  extended by org.apache.solr.handler.dataimport.EntityProcessor
      extended by org.apache.solr.handler.dataimport.EntityProcessorBase
          extended by org.apache.solr.handler.dataimport.TikaEntityProcessor

public class TikaEntityProcessor
extends EntityProcessorBase

An implementation of EntityProcessor which reads data from rich docs using Apache Tika

Since:
solr 3.1

Field Summary
 
Fields inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
ABORT, cacheSupport, context, CONTINUE, entityName, isFirstInit, ON_ERROR, onError, query, rowIterator, SKIP, SKIP_DOC, TRANSFORM_ROW, TRANSFORMER
 
Constructor Summary
TikaEntityProcessor()
           
 
Method Summary
protected  void firstInit(Context context)
          first time init call.
 Map<String,Object> nextRow()
          For a simple implementation, this is the only method that the sub-class should implement.
 
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase
destroy, getNext, init, initCache, nextDeletedRowKey, nextModifiedParentRowKey, nextModifiedRowKey
 
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessor
close, postTransform
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TikaEntityProcessor

public TikaEntityProcessor()
Method Detail

firstInit

protected void firstInit(Context context)
Description copied from class: EntityProcessorBase
first time init call. do one-time operations here

Overrides:
firstInit in class EntityProcessorBase

nextRow

public Map<String,Object> nextRow()
Description copied from class: EntityProcessorBase
For a simple implementation, this is the only method that the sub-class should implement. This is intended to stream rows one-by-one. Return null to signal end of rows

Overrides:
nextRow in class EntityProcessorBase
Returns:
a row where the key is the name of the field and value can be any Object or a Collection of objects. Return null to signal end of rows


Copyright © 2000-2012 Apache Software Foundation. All Rights Reserved.