[7588] | 1 | Apache Solr - DataImportHandler |
---|
| 2 | Release Notes |
---|
| 3 | |
---|
| 4 | Introduction |
---|
| 5 | ------------ |
---|
| 6 | DataImportHandler is a data import tool for Solr which makes importing data from Databases, XML files and |
---|
| 7 | HTTP data sources quick and easy. |
---|
| 8 | |
---|
| 9 | |
---|
| 10 | $Id: CHANGES.txt 1350278 2012-06-14 14:52:22Z jdyer $ |
---|
| 11 | ================== 4.0.0-ALPHA ============== |
---|
| 12 | Bug Fixes |
---|
| 13 | ---------------------- |
---|
| 14 | * SOLR-3430: Added a new test against a real SQL database. Fixed problems revealed by this new test |
---|
| 15 | related to the expanded cache support added to 3.6/SOLR-2382 (James Dyer) |
---|
| 16 | |
---|
| 17 | * SOLR-1958: When using the MailEntityProcessor, import would fail if fetchMailsSince was not specified. |
---|
| 18 | (Max Lynch via James Dyer) |
---|
| 19 | |
---|
| 20 | Other Changes |
---|
| 21 | ---------------------- |
---|
| 22 | * SOLR-3262: The "threads" feature is removed (deprecated in Solr 3.6) (James Dyer) |
---|
| 23 | |
---|
| 24 | * SOLR-3422: Refactored internal data classes. |
---|
| 25 | All entities in data-config.xml must have a name (James Dyer) |
---|
| 26 | |
---|
| 27 | ================== 3.6.1 ================== |
---|
| 28 | |
---|
| 29 | Bug Fixes |
---|
| 30 | ---------------------- |
---|
| 31 | * SOLR-3336: SolrEntityProcessor substitutes most variables at query time |
---|
| 32 | (Michael Kroh, Lance Norskog, via Martijn van Groningen) |
---|
| 33 | |
---|
| 34 | ================== 3.6.0 ================== |
---|
| 35 | |
---|
| 36 | New Features |
---|
| 37 | ---------------------- |
---|
| 38 | * SOLR-1499: Added SolrEntityProcessor that imports data from another Solr core or instance based on a specified query. |
---|
| 39 | (Lance Norskog, Erik Hatcher, Pulkit Singhal, Ahmet Arslan, Luca Cavanna, Martijn van Groningen) |
---|
| 40 | Additional Work: |
---|
| 41 | SOLR-3190: Minor improvements to SolrEntityProcessor. Add more consistency between solr parameters |
---|
| 42 | and parameters used in SolrEntityProcessor and ability to specify a custom HttpClient instance. |
---|
| 43 | (Luca Cavanna via Martijn van Groningen) |
---|
| 44 | * SOLR-2382: Added pluggable cache support so that any Entity can be made cache-able by adding the "cacheImpl" parameter. |
---|
| 45 | Include "SortedMapBackedCache" to provide in-memory caching (as previously this was the only option when |
---|
| 46 | using CachedSqlEntityProcessor). Users can provide their own implementations of DIHCache for other |
---|
| 47 | caching strategies. Deprecate CachedSqlEntityProcessor in favor of specifing "cacheImpl" with |
---|
| 48 | SqlEntityProcessor. Make SolrWriter implement DIHWriter and allow the possibility of pluggable Writers |
---|
| 49 | (DIH writing to something other than Solr). (James Dyer, Noble Paul) |
---|
| 50 | |
---|
| 51 | Changes in Runtime Behavior |
---|
| 52 | ---------------------- |
---|
| 53 | * SOLR-3142: Imports no longer default optimize to true, instead false. If you want to force all segments to be merged |
---|
| 54 | into one, you can specify this parameter yourself. NOTE: this can be very expensive operation and usually |
---|
| 55 | does not make sense for delta-imports. (Robert MUir) |
---|
| 56 | |
---|
| 57 | ================== 3.5.0 ================== |
---|
| 58 | |
---|
| 59 | Bug Fixes |
---|
| 60 | ---------------------- |
---|
| 61 | * SOLR-2875: Fix the incorrect url in tika-data-config.xml (Shinichiro Abe via koji) |
---|
| 62 | |
---|
| 63 | ================== 3.4.0 ================== |
---|
| 64 | |
---|
| 65 | Bug Fixes |
---|
| 66 | ---------------------- |
---|
| 67 | * SOLR-2644: When using threads=2 the default logging is set too high (Bill Bell via shalin) |
---|
| 68 | * SOLR-2492: DIH does not commit if only deletes are processed (James Dyer via shalin) |
---|
| 69 | * SOLR-2186: DataImportHandler's multi-threaded option throws NPE (Lance Norskog, Frank Wesemann, shalin) |
---|
| 70 | * SOLR-2655: DIH multi threaded mode does not resolve attributes correctly (Frank Wesemann, shalin) |
---|
| 71 | * SOLR-2695: Documents are collected in unsynchronized list in multi-threaded debug mode (Michael McCandless, shalin) |
---|
| 72 | * SOLR-2668: DIH multithreaded mode does not rollback on errors from EntityProcessor (Frank Wesemann, shalin) |
---|
| 73 | |
---|
| 74 | ================== 3.3.0 ================== |
---|
| 75 | |
---|
| 76 | * SOLR-2551: Check dataimport.properties for write access (if delta-import is supported |
---|
| 77 | in DIH configuration) before starting an import (C S, shalin) |
---|
| 78 | |
---|
| 79 | ================== 3.2.0 ================== |
---|
| 80 | |
---|
| 81 | (No Changes) |
---|
| 82 | |
---|
| 83 | ================== 3.1.0 ================== |
---|
| 84 | Upgrading from Solr 1.4 |
---|
| 85 | ---------------------- |
---|
| 86 | |
---|
| 87 | Versions of Major Components |
---|
| 88 | --------------------- |
---|
| 89 | |
---|
| 90 | Detailed Change List |
---|
| 91 | ---------------------- |
---|
| 92 | |
---|
| 93 | New Features |
---|
| 94 | ---------------------- |
---|
| 95 | |
---|
| 96 | * SOLR-1525 : allow DIH to refer to core properties (noble) |
---|
| 97 | |
---|
| 98 | * SOLR-1547 : TemplateTransformer copy objects more intelligently when there when the template is a single variable (noble) |
---|
| 99 | |
---|
| 100 | * SOLR-1627 : VariableResolver should be fetched just in time (noble) |
---|
| 101 | |
---|
| 102 | * SOLR-1583 : Create DataSources that return InputStream (noble) |
---|
| 103 | |
---|
| 104 | * SOLR-1358 : Integration of Tika and DataImportHandler ( Akshay Ukey, noble) |
---|
| 105 | |
---|
| 106 | * SOLR-1654 : TikaEntityProcessor example added DIHExample (Akshay Ukey via noble) |
---|
| 107 | |
---|
| 108 | * SOLR-1678 : Move onError handling to DIH framework (noble) |
---|
| 109 | |
---|
| 110 | * SOLR-1352 : Multi-threaded implementation of DIH (noble) |
---|
| 111 | |
---|
| 112 | * SOLR-1721 : Add explicit option to run DataImportHandler in synchronous mode (Alexey Serba via noble) |
---|
| 113 | |
---|
| 114 | * SOLR-1737 : Added FieldStreamDataSource (noble) |
---|
| 115 | |
---|
| 116 | Optimizations |
---|
| 117 | ---------------------- |
---|
| 118 | |
---|
| 119 | * SOLR-2200: Improve the performance of DataImportHandler for large delta-import |
---|
| 120 | updates. (Mark Waddle via rmuir) |
---|
| 121 | |
---|
| 122 | Bug Fixes |
---|
| 123 | ---------------------- |
---|
| 124 | * SOLR-1638: Fixed NullPointerException during import if uniqueKey is not specified |
---|
| 125 | in schema (Akshay Ukey via shalin) |
---|
| 126 | |
---|
| 127 | * SOLR-1639: Fixed misleading error message when dataimport.properties is not writable (shalin) |
---|
| 128 | |
---|
| 129 | * SOLR-1598: Reader used in PlainTextEntityProcessor is not explicitly closed (Sascha Szott via noble) |
---|
| 130 | |
---|
| 131 | * SOLR-1759: $skipDoc was not working correctly (Gian Marco Tagliani via noble) |
---|
| 132 | |
---|
| 133 | * SOLR-1762: DateFormatTransformer does not work correctly with non-default locale dates (tommy chheng via noble) |
---|
| 134 | |
---|
| 135 | * SOLR-1757: DIH multithreading sometimes throws NPE (noble) |
---|
| 136 | |
---|
| 137 | * SOLR-1766: DIH with threads enabled doesn't respond to the abort command (Michael Henson via noble) |
---|
| 138 | |
---|
| 139 | * SOLR-1767: dataimporter.functions.escapeSql() does not escape backslash character (Sean Timm via noble) |
---|
| 140 | |
---|
| 141 | * SOLR-1811: formatDate should use the current NOW value always (Sean Timm via noble) |
---|
| 142 | |
---|
| 143 | * SOLR-1794: Dataimport of CLOB fields fails when getCharacterStream() is |
---|
| 144 | defined in a superclass. (Gunnar Gauslaa Bergem via rmuir) |
---|
| 145 | |
---|
| 146 | * SOLR-2057: DataImportHandler never calls UpdateRequestProcessor.finish() |
---|
| 147 | (Drew Farris via koji) |
---|
| 148 | |
---|
| 149 | * SOLR-1973: Empty fields in XML update messages confuse DataImportHandler. (koji) |
---|
| 150 | |
---|
| 151 | * SOLR-2221: Use StrUtils.parseBool() to get values of boolean options in DIH. |
---|
| 152 | true/on/yes (for TRUE) and false/off/no (for FALSE) can be used for sub-options |
---|
| 153 | (debug, verbose, synchronous, commit, clean, optimize) for full/delta-import commands. (koji) |
---|
| 154 | |
---|
| 155 | * SOLR-2310: getTimeElapsedSince() returns incorrect hour value when the elapse is over 60 hours |
---|
| 156 | (tom liu via koji) |
---|
| 157 | |
---|
| 158 | * SOLR-2252: When a child entity in nested entities is rootEntity="true", delta-import doesn't work. |
---|
| 159 | (koji) |
---|
| 160 | |
---|
| 161 | * SOLR-2330: solrconfig.xml files in example-DIH are broken. (Matt Parker, koji) |
---|
| 162 | |
---|
| 163 | * SOLR-1191: resolve DataImportHandler deltaQuery column against pk when pk |
---|
| 164 | has a prefix (e.g. pk="book.id" deltaQuery="select id from ..."). More |
---|
| 165 | useful error reporting when no match found (previously failed with a |
---|
| 166 | NullPointerException in log and no clear user feedback). (gthb via yonik) |
---|
| 167 | |
---|
| 168 | * SOLR-2116: Fix TikaConfig classloader bug in TikaEntityProcessor |
---|
| 169 | (Martijn van Groningen via hossman) |
---|
| 170 | |
---|
| 171 | |
---|
| 172 | Other Changes |
---|
| 173 | ---------------------- |
---|
| 174 | |
---|
| 175 | * SOLR-1821: Fix TimeZone-dependent test failure in TestEvaluatorBag. |
---|
| 176 | (Chris Male via rmuir) |
---|
| 177 | |
---|
| 178 | * SOLR-2367: Reduced noise in test output by ensuring the properties file can be written. |
---|
| 179 | (Gunnlaugur Thor Briem via rmuir) |
---|
| 180 | |
---|
| 181 | |
---|
| 182 | Build |
---|
| 183 | ---------------------- |
---|
| 184 | |
---|
| 185 | |
---|
| 186 | Documentation |
---|
| 187 | ---------------------- |
---|
| 188 | |
---|
| 189 | ================== Release 1.4.0 ================== |
---|
| 190 | |
---|
| 191 | Upgrading from Solr 1.3 |
---|
| 192 | ----------------------- |
---|
| 193 | |
---|
| 194 | Evaluator API has been changed in a non back-compatible way. Users who have developed custom Evaluators will need |
---|
| 195 | to change their code according to the new API for it to work. See SOLR-996 for details. |
---|
| 196 | |
---|
| 197 | The formatDate evaluator's syntax has been changed. The new syntax is formatDate(<variable>, '<format_string>'). |
---|
| 198 | For example, formatDate(x.date, 'yyyy-MM-dd'). In the old syntax, the date string was written without a single-quotes. |
---|
| 199 | The old syntax has been deprecated and will be removed in 1.5, until then, using the old syntax will log a warning. |
---|
| 200 | |
---|
| 201 | The Context API has been changed in a non back-compatible way. In particular, the Context.currentProcess() method |
---|
| 202 | now returns a String describing the type of the current import process instead of an int. Similarily, the public |
---|
| 203 | constants in Context viz. FULL_DUMP, DELTA_DUMP and FIND_DELTA are changed to a String type. See SOLR-969 for details. |
---|
| 204 | |
---|
| 205 | The EntityProcessor API has been simplified by moving logic for applying transformers and handling multi-row outputs |
---|
| 206 | from Transformers into an EntityProcessorWrapper class. The EntityProcessor#destroy is now called once per |
---|
| 207 | parent-row at the end of row (end of data). A new method EntityProcessor#close is added which is called at the end |
---|
| 208 | of import. |
---|
| 209 | |
---|
| 210 | In Solr 1.3, if the last_index_time was not available (first import) and a delta-import was requested, a full-import |
---|
| 211 | was run instead. This is no longer the case. In Solr 1.4 delta import is run with last_index_time as the epoch |
---|
| 212 | date (January 1, 1970, 00:00:00 GMT) if last_index_time is not available. |
---|
| 213 | |
---|
| 214 | Detailed Change List |
---|
| 215 | ---------------------- |
---|
| 216 | |
---|
| 217 | New Features |
---|
| 218 | ---------------------- |
---|
| 219 | 1. SOLR-768: Set last_index_time variable in full-import command. |
---|
| 220 | (Wojtek Piaseczny, Noble Paul via shalin) |
---|
| 221 | |
---|
| 222 | 2. SOLR-811: Allow a "deltaImportQuery" attribute in SqlEntityProcessor which is used for delta imports |
---|
| 223 | instead of DataImportHandler manipulating the SQL itself. |
---|
| 224 | (Noble Paul via shalin) |
---|
| 225 | |
---|
| 226 | 3. SOLR-842: Better error handling in DataImportHandler with options to abort, skip and continue imports. |
---|
| 227 | (Noble Paul, shalin) |
---|
| 228 | |
---|
| 229 | 4. SOLR-833: A DataSource to read data from a field as a reader. This can be used, for example, to read XMLs |
---|
| 230 | residing as CLOBs or BLOBs in databases. |
---|
| 231 | (Noble Paul via shalin) |
---|
| 232 | |
---|
| 233 | 5. SOLR-887: A Transformer to strip HTML tags. |
---|
| 234 | (Ahmed Hammad via shalin) |
---|
| 235 | |
---|
| 236 | 6. SOLR-886: DataImportHandler should rollback when an import fails or it is aborted |
---|
| 237 | (shalin) |
---|
| 238 | |
---|
| 239 | 7. SOLR-891: A Transformer to read strings from Clob type. |
---|
| 240 | (Noble Paul via shalin) |
---|
| 241 | |
---|
| 242 | 8. SOLR-812: Configurable JDBC settings in JdbcDataSource including optimized defaults for read only mode. |
---|
| 243 | (David Smiley, Glen Newton, shalin) |
---|
| 244 | |
---|
| 245 | 9. SOLR-910: Add a few utility commands to the DIH admin page such as full import, delta import, status, reload config. |
---|
| 246 | (Ahmed Hammad via shalin) |
---|
| 247 | |
---|
| 248 | 10.SOLR-938: Add event listener API for import start and end. |
---|
| 249 | (Kay Kay, Noble Paul via shalin) |
---|
| 250 | |
---|
| 251 | 11.SOLR-801: Add support for configurable pre-import and post-import delete query per root-entity. |
---|
| 252 | (Noble Paul via shalin) |
---|
| 253 | |
---|
| 254 | 12.SOLR-988: Add a new scope for session data stored in Context to store objects across imports. |
---|
| 255 | (Noble Paul via shalin) |
---|
| 256 | |
---|
| 257 | 13.SOLR-980: A PlainTextEntityProcessor which can read from any DataSource<Reader> and output a String. |
---|
| 258 | (Nathan Adams, Noble Paul via shalin) |
---|
| 259 | |
---|
| 260 | 14.SOLR-1003: XPathEntityprocessor must allow slurping all text from a given xml node and its children. |
---|
| 261 | (Noble Paul via shalin) |
---|
| 262 | |
---|
| 263 | 15.SOLR-1001: Allow variables in various attributes of RegexTransformer, HTMLStripTransformer |
---|
| 264 | and NumberFormatTransformer. |
---|
| 265 | (Fergus McMenemie, Noble Paul, shalin) |
---|
| 266 | |
---|
| 267 | 16.SOLR-989: Expose running statistics from the Context API. |
---|
| 268 | (Noble Paul, shalin) |
---|
| 269 | |
---|
| 270 | 17.SOLR-996: Expose Context to Evaluators. |
---|
| 271 | (Noble Paul, shalin) |
---|
| 272 | |
---|
| 273 | 18.SOLR-783: Enhance delta-imports by maintaining separate last_index_time for each entity. |
---|
| 274 | (Jon Baer, Noble Paul via shalin) |
---|
| 275 | |
---|
| 276 | 19.SOLR-1033: Current entity's namespace is made available to all Transformers. This allows one to use an output field |
---|
| 277 | of TemplateTransformer in other transformers, among other things. |
---|
| 278 | (Fergus McMenemie, Noble Paul via shalin) |
---|
| 279 | |
---|
| 280 | 20.SOLR-1066: New methods in Context to expose Script details. ScriptTransformer changed to read scripts |
---|
| 281 | through the new API methods. |
---|
| 282 | (Noble Paul via shalin) |
---|
| 283 | |
---|
| 284 | 21.SOLR-1062: A LogTransformer which can log data in a given template format. |
---|
| 285 | (Jon Baer, Noble Paul via shalin) |
---|
| 286 | |
---|
| 287 | 22.SOLR-1065: A ContentStreamDataSource which can accept HTTP POST data in a content stream. This can be used to |
---|
| 288 | push data to Solr instead of just pulling it from DB/Files/URLs. |
---|
| 289 | (Noble Paul via shalin) |
---|
| 290 | |
---|
| 291 | 23.SOLR-1061: Improve RegexTransformer to create multiple columns from regex groups. |
---|
| 292 | (Noble Paul via shalin) |
---|
| 293 | |
---|
| 294 | 24.SOLR-1059: Special flags introduced for deleting documents by query or id, skipping rows and stopping further |
---|
| 295 | transforms. Use $deleteDocById, $deleteDocByQuery for deleting by id and query respectively. |
---|
| 296 | Use $skipRow to skip the current row but continue with the document. Use $stopTransform to stop |
---|
| 297 | further transformers. New methods are introduced in Context for deleting by id and query. |
---|
| 298 | (Noble Paul, Fergus McMenemie, shalin) |
---|
| 299 | |
---|
| 300 | 25.SOLR-1076: JdbcDataSource should resolve variables in all its configuration parameters. |
---|
| 301 | (shalin) |
---|
| 302 | |
---|
| 303 | 26.SOLR-1055: Make DIH JdbcDataSource easily extensible by making the createConnectionFactory method protected and |
---|
| 304 | return a Callable<Connection> object. |
---|
| 305 | (Noble Paul, shalin) |
---|
| 306 | |
---|
| 307 | 27.SOLR-1058: JdbcDataSource can lookup javax.sql.DataSource using JNDI. Use a jndiName attribute to specify the |
---|
| 308 | location of the data source. |
---|
| 309 | (Jason Shepherd, Noble Paul via shalin) |
---|
| 310 | |
---|
| 311 | 28.SOLR-1083: An Evaluator for escaping query characters. |
---|
| 312 | (Noble Paul, shalin) |
---|
| 313 | |
---|
| 314 | 29.SOLR-934: A MailEntityProcessor to enable indexing mails from POP/IMAP sources into a solr index. |
---|
| 315 | (Preetam Rao, shalin) |
---|
| 316 | |
---|
| 317 | 30.SOLR-1060: A LineEntityProcessor which can stream lines of text from a given file to be indexed directly or |
---|
| 318 | for processing with transformers and child entities. |
---|
| 319 | (Fergus McMenemie, Noble Paul, shalin) |
---|
| 320 | |
---|
| 321 | 31.SOLR-1127: Add support for field name to be templatized. |
---|
| 322 | (Noble Paul, shalin) |
---|
| 323 | |
---|
| 324 | 32.SOLR-1092: Added a new command named 'import' which does not automatically clean the index. This is useful and |
---|
| 325 | more appropriate when one needs to import only some of the entities. |
---|
| 326 | (Noble Paul via shalin) |
---|
| 327 | |
---|
| 328 | 33.SOLR-1153: 'deltaImportQuery' is honored on child entities as well (noble) |
---|
| 329 | |
---|
| 330 | 34.SOLR-1230: Enhanced dataimport.jsp to work with all DataImportHandler request handler configurations, |
---|
| 331 | rather than just a hardcoded /dataimport handler. (ehatcher) |
---|
| 332 | |
---|
| 333 | 35.SOLR-1235: disallow period (.) in entity names (noble) |
---|
| 334 | |
---|
| 335 | 36.SOLR-1234: Multiple DIH does not work because all of them write to dataimport.properties. |
---|
| 336 | Use the handler name as the properties file name (noble) |
---|
| 337 | |
---|
| 338 | 37.SOLR-1348: Support binary field type in convertType logic in JdbcDataSource (shalin) |
---|
| 339 | |
---|
| 340 | 38.SOLR-1406: Make FileDataSource and FileListEntityProcessor to be more extensible (Luke Forehand, shalin) |
---|
| 341 | |
---|
| 342 | 39.SOLR-1437 : XPathEntityProcessor can deal with xpath syntaxes such as //tagname , /root//tagname (Fergus McMenemie via noble) |
---|
| 343 | |
---|
| 344 | Optimizations |
---|
| 345 | ---------------------- |
---|
| 346 | 1. SOLR-846: Reduce memory consumption during delta import by removing keys when used |
---|
| 347 | (Ricky Leung, Noble Paul via shalin) |
---|
| 348 | |
---|
| 349 | 2. SOLR-974: DataImportHandler skips commit if no data has been updated. |
---|
| 350 | (Wojtek Piaseczny, shalin) |
---|
| 351 | |
---|
| 352 | 3. SOLR-1004: Check for abort more frequently during delta-imports. |
---|
| 353 | (Marc Sturlese, shalin) |
---|
| 354 | |
---|
| 355 | 4. SOLR-1098: DateFormatTransformer can cache the format objects. |
---|
| 356 | (Noble Paul via shalin) |
---|
| 357 | |
---|
| 358 | 5. SOLR-1465: Replaced string concatenations with StringBuilder append calls in XPathRecordReader. |
---|
| 359 | (Mark Miller, shalin) |
---|
| 360 | |
---|
| 361 | |
---|
| 362 | Bug Fixes |
---|
| 363 | ---------------------- |
---|
| 364 | 1. SOLR-800: Deep copy collections to avoid ConcurrentModificationException in XPathEntityprocessor while streaming |
---|
| 365 | (Kyle Morrison, Noble Paul via shalin) |
---|
| 366 | |
---|
| 367 | 2. SOLR-823: Request parameter variables ${dataimporter.request.xxx} are not resolved |
---|
| 368 | (Mck SembWever, Noble Paul, shalin) |
---|
| 369 | |
---|
| 370 | 3. SOLR-728: Add synchronization to avoid race condition of multiple imports working concurrently |
---|
| 371 | (Walter Ferrara, shalin) |
---|
| 372 | |
---|
| 373 | 4. SOLR-742: Add ability to create dynamic fields with custom DataImportHandler transformers |
---|
| 374 | (Wojtek Piaseczny, Noble Paul, shalin) |
---|
| 375 | |
---|
| 376 | 5. SOLR-832: Rows parameter is not honored in non-debug mode and can abort a running import in debug mode. |
---|
| 377 | (Akshay Ukey, shalin) |
---|
| 378 | |
---|
| 379 | 6. SOLR-838: The VariableResolver obtained from a DataSource's context does not have current data. |
---|
| 380 | (Noble Paul via shalin) |
---|
| 381 | |
---|
| 382 | 7. SOLR-864: DataImportHandler does not catch and log Errors (shalin) |
---|
| 383 | |
---|
| 384 | 8. SOLR-873: Fix case-sensitive field names and columns (Jon Baer, shalin) |
---|
| 385 | |
---|
| 386 | 9. SOLR-893: Unable to delete documents via SQL and deletedPkQuery with deltaimport |
---|
| 387 | (Dan Rosher via shalin) |
---|
| 388 | |
---|
| 389 | 10. SOLR-888: DateFormatTransformer cannot convert non-string type |
---|
| 390 | (Amit Nithian via shalin) |
---|
| 391 | |
---|
| 392 | 11. SOLR-841: DataImportHandler should throw exception if a field does not have column attribute |
---|
| 393 | (Michael Henson, shalin) |
---|
| 394 | |
---|
| 395 | 12. SOLR-884: CachedSqlEntityProcessor should check if the cache key is present in the query results |
---|
| 396 | (Noble Paul via shalin) |
---|
| 397 | |
---|
| 398 | 13. SOLR-985: Fix thread-safety issue with TemplateString for concurrent imports with multiple cores. |
---|
| 399 | (Ryuuichi Kumai via shalin) |
---|
| 400 | |
---|
| 401 | 14. SOLR-999: XPathRecordReader fails on XMLs with nodes mixed with CDATA content. |
---|
| 402 | (Fergus McMenemie, Noble Paul via shalin) |
---|
| 403 | |
---|
| 404 | 15.SOLR-1000: FileListEntityProcessor should not apply fileName filter to directory names. |
---|
| 405 | (Fergus McMenemie via shalin) |
---|
| 406 | |
---|
| 407 | 16.SOLR-1009: Repeated column names result in duplicate values. |
---|
| 408 | (Fergus McMenemie, Noble Paul via shalin) |
---|
| 409 | |
---|
| 410 | 17.SOLR-1017: Fix thread-safety issue with last_index_time for concurrent imports in multiple cores due to unsafe usage |
---|
| 411 | of SimpleDateFormat by multiple threads. |
---|
| 412 | (Ryuuichi Kumai via shalin) |
---|
| 413 | |
---|
| 414 | 18.SOLR-1024: Calling abort on DataImportHandler import commits data instead of calling rollback. |
---|
| 415 | (shalin) |
---|
| 416 | |
---|
| 417 | 19.SOLR-1037: DIH should not add null values in a row returned by EntityProcessor to documents. |
---|
| 418 | (shalin) |
---|
| 419 | |
---|
| 420 | 20.SOLR-1040: XPathEntityProcessor fails with an xpath like /feed/entry/link[@type='text/html']/@href |
---|
| 421 | (Noble Paul via shalin) |
---|
| 422 | |
---|
| 423 | 21.SOLR-1042: Fix memory leak in DIH by making TemplateString non-static member in VariableResolverImpl |
---|
| 424 | (Ryuuichi Kumai via shalin) |
---|
| 425 | |
---|
| 426 | 22.SOLR-1053: IndexOutOfBoundsException in SolrWriter.getResourceAsString when size of data-config.xml is a |
---|
| 427 | multiple of 1024 bytes. |
---|
| 428 | (Herb Jiang via shalin) |
---|
| 429 | |
---|
| 430 | 23.SOLR-1077: IndexOutOfBoundsException with useSolrAddSchema in XPathEntityProcessor. |
---|
| 431 | (Sam Keen, Noble Paul via shalin) |
---|
| 432 | |
---|
| 433 | 24.SOLR-1080: RegexTransformer should not replace if regex is not matched. |
---|
| 434 | (Noble Paul, Fergus McMenemie via shalin) |
---|
| 435 | |
---|
| 436 | 25.SOLR-1090: DataImportHandler should load the data-config.xml using UTF-8 encoding. |
---|
| 437 | (Rui Pereira, shalin) |
---|
| 438 | |
---|
| 439 | 26.SOLR-1146: ConcurrentModificationException in DataImporter.getStatusMessages |
---|
| 440 | (Walter Ferrara, Noble Paul via shalin) |
---|
| 441 | |
---|
| 442 | 27.SOLR-1229: Fixes for deletedPkQuery, particularly when using transformed Solr unique id's |
---|
| 443 | (Lance Norskog, Noble Paul via ehatcher) |
---|
| 444 | |
---|
| 445 | 28.SOLR-1286: Fix the commit parameter always defaulting to "true" even if "false" is explicitly passed in. |
---|
| 446 | (Jay Hill, Noble Paul via ehatcher) |
---|
| 447 | |
---|
| 448 | 29.SOLR-1323: Reset XPathEntityProcessor's $hasMore/$nextUrl when fetching next URL (noble, ehatcher) |
---|
| 449 | |
---|
| 450 | 30.SOLR-1450: Jdbc connection properties such as batchSize are not applied if the driver jar is placed |
---|
| 451 | in solr_home/lib. |
---|
| 452 | (Steve Sun via shalin) |
---|
| 453 | |
---|
| 454 | 31.SOLR-1474: Delta-import should run even if last_index_time is not set. |
---|
| 455 | (shalin) |
---|
| 456 | |
---|
| 457 | |
---|
| 458 | Documentation |
---|
| 459 | ---------------------- |
---|
| 460 | 1. SOLR-1369: Add HSQLDB Jar to example-DIH, unzip database and update instructions. |
---|
| 461 | |
---|
| 462 | Other |
---|
| 463 | ---------------------- |
---|
| 464 | 1. SOLR-782: Refactored SolrWriter to make it a concrete class and removed wrappers over SolrInputDocument. |
---|
| 465 | Refactored to load Evaluators lazily. Removed multiple document nodes in the configuration xml. |
---|
| 466 | Removed support for 'default' variables, they are automatically available as request parameters. |
---|
| 467 | (Noble Paul via shalin) |
---|
| 468 | |
---|
| 469 | 2. SOLR-964: XPathEntityProcessor now ignores DTD validations |
---|
| 470 | (Fergus McMenemie, Noble Paul via shalin) |
---|
| 471 | |
---|
| 472 | 3. SOLR-1029: Standardize Evaluator parameter parsing and added helper functions for parsing all evaluator |
---|
| 473 | parameters in a standard way. |
---|
| 474 | (Noble Paul, shalin) |
---|
| 475 | |
---|
| 476 | 4. SOLR-1081: Change EventListener to be an interface so that components such as an EntityProcessor or a Transformer |
---|
| 477 | can act as an event listener. |
---|
| 478 | (Noble Paul, shalin) |
---|
| 479 | |
---|
| 480 | 5. SOLR-1027: Alias the 'dataimporter' namespace to a shorter name 'dih'. |
---|
| 481 | (Noble Paul via shalin) |
---|
| 482 | |
---|
| 483 | 6. SOLR-1084: Better error reporting when entity name is a reserved word and data-config.xml root node |
---|
| 484 | is not <dataConfig>. |
---|
| 485 | (Noble Paul via shalin) |
---|
| 486 | |
---|
| 487 | 7. SOLR-1087: Deprecate 'where' attribute in CachedSqlEntityProcessor in favor of cacheKey and cacheLookup. |
---|
| 488 | (Noble Paul via shalin) |
---|
| 489 | |
---|
| 490 | 8. SOLR-969: Change the FULL_DUMP, DELTA_DUMP, FIND_DELTA constants in Context to String. |
---|
| 491 | Change Context.currentProcess() to return a string instead of an integer. |
---|
| 492 | (Kay Kay, Noble Paul, shalin) |
---|
| 493 | |
---|
| 494 | 9. SOLR-1120: Simplified EntityProcessor API by moving logic for applying transformers and handling multi-row outputs |
---|
| 495 | from Transformers into an EntityProcessorWrapper class. The behavior of the method |
---|
| 496 | EntityProcessor#destroy has been modified to be called once per parent-row at the end of row. A new |
---|
| 497 | method EntityProcessor#close is added which is called at the end of import. A new method |
---|
| 498 | Context#getResolvedEntityAttribute is added which returns the resolved value of an entity's attribute. |
---|
| 499 | Introduced a DocWrapper which takes care of maintaining document level session variables. |
---|
| 500 | (Noble Paul, shalin) |
---|
| 501 | |
---|
| 502 | 10.SOLR-1265: Add variable resolving for URLDataSource properties like baseUrl. (Chris Eldredge via ehatcher) |
---|
| 503 | |
---|
| 504 | 11.SOLR-1269: Better error messages from JdbcDataSource when JDBC Driver name or SQL is incorrect. |
---|
| 505 | (ehatcher, shalin) |
---|
| 506 | |
---|
| 507 | ================== Release 1.3.0 ================== |
---|
| 508 | |
---|
| 509 | Status |
---|
| 510 | ------ |
---|
| 511 | This is the first release since DataImportHandler was added to the contrib solr distribution. |
---|
| 512 | The following changes list changes since the code was introduced, not since |
---|
| 513 | the first official release. |
---|
| 514 | |
---|
| 515 | |
---|
| 516 | Detailed Change List |
---|
| 517 | -------------------- |
---|
| 518 | |
---|
| 519 | New Features |
---|
| 520 | 1. SOLR-700: Allow configurable locales through a locale attribute in fields for NumberFormatTransformer. |
---|
| 521 | (Stefan Oestreicher, shalin) |
---|
| 522 | |
---|
| 523 | Changes in runtime behavior |
---|
| 524 | |
---|
| 525 | Bug Fixes |
---|
| 526 | 1. SOLR-704: NumberFormatTransformer can silently ignore part of the string while parsing. Now it tries to |
---|
| 527 | use the complete string for parsing. Failure to do so will result in an exception. |
---|
| 528 | (Stefan Oestreicher via shalin) |
---|
| 529 | |
---|
| 530 | 2. SOLR-729: Context.getDataSource(String) gives current entity's DataSource instance regardless of argument. |
---|
| 531 | (Noble Paul, shalin) |
---|
| 532 | |
---|
| 533 | 3. SOLR-726: Jdbc Drivers and DataSources fail to load if placed in multicore sharedLib or core's lib directory. |
---|
| 534 | (Walter Ferrara, Noble Paul, shalin) |
---|
| 535 | |
---|
| 536 | Other Changes |
---|
| 537 | |
---|
| 538 | |
---|