Solr 1.4 Enterprise Search Server- P8

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:18

0
79
lượt xem
14
download

Solr 1.4 Enterprise Search Server- P8

Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tham khảo tài liệu 'solr 1.4 enterprise search server- p8', công nghệ thông tin, quản trị web phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:
Lưu

Nội dung Text: Solr 1.4 Enterprise Search Server- P8

  1. B CMS 250 Co-ordination Factor. See coord batchSize 78 collapse.facet, field collapsing 192 bf parameter 117 collapse.field, field collapsing 192 Blacklight Online Public Access Catalog. collapse.info.doc, field collapsing 193 See Blacklight OPAC, Ruby On Rails collapse.maxdocs, field collapsing 193 integrations collapse.threshold, field collapsing 193 Blacklight OPAC, Ruby On Rails collapse.type, field collapsing 192 integrations combined index 32 about 263 CommonsHttpSolrServer 235 data, indexing 263-267 complex systems, tuning Boolean operators about 271 AND 100 CPU usage 272 AND operator, combining with OR memory usage 272 operator 101 scale deep 273 AND or && operator 101 scale high 273 NOT 100 scale wide 273 NOT operator 101 system changes 272 OR 100 components OR or || operator 101 about 111, 159 bool element 92 solrconfig.xml 159 boost functions compressed, field option 41 boosting 137, 138 configuration files, Solr r_event_date_earliest field 138 tag 25 boosting 70, 107 solrconfig.xml file 25 boost queries standard request handler 26 boosting 134-137 Configuration Management. See CM bq parameter(s) 134 ConsoleHandler 204 bucketFirstLetter 148 Content Construction Kit 252 buildOnCommit 174 Content Management System. See CMS buildOnCommit, spellchecker option 174 Continuous Integration. See CI buildOnOptimize, spellchecker option 174 coord 112 copyField directive C about 46 uses 46 caches CoreDescriptor classes 231 tuning 281 core, managing 209, 210 CapitalizationFilterFactory filter 63 count, Stats component 189 CCK 252 CPU usage 272 Chainsaw cron 289 URL 204 CSV, sending to Solr characterEncoding, FileBasedSpellChecker about 72 option 175 configuration options 73, 74 CharFilterFactory 62 curl CI 128 using, to interact with Solr 66, 68 classname 173 CM 197 [ 302 ]
  2. D dataSource attribute 78 development console 76, 77 data, indexing documents, entities 78 stream.body parameter 67 entity 78 stream.file parameter 67 getting started 75 stream.url parameter 67 mb-dih-artists-jdbc.xml file 75, 76 through HTTP POST 67 query attribute 78 ways 67 reference document, URL 74 database Solr, registering with 75 and Lucene search index, differences 9, 10 solrconfig.xml 75 DataImportHandler. See DIH DIH, development console dataSource attribute 78 DataSources, JdbcDataSource type 77, 78 date element 93 DIH control form 77 date facet, parameters documents, entities 79 facet.date 151 fields 79 facet.date.end 151 importing with 80 facet.date.gap 151 DIH, transformers facet.date.hardend 151 dateTimeFormat attributes 79 facet.date.other 152 splitBy attributes 79 facet.date.start 151 template attributes 79 dates, Faceting 146 DIH fields debugQuery, diagnostic parameter column attribute 79 about 98 name attribute 79 explainOther 98 directory structure, Solr defaults 111 build 13 defaultSearchField, schema.xml settings 47 client 13 defType, query parameter 95 dist 13 defType parameter 128 example 14 deleteById() 232 example/etc 14 deleteByQuery() 232 example/multicore 14 denormalizing example/solr 14 one to many associated data 36, 37 example/webapps 14 one to one associated data 36 lib 14 deployment process, Solr 197, 198 site 14 df, query parameter 95 src 14 diagnostic query parameters src/java 14 debugQuery 98 src/scripts 14 echoHandler 98 src/solrj 14 echoParams 98 src/test 14 indent 98 src/webapp 14 dictionary Disjunction-Max. See dismax about 169 DisjunctionMaxQuery building, from source 176, 177 about 130 DIH boosts, configuring 131 about 74, 236 queried fields, configuring 131 capabilities 74 dismax 113 [ 303 ]
  3. dismax handler. See Dismax Solr request EdgeNGram analyzer 61 handler EdgeNGramFilterFactory 61 dismax query handler 131 EdgeNGramTokenizerFactory 61 dismax request handler 128 Elasticfox 276 Dismax Solr request handler Embedded-Solr 65 about 128 embedded Solr automatic phrase boosting 132, 133 legacy Lucene, upgrading from 237 boost functions, boosting 137, 138 using for rich clients 237 boost queries, boosting 134-137 using in in-process streaming 236, 237 debugQuery option used 129 EmbeddedSolrServer class 224 default search 140, 141 encoder attribute 59 DisjunctionMaxQuery 130 EnglishPorterFilter Factory, stemming 54 features, over standard handler 129 Entity tags 279 limited query syntax 131 ETag 279 min-should-match 138 ETL 78 mm query parameter 138 eval() function 238 phrase slop, configuring 134 existence (and non-existence) queries 107 distanceMeasure, spellchecker option 174 explicit mapping 56 distributed search 32 Extract Transform and Load. See ETL div(x,y), mathematical primitives 121 extraParams entry 242 doc element 93 docText field data 233 F document deleting 70 facet 146 documentCache 281 facet.date 151, 286 Domain Specific Language. See DSL examples 151 double element 92 facet.date.end 151 DoubleMetaphone, phonetic encoding facet.date.gap 151 algorithms 58 facet.date.hardend 151 DoubleMetaphoneFilterFactory analysis facet.date.other 152 filter, options facet.date.start 151 inject 59 facet.field 147 maxCodeLength 59 facet.limit 147 Drupal, options facet.method 148 Apache Solr Search integration module 251 facet.mincount 147 Solr, hosted by Acquia 252 facet.missing 148 DSL 269 facet.missing parameter 143 dynamic fields facet.offset 147 * fallback 46 facet.prefix 148, 156 about 45 facet.query 286 facet.query parameter 152, 153 E facet.sort 147 facet_counts 143 echoHandler, diagnostic parameter 98 faceted navigation 7, 141, 145, 153 echoParams 152 faceted search 149, 220, 221 echoParams, diagnostic parameter 98 [ 304 ]
  4. faceting collapse.info.count 193 about 141 collapse.info.doc 193 alphabetic range bucketing (A-C, D-F, and collapse.maxdocs 193 so on) 148, 149 collapse.threshold 193 date facet parameters 151, 152 collapse.type 192 dates 146, 149, 150 configuring 192, 193 example 142, 143 SOLR-236 191 facet.field 147 field definitons, schema.xml file facet.limit 147 attributes 42 facet.method 148 copyField, using 46 facet.mincount 147 copyField directive, using 46 facet.missing 148 default (optional) 42 facet.missing parameter 143 dynamic fields 45 facet.offset 147 name 42 facet.prefix 148 required (optional) 42 facet.sort 147 schema.xml, settings 47 facet_counts 143 sorting 44 facet prefixing (term suggest) 156-158 sorting, limitations 44, 45 field, requisites 146 type 42 field values (text) 146 field length. See fieldNorm filters, excluding 153-155 field list. See fl Local Params 155 fieldNorm 112 on arbitrary parameters 152, 153 field options, schema.xml file queries 146 compresses 41 release types, exampleexample 142, 143 indexed 41 schema changes, MusicBrainz example 144, multiValued 41 145 omitNorms (advanced) 41 text 147 positionIncrementGap (advanced) 42 types 146 sortMissingFirst 41 faceting, dates sortMissingLast 41 about 149 stored 41 examples 150 termVectors (advanced) 41 Facet prefixing 156 field qualifier 102, 103 Familiarity field references, function queries 120 URL 204 fieldType, spellchecker option 174 FastLRUCache 280 field types, schema.xml file fetchSize 78 tag 40 field, attributes tag 40 default (optional) 42 class attribute 40 name 42 field values (text), Faceting 146 required (optional) 42 file, spellchecker 172 type 42 FileBasedSpellChecker options field, IndexBasedSpellChecker option 174 characterEncoding 175 field collapsing, search components sourceLocation 175 about 191, 192 FileHandler logging 204 collapse.facet 192 filterCache 280 collapse.field 192 filter element 50 [ 305 ]
  5. filtering 108, 109 H filters, Faceting excluding 153, 155 Hadoop 225 first-components 111 HathiTrust 273 fl 220 Heritrix fl, output related parameter 96 using, to download artist pages 226, 227 float element 92 highlighted field list. See hl.fl fq, query parameter 95 highlighting component, search function argument components limitations 120 about 161 function queries configuring 163 _val_ pseudo-field hack 117 example 161, 163 about 117 hl 164 bf parameter 117 hl.fl 164 Daydreaming search example 119 hl.fragsize 164 example 118 hl.highlightMultiTerm 164 field references 120 hl.mergeContiguous 165 function references 120 hl.requireFieldMatch 164 incorporating, to searches 117 hl.snippets 164 t_trm_lookups 118 hl.usePhraseHighlighter 164 function query, tips 128 hl alternateField 165 function references hl formatter 165 mathematical primitives 121 hl fragmenter 165 function references, function queries 120 hl maxAnalyzedChars 165 parameters 164 G hl, highlighting component 164 hl.fl 161 g, query parameter 95 hl.fl, highlighting component 164 g.op, query parameter 95 hl.fragsize, highlighting component 164 generic XML data structure hl.highlightMultiTerm, highlighting about 92 component 164 appends 111 hl.increment, regex fragmenter 166 arr, XML element 92 hl.mergeContiguous, highlighting bool element 92 component 165 components 111 hl.regex.maxAnalyzedChars, regex date element 93 fragmenter 166 defaults 111 hl.regex.pattern, regex fragmenter 166 double element 92 hl.regex.slop, regex fragmenter 166 first-components 111 hl.requireFieldMatch, highlighting float element 92 component 164 int element 92 hl.snippets, highlighting component 164 invariants 111 hl.usePhraseHighlighter, highlighting last-components 111 component 164 long element 92 hl alternateField, highlighting component lst, XML element 92 165 str element 92 hl formatter, highlighting component Git about 165 URL 11 hl.simple.pre and hl.simple.post 165 [ 306 ]
  6. hl fragmenter, highlighting component 165 factors, committing 285 hl maxAlternateFieldLength, highlighting factors, optimizing 285 component 165 unique document checking, disabling 285 hl maxAnalyzedChars, highlighting Index Searchers 280 component 165 Information Retrieval. See  IR home directory, Solr int element 92 bin 15 InternetArchive 226 conf 15 invariants 111 conf/schema.xml 15 Inverse Document Frequency. See  IDF conf/solrconfig.xml 15 inverse reciprocals 125 conf/xslt 15 IR 8 data 15 ISOLatin1AccentFilterFactory filter 62 lib 15 issue tracker, Solr 27 HTML, indexing in Solr 227 HTMLStripStandardTokenizerFactory 52 J HTMLStripStandardTokenizerFactory tokenizer 227 J2SE HTMLStripWhitespaceTokenizerFactory 52 with JConsole 212 HTTP caching 277-279 JARmageddon 205 HTTP server request access logs, logging jarowinkler, spellchecker 172 about 201, 202 java.util.logging package 203 log directory, creating 201 Java class names Tailing 202 abbreviated 40 org.apache.solr.schema.BoolField 40 I Java Development Kit (JDK) URL 11 IDF 33 JavaDoc tags 234 idf 112 Java Management Extensions. See  JMX ID field 44 Java Naming and Directory Interface. See  indent, diagnostic parameter 98 JNDI index 31 Java replication index-time versus script 289 and query-time, boosting 113 JavaScript Object Notation. See  JSON versus query-time 57 Java Server Pages. See  JSPs index-time boosting 70 JConsole GUI IndexBasedSpellChecker options about 212 field 174 URL 212 sourceLocation 174 JDK [1.4] logging 203 thresholdTokenFrequency 175 JDK logging 203 index data Jetty document access, controlling 221 startup integration 205 securing 220 web.xml, customizing 218 indexed, field option 41 jetty.xml 201 indexed, schema design 282 JIRB tool 215 indexes JMX sharding 295 about 212 indexing strategies access, controlling 220 about 283 [ 307 ]
  7. information extracting, JRuby used 215 HTTP server request access logs 201, 202 Solr, starting with 212-215 levels. managing at runtime 205, 206 Jmx4r 217 Solr application logging 203 JMX Console 212 types 201 JNDI 16, 200 logging.properties file 204 JNDI name 200 long element 92 jQuery 240 LowerCaseFilterFactory filter 62 jQuery Autocomplete widget 241, 242 LRUCache 280 JRuby lst, XML element 92 using, to extract JMS information 215 Lucene JRuby Interactive Browser tool. See  JIRB about 8 tool DisjunctionMaxQuery 130 JSON 238 features 8 JSONP 242 scoring 112 JSON with Padding. See  JSONP Lucene’s query syntax JSPs 17 URL 44 JUL 203 LUCENE-1435 45 JVM Lucene search index configuration 277 and database, differences 9, 10 Lucene syntax K query expression 100 query syntax 99 KeepWordFilterFactory filter 62 sub-expressions 101 KeywordTokenizerFactory 52 KStem, stemming 55 M L mailing lists, Solr URL 26 last-components 111 Managed Bean. See  MBeans LengthFilterFactory 145 mandatory clause, expression query 100 LengthFilterFactory filter 62 map() function 243 LetterTokenizerFactory 52 map(x,min,max,target), miscellaneous math limited query syntax 131 121 disabling 132 master server linear(x,m,c), miscellaneous math 122 indexing into 292 Local Params 155 mathematical primitives, function LocalSolr component 194 references log(x), mathematical primitives 121 abs(x) 121 Log4j div(x,y) 121 configuring, URL 205 log(x) 121 logging to 204 pow(x,y) 121 Log4j JAR file product(x,y,z,...) 121 URL 204 sqrt(x) 121 logarithms 123, 124 sum(x,y,z, ... ) 121 Logback Maven 228 URL 204 max(x,c), miscellaneous math 121 logging max, Stats component 189 about 201 maxGramSize 60 [ 308 ]
  8. maxScore 93 specific parameters 183 maxWarmingSearchers 284 using, ways 182 mb-dih-artists-jdbc.xml file 75, 76 mlt.boost 186 mb_attributes.txt mlt.fl 185 content 145 mlt.maxntp 186 MBeans 212 mlt.maxqt 186 mean, Stats component 189 mlt.maxwl 185 member_id field 36 mlt.mindf 185 memory usage 272 mlt.mintf 185 Metaphone, phonetic encoding algorithms mlt.minwl 185 58 mlt.qf 185 min, Stats component 189 mm query parameter 138 min-should-match mm specification formats about 138 as examples 139 basic rules 139 more-like-this search component. See  MLT, multiple rules 139 search components rules 139 more like this plugin 9 rules, choosing 140 multi-word synonyms 56 minGramSize 60 multicore miscellaneous math, function references need for 210, 211 linear(x,m,c) 122 multiple indices 32 map(x,min,max,target) 121 multiple Solr servers max(x,c) 121 documents, assigning to shards 296 recip(x,m,a,c) 122 indexes, sharding 295 scale(x,minTarget,maxTarget) 121 master server, indexing into 292 missing, Stats component 189 replication, configuring 291 MLT, search components script versus Java replication 289 as dedicated request handler 182 searches, distributing 291 as request handler, with external input search queries, distributing across slaves document 183 293, 294 as Solr component 182 shards, searching across 297, 298 configuration parameters 183 slaves, configuring 292, 293 mlt 183 starting 290, 291 mlt.boost 186 multiValued, field option 41 mlt.count 183 multiValued field 221 mlt.fl 185 MusicBrainz.org 30, 31 mlt.maxntp 186 mlt.maxqt 186 N mlt.maxwl 185 mlt.mindf 185 n-gramming costs mlt.mintf 185 Edge n-gramming costs 62 mlt.minwl 185 tokenizer based n-gramming costs 62 mlt.qf 185 N-gramming costs, substring indexing parameters 185, 186 a_name field 61 parameters, specific to MLT request handler a_name field + a_ngram field 61 184 minGramSize 62 results, example 186, 188 name 173 name attribute 143 [ 309 ]
  9. name field 33 Metaphone 58 newSearch query 284 RefinedSoundex 58 NOT operator 100, 101 Soundex 58 numFound 93 PhoneticFilterFactory filter 59 Nutch 225 phonetic sounds-like Nutch + Web Archive eXtensions. See  about 58 NutchWAX phonetic encoding algorithms 58 NutchWAX 225 phrase queries 103 phrase search performance O improving 287 shingling, solution 287, 288 OLTP 78 phrase slop omitNorms (advanced), field option 41 configuring 134 omitNorms, schema design 282 Plain Old Java Objects. See  POJOs omitTermFreqAndPositions, schema design POJOs 282 indexing 234 Online Transaction Processing systems. See  PorterStemFilterFactory, stemming 54 OLTP positionIncrementGap (advanced), field optional clause, expression query 100 option 42 ord() function 120, 122 pow(x,y), mathematical primitives 121 ord(fieldReference) 122 product(x,y,z, ... ), mathematical primitives ord/rord 122 121 ord and rord, function references prohibited clause, expression query 100 ord(fieldReference) 122 PRONOM Unique Identifier. See  PUID rord(fieldReference) 122 public searches OR operator 100 securing 219, 220 OR or || operator 101 PUID 31 output related parameters, query parameters fl 96 Q sort 96 version 98 q parameter wt 97 processing 175 outputUnigrams controls 288 qt, miscellaneous parameter 95 QTime 93 P queries, Faceting 146 query-time parse and index-time, boosting 113 parameter 243 versus index-time 57 parse() function 244 query-time boosting 70 partial indexing. See  substring indexing query attribute 78 PatternReplaceFilterFactory filter 63 query converter 175 PatternTokenizerFactory 53 query elevation, search components pf, tips 134 about 166 pf parameter 133 config-file 167, 168 phoneme 58 configuration parameters 167 phonetic encoding algorithms configuring 167 DoubleMetaphone 58 elevateArtists.xml 168 encoder attribute 59 [ 310 ]
  10. forceElevation 168 { and } brackets 106 queryFieldType 168 about 105, 106 query expression, clauses date math 106, 107 mandatory clause 100 readOnly 77 optional clause 100 recip(x,m,a,c), miscellaneous math 122 prohobited clause 100 reciprocals and rord, with dates 126, 127 query parameters RecordItem 234 about 95 RefinedSoundex, phonetic encoding defType 95 algorithms 58 df 95 regex fragmenter, options diagnostic 98 hl.increment 166 fq 95 hl.regex.pattern 166 output related parameters 96 hl.regex.slop 166 q 95 hl regex.maxAnalyzedChars 166 q.op 95 release’s artist’s name. See  r_a_name qt 95 remote streaming result paging 96 about 68, 221 rows 96 disabling 69 start 96 enabling 69 query parser plugin 128 remote streaming feature 224 QueryResponse object 235 RemoveDuplicatesTokenFilterFactory filter queryResultCache 280 62 query spell checker renderResult() method 247 indexed content based 8, 9 replication query syntax and sharding, combining 298-300 about 99 configuring 291 boosting 107 requestHandler 207 documents, matching 99 request handler existence (and non-existence) queries 107 about 110 field qualifier 102, 103 configuration, creating 110 fuzzy queries 105 configuring 110 phrase queries 103 result() function 243, 244 query expression, clauses 100 right field type/analysis, using 109 special characters 108 rOfficial 144 sub-expressions 101 rord() 122 term proximity 103 rord(fieldReference) 122 wildcard queries 103, 104 rows parameter 96, 242 rsolr R versus solr-ruby 269 Ruby On Rails integrations r_a_name 42 acts_as_solr 254-259 r_attributes 144 acts_as_solr plugin 253 r_event_date_earliest field 138 Blacklight OPAC 263 r_name_facetLetter 148 Convention over Configuration 253 r_official 144 display, customizing 267 r_type 144 fields display, customizing 268, 269 range queries solr-ruby versus rsolr 269 [ and ] brackets 106 solr_data 257 [ 311 ]
  11. S stored 282 score boosting. See  boosting scale() function scoring example 123 about 112 inverse reciprocals, using 124, 125 co-ordination factor (coord) 112 logarithms, using 123, 124 factors 112 reciprocals and rord with dates, using field length (fieldNorm) 112 126, 127 Inverse Document Frequency (idf) 112 scale(x,minTarget,maxTarget), query-time and index-time, boosting 113 miscellaneous math 121 term frequency (tf) 112 scale deep 298 troubleshooting 113, 114 scale high 276 script scale wide 289 versus Java replication 289 schema, Solr search, distributing across slaves tag 25 about 291 tag 25 master server, indexing into 292 tag 25 slaves, configuring 292, 293 primary key 25 search components text, field name 25 about 161 schema.xml, settings field collapsing 191, 192 defaultSearchField 47 highlighting component 161 solrconfig.xml 47 MLT (more-like-this) 182 solrQueryParser 47 query elevation 166 uniqueKey 47 spellcheck 169 schema.xml file Stats component 189 tag 40 terms component 194 tag 40 termVector component 194 field definitions 42, 43 search engine 161, 223, 237, 266, 272 field options 40 searcher.num_docs attribute 216 field types 40 SearchHandler sample 45 per search interface 207 schema design search handler 128 about 34 searching 89, 90 compressed field option 282 server access data, denormalizing 36 limiting 217, 219 entities returned from search, determining Servlet container 35 and Solr, differences 199 inclusion of fields used in search results, installing in 199 omitting 38, 39 solr.home property, defining 199 indexed 282 sharding omitNorms 282 and replication, combining 298-300 omitTermFreqAndPositions 282 documents, assigning 296 one to many associated data, denormalizing indexes 295, 296 36, 37 searching across 297, 298 one to one associated data, denormalizing ShingleFilterFactory 288 36 shingling 133, 127, 287 Solr powered search, determining 35 [ 312 ]
  12. Simple Java interface. See  SolrJ issue tracker 27 Simple Logging Facade for Java package. local file accessing, example 68 See  SLF4J package logging 201 single combined index mailing list 26 issues 34 official site, URL 11 schema.xml snippet, sample 32 powered artists building, autocomplete using, issues 33 widget with jQuery used 240, 241, 242 single Solr server powered artists building, autocomplete optimizing 276 widget with JSONP used 243 single Solr server, optimizing prerequisites 11 faceting performance, enhancing 286 query parameters 95 HTTP caching 277-279 query syntax 99 indexing strategies 283, 284 remote streaming 68, 69 JVM configuration 277 request handlers 110 phrase search performance, improving 287 resources 26 schema design considerations 282 running 17-19 Solr caching 280, 281 sample data, loading 20, 21 term vectors, using 286, 287 schema 25 tuning caches 281 search request handler 128 slaves securing 217 configuring 292 simple query, running 22-24 search queries, distributing across slaves solr.solr.home, searching for 16 293, 294 sorting 109 SLF4j 20 spell check plugin 9 SLF4J package 203 starting 15, 16 SnowballPorterFilterFactory, stemming 54 starting, with JMX 212-215 Solr statistics page 24 about 7, 10 system changes 272 and Servlet container, differences 199 testing 13 building 13 tools 58 communicating with 65 XML, sending to 69, 70 complex systems, tuning 271, 272 XML response format 93 configuration files 25, 26 Solr’s DIH DataImportHandler contrib cores, managing 209, 210 add-on 66 CSV, sending to 72 Solr’s Wiki 26 deploying 17 Solr, accessing from PHP applications deployment process 197, 198 about 247, 248 directory structure 13 Drupal, options 250 disjunction-max query handler 9 solr-php-client 248-250 Faceting 141 Solr, communicating with features 8, 9 convenient client API 65 filtering 108, 109 data formats 66 function query, incorporating to searches data streamed remotely 66 117 Direct HTTP 65 generic XML data structure 92 Solr’s filesystem 66 home directory 15 Solr, data formats interacting with, curl used 66, 68 rich documents 66 [ 313 ]
  13. Solr-binary 66 solr.TextField 48 Solr-XML 66 Solr 1.3 11 Solr, examples Solr 1.4 11 structure 223 Solr admin summary 224 Assistance area 20 Solr, filters example 19 CapitalizationFilterFactory 63 Make a Query text box 20 CharFilterFactory 62 navigation menu 19 ISOLatin1AccentFilterFactory 62 Solr application logging, logging 203 KeepWordFilterFactory 62 Jetty, startup integration 205 LengthFilterFactory 62 Log4j, logging to 204 LowerCaseFilterFactory 62 logging output, configuring 203 PatternReplaceFilterFactory 63 log levels, managing at runtime 205, 206 RemoveDuplicatesTokenFilterFactory 62 solrbook-packtpub 273 StandardFilterFactory 62 Solr caching write your own 63 autowarmCount 281 Solr, integrating class 281 JavaScript used 238, 239 configuring 281 Solr, prerequisites documentCache 281 Apache ant 11 filterCache 280 Java Development Kit (JDK) 11 queryResultCache 280 Subversion or Git 11 size 281 Solr, securing Solr cell document access, controlling 221 binary content, extracting 81, 82 index data, securing 220 documents, indexing with 81 JMX access, controlling 220 karaoke lyrics, extracting 83-85 server access, limiting 217, 219, 220 richer documents, indexing 85-87 SOLR-236 191 Solr, configuring 83 solr-balancer 294 Solr cores Solr-binary 66 cores, managing 209, 210 solr-php-client multicore, need for 210, 211 a_member_name array 249 solr.xml, configuring 208, 209 about 248, 249, 250 solrconfig.xml Apache_Solr_Service, configuration 249 elements 159 solr-ruby about 75 versus rsolr 269 solrconfig.xml, schema.xml settings 47 Solr-XML 66 Solr DIH Wiki page solr.body feature 68 URL 79 solr.home property SolrDocumentList object 235 defining 199 SolrDocument object 235 JNDI (Java Naming and Directory Interface) Solr home 16 200 SolrIndexSearch Mbean 214 solr.war file 200 SolrJ solr.setParser(new XMLResponseParser()) about 65, 224 235 client API 230-233 solr.solr.home CommonsHttpSolrServer 224 searching for 16 embedded Solr, need for 235, 236 [ 314 ]
  14. EmbeddedSolrServer class 224 classname 173 Heritrix using, to download artist pages dictionary, building from source 176 226, 227 file, spellchecker 172 HTML, indexing 227-230 FileBasedSpellChecker options 175 HTMLStripStandardTokenizerFactory IndexBasedSpellChecker options 174 tokenizer 227 indexed content 169 POJOs, indexing 234, 235 jarowinkler, spellchecker 172 stream.file parameter 224 mispelled query, example 178, 180 Solr JIRA name 173 URL 12 q parameter, processing 175 SolrJS requests, issuing 177, 178 about 245, 246 schema configuration 169-171 addWidget() method 247 solrconfig.xml, configuration in 171, 172 project homepage, URL 245 Solr configuring, ways 169 SolrJS Manager object 247 spellcheck.q parameter, processing 176 URL 220 spellchecker, index and file based 173 Solrmarc 236 spellcheckers (dictionaries), configuring SolrQuery object 235 173 solrQueryParser, schema.xml settings 47 spellcheckIndexDir 173 Solr resources text file of words 169 about 26 spellcheck.collate 178 issue tracker 27 spellcheck.count 177 mailing lists 26 spellcheck.dictionary 177 Solr’s Wiki 26 spellcheck.extendedResults 178 Solr search components spellcheck.onlyMorePopular 178 LocalSolr component 194 spellcheck.q 177 terms component 194 spellcheck.q parameter termVector component 194 processing 176 sort, output related parameter 97 spellchecker, index and file based sorting accuracy 174 about 44, 109 buildOnCommit 174 limitations 44 buildOnOptimize 174 string type 45 classname 173 title_sort type 45 distanceMeasure 174 sortMissingFirst, field option 41 fieldType 174 sortMissingLast, field option 41 name 173 Soundex, phonetic encoding algorithms 58 spellcheckIndexDir 173 sourceLocation, FileBasedSpellChecker spellcheckIndexDir 173 option 175 spell check plugin 9 sourceLocation, IndexBasedSpellChecker Splunk 205 option 174 sqrt(x), mathematical primitives 121 spellcheck 177 Squid spellcheck, search components URL 279 a_spell, spellchecker 172 standard component list 160 a_spellPhrase, spellchecker 172 StandardFilterFactory filter 62 about 169 StandardTokenizerFactory 52 alternative approach 180, 182 start 93 [ 315 ]
  15. startEmbeddedSolr() 234 EdgeNGramFilterFactory 61 start parameter 96 EdgeNGramTokenizerFactory 61 stats, Stats component 189 n-gramming costs 61 stats.facet, Stats component 190 NGramFilterFactory, configuring with min- stats.field, Stats component 189 GramSize of 2 60 Stats component, search components NGramFilterFactory, configuring with min- about 189 GramSize of 5 60 configuring 189 Subversion count 189 URL 11 max 189 sum(x,y,z, ... ), mathematical primitives 121 mean 189 sum, Stats component 189 min 189 sumOfSquares, Stats component 189 missing 189 synonyms statistics, for track durations 190 => 56 stats 189 about 55 stats.facet 190 ignoreCase, setting true 56 stats.field 189 index-time versus query-time 57 stddev 189 WordNet, thesarus 55 sum 189 sumOfSquares 189 T status 93 stddev, Stats component 189 t_duration 152 stemming t_shingle 288 about 54 t_trm_lookups 118 EnglishPorterFilterFactory 54 Tailing 202 implementations 54 term-suggest 141, 156 KStem 55 term frequency. See  tf PorterStemFilterFactory 54 term proximity 103 SnowballPorterFilterFactory 54 terms component 194 StopFilterFactory 186 termVector component 194 used, for stop words filtering 57 termVectors 186 stop words term vectors 286, 287 filtering, StopFilterFactory used 57 termVectors (advanced), field option 41 stored, field option 41 text analysis stored, schema design 282 about 47 stream.body parameter 67 experimenting with 50, 51 stream.file parameter 67, 224 highlight matches 51 stream.url parameter 67 index box 51 StreamingUpdateSolrServer 284 multi-word synonyms 56 str element 92 n-gram 60 string type 45 n-gramming costs 61, 62 sub-expressions partial indexing 60 about 101 phonetic sounds-like 58 prohibited clause, limitations 102 query box 51 substring indexing stemming 54, 55 about 60 stop words 58 analyzer configuration, n-grams used 60 substring indexing 60 synonyms 55 [ 316 ]
  16. term text 51 wildcard queries text field type 50 about 103, 104 text field type definition, configuration 48 fuzzy queries 105 text field type definition, configuring 49 WordDelimeterFilterFactory 51 tokenizer 52 WordDelimeterFilterFactory, verbose output 51 tokenizer action 50 WordDelimiter analyzer 53 WordDelimiter analyzer WordDelimiterFilterFactory 53 splitting, ways 53, 54 WorkDelimiterFilterFactory 54 tokenizing, ways 53, 54 text field type 50 WordDelimiterFilterFactory 53 tf 112 WordNet thesarus 55 threaded_test.rb script 283, 284 write your own filter 63 thresholdTokenFrequency, wt, output related parameter 97 IndexBasedSpellChecker option 175 title_sort type 45 X tokenizer about 50 XML, sending to Solr HTMLStripStandardTokenizerFactory 52 about 69, 70 HTMLStripWhitespaceTokenizerFactory 52 changes, committing 71 KeywordTokenizerFactory 52 commit and optimize 71 LetterTokenizerFactory 52 documents, deleting 70 PatternTokenizerFactory 53 rollback command 71 StandardTokenizerFactory 52 uncommitted changes, withdrawing 71 WhitespaceTokenizerFactory 52 XML response format Tomcat 199 93 TPS 272 93 about 93 maxScore 93 U numFound 93 uniqueKey, schema.xml settings 47 QTime 93 uniqueKey field 232, 233 start 93 status 93 V URL, parsing 94 version, output related parameter 98 Y Vigilog URL 204 y, argument 120 W Z WAR 199 zip format 292 web.xml customizing, in Jetty 218 Web application archive. See  WAR WebTrends 202 WhitespaceTokenizerFactory 52 [ 317 ]
  17. Thank you for buying Solr 1.4 Enterprise Search Server Packt Open Source Project Royalties When we sell a book written on an Open Source project, we pay a royalty directly to that project. Therefore by purchasing Solr 1.4 Enterprise Search Server, Packt will have given some of the money received to the Apache Solr project. In the long term, we see ourselves and you—customers and readers of our books—as part of the Open Source ecosystem, providing sustainable revenue for the projects we publish on. Our aim at Packt is to establish publishing royalties as an essential part of the service and support a business model that sustains Open Source. If you're working with an Open Source project that you would like us to publish on, and subsequently pay royalties to, please get in touch with us. Writing for Packt We welcome all inquiries from people who are interested in authoring. Book proposals should be sent to author@packtpub.com. If your book idea is still at an early stage and you would like to discuss it first before writing a formal book proposal, contact us; one of our commissioning editors will get in touch with you. We're not just looking for published authors; if you have strong technical skills but no writing experience, our experienced editors can help you develop a writing career, or simply get some additional reward for your expertise. About Packt Publishing Packt, pronounced 'packed', published its first book "Mastering phpMyAdmin for Effective MySQL Management" in April 2004 and subsequently continued to specialize in publishing highly focused books on specific technologies and solutions. Our books and publications share the experiences of your fellow IT professionals in adapting and customizing today's systems, applications, and frameworks. Our solution-based books give you the knowledge and power to customize the software and technologies you're using to get the job done. Packt books are more specific and less general than the IT books you have seen in the past. Our unique business model allows us to bring you more focused information, giving you more of what you need to know, and less of what you don't. Packt is a modern, yet unique publishing company, which focuses on producing quality, cutting-edge books for communities of developers, administrators, and newbies alike. For more information, please visit our website: www.PacktPub.com.
  18. JasperReports for Java Developers ISBN: 1-904811-90-6 Paperback: 344 pages Create, Design, Format, and Export Reports with the world's most popular Java reporting library 1. Get started with JasperReports, and develop the skills to get the most from it 2. Create, design, format, and export reports 3. Generate report data from a wide range of datasources 4. Integrate Jasper Reports with Spring, Hibernate, Java Server Faces, or Struts JBoss Portal Server Development ISBN: 978-1-847194-10-7 Paperback: 276 pages Create dynamic, feature-rich, and robust enterprise portal applications 1. Complete guide with examples for building enterprise portal applications using the free, open-source standards-based JBoss portal server 2. Quickly build portal applications such as B2B web sites or corporate intranets 3. Practical approach to understanding concepts such as personalization, single sign-on, integration with web technologies, and content management Please check www.PacktPub.com for information on our titles
Đồng bộ tài khoản