- Hi everybody,
-
- I have a problem with searching russian strings, utf8 encoded,
with Zend_Search_Lucene. Here is my short sample code:
<?php
require_once
'ZendInit.php';
require_once
'Zend/Search/Lucene.php';
require_once
'Zend/Search/Lucene/Document.php';
//
Create index
$index
=
Zend_Search_Lucene::create('data/index');
$doc
= new
Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::Text('samplefield',
'русский текст; english text',
'utf-8'));
$index->addDocument($doc);
$index->commit();
//
Open index and search:
$index
=
Zend_Search_Lucene::open('data/index');
Zend_Search_Lucene_Search_QueryParser::setDefaultEncoding('utf-8');
Zend_Search_Lucene::setDefaultSearchField('samplefield');
//
Query the index:
$queryStr
= 'english';
$query
=
Zend_Search_Lucene_Search_QueryParser::parse($queryStr,
'utf-8');
$hits
= $index->find($query);
foreach
($hits as
$hit) {
/*@var
$hit Zend_Search_Lucene*/
$doc
= $hit->getDocument();
echo
$doc->getField('samplefield')->value,
PHP_EOL;
}
-
- The 'samplefield' of the document contain string in too
languages �C russian and english(see code). If we'll search
'english' it's all fine - we successfully find the document, but if
we'll try to find russian part of field( set $queryStr to 'русский')
then we don't find any document.
-
- What is a problem with my code? Help me find solution...
-
- Thank you guys
-
- Maxim Savenko
-
[hidden email]