Проблема добавления в индекс Solr из Django с использованием zc.buildout - PullRequest
3 голосов
/ 10 сентября 2010

Я пытаюсь запустить Apache Solr в моей среде zc.buildout.

Я определил простую модель:

class NewsItem(models.Model):
    title = models.CharField(blank=False, max_length=255, help_text=u"Title of this news item")
    slug = models.SlugField(blank=False, help_text=u"Slug will be automatically generated from the title")
    article = models.TextField(help_text=u"The body text of this news item")
    created_on = models.DateTimeField(auto_now_add = True)
    updated_on = models.DateTimeField(auto_now = True)
    published = models.BooleanField(default=True)

    def __unicode__(self):
        return self.title

search_index.py:

import datetime
from haystack.indexes import *
from haystack import site
from appname.models import *


class NewsItemIndex(RealTimeSearchIndex):
    text = CharField(document=True, use_template=True)

    def get_queryset(self):
        """Used when the entire index for model is updated."""
        return NewsItem.objects.all()


site.register(NewsItem, NewsItemIndex)

А search_sites.py определяет:

import haystack
haystack.autodiscover()

Файл настроек содержит:

HAYSTACK_SITECONF = 'appname.search_sites'
HAYSTACK_SEARCH_ENGINE = 'solr'
HAYSTACK_SOLR_URL = 'http://127.0.0.1:8983/solr'
HAYSTACK_SEARCH_RESULTS_PER_PAGE = 30
HAYSTACK_INCLUDE_SPELLING = True

'haystack' указан в INSTALLED_APPS, pysolr указан в 'install_requires' в setup.py (предлагается buildout)

Мой buildout.cfg содержит solr-files, solr, solr-conf и supervisor.

Я добавил ${buildout:directory}/solr-conf к [mkdir] путям.

Разделы supervisor и solr в buildout.cfg выглядят так:

[supervisor]
recipe = collective.recipe.supervisor
port = localhost:9001
user = admin
password = admin
plugins =
   superlance

# solr security settings: see
# http://docs.codehaus.org/display/JETTY/Connectors+slow+to+startup
programs =
   10 solr     (startsecs=10) java [-Djava.security.egd=file:/dev/urandom -jar start.jar] ${buildout:parts-directory}/solr true

eventlisteners =
   SolrHttpOk TICK_60 ${buildout:bin-directory}/httpok [-p solr -t 20 http://localhost:8983/solr/]


[solr-files]
recipe = hexagonit.recipe.download
url = ftp://mir1.ovh.net/ftp.apache.org/dist/lucene/solr/1.3.0/apache-solr-1.3.0.tgz
md5sum = 23774b077598c6440d69016fed5cc810
strip-top-level-dir = true

[solr]
recipe = collective.recipe.solrinstance
solr-location = ${buildout:parts-directory}/solr-files
host = localhost
port = 8983

unique-key = uniqueID
default-search-field = text

index =
   name:uniqueID type:string indexed:true stored:true required:true
   name:text type:string indexed:true stored:true required:false omitnorms:false multivalued:true

[solr-conf]
recipe = iw.recipe.cmd
on_install = true
on_update = true
cmds =
   cp -v ${buildout:directory}/solr-conf/jetty.xml ${solr:jetty-destination}
   cp -v ${buildout:directory}/solr-conf/schema.xml ${solr:schema-destination}
   cp -v ${buildout:directory}/solr-conf/stopwords_fr.txt ${solr:schema-destination}

[solr-rebuild]
recipe = iw.recipe.cmd
on_install = true
on_update = true

# since solr is not started by solr-instance but supervisord, solr-instance has
# no pid file and thinks that solr is down. Thus we must run it with
# solr-instance to be able to "solr-instance purge"
cmds =
   ${buildout:bin-directory}/supervisorctl stop solr
   cp -v ${buildout:directory}/solr-conf/schema.xml ${solr:schema-destination}
   ${buildout:bin-directory}/solr-instance start
   COUNT=15; echo "Waiting $COUNT s"; sleep $COUNT
   ${buildout:bin-directory}/solr-instance purge
   time ${buildout:bin-directory}/${django:control-script} rebuild_index
   ${buildout:bin-directory}/solr-instance stop
   ${buildout:bin-directory}/supervisorctl start solr

Когда я запускаю $ bin/buildout install solr-rebuild, я получаю следующий вывод:

`/appname/solr-conf/schema.xml' -> `/appname/parts/solr/solr/conf/schema.xml'
Solr started with pid 16023
Waiting 15 s
SimplePostTool: version 1.2
SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported
SimplePostTool: POSTing args to http://localhost:8983/solr/update..
SimplePostTool: COMMITting Solr index changes..

WARNING: This will irreparably remove EVERYTHING from your search index.
Your choices after this are to restore from backups or rebuild via the `rebuild_index` command.
Are you sure you wish to continue? [y/N] y

Removing all documents from your index because you said so.
All documents removed.
Indexing 1 news items.
Failed to add documents to Solr: [Reason: ERROR:unknown field 'django_ct']
0.32user 0.05system 0:02.82elapsed 13%CPU (0avgtext+0avgdata 57872maxresident)k
160inputs+8outputs (3major+4257minor)pagefaults 0swaps
Solr stopped successfully.

Аналогично, при запуске $ bin/django rebuild_index или $ bin/buildout update_index жалуется на 'django_ct':

Failed to add documents to Solr: [Reason: ERROR:unknown field 'django_ct']

(одна вещь, которую я собираюсь попробовать, это обновить solr до последней версии .. сообщит, если это сделает ..)

Я не уверен, где искать дальше .. Поиск в google, groups и stackoverflow не помог мне преодолеть этот пункт. Заранее спасибо!

1 Ответ

1 голос
/ 14 сентября 2010

ОК, проблема решена.Обновление до Solr 1.4.1 (и, как ни странно, перезагрузка после этого) добилось цели.

...