PHP Класс SolrWebService, ojs

Наследование: extends XmlWebService
Показать файл Открыть проект Примеры использования класса

Открытые свойства

Свойство Тип Описание
$_fieldCache A cache containing the available search fields.
$_instId The unique ID identifying this OJS installation to the solr server.
$_issueCache An issue cache.
$_journalCache A journal cache.
$_serviceMessage A description of the last error or message that occurred when calling the service.
$_solrCore The solr core we get our data from.
$_solrSearchHandler The solr search handler name we place our searches on.
$_solrServer The base URL of the solr server without core and search handler.
$_useProxySettings Whether the proxy settings in the config.inc.php should be considered for the web service request.

Открытые методы

Метод Описание
__construct ( $searchHandler, $username, $password, $instId, $useProxy = false ) Constructor
_addArticleXml ( &$articleDoc, &$article, &$journal, $markToDelete = false ) Add the metadata XML of a single article to an XML article list.
_addSubquery ( $fieldList, $searchPhrase, $params ) Add a subquery to the search query.
_cacheMiss ( $cache, $id ) : array Refresh the cache from the solr server.
_convertDate ( $timestamp ) : string Convert a date from local time (unix timestamp or ISO date string) to UTC time as understood by solr.
_deleteFromIndex ( $xml ) : boolean Delete documents from the index (by ID or by query).
_expandFieldList ( $fields ) : string Expand the given list of fields.
_getAdminUrl ( ) : string Identifies the general solr admin endpoint from the search handler URL.
_getArticleListXml ( &$articles, $totalCount, &$numDeleted ) : string Retrieve the XML for a batch of articles to be updated.
_getAutosuggestUrl ( $autosuggestType ) : string Returns the solr auto-suggestion endpoint.
_getCache ( ) : FileCache Get the field cache.
_getCoreAdminUrl ( ) : string Identifies the solr core-specific admin endpoint from the search handler URL.
_getDihUrl ( ) : string Returns the solr DIH endpoint.
_getDocumentsProcessed ( $result ) : integer Retrieve the number of indexed documents from a DIH response XML
_getFacetingAutosuggestions ( $url, $searchRequest, $userInput, $fieldName ) : array Retrieve auto-suggestions from the faceting service.
_getFieldNames ( $fieldType ) : array Return a list of all text fields that may occur in the index.
_getInterestingTermsUrl ( ) : string Returns the solr endpoint to retrieve "interesting terms" from a given document.
_getIssue ( $issueId, $journalId ) : Issue Retrieve an issue (possibly from the cache).
_getJournal ( $journalId ) : Journal Retrieve a journal (possibly from the cache).
_getLocalesAndFormats ( $field ) : array Identify all format/locale versions of the given field.
_getOrdering ( $field, $direction ) : string Generate the ordering parameter of a search query.
_getReloadExternalFilesUrl ( ) Returns the solr endpoint to reload external files.
_getSearchQueryParameters ( &$searchRequest ) : array | null Create the edismax query parameters from a search request.
_getSearchUrl ( ) : string Returns the solr search endpoint.
_getSuggesterAutosuggestions ( $url, $userInput, $fieldName ) : array Retrieve auto-suggestions from the suggester service.
_getUpdateUrl ( ) : string Returns the solr update endpoint.
_indexingTransaction ( $sendXmlCallback, $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) This method encapsulates an indexing transaction (pull or push).
_isArticleAccessAuthorized ( &$article ) : boolean Check whether access to the given article is authorized to the requesting party (i.e. the Solr server).
_makeRequest ( $url, $params = [], $method = 'GET' ) : DOMXPath Make a request
_pushIndexingCallback ( &$articleXml, $batchCount, $numDeleted ) : integer Handle push indexing.
_setQuery ( $fieldList, $searchPhrase, $spellcheck = false ) Set the query parameters for a search query.
_translateSearchPhrase ( $searchPhrase, $backwards = false ) : The Translate query keywords.
deleteArticleFromIndex ( $articleId ) : boolean Deletes the given article from the Solr index.
deleteArticlesFromIndex ( $journalId = null ) : boolean Deletes all articles of a journal or of the installation from the Solr index.
flushFieldCache ( ) Flush the field cache.
getArticleFromIndex ( $articleId ) : array Retrieve a document directly from the index (for testing/debugging purposes only).
getAutosuggestions ( $searchRequest, $fieldName, $userInput, $autosuggestType ) : array Retrieve auto-suggestions from the solr index corresponding to the given user input.
getAvailableFields ( $fieldType ) : array Returns an array with all (dynamic) fields in the index.
getInterestingTerms ( $articleId ) : array Retrieve "interesting terms" from a document to be used in a "similar documents" search.
getServerStatus ( ) : integer Checks the solr server status.
getServiceMessage ( ) : string Get the last service message.
markArticleChanged ( $articleId ) Mark a single article "changed" so that the indexing back-end will update it during the next batch update.
markJournalChanged ( $journalId ) : integer Mark the given journal for re-indexing.
pullChangedArticles ( $pullIndexingCallback, $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) : integer Retrieves a batch of articles in XML format.
pushChangedArticles ( $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) : integer (Re-)indexes all changed articles in Solr.
rebuildDictionaries ( ) Rebuilds the spelling/auto-suggest dictionaries.
reloadExternalFiles ( ) Reloads external files.
retrieveResults ( &$searchRequest, &$totalResults ) : array Execute a search against the Solr search server.

Описание методов

__construct() публичный Метод

Constructor
public __construct ( $searchHandler, $username, $password, $instId, $useProxy = false )
$searchHandler string The search handler URL. We assume the embedded server as a default.
$username string The HTTP BASIC authentication username.
$password string The corresponding password.
$instId string The unique ID of this OJS installation to partition a shared index. @param $useProxy boolean Whether the proxy settings from config.inc.php should be considered.

_addArticleXml() публичный Метод

Add the metadata XML of a single article to an XML article list.
public _addArticleXml ( &$articleDoc, &$article, &$journal, $markToDelete = false )
$articleDoc DOMDocument
$article PublishedArticle
$journal Journal
$markToDelete boolean If true the returned XML will only contain a deletion marker.

_addSubquery() публичный Метод

NB: subqueries do not support collation (for alternative spelling suggestions).
public _addSubquery ( $fieldList, $searchPhrase, $params )
$fieldList string A list of fields to be queried, separated by '|'.
$searchPhrase string The search phrase to be added.
$params array The existing query parameters.

_cacheMiss() публичный Метод

Refresh the cache from the solr server.
public _cacheMiss ( $cache, $id ) : array
$cache FileCache
$id string The field type.
Результат array The available field names.

_convertDate() публичный Метод

NB: Using intermediate unix timestamps can be a problem in older PHP versions, especially on Windows where negative timestamps are not supported. As Solr requires PHP5 that should not be a big problem in practice, except for electronic publications that go back until earlier than 1901. It does not seem probable that such a situation could realistically arise with OJS.
public _convertDate ( $timestamp ) : string
$timestamp int|string Unix timestamp or local ISO time.
Результат string ISO UTC timestamp

_deleteFromIndex() публичный Метод

Delete documents from the index (by ID or by query).
public _deleteFromIndex ( $xml ) : boolean
$xml string The documents to delete.
Результат boolean true, if successful, otherwise false.

_expandFieldList() публичный Метод

Expand the given list of fields.
public _expandFieldList ( $fields ) : string
$fields array
Результат string A space-separated field list (e.g. to be used in edismax's qf parameter).

_getAdminUrl() публичный Метод

Identifies the general solr admin endpoint from the search handler URL.
public _getAdminUrl ( ) : string
Результат string

_getArticleListXml() публичный Метод

Retrieve the XML for a batch of articles to be updated.
public _getArticleListXml ( &$articles, $totalCount, &$numDeleted ) : string
$articles DBResultFactory The articles to be included in the list.
$totalCount integer The overall number of changed articles (not only the current batch).
$numDeleted integer An output parameter that returns the number of documents that will be deleted.
Результат string The XML ready to be consumed by the Solr data import service.

_getAutosuggestUrl() публичный Метод

Returns the solr auto-suggestion endpoint.
public _getAutosuggestUrl ( $autosuggestType ) : string
$autosuggestType string One of the SOLR_AUTOSUGGEST_* constants
Результат string

_getCache() публичный Метод

Get the field cache.
public _getCache ( ) : FileCache
Результат FileCache

_getCoreAdminUrl() публичный Метод

Identifies the solr core-specific admin endpoint from the search handler URL.
public _getCoreAdminUrl ( ) : string
Результат string

_getDihUrl() публичный Метод

Returns the solr DIH endpoint.
public _getDihUrl ( ) : string
Результат string

_getDocumentsProcessed() публичный Метод

Retrieve the number of indexed documents from a DIH response XML
public _getDocumentsProcessed ( $result ) : integer
$result DOMXPath
Результат integer

_getFacetingAutosuggestions() публичный Метод

Retrieve auto-suggestions from the faceting service.
public _getFacetingAutosuggestions ( $url, $searchRequest, $userInput, $fieldName ) : array
$url string
$searchRequest SolrSearchRequest
$userInput string
$fieldName string
Результат array The generated suggestions.

_getFieldNames() публичный Метод

Return a list of all text fields that may occur in the index.
public _getFieldNames ( $fieldType ) : array
$fieldType string "search", "sort" or "all"
Результат array

_getInterestingTermsUrl() публичный Метод

Returns the solr endpoint to retrieve "interesting terms" from a given document.
public _getInterestingTermsUrl ( ) : string
Результат string

_getIssue() публичный Метод

Retrieve an issue (possibly from the cache).
public _getIssue ( $issueId, $journalId ) : Issue
$issueId int
$journalId int
Результат Issue

_getJournal() публичный Метод

Retrieve a journal (possibly from the cache).
public _getJournal ( $journalId ) : Journal
$journalId int
Результат Journal

_getLocalesAndFormats() публичный Метод

Identify all format/locale versions of the given field.
public _getLocalesAndFormats ( $field ) : array
$field string A field name without any extension.
Результат array A list of index fields.

_getOrdering() публичный Метод

Generate the ordering parameter of a search query.
public _getOrdering ( $field, $direction ) : string
$field string the field to order by
$direction boolean true for ascending, false for descending
Результат string The ordering to be used (default: descending relevance).

_getReloadExternalFilesUrl() публичный Метод

Returns the solr endpoint to reload external files.

_getSearchQueryParameters() публичный Метод

Create the edismax query parameters from a search request.
public _getSearchQueryParameters ( &$searchRequest ) : array | null
$searchRequest SolrSearchRequest
Результат array | null A parameter array or null if something went wrong.

_getSearchUrl() публичный Метод

Returns the solr search endpoint.
public _getSearchUrl ( ) : string
Результат string

_getSuggesterAutosuggestions() публичный Метод

Retrieve auto-suggestions from the suggester service.
public _getSuggesterAutosuggestions ( $url, $userInput, $fieldName ) : array
$url string
$userInput string
$fieldName string
Результат array The generated suggestions.

_getUpdateUrl() публичный Метод

Returns the solr update endpoint.
public _getUpdateUrl ( ) : string
Результат string

_indexingTransaction() публичный Метод

It consists in generating the XML, transferring it to the server and marking the transferred articles as "indexed".
public _indexingTransaction ( $sendXmlCallback, $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null )
$sendXmlCallback callback This function will be called with the generated XML.
$batchSize integer The maximum number of articles to be returned.
$journalId integer If given, only retrieves articles for the given journal.

_isArticleAccessAuthorized() публичный Метод

Check whether access to the given article is authorized to the requesting party (i.e. the Solr server).
public _isArticleAccessAuthorized ( &$article ) : boolean
$article Article
Результат boolean True if authorized, otherwise false.

_makeRequest() публичный Метод

Make a request
public _makeRequest ( $url, $params = [], $method = 'GET' ) : DOMXPath
$url string The request URL
$params mixed array (key value pairs) or string request parameters
$method string GET or POST
Результат DOMXPath An XPath object with the response loaded. Null if an error occurred. See _serviceMessage for more details about the error.

_pushIndexingCallback() публичный Метод

This method pushes XML with index changes directly to the Solr data import handler for immediate processing.
public _pushIndexingCallback ( &$articleXml, $batchCount, $numDeleted ) : integer
$articleXml string The XML with index changes to be pushed to the Solr server.
$batchCount integer The number of articles in the XML list (i.e. the expected number of documents to be indexed).
$numDeleted integer The number of articles in the XML list that are marked for deletion.
Результат integer The number of articles processed or null if an error occurred. After an error the method SolrWebService::getServiceMessage() will return details of the error.

_setQuery() публичный Метод

Set the query parameters for a search query.
public _setQuery ( $fieldList, $searchPhrase, $spellcheck = false )
$fieldList string A list of fields to be queried, separated by '|'.
$searchPhrase string The search phrase to be added.
$spellcheck boolean Whether to switch spellchecking on.

_translateSearchPhrase() публичный Метод

Translate query keywords.
public _translateSearchPhrase ( $searchPhrase, $backwards = false ) : The
$searchPhrase string
Результат The translated search phrase.

deleteArticleFromIndex() публичный Метод

Deletes the given article from the Solr index.
public deleteArticleFromIndex ( $articleId ) : boolean
$articleId integer The ID of the article to be deleted.
Результат boolean true if successful, otherwise false.

deleteArticlesFromIndex() публичный Метод

Deletes all articles of a journal or of the installation from the Solr index.
public deleteArticlesFromIndex ( $journalId = null ) : boolean
$journalId integer If given, only articles from this journal will be deleted.
Результат boolean true if successful, otherwise false.

flushFieldCache() публичный Метод

Flush the field cache.
public flushFieldCache ( )

getArticleFromIndex() публичный Метод

Retrieve a document directly from the index (for testing/debugging purposes only).
public getArticleFromIndex ( $articleId ) : array
$articleId
Результат array The document fields.

getAutosuggestions() публичный Метод

Retrieve auto-suggestions from the solr index corresponding to the given user input.
public getAutosuggestions ( $searchRequest, $fieldName, $userInput, $autosuggestType ) : array
$searchRequest SolrSearchRequest Active search filters. Choosing the faceting auto-suggest implementation via $autosuggestType will pre-filter auto-suggestions based on this search request. In case of the suggester component, the search request will simply be ignored.
$fieldName string The field to suggest values for. Values are queried on field level to improve relevance of suggestions.
$userInput string Partial query input. This input will be split split up. Only the last query term will be used to suggest values.
$autosuggestType string One of the SOLR_AUTOSUGGEST_* constants. The faceting implementation is slower but will return more relevant suggestions. The suggestor implementation is faster and scales better in large deployments. It will return terms from a field-specific global dictionary, though, e.g. from different journals.
Результат array A list of suggested queries

getAvailableFields() публичный Метод

NB: This is cached data so after an index update we may have to flush the index to re-read the current index state.
public getAvailableFields ( $fieldType ) : array
$fieldType string Either 'search' or 'sort'.
Результат array

getInterestingTerms() публичный Метод

Retrieve "interesting terms" from a document to be used in a "similar documents" search.
public getInterestingTerms ( $articleId ) : array
$articleId integer The article from which we retrieve "interesting terms".
Результат array An array of terms that can be used to execute a search for similar documents.

getServerStatus() публичный Метод

Checks the solr server status.
public getServerStatus ( ) : integer
Результат integer One of the SOLR_STATUS_* constants.

getServiceMessage() публичный Метод

Get the last service message.
public getServiceMessage ( ) : string
Результат string

markArticleChanged() публичный Метод

Mark a single article "changed" so that the indexing back-end will update it during the next batch update.
public markArticleChanged ( $articleId )
$articleId Integer

markJournalChanged() публичный Метод

Mark the given journal for re-indexing.
public markJournalChanged ( $journalId ) : integer
$journalId integer The ID of the journal to be (re-)indexed.
Результат integer The number of articles that have been marked.

pullChangedArticles() публичный Метод

This is the pull-indexing implementation of the Solr web service. To control memory usage and response time we index articles in batches. Batches should be as large as possible to reduce index commit overhead.
public pullChangedArticles ( $pullIndexingCallback, $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) : integer
$batchSize integer The maximum number of articles to be returned.
$journalId integer If given, only returns articles from the given journal.
Результат integer The number of articles processed or null if an error occurred. After an error the method SolrWebService::getServiceMessage() will return details of the error.

pushChangedArticles() публичный Метод

This is the push-indexing implementation of the Solr web service. To control memory usage and response time we index articles in batches. Batches should be as large as possible to reduce index commit overhead.
public pushChangedArticles ( $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) : integer
$batchSize integer The maximum number of articles to be indexed in this run.
$journalId integer If given, restrains index updates to the given journal.
Результат integer The number of articles processed or null if an error occurred. After an error the method SolrWebService::getServiceMessage() will return details of the error.

rebuildDictionaries() публичный Метод

Rebuilds the spelling/auto-suggest dictionaries.
public rebuildDictionaries ( )

reloadExternalFiles() публичный Метод

Reloads external files.
public reloadExternalFiles ( )

retrieveResults() публичный Метод

Execute a search against the Solr search server.
public retrieveResults ( &$searchRequest, &$totalResults ) : array
$searchRequest SolrSearchRequest
$totalResults integer An output parameter returning the total number of search results found by the query. This differs from the actual number of returned results as the search can be limited.
Результат array An array of search results. The main keys are result types. These are "scoredResults" and "alternativeSpelling". The keys in the "scoredResults" sub-array are scores (1-9999) and the values are article IDs. The alternative spelling sub-array returns an alternative query string (if any) and the number of hits for this string. Null if an error occurred while querying the server.

Описание свойств

$_fieldCache публичное свойство

A cache containing the available search fields.
public $_fieldCache

$_instId публичное свойство

The unique ID identifying this OJS installation to the solr server.
public $_instId

$_issueCache публичное свойство

An issue cache.
public $_issueCache

$_journalCache публичное свойство

A journal cache.
public $_journalCache

$_serviceMessage публичное свойство

A description of the last error or message that occurred when calling the service.
public $_serviceMessage

$_solrCore публичное свойство

The solr core we get our data from.
public $_solrCore

$_solrSearchHandler публичное свойство

The solr search handler name we place our searches on.
public $_solrSearchHandler

$_solrServer публичное свойство

The base URL of the solr server without core and search handler.
public $_solrServer

$_useProxySettings публичное свойство

Whether the proxy settings in the config.inc.php should be considered for the web service request.
public $_useProxySettings