PHP Class SolrWebService, ojs

Inheritance: extends XmlWebService
Afficher le fichier Open project: pkp/ojs Class Usage Examples

Méthodes publiques

Свойство Type Description
$_fieldCache A cache containing the available search fields.
$_instId The unique ID identifying this OJS installation to the solr server.
$_issueCache An issue cache.
$_journalCache A journal cache.
$_serviceMessage A description of the last error or message that occurred when calling the service.
$_solrCore The solr core we get our data from.
$_solrSearchHandler The solr search handler name we place our searches on.
$_solrServer The base URL of the solr server without core and search handler.
$_useProxySettings Whether the proxy settings in the config.inc.php should be considered for the web service request.

Méthodes publiques

Méthode Description
__construct ( $searchHandler, $username, $password, $instId, $useProxy = false ) Constructor
_addArticleXml ( &$articleDoc, &$article, &$journal, $markToDelete = false ) Add the metadata XML of a single article to an XML article list.
_addSubquery ( $fieldList, $searchPhrase, $params ) Add a subquery to the search query.
_cacheMiss ( $cache, $id ) : array Refresh the cache from the solr server.
_convertDate ( $timestamp ) : string Convert a date from local time (unix timestamp or ISO date string) to UTC time as understood by solr.
_deleteFromIndex ( $xml ) : boolean Delete documents from the index (by ID or by query).
_expandFieldList ( $fields ) : string Expand the given list of fields.
_getAdminUrl ( ) : string Identifies the general solr admin endpoint from the search handler URL.
_getArticleListXml ( &$articles, $totalCount, &$numDeleted ) : string Retrieve the XML for a batch of articles to be updated.
_getAutosuggestUrl ( $autosuggestType ) : string Returns the solr auto-suggestion endpoint.
_getCache ( ) : FileCache Get the field cache.
_getCoreAdminUrl ( ) : string Identifies the solr core-specific admin endpoint from the search handler URL.
_getDihUrl ( ) : string Returns the solr DIH endpoint.
_getDocumentsProcessed ( $result ) : integer Retrieve the number of indexed documents from a DIH response XML
_getFacetingAutosuggestions ( $url, $searchRequest, $userInput, $fieldName ) : array Retrieve auto-suggestions from the faceting service.
_getFieldNames ( $fieldType ) : array Return a list of all text fields that may occur in the index.
_getInterestingTermsUrl ( ) : string Returns the solr endpoint to retrieve "interesting terms" from a given document.
_getIssue ( $issueId, $journalId ) : Issue Retrieve an issue (possibly from the cache).
_getJournal ( $journalId ) : Journal Retrieve a journal (possibly from the cache).
_getLocalesAndFormats ( $field ) : array Identify all format/locale versions of the given field.
_getOrdering ( $field, $direction ) : string Generate the ordering parameter of a search query.
_getReloadExternalFilesUrl ( ) Returns the solr endpoint to reload external files.
_getSearchQueryParameters ( &$searchRequest ) : array | null Create the edismax query parameters from a search request.
_getSearchUrl ( ) : string Returns the solr search endpoint.
_getSuggesterAutosuggestions ( $url, $userInput, $fieldName ) : array Retrieve auto-suggestions from the suggester service.
_getUpdateUrl ( ) : string Returns the solr update endpoint.
_indexingTransaction ( $sendXmlCallback, $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) This method encapsulates an indexing transaction (pull or push).
_isArticleAccessAuthorized ( &$article ) : boolean Check whether access to the given article is authorized to the requesting party (i.e. the Solr server).
_makeRequest ( $url, $params = [], $method = 'GET' ) : DOMXPath Make a request
_pushIndexingCallback ( &$articleXml, $batchCount, $numDeleted ) : integer Handle push indexing.
_setQuery ( $fieldList, $searchPhrase, $spellcheck = false ) Set the query parameters for a search query.
_translateSearchPhrase ( $searchPhrase, $backwards = false ) : The Translate query keywords.
deleteArticleFromIndex ( $articleId ) : boolean Deletes the given article from the Solr index.
deleteArticlesFromIndex ( $journalId = null ) : boolean Deletes all articles of a journal or of the installation from the Solr index.
flushFieldCache ( ) Flush the field cache.
getArticleFromIndex ( $articleId ) : array Retrieve a document directly from the index (for testing/debugging purposes only).
getAutosuggestions ( $searchRequest, $fieldName, $userInput, $autosuggestType ) : array Retrieve auto-suggestions from the solr index corresponding to the given user input.
getAvailableFields ( $fieldType ) : array Returns an array with all (dynamic) fields in the index.
getInterestingTerms ( $articleId ) : array Retrieve "interesting terms" from a document to be used in a "similar documents" search.
getServerStatus ( ) : integer Checks the solr server status.
getServiceMessage ( ) : string Get the last service message.
markArticleChanged ( $articleId ) Mark a single article "changed" so that the indexing back-end will update it during the next batch update.
markJournalChanged ( $journalId ) : integer Mark the given journal for re-indexing.
pullChangedArticles ( $pullIndexingCallback, $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) : integer Retrieves a batch of articles in XML format.
pushChangedArticles ( $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) : integer (Re-)indexes all changed articles in Solr.
rebuildDictionaries ( ) Rebuilds the spelling/auto-suggest dictionaries.
reloadExternalFiles ( ) Reloads external files.
retrieveResults ( &$searchRequest, &$totalResults ) : array Execute a search against the Solr search server.

Method Details

__construct() public méthode

Constructor
public __construct ( $searchHandler, $username, $password, $instId, $useProxy = false )
$searchHandler string The search handler URL. We assume the embedded server as a default.
$username string The HTTP BASIC authentication username.
$password string The corresponding password.
$instId string The unique ID of this OJS installation to partition a shared index. @param $useProxy boolean Whether the proxy settings from config.inc.php should be considered.

_addArticleXml() public méthode

Add the metadata XML of a single article to an XML article list.
public _addArticleXml ( &$articleDoc, &$article, &$journal, $markToDelete = false )
$articleDoc DOMDocument
$article PublishedArticle
$journal Journal
$markToDelete boolean If true the returned XML will only contain a deletion marker.

_addSubquery() public méthode

NB: subqueries do not support collation (for alternative spelling suggestions).
public _addSubquery ( $fieldList, $searchPhrase, $params )
$fieldList string A list of fields to be queried, separated by '|'.
$searchPhrase string The search phrase to be added.
$params array The existing query parameters.

_cacheMiss() public méthode

Refresh the cache from the solr server.
public _cacheMiss ( $cache, $id ) : array
$cache FileCache
$id string The field type.
Résultat array The available field names.

_convertDate() public méthode

NB: Using intermediate unix timestamps can be a problem in older PHP versions, especially on Windows where negative timestamps are not supported. As Solr requires PHP5 that should not be a big problem in practice, except for electronic publications that go back until earlier than 1901. It does not seem probable that such a situation could realistically arise with OJS.
public _convertDate ( $timestamp ) : string
$timestamp int|string Unix timestamp or local ISO time.
Résultat string ISO UTC timestamp

_deleteFromIndex() public méthode

Delete documents from the index (by ID or by query).
public _deleteFromIndex ( $xml ) : boolean
$xml string The documents to delete.
Résultat boolean true, if successful, otherwise false.

_expandFieldList() public méthode

Expand the given list of fields.
public _expandFieldList ( $fields ) : string
$fields array
Résultat string A space-separated field list (e.g. to be used in edismax's qf parameter).

_getAdminUrl() public méthode

Identifies the general solr admin endpoint from the search handler URL.
public _getAdminUrl ( ) : string
Résultat string

_getArticleListXml() public méthode

Retrieve the XML for a batch of articles to be updated.
public _getArticleListXml ( &$articles, $totalCount, &$numDeleted ) : string
$articles DBResultFactory The articles to be included in the list.
$totalCount integer The overall number of changed articles (not only the current batch).
$numDeleted integer An output parameter that returns the number of documents that will be deleted.
Résultat string The XML ready to be consumed by the Solr data import service.

_getAutosuggestUrl() public méthode

Returns the solr auto-suggestion endpoint.
public _getAutosuggestUrl ( $autosuggestType ) : string
$autosuggestType string One of the SOLR_AUTOSUGGEST_* constants
Résultat string

_getCache() public méthode

Get the field cache.
public _getCache ( ) : FileCache
Résultat FileCache

_getCoreAdminUrl() public méthode

Identifies the solr core-specific admin endpoint from the search handler URL.
public _getCoreAdminUrl ( ) : string
Résultat string

_getDihUrl() public méthode

Returns the solr DIH endpoint.
public _getDihUrl ( ) : string
Résultat string

_getDocumentsProcessed() public méthode

Retrieve the number of indexed documents from a DIH response XML
public _getDocumentsProcessed ( $result ) : integer
$result DOMXPath
Résultat integer

_getFacetingAutosuggestions() public méthode

Retrieve auto-suggestions from the faceting service.
public _getFacetingAutosuggestions ( $url, $searchRequest, $userInput, $fieldName ) : array
$url string
$searchRequest SolrSearchRequest
$userInput string
$fieldName string
Résultat array The generated suggestions.

_getFieldNames() public méthode

Return a list of all text fields that may occur in the index.
public _getFieldNames ( $fieldType ) : array
$fieldType string "search", "sort" or "all"
Résultat array

_getInterestingTermsUrl() public méthode

Returns the solr endpoint to retrieve "interesting terms" from a given document.
public _getInterestingTermsUrl ( ) : string
Résultat string

_getIssue() public méthode

Retrieve an issue (possibly from the cache).
public _getIssue ( $issueId, $journalId ) : Issue
$issueId int
$journalId int
Résultat Issue

_getJournal() public méthode

Retrieve a journal (possibly from the cache).
public _getJournal ( $journalId ) : Journal
$journalId int
Résultat Journal

_getLocalesAndFormats() public méthode

Identify all format/locale versions of the given field.
public _getLocalesAndFormats ( $field ) : array
$field string A field name without any extension.
Résultat array A list of index fields.

_getOrdering() public méthode

Generate the ordering parameter of a search query.
public _getOrdering ( $field, $direction ) : string
$field string the field to order by
$direction boolean true for ascending, false for descending
Résultat string The ordering to be used (default: descending relevance).

_getReloadExternalFilesUrl() public méthode

Returns the solr endpoint to reload external files.

_getSearchQueryParameters() public méthode

Create the edismax query parameters from a search request.
public _getSearchQueryParameters ( &$searchRequest ) : array | null
$searchRequest SolrSearchRequest
Résultat array | null A parameter array or null if something went wrong.

_getSearchUrl() public méthode

Returns the solr search endpoint.
public _getSearchUrl ( ) : string
Résultat string

_getSuggesterAutosuggestions() public méthode

Retrieve auto-suggestions from the suggester service.
public _getSuggesterAutosuggestions ( $url, $userInput, $fieldName ) : array
$url string
$userInput string
$fieldName string
Résultat array The generated suggestions.

_getUpdateUrl() public méthode

Returns the solr update endpoint.
public _getUpdateUrl ( ) : string
Résultat string

_indexingTransaction() public méthode

It consists in generating the XML, transferring it to the server and marking the transferred articles as "indexed".
public _indexingTransaction ( $sendXmlCallback, $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null )
$sendXmlCallback callback This function will be called with the generated XML.
$batchSize integer The maximum number of articles to be returned.
$journalId integer If given, only retrieves articles for the given journal.

_isArticleAccessAuthorized() public méthode

Check whether access to the given article is authorized to the requesting party (i.e. the Solr server).
public _isArticleAccessAuthorized ( &$article ) : boolean
$article Article
Résultat boolean True if authorized, otherwise false.

_makeRequest() public méthode

Make a request
public _makeRequest ( $url, $params = [], $method = 'GET' ) : DOMXPath
$url string The request URL
$params mixed array (key value pairs) or string request parameters
$method string GET or POST
Résultat DOMXPath An XPath object with the response loaded. Null if an error occurred. See _serviceMessage for more details about the error.

_pushIndexingCallback() public méthode

This method pushes XML with index changes directly to the Solr data import handler for immediate processing.
public _pushIndexingCallback ( &$articleXml, $batchCount, $numDeleted ) : integer
$articleXml string The XML with index changes to be pushed to the Solr server.
$batchCount integer The number of articles in the XML list (i.e. the expected number of documents to be indexed).
$numDeleted integer The number of articles in the XML list that are marked for deletion.
Résultat integer The number of articles processed or null if an error occurred. After an error the method SolrWebService::getServiceMessage() will return details of the error.

_setQuery() public méthode

Set the query parameters for a search query.
public _setQuery ( $fieldList, $searchPhrase, $spellcheck = false )
$fieldList string A list of fields to be queried, separated by '|'.
$searchPhrase string The search phrase to be added.
$spellcheck boolean Whether to switch spellchecking on.

_translateSearchPhrase() public méthode

Translate query keywords.
public _translateSearchPhrase ( $searchPhrase, $backwards = false ) : The
$searchPhrase string
Résultat The translated search phrase.

deleteArticleFromIndex() public méthode

Deletes the given article from the Solr index.
public deleteArticleFromIndex ( $articleId ) : boolean
$articleId integer The ID of the article to be deleted.
Résultat boolean true if successful, otherwise false.

deleteArticlesFromIndex() public méthode

Deletes all articles of a journal or of the installation from the Solr index.
public deleteArticlesFromIndex ( $journalId = null ) : boolean
$journalId integer If given, only articles from this journal will be deleted.
Résultat boolean true if successful, otherwise false.

flushFieldCache() public méthode

Flush the field cache.
public flushFieldCache ( )

getArticleFromIndex() public méthode

Retrieve a document directly from the index (for testing/debugging purposes only).
public getArticleFromIndex ( $articleId ) : array
$articleId
Résultat array The document fields.

getAutosuggestions() public méthode

Retrieve auto-suggestions from the solr index corresponding to the given user input.
public getAutosuggestions ( $searchRequest, $fieldName, $userInput, $autosuggestType ) : array
$searchRequest SolrSearchRequest Active search filters. Choosing the faceting auto-suggest implementation via $autosuggestType will pre-filter auto-suggestions based on this search request. In case of the suggester component, the search request will simply be ignored.
$fieldName string The field to suggest values for. Values are queried on field level to improve relevance of suggestions.
$userInput string Partial query input. This input will be split split up. Only the last query term will be used to suggest values.
$autosuggestType string One of the SOLR_AUTOSUGGEST_* constants. The faceting implementation is slower but will return more relevant suggestions. The suggestor implementation is faster and scales better in large deployments. It will return terms from a field-specific global dictionary, though, e.g. from different journals.
Résultat array A list of suggested queries

getAvailableFields() public méthode

NB: This is cached data so after an index update we may have to flush the index to re-read the current index state.
public getAvailableFields ( $fieldType ) : array
$fieldType string Either 'search' or 'sort'.
Résultat array

getInterestingTerms() public méthode

Retrieve "interesting terms" from a document to be used in a "similar documents" search.
public getInterestingTerms ( $articleId ) : array
$articleId integer The article from which we retrieve "interesting terms".
Résultat array An array of terms that can be used to execute a search for similar documents.

getServerStatus() public méthode

Checks the solr server status.
public getServerStatus ( ) : integer
Résultat integer One of the SOLR_STATUS_* constants.

getServiceMessage() public méthode

Get the last service message.
public getServiceMessage ( ) : string
Résultat string

markArticleChanged() public méthode

Mark a single article "changed" so that the indexing back-end will update it during the next batch update.
public markArticleChanged ( $articleId )
$articleId Integer

markJournalChanged() public méthode

Mark the given journal for re-indexing.
public markJournalChanged ( $journalId ) : integer
$journalId integer The ID of the journal to be (re-)indexed.
Résultat integer The number of articles that have been marked.

pullChangedArticles() public méthode

This is the pull-indexing implementation of the Solr web service. To control memory usage and response time we index articles in batches. Batches should be as large as possible to reduce index commit overhead.
public pullChangedArticles ( $pullIndexingCallback, $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) : integer
$batchSize integer The maximum number of articles to be returned.
$journalId integer If given, only returns articles from the given journal.
Résultat integer The number of articles processed or null if an error occurred. After an error the method SolrWebService::getServiceMessage() will return details of the error.

pushChangedArticles() public méthode

This is the push-indexing implementation of the Solr web service. To control memory usage and response time we index articles in batches. Batches should be as large as possible to reduce index commit overhead.
public pushChangedArticles ( $batchSize = SOLR_INDEXING_MAX_BATCHSIZE, $journalId = null ) : integer
$batchSize integer The maximum number of articles to be indexed in this run.
$journalId integer If given, restrains index updates to the given journal.
Résultat integer The number of articles processed or null if an error occurred. After an error the method SolrWebService::getServiceMessage() will return details of the error.

rebuildDictionaries() public méthode

Rebuilds the spelling/auto-suggest dictionaries.
public rebuildDictionaries ( )

reloadExternalFiles() public méthode

Reloads external files.
public reloadExternalFiles ( )

retrieveResults() public méthode

Execute a search against the Solr search server.
public retrieveResults ( &$searchRequest, &$totalResults ) : array
$searchRequest SolrSearchRequest
$totalResults integer An output parameter returning the total number of search results found by the query. This differs from the actual number of returned results as the search can be limited.
Résultat array An array of search results. The main keys are result types. These are "scoredResults" and "alternativeSpelling". The keys in the "scoredResults" sub-array are scores (1-9999) and the values are article IDs. The alternative spelling sub-array returns an alternative query string (if any) and the number of hits for this string. Null if an error occurred while querying the server.

Property Details

$_fieldCache public_oe property

A cache containing the available search fields.
public $_fieldCache

$_instId public_oe property

The unique ID identifying this OJS installation to the solr server.
public $_instId

$_issueCache public_oe property

An issue cache.
public $_issueCache

$_journalCache public_oe property

A journal cache.
public $_journalCache

$_serviceMessage public_oe property

A description of the last error or message that occurred when calling the service.
public $_serviceMessage

$_solrCore public_oe property

The solr core we get our data from.
public $_solrCore

$_solrSearchHandler public_oe property

The solr search handler name we place our searches on.
public $_solrSearchHandler

$_solrServer public_oe property

The base URL of the solr server without core and search handler.
public $_solrServer

$_useProxySettings public_oe property

Whether the proxy settings in the config.inc.php should be considered for the web service request.
public $_useProxySettings