RetrieverReranker#

class ragger_duck.retrieval.RetrieverReranker(*, retrievers, cross_encoder, min_top_k=None, max_top_k=None, threshold=None, drop_duplicates=True)#

Hybrid retriever (lexical and semantic) followed by a cross-encoder reranker.

We can accept several retrievers in case you want to rerank the results of several retrievers.

Parameters:
retrieverslist of retriever instances

The retrievers to use for retrieving the context. We expect the retrievers to implement a query method.

cross_encoderCrossEncoder

Cross-encoder used to rerank the results of the hybrid retriever.

min_top_kint, default=None

Minimum number of document to retrieve. If None, it is possible to return less than min_top_k documents.

max_top_kint, default=None

Maximum number of document to retrieve. If None, all the documents are retrieved.

thresholdfloat, default=None

Threshold to filter the scores of the cross_encoder. If None, the scores are note filtered based on a threshold.

drop_duplicatesbool, default=True

Whether to drop duplicates from the retrieved documents. This step is done right after the retrieval step.

Methods

fit([X, y])

Compute the vocabulary and the idf.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

query(query)

Retrieve the most relevant documents for the query.

set_params(**params)

Set the parameters of this estimator.

fit(X=None, y=None)#

Compute the vocabulary and the idf.

Parameters:
Xlist of str or dict

The input data.

yNone

This parameter is ignored.

Returns:
self

The fitted estimator.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

query(query)#

Retrieve the most relevant documents for the query.

Parameters:
querystr

The user query.

Returns:
list of str or dict

The list of the most relevant document from the training set.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.