RetrieverReranker#

class ragger_duck.retrieval.RetrieverReranker(*, retrievers, cross_encoder, min_top_k=None, max_top_k=None, threshold=None, drop_duplicates=True)#

Hybrid retriever (lexical and semantic) followed by a cross-encoder reranker.

We can accept several retrievers in case you want to rerank the results of several retrievers.

Parameters:

retrieverslist of retriever instances: The retrievers to use for retrieving the context. We expect the retrievers to implement a query method.
cross_encoderCrossEncoder: Cross-encoder used to rerank the results of the hybrid retriever.
min_top_kint, default=None: Minimum number of document to retrieve. If None, it is possible to return less than min_top_k documents.
max_top_kint, default=None: Maximum number of document to retrieve. If None, all the documents are retrieved.
thresholdfloat, default=None: Threshold to filter the scores of the cross_encoder. If None, the scores are note filtered based on a threshold.
drop_duplicatesbool, default=True: Whether to drop duplicates from the retrieved documents. This step is done right after the retrieval step.

Methods

`fit`([X, y])	Compute the vocabulary and the idf.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`query`(query)	Retrieve the most relevant documents for the query.
`set_params`(**params)	Set the parameters of this estimator.

fit(X=None, y=None)#

Compute the vocabulary and the idf.

Parameters:

Xlist of str or dict: The input data.
yNone: This parameter is ignored.

Returns:

self: The fitted estimator.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

query(query)#

Retrieve the most relevant documents for the query.

Parameters:

querystr: The user query.

Returns:

list of str or dict: The list of the most relevant document from the training set.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.