Istella is glad to release the Istella22 Dataset to the public.

To use the dataset, you must read and accept the Istella22 Licence Agreement. By using the dataset, you agree to be bound by the terms of the license: Istella dataset is solely for non-commercial use.

Neural approaches that use pre-trained language models are effective at various ranking tasks, such as question answering and ad-hoc document ranking. However, their effectiveness compared to feature-based Learning-to-Rank (LtR) methods has not yet been well-established. A major reason for this is because present LtR benchmarks that contain query-document feature vectors do not contain the raw query and document text needed for neural models. On the other hand, the benchmarks often used for evaluating neural models (e.g., MS MARCO, TREC Robust, etc.), provide text but do not provide query-document feature vectors.

The Istella22 dataset enables such comparisons by providing both query/document text and strong query-document feature vectors used by an industrial search engine. The dataset consists of a comprehensive corpus of 8.4M web documents, a collection of query-document pairs including 220 hand-crafted features, relevance judgments on a 5-graded scale, and a set of 2,198 textual queries used for testing purposes.

Istella22 enables a fair evaluation of traditional learning-to-rank and transfer ranking techniques on the same data. LtR models exploit the feature-based representations of training samples while pre-trained transformer-based neural rankers can be evaluated on the corresponding textual content of queries and documents.

If you want to use the dataset in your research, you can download Istella22 Dataset here:

May 2023: You can optionally download the document Identifiers of the query-document pairs in the test.svm file (row-aligned).

In case you use the dataset, we ask you to acknowledge Istella SpA and cite the following publication in your research:

D. Dato, S. MacAvaney, F.M. Nardini, R. Perego, N. Tonellotto
The Istella22 Dataset: Bridging Traditional and Neural Learning to Rank Evaluation
SIGIR ’22: 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
DOI: 10.1145/3477495.3531740