Web Data

istella has developed a B2C search engine, investing on the creation of one of the largest Big Data infrastructure in Europe. This allows the crawling and processing of hundreds of millions of documents per day as well as the possibility to retarget indexes, ranking and machine learning algorithms in a small portion of time compared to most of the competitors.

The platform that we keep running for web crawling, analyzes and processes over 7 billion documents (web pages, social networks interactions, videos, images, news, etc.). It is a unique set of data that we offer to enrich the data of our customers, and allows a unique insight for data intelligence products and services:

7 billion URLs indexed;

400 signals extracted;

more than 15 billions URLs discovered;

100 millions URLs refreshed daily;

2000 RSS feeds refreshed every 10 minutes;

1500 news sites refreshed every 10 minutes.