Discovery Challenge

Farfetch Fashion Recommendations Challenge


The importance of online sales in the luxury fashion space has been growing at an accelerated pace in the last few years, as consumers of this once traditional industry now expect easy access to a worldwide network of brands and retailers. Farfetch operates in this space and has the mission to bring together creators, curators, and consumers of fashion, all over the world.

To be successful in this landscape, it's necessary to provide a tailored, personalised and authoritative fashion shopping experience. Recommendation systems play an important role in the user journey, allowing customers to discover products that speak to their style, complement their choices, or challenge them with bold new ideas. Farfetch continuously works to improve its own recommender system with this ambitious goal in mind.

In this challenge, you will attempt to solve this unique problem by predicting which products were clicked on a set of recommendations shown to a user. This is a core task of any successful recommendations system, and the key to understanding user behaviour. For this, we will provide an anonymised dataset consisting of real interactions of users with the Farfetch platform.

Finding the planet in the noise: De-trending exoplanet lightcurves from non-linear noise


To date we know of over 4600 extrasolar planets – planets orbiting other stars – in our galaxy. By observing and analysing the thin atmospheric envelope of these planets, we can characterise their natures and understand their formation and evolution histories. The European Space Agency’s Ariel space telescope, to be launched in 2029, will aim to characterise 1000 of these foreign worlds. Ariel will measure the chemical and cloud signatures of these atmospheres by collecting the stellar light that passes through the atmospheric annulus as the planet transits in front of its star in our line of sight. This measurement pushes current technology to its very limits and the scientific signal is heavily corrupted with non-Gaussian instrument noise and astrophysical distortions originating from the activity of the host-star.

Modelling and correcting for the non-linear, and often unknown, nature of these noise sources is one of the central challenges. To date, solutions to this data mining problem are often ad-hoc and prone to significant biases by applying inaccurate/incomplete parametric corrections. This remains an unsolved problem and learning the non-linear nature of the noise is a promising new direction. The challenge will focus on building supervised time series de-trending algorithms to detect and de-trend the presence of non-linear instrument and stellar noise (signal-denoising). The data consist of sets of spectroscopic light curves (simultaneously recorded time series of flux in different wavelengths of light) corrupted by stellar and instrument noise and the corresponding clean sets, along with auxiliary observation information.

Solving this problem will mean improving our understanding of the characteristics of currently confirmed exoplanets, potentially recognising false positive / false negative detections of atmospheric chemistries and improving our ability to analyse observations of smaller and temperate worlds.

Discover the mysteries of the Maya


Remote sensing has greatly accelerated traditional archaeological landscape surveys in the forested regions of the ancient Maya. Typical exploration and discovery attempts, beside focusing on whole ancient cities, focus also on individual buildings and structures. Recently, there have been recent successful attempts of utilizing machine learning for identifying ancient Maya settlements. These attempts, while relevant, focus on narrow areas and rely on high-quality aerial laser scanning (ALS) data which covers only a fraction of the region where ancient Maya were once settled. Satellite image data, on the other hand, produced by the European Space Agency’s (ESA) Sentinel missions, is abundant and, more importantly, publicly available.

In particular, the Sentinel-1 satellites are equipped with Synthetic Aperture Radar (SAR) operating globally with frequent revisit periods, while the the Sentinel-2 satellites are equipped with one of the most sophisticated optical sensors (MSI and SWIR), capturing imagery from visible to medium-infrared spectrum with a spatial resolution of 10-60m. While the latter has been shown to lead to accurate performance on a variety of remote sensing tasks, the data from the optical sensors is heavily dependent on the presence of cloud cover, therefore combining it with radar data from the Sentinel-1 satellites provides an additional benefit. Integrating Sentinel data has been shown to lead to improved performance for different tasks of land-use and land-cover classification. This is the goal of the challenge: Explore the potential of the Sentinel satellite data (as well as making use of the available ALS data) for integrated image segmentation in order to locate and identify “lost” ancient Maya settlements, hidden under the thick forest canopy.