Algorithms learning for large scale facilities
A large scale facility can be described as an object producing, as an output, datasets D_i
, that scientists analyse to obtain results R_i
. The ideal data analysis trajectory for an experiment is thus D_i-->R-i-->P_i
, where P_i
denotes the desired output: a publication.
Most of the times, something goes wrong along the way: there is no time to analyse the data, the analysis is harder than expected... To speed up the process and make it more efficient, I suggest using a new approach, including machine learning methods. Once developed, it could be applied to various industries.
With time, a large set of algorithms A_ij
is developed to analyse the data collected at the considered large scale facility. I propose to profile the algorithms, translating them to a high-level formal language, so to create a platform that
If the data collected during an experiment is similar to a previous dataset, suggests in a concise format which steps to follow.
If the experiment is new, it searches how similar data have been treated, and uses machine learning techniques to suggest a possible data analysis approach.
The system would help to speed up data analysis both for well developed and new techniques.