Supervised and unsupervised machine learning approach to the CMS data quality monitoring
The CMS experiment at the LHC is one of the biggest and most complex general purpose detectors ever built. The constant monitoring of the data quality is vital to guarantee a proper and efficient operation of the detector and reliable physics results. The choice of the key variables to be monitored by shifters and experts in the Data Quality Monitoring (DQM) framework relies on the expertise of the detector operators. The use of supervised machine learning techniques in the process, to be trained with the data collected and scrutinised in Run1, would allow saving a considerable fraction of the manpower in the data quality assessment process. From recent data taking emerged clearly that, with the constant evolution of the detector hardware and software, not all the corners are covered by the current DQM system in the phase space of the failures. The approach to data quality with unsupervised feature learning techniques, would highlight the presence of unforeseen patterns and anomalies while taking data. A fast feedback to experts and the chance to predict failures before they manifest themselves, would make the CMS collaboration save data usable for the final analyses and money at the same time.