CMS Data-Mining projects
The CMS experiment
) at CERN
is seeking CS/MEng/Physics students to work on the following computing data-mining projects. Please contact Valentin Kuznetsov
or Peter Wittich
for project assignment and further details.
Studying dataset popularity
A new round of physics program at the Large Hadron Collider (LHC) at CERN has began. The data volume produced by CMS experiment is already O(10) PB per year and growing. The intelligent data placements at sites within GRID infrastracture can speed up physics analysis, spread the load among participating sites and use their capacity at optimal level. We woud like to explore a possibility to apply Machine Learning techniques to predict dataset popularity and improve data placement according to obtained predictions.
Transfer metrics analytics
LHC experiments transfer more than 10 PB/week between all grid sites using the FTS (File Transfer Service). In particular, CMS manages almost 5 PB/week of FTS transfers.
FTS sends metrics about each transfer (e.g. transfer rate, duration) to a central HDFS storage at CERN. We propose to use ML techniques to process this raw data and generate predictions of transfer rates/latencies on all links between GRID sites.
CMS event classification using Deep-Learning Networks
We would like to perform feasibility studies of ML mainstream toolkits with CMS root
based files. We're looking for creation of common framework to explore Big Data datasets within Machien Learning (ML)/Deep-Learning (DL) frameworks. The success of this work can lead to adaptation of ML toolkits for High-Energy Physics (HEP). The problem here is two-fold. On one hand we need to efficiently handle PB of data and on another we should be able to explore how that amount of data can be processed via ML DL framework(s). The particular topic of DL would be to perform event classification of CMS data. Here we can use DL to either classify events (like trigger) or find new event types via unsupervised learning.
-- ValentinKuznetsov - 11 Dec 2014